Our original post about gender and minority showed that minority students and women had higher rates of success, based on 49 cases, looking at admissions to any program, not just PGR top-50. A larger number of participants reported the results of their applications to PGR top-50 programs, but these were recorded in such a way that we had to manually code them in order to analyze them. After we did this, we were able to expand the analysis. The results of these new models (including those here), which are based on more observations, provided a different view on the effect of minority status, at least.
We wanted to know if race adds anything to the prediction of PhD program admission success. To do this, we compared a model using only the three predictors we had previously selected to one that also included minority status. If the model that includes minority status gives a significantly better prediction than the one that doesn’t, this would tell us that PhD program admission rates differ between white and minority students with the same gender, undergrad GPA and verbal GRE score. Four of the participants whose answers we used to build our three-predictor model didn’t report minority status (cases 43, 64, 65, and 81). Obviously, we could only include cases that contained minority status in building the model that contained this variable, so we used the remaining cases. Since we can only meaningfully compare two models build on the same dataset, we also had to build a new three-predictor model using these cases. We created a dataset that didn’t contain these cases, build the two models, and compared them using an analysis of deviance.
Here are the slope and coefficients, with Wald Z statistics and probability values, for the three-predictor model constructed using the new dataset yielded the following regression equation:
These are very close to the results we got when we used the full data set. There is still no evidence for lack of fit, chisq(70) = 66.46273, p = 0.5977475. The deviance test comparing this model to the null model is still strongly significant, chisq(3) = 43.52594, p = 1.902966e-09. Gender is now (just barely) not significant, but as discussed earlier, the Wald Z test used to measure the contribution of each predictor to the strength of prediction is not that sensitive. The important thing to note is that the coefficent estimates hardly changed at all, suggesting that the relationships we observed among the predictors and the outcome are fairly reliable, and don’t depend that much on which cases are included.Here’s our new regression equation: Y = -17.72928 + gender*0.32076 + gpa*1.45217 + gre_verbal*0.11182
We also built a model that included minority status. Here is a summary of the results:
Other than the inclusion of minority status as a predictor, this model is pretty similar to the three-predictor model. The relationships we observed among the other three predictors and the outcome don’t change very much when minority status is accounted for. The Wald test for the minority status variable is nowhere near significant. There is no evidence for lack of fit with this model, chisq(69) = 65.88326, p = 0.5841074. It definitely performs better than the null model, chisq(4) = 44.10541, p = 6.100333e-09. The p value is slightly higher than the three-predictor model because there is one more parameter (that’s why there are 4 degrees of freedom for the chi square test instead of 3) which is free to vary and not much of a decrease in deviance (value of chi squared is similar).
We compared the two models by testing the difference in residual deviance (the amount of variation in the outcome that each model fails to explain) using a chi square test, chisq(1) = 0.57947, p = 0.4465201. Adding another predictor to the model will always decrease the residual deviance a little bit, but the difference in this case was small and not statistically significant. To give you an idea of how small, the residual deviance of the simpler model was 66.463, and the difference was 0.57947. This indicates that the small improvement in prediction gained by adding minority status to the model probably occurred by chance. There is no evidence for a relationship between minority status and the odds of admission when controlling for gender, undergrad GPA and verbal GRE score.
What’s especially interesting about this is that there is a relationship between minority status and program admissions when gender, undergrad GPA and verbal GRE score are not controlled for. We can demonstrate this by building a model that contains only minority status and comparing it to the no-predictor model. Here’s what that model would look like:
The Wald statistic for minority status is significant, and so is the overall model, chisq(1) = 5.002334, p = 0.02531316. Unfortunately, the deviance test detected a model violation, chisq(73) = 105.6568, p = 0.007471957. Nine outliers were present (cases 37, 14, 38, 33, 53, 10, 23, 18, and 44). I wasn’t comfortable deleting that many observations, so instead I tested the marginal relationship between minority status and admission success using Pearson’s chi square test of association. Nonminority applicants succeeded in 201 out of 602 applications, whereas minority students succeeded in 32 out of 152 applications. Being white was associated with a significantly higher rate of success, chisq(1) = 4.7407, p = 0.0295. However, we already know that this effect shrinks markedly and is no longer statistically significant when gender, undergrad GPA and verbal GRE score are modeled.
Essentially, decreased rates of admissions for minority students are explained by their GPAs, verbal GRE score, and gender. Since higher GPA and GRE scores have positive effects on admissions, the minority students who applied to PhD programs and submitted their results had weaker applications on these quantitative measures. There are many possibly explanations for why this may be the case, and again, these results are based on the data we have. When we looked at a smaller subset of students and expanded our measure of success to include admissions outside of the PGR top-50, we found that minority students had a higher likelihood of success.
We also wanted to know if gender had an effect on graduate school admissions, so we built a model that didn’t include gender and compared it to the original thee-predictor model. Here’s what our new 2-predictor model looks like:
The two-predictor model predicts significantly better than the null model, chisq(2) = 54.47855, p = 1.479562e-12, with no evidence for violation of the logit model according to the chi square deviance test, chisq(74) = 73.89527, p = 0.4487002. Now let’s compare this model to the original three-predictor model. Once again, we are testing the difference in deviance between the two models against the null hypothesis that gender is unrelated to application success in the population when undergrad gpa and verbal GRE are controlled for. We know that the deviance of the more complex model will be somewhat lower, but we want to see how much lower, and test this difference for significance using a chi square test.
Model 1: (success, applied) ~ gpa + gre_verbal_round
Model 2: (success, applied) ~ gender + gpa + gre_verbal_round
|Resid. DF||Resid. Dev||Df||Dev.|
How much greater is the probability of success for a female applicant? That depends on the values of the other predictors. Let’s look at the modal values for undergrad GPA and verbal GRE score in our sample. For GPA, the most common answer is 4, which actually denotes grade point averages ranging from 3.9 to 4. For verbal GRE, the most common answer is 99, which actually denotes a range of percentile scores from 96 to 99.
For a typical (in our sample, but probably not anywhere else) male candidate, the estimated log odds would be -17.72928 + 4*1.45217 + 99*0.11182 = -0.85042, corresponding to odds of e^-0.85042 = 0.4272, or a probability of success for each application of 0.4272/(0.4272+1) = 0.299 (or 29.9%). The 13 candidates who actually have this combination of predictor values submitted 123 applications, of which 45 were successful, for a rate of 36.6%. So far our model looks pretty good. For a female candidate with the same GPA and verbal GRE, the estimated log odds would be -17.72928 + 0.32076 + 4*1.45217 + 99*0.11182 = -0.52966, corresponding to odds of e^-0.52966 = 0.58881, or a probability of success for each application of 0.58881(0.58881+1) = 0.9355 (or 93.55%), a shockingly high estimate. The five candidates who actually have this combination of predictor values submitted 40 applications, of which 24 were successful, for a rate of 60%. This is very high, but not nearly as high as our model predicts.The chi square test for the difference in deviance between the two models is significant, chisq(1) = 4.457, p = 0.03475849.
In other words, there is a statistically significant difference between the predictive power of the model that includes gender and the one that doesn’t. Female applicants have greater odds of being admitted to a top 50 program than male applicants with the same undergraduate GPA and verbal GRE score. How much greater? We can get the difference in odds between male and female applicants by antilogging the coefficient for gender, e^0.3356 = 1.399, meaning that female applicants have 39.9% higher odds of succeeding with each application.