That claim that Harvard admissions discriminate in favor of Jews? After seeing the statistics, I don’t see it.
Statistical Modeling, Causal Inference, and Social Science 2013-03-15
A few months ago we discussed Ron Unz’s claim that Jews are massively overrepresented in Ivy League college admissions, not just in comparison to the general population of college-age Americans, but even in comparison to other white kids with comparable academic ability and preparation.
Most of Unz’s article concerns admissions of Asian-Americans, and he also has a proposal to admit certain students at random (see my discussion in the link above). In the present post, I concentrate on the statistics about Jewish students, because this is where I have learned that his statistics are particularly suspect, with various numbers being off by factors of 2 or 4 or more.
Unz’s article was discussed, largely favorably, by academic bloggers Tyler Cowen, Steve Hsu, and . . . me! Hsu writes: “Don’t miss the statistical supplement.” But a lot of our trust in those statistics seems to be misplaced. Some people have sent me some information showing serious problems with Unz’s methods and his numbers.
This post is long because, if we’re adjudicating claims based on statistics, details matter. The short story, though, is that Unz appears to be (a) overestimating the number of Jews at Harvard, and (b) underestimating the proportion of Jews among the set of high-achieving potential Harvard applicants. Put this together and I don’t see the evidence that Jews receive preferential admissions compared to other whites. (Again, Asians is another story, not the topic of the present post.)
I personally have connections both to Harvard and to Jews, so you can make of this what you will. All I can say on that account is that, when Unz’s article came out a few months ago, I had no problem presenting its claims as stated; it was only after receiving some recent emails with detailed statistics that I got the impression that Unz’s numbers were mistaken. What Unz did seems reasonable from a distance (and I can understand why he made the choices he did in making his estimate), but his conclusions don’t seem to hold up on closer inspection. Unz’s claim
Unz’s argument has two parts, a numerator and a denominator. First, that Ivy League colleges were admitting tons and tons of Jews: 25% at Harvard, Yale, and Columbia and “this same general pattern” in the other five Ivy League schools. Second, Unz writes that the academic credentials of American Jews are not so impressive, with Jews representing “less than 6 percent” of National Merit Scholar semifinalists, a number which Unz presents as “an extreme upper bound to a more neutrally-derived total.”
This seems pretty clear. You have a group that’s 6% of the top achievers, getting 25% of the places in top colleges. A factor of 4, that’s a lot. Sure, Unz’s reasoning can be questioned on the edges: Ivy League schools draw more students from the Northeast, and Unz’s estimates are only approximate. Unz acknowledges some of these issues, writing, “any of the individual figures provided above should be treated with great caution, but the overall pattern of enrollments—statistics compiled over years and decades and across numerous different universities—seems likely to provide an accurate description of reality.” In short, a factor of 4! That would seem pretty solid.
Nope. When you look at the numbers carefully, though, that factor of 4 erodes and erodes until there’s nothing left.
What percentage of Harvard College students are Jewish?
Start with the claim that 25% of Harvard College students are Jewish. That number comes from the Hillel Foundation, the Jewish student organization. I received an email from a Harvard alum who went through the names of Harvard students from the classes of 2009-2012 and estimated the proportion of Jews using the same scale-up methods [see details below] that Unz used to validate his personal estimates of the rate of Jewish names in high-achieving groups (Unz stated here that these scale-up methods produced results within 1% of his own estimates based on direct inspection). Using the scale-up methods, you get an estimate that 10-11% of students at Harvard are Jewish, not 25%. My correspondent suspects that the scale-up estimates are too low and that Hillel’s numbers are too high.
As described by Unz, here are the two Jewish name analyses he uses:
We can perform the same population estimate using distinctly Jewish last names, such as the small set of Cohen, Kaplan, Levy, and “Gold—“ (J1) which were suggested by blogger Steve Sailer and his Jewish correspondent, or else extended to include the full set of such names (J2) utilized by Weyl by adding Berman, Bernstein, Epstein, Friedman, Greenberg, Katz, Levine, Rosenberg, and Stern. Based on the 2000 Census estimates, the first group includes approximately 1 in 20 American Jews, while the larger set raises the fraction to 1 in 12.
I’m a little miffed the list doesn’t include Rosenthal or Gelman, but hey, what can you do?
Numerator incompatible with denominator
As my correspondent writes, “If Unz wants to compare the representation of Jewish students among National Merit Scholar semifinalists to that among Harvard undergrads, he must use the same methodology for both data sets. This substantially nullifies Unz’s arguments about Harvard’s admissions preferences for unqualified Jewish students (particularly in comparison to non-Jewish whites, whose enrollment at Harvard he substantially underestimated).”
OK, so going from 25% to 10%—that’s a factor of 2.5. What about the rest? Two things: geography and counting.
First, the geography. Typically over 40% of Harvard College students come from New England and the mid-Atlantic, a group of states that includes 48% of American Jews but only 21% of the white population.
In comparison, Jewish admissions in competitive west coast colleges such as Stanford and the University of California are much lower, reflecting that they are drawing from a different geographic distribution of applicants. My correspondent writes:
Performing Weyl Analysis on Stanford’s public directory yields the result that 4-5% of Stanford’s undergrads are Jewish (half of the 9.5% Hillel figure cited by Unz), which also happens to coincide with the percentage of Jewish CA NMS semifinalists one finds via Weyl analysis. Note that this figure is below Unz’s estimate (which we shall soon argue is an underestimate) that Jews represent 6% of all NMS semifinalists. This does not suggest that Stanford is discriminating against Jews but rather reflects the fact that Stanford draws many of its students from California, where Jews are a much smaller % of the population than in the Northeast where the Ivies are located. When searching Stanford’s public directory, one can easily see the students’ majors, and it is interesting to note that relatively few of the students with presumably Jewish names are studying engineering, while performing a search for Asian names like Chen or Wang yields many engineering students. This casts doubt on Unz’s implication that the reason there are relatively few Jews at MIT and Caltech is due to their “objective/meritocratic” admissions practices; rather, relatively few Jews seem interested in pursuing engineering.
Second, the counting. The person who emailed me had gone to the trouble of replicating Unz’s calculations, that is, going through the names of National Merit Scholar semifinalists, counting the number of usually Jewish surnames from the above two specified lists, and then scaling these up using the procedure described by Unz to obtain an estimate of total % Jews. For almost every state, these replicated estimates are higher than the numbers reported by Unz. Here are some states: Pennsylvania: replication estimates 14-21% Jewish; Unz reported 9% Jewish Massachusetts: replication estimates 9-14%; Unz did not count Maryland: replication estimates 12-15%; Unz reported 11% Virginia: replication estimates 7-9%; Unz reported 6% Ohio: replication estimates 6-7%; Unz reported 4% Illinois: replication estimates 10%; Unz reported 8% Florida: replication estimates 9%; Unz reported 8% Michigan: replication estimates 4%; Unz reported 2% New York: replication estimates 24%; Unz reported 21%. This all suggests that Unz’s estimate of 6% Jewish National Merit Scholar semifinalists is too low, even using the Weyl method that Unz described (and, which by Unz’s report, gave results within 0.1 percentage point of what he got from direct inspection of the names). While the Weyl method may produce different results on a state-by-state basis from Unz’s “direct inspection” method, my correspondent found that the Weyl method produced higher Jewish totals than reported by Unz in these states in almost every state checked.
Recall that the Weyl method, when applied to the list of Harvard names, estimates Harvard’s student population as 10-11%, not 25%, Jewish. The point here is not that Unz was trying to get the wrong answer but rather that he was using different data sources and estimation methods in different places, and this leads to systematic errors when the data are combined.
Unz’s estimate is also low because he was missing data from some of the most populous states such as Massachusetts that account for a disproportionate percentage of the U.S. Jewish population. And that’s all before adjusting to account for Ivy students being disproportionately from the Northeast as well as PSAT cutoff scores required for NMS status being generally higher in that region of the U.S.
In summary: Unz claimed that Harvard and other Ivies massively over-admit Jews based on their academic accomplishments (in comparison to non-Jewish whites). But if you are careful with the statistics and compare comparable numbers, the differences go away.
That “strange collapse of Jewish academic achievement”? It’s not such a collapse nor is it so strange.
Here’s another comparison, this based on Harvard’s Phi Beta Kappa awardees.
Unz: “By the late 2000s and early 2010s, Jewish students had become one of the academically weakest groups at Harvard, constituting 25% or more of all students, but just 11-13% of PBKs selections. Meanwhile, during the 2010s the average Asian student was nearly 300% more likely to make PBK, with their proportion of Junior Year PBKs running even higher. And white Gentiles seemed to perform best of all, being about 400% more likely to gain PBK honors than their Jewish classmates.”
My correspondent writes: “Unz obtains these figures by classifying all non-obviously Asian and non-obviously Jewish high academic achievers as non-Jewish white, even though there are many Jews with non-obviously Jewish names (not to mention biracial students with an Asian mother and non-Asian people of color). In addition, I checked his stats for junior PBKs for the classes of 2010-13, and he inflated the Asian numbers by 7 percentage points. Perhaps he would claim that he checked the photos of all 96 students and somehow confirmed that some of the students with “white” names are actually Asian (or half-Asian), but had he done so, he would have noticed that at least 3 of the students are non-Asian students of color, whom he classified as non-Jewish white. Unz completely disregards the existence of black and Hispanic high academic achievers, thus obtaining inflated figures for the academic performance of white Gentiles.”
Unz also looks at very high-performing math students to document his claim of “the strange collapse of Jewish academic achievement.” If you look at the numbers carefully, though, it’s not such a collapse nor is it so strange. I learned this via some numbers compiled by Janet Mertz, a professor of oncology at the University of Wisconsin – Madison who has published a relevant article in the Notices of the American Mathematical Society on mathematics performance by gender and ethnicity on national and international mathematics competitions. She also happens to be the mother of one of the Jewish students with an Anglicized surname whom Unz failed to count among the Jewish 21st century US IMO team members and Putnam Fellows.
Unz: “The U.S. Math Olympiad began in 1974, and all the names of the top scoring students are easily available on the Internet. During the 1970s, well over 40 percent of the total were Jewish, and during the 1980s and 1990s, the fraction averaged about one-third. However, during the thirteen years since 2000, just two names out of 78 or 2.5 percent appear to be Jewish.” [full list is here]
Mertz: “For the 2000s (i.e., 2000-2009) row of his table, Unz claims there were only 3% Jews. Based upon my direct personal knowledge of these students, I have determined that there actually were at most 35% non-Jewish whites, 49% Asians, and at least 16% Jews, with 1/2 Jews being counted 50-50 between Jewish and non-Jewish white and 1/4 Jews being counted as 25-75 between Jewish and non-Jewish white. The 44% Jews Unz claims for the 1970s teams is probably a significant over-estimate. The “Weyl method” (whose numbers need to be multiplied by 12 to estimate total % Jews) yields only 1 obviously Jewish name out of 48 for that era while it yields 2 out of 78 for the 21st century. Unfortunately, my ethnicity data from the 1970s is too incomplete for me to give a firm maximum % Jews.”
OK, let’s say that again: Unz says the rate in the 21st century has been 2.5%, the actual number is over 12% (when taking the whole period 2000-2012). A factor of 5 makes a difference!
Enter the New York Times
In the New York Times, David Brooks wrote the following, in a column celebrating Unz’s piece as one of the best magazine articles of the year: “You’re going to want to argue with Unz’s article all the way along, especially for its narrow, math-test-driven view of merit. But it’s potentially ground-shifting. Unz’s other big point is that Jews are vastly overrepresented at elite universities and that Jewish achievement has collapsed. In the 1970s, for example, 40 percent of top scorers in the Math Olympiad had Jewish names. Now 2.5 percent do.”
Nope. Mertz contacted Brooks and the New York Times about this (not to be all narrow and math-testy, but 2.5% != 12+%), but the Times has not (yet) run a correction. I had a similar experience after pointing out a statistical error in a different NYT column by a different Brooks, but it was no go on that one too, so I could relate to Mertz’s story.
OK, back to Jewish academic performance.
The rate of Jews in the U.S. Olympiad teams has declined over these decades, maybe by a factor of 2 or 3 rather than Unz’s claimed factor of 17. What explains the factor of 2-3?
Mertz writes, “the recent modest drop off in the % Jews can be fully accounted for by their now having to compete with the recent influx to the US of high-achieving Asians for the fixed number of 6 slots per year available on the US IMO team, combined with Jews, especially non-ultra-Orthodox ones, having become a smaller percentage of the US population over the past few decades.”
In his article, Unz wrote: “today’s overwhelmingly affluent Jewish students may be far less diligent in their work habits or driven in their studies than were their parents or grandparents.” But if we accept that Asian-Americans are a high-achieving group in math, and we realize that they now greatly outnumber American Jews, it’s really no mystery that the proportion of Jews on the U.S. Olympiad team has dropped by a factor of 2 or 3. “Increased competition for a fixed number of slots,” together with demographic changes, would seem to be a sufficient explanation in and of itself.
There’s a similar problem with Unz’s analysis of winners of the Putnam math competition, of which he writes, “Over 40 percent of the Putnam winners prior to 1950 were Jewish, and during every decade from the 1950s through the 1990s, between 22 percent and 31 percent of the winners seem to have come from that same ethnic background. But since 2000, the percentage has dropped to under 10 percent, without a single likely Jewish name in the last seven years.”
But Mertz finds something quite different:
For the 2000s (i.e., 2000-2009), at least 34% of the students were foreigners who had only come to the US to matriculate to college here after having participated in the IMO as a member of their own country’s IMO team, i.e., they were born and educated prior to college outside of the US and, thus, competed for admission to elite US colleges in the foreign applicant pool, not the US one. This fact is very easy to confirm by simply searching the official IMO web site for what country’s team each student represented. Again, I say “at least” by assuming any student who did not participate in the IMO is a US citizen/resident. By subtracting out these foreign students, I then obtain at most 33% non-Jewish whites, at most 17% US Asian-Americans, and at least 15% Jews. My own son was among the Jewish Putnam Fellows in the past 7 years that Unz failed to count because he does not have an obviously Jewish surname.
Likewise, for the 2010s (i.e., 2010 and 2011 which is a total of only 10 students, at least 40% of whom were foreign), I calculate at most 40% non-Jewish whites, 10% US Asian-Americans, and at least 10% Jews. Again, the collapse Unz claims in Jewish achievement is simply not present in these data sets when Jews are properly counted by directly asking them whether they have any Jewish parents or grandparents (and counting people with partial Jewish ancestry fractionally in the total).
Mertz summarizes:
By far the biggest problem with the Unz article is his failure to consistently use the same methodology for counting Jews (or US Asian-Americans). I believe the primary reason he sees disparities is due to changing methods for counting, comparing data from methods which under-count against data from methods which over-count. If he maintained a consistent method, it wouldn’t matter much if his method were somewhat off (except in the cases of the very tiny data sets such as the Putnam and Olympiad data or when he compares the 1970s against the 21st century because of changing demographics). So, for example, the problem with his claim that Jews are being over-admitted to Ivy League colleges is that he uses the Hillel data sets (which are likely over-counts) against his Jewish names method (which likely under-counts). If he used his Jewish names method, rather than the Hillel data, to determine % Jews attending Harvard, Yale, etc., he might well find the disparity disappears. Likewise, he includes foreign whites and Asians and others who do not yet have green cards in his Putnam and Harvard Phi Beta Kappa data even though these students were in a separate applicant pool for admissions than were the U.S. students. The U.S. Asians who do not yet have green cards are also mixed in with the Asian-Americans in his National Merit Scholar semifinalist data. In addition, he assumes that all non-Jewish, non-Asian students are non-Jewish US whites, ignoring the fact that some of these students are, in fact, African-American, Hispanic, or, in the case of the Phi Beta Kappa and Putnam data, foreign.
Not a slam, not a fisk
My point here is not to slam or “fisk” Unz. Rather, I’m giving all these details because, as a statistician, I think details are important. One reason I suspect that Unz’s article was originally received so uncritically by many bloggers and journalists is that Unz presented lots of numbers and described where he got them. There were no other easily accessible numbers on the topic, so we were inclined to go with what Unz had. So I think it’s important to pick on the details here, so that it’s clear that these are not minor technical criticisms but rather get to the center of Unz’s argument. I read his article as claiming that Jewish students are overrepresented in Harvard’s admissions by roughly a factor of 4 (compared to other whites) compared to their academic achievements and abilities. But when you make the comparisons carefully, this disparity goes away. There still is arguably an underrepresentation of high-achieving Asian students, but some of Unz’s comparisons there are off too, in that he at times is lumping U.S. and foreign Asians into a single category, whereas it would seem more appropriate to focus on Asian-Americans when comparing to other college applicants.
Summary
In his article, Unz claims to have found that elite college admissions underrepresent Asian-Americans (in comparison to their academic talent achievements) and overrepresent Jews, leaving non-Jewish whites squeezed out. Looking at the statistics more carefully, we see no evidence that Jews are admitted preferentially compared to other whites. Unz’s error arose because he used different sorts of information with different biases that did not cancel out but actually reinforced each other, underestimating the proportion of high-achieving Jews and overestimating the Jewish presence among Ivy League students.
I have long argued that meritocracy can’t work (for more recent discussion, see here and here) and so I’m sympathetic to Unz’s general concerns. But it looks like he garbled the analysis for one of his main points.
The statistical message: your conclusion is only as good as the numbers that go into it.
P.S. Unz replies here. I should prepare a longer explanation, but, just briefly, it does not address the questions raised in this blog post, most obviously that he is using different and incompatible sources for his numerator and denominator and that one of his dramatic numbers is off by at least a factor of 5.