Stalking the Statistically Improbable Restaurant… With Data!

...My heart's in Accra 2025-07-03

Last summer, I wrote about the statistically improbable restaurant, the restaurant you wouldn’t expect to find in a small American city: the excellent Nepali food in Erie, PA and Akron, OH; a gem of a Gambian restaurant in Springfield, IL. Statistically improbable restaurants often tell you something about the communities they are based in: Erie and Akron have large Lhotshampa refugee populations, Nepali-speaking people who lived in Bhutan for years before being expelled from their county; Springfield has University of Illinois Springfield, which attracts lots of west African students, some of whom have settled in the area.

Fine food from The Gambia in Springfield, IL

The existence of the statistically improbable restaurant implies a statistically probable restaurant distribution: the mix of restaurants we’d expect to find in an “average” American city. Of course, once you dig into the idea of an “average” city, the absurdity of the concept becomes clear. There are 343 cities in the US with populations of over 100,000 people, from 8.47 million in New York City to 100,128 in Sunrise, Florida (a small city in the Ft. Lauderdale, FL metro area). Within that set are global megacities like New York and LA, state capitols, college towns, towns growing explosively and those shrinking slowly.

I’ve retrieved data about the restaurants in 340 of these cities using the Google Places API. This is a giant database of geographic information from across the world – not only does it include information about restaurants, but about parks, churches, museums and other points of interest. The API was designed to make it easy to search by proximity – “return all restaurants within 2km of this point” – but it’s recently gained an “aggregate” attribute, which allows you to ask questions like “How many Mexican restaurants are there in Wichita Falls, Texas?”.

The API is not perfect. I tested my queries on my hometown of Pittsfield, MA and while it got some questions (the number of Dunkin’ Donuts) completely correct, it missed others entirely, failing to identify our two excellent Brazilian restaurants when I searched for that category. We’re going to proceed with the assumption that the data is imperfect, and sanity-check when we get surprising results.

graph of the relationship between population and number of restaurants

For starters, we look to see whether there’s a relationship between the population of a city and the number of restaurants located within city limits. It seems obvious that New York City should have significantly more restaurants than Lincoln, Nebraska, and indeed, that’s true. When we look at the whole set of cities between 100k – 8 million, there’s a straightforward linear relationship between population and restaurants with a few interesting outliers: Houston has more restaurants than we might expect, Phoenix fewer than we’d anticipate for cities their size.

graph of the relationship between restaurants and population in large american cities

The data is messier as we look at smaller sets of cities. Looking at cities with populations over 250,000, lopping off the four largest US cities (New York, Los Angeles, Chicago, Houston), a linear regression no longer fits as well. Some of the cities that are celebrated for their “creative economies” – Austin, San Francisco, Portland, Seattle, Nashville, Boston – have more restaurants than we might expect, while some less celebrated cities of comparable size – Fort Worth, Jacksonville, Indianapolis, El Paso, Oklahoma City – have fewer than we might expect.

Graph of the relationship between population and number of restaurants in small American cities

Exploring the cities between 100,000 – 250,000, there’s still a clear relationship between population and the number of restaurants, but that relationship explains just more than half the data variance (R2=0.5333) Some of the cities that are especially restaurant-rich are relatively small capitol cities – Little Rock, AR; Providence, RI; Baton Rouge, LA; Tallahassee, FL – and college towns – Knoxville, TN; Tempe, AZ. Some of the cities that have fewer restaurants than expected are close to larger cities – Cape Coral, FL is next to Fort Meyers; Yonkers, NY is next to New York City; Moreno Valley, CA may be overshadowed by Riverside and San Bernardino.

(These are rough guesses based on staring at scatterplots. I’ll want to try some regressions before positing that capitol cities have a higher than usual number of restaurants because lobbyists need to take legislators out to eat.)

With all this data, we can now imagine an “average” American city of 100,000 people. We’ll call our imagined city “New Springfield, California”. (California has 76 cities with 100,000 or more people, ahead of Texas with 42. There are three Springfields in our set of cities, and 5 cities that start with “New”.) AI generated image of an imaginary city with a "Welcome to New Springfield" sign

There are 305 restaurants in New Springfield. 61 (just over 20%) are fast food outlets, including: 9 Starbucks and 4 Dunkin’s 6 McDonalds, 3 Burger Kings and 3 Wendy’s 4 Taco Bells and 2 Chipotles 9 Subways 3 Dominos and 2.5 Chick-Fil-A’s

55 restaurants describe themselves as selling “American” food. Additionally, New Springfield boasts 5 BBQ joints, 5 diners, 12 bar and grills, 22 burger joints, 29.5 pizza parlors, 28 sandwich shops and 5 steakhouses.

122 restaurants offer some sort of “international” cuisine. Mexican is the most numerous with 38 eateries. There are 12.5 Chinese restaurants, 12 that identify as “Asian” (not clear how those categories overlap), 11 Japanese, 3.5 Korean restaurants, 1.5 Ramen bars, 7 sushi restaurants, 4 Thai restaurants and 3.5 Vietnamese places. (Not clear if any have good bahn mi, or just pho.) There are 4.5 “Mediterranean” restaurants, 10 Italian, 1.5 Greek, two Middle Eastern and half a Lebanese place. There are four Indian restaurants, two Brazilian restaurants, and my favorite place has a 47% chance of being African on any given night, a 20% chance of being Afghan and is otherwise likely to be Turkish.

These numbers won’t add up neatly, due to rounding, overlap between restaurants (the combination Pizza Hut and Taco Bell might well be coded as Mexican, Italian and fast food) and the fact that this is an imaginary city based on the distribution of messy and incomplete data. But it gives us a statistically probable city that we can now deviate from.

We can – and will, in just a moment – look at individual variable to discover that Newark, NJ has the highest percentage of African restaurants and that Quincy, MA has the lowest proportion of Mexican restaurants of any American city over 100,000. (If you’ve been to Quincy, that tracks – while the town is not solely white and blue collar as it used to be, there’s been an influx of Asian immigrants and far fewer Latinx immigrants than in Boston’s western suburbs.) For now, we want to ask: which of our 340 cities is the closest to New Springfield, the most “average”.

I represented the restaurant distribution for each city in terms of 41 vectors, each a probability between 0 – 1 that a random restaurant fits within a specific category (Is it fast food? Chinese? A Dunkin Donuts? etc.) The cities closest to the centroid of that vectorspace are Lexington, KY; Colorado Spring, CO; North Charleston, SC; Indianapolis, IN and Columbus, OH. Three of those cities are relatively close to one another (Lexington, Columbus and Indianapolis), suggesting some sort of southern/midwest conceptual center for the nation’s culinary tastes.

(I badly wanted Peoria, IL to be close to the centroid because of the old idea that Peoria was middle American enough to be America’s test market. That old saw hasn’t been true for decades – Peoria hasn’t diversified as quickly as the rest of the US and is no longer demographically average. Interestingly enough, Columbus, OH is one of the cities most mentioned when people look for a demographically representative test city. And in an ironic twist, both Peoria, IL and Peoria, AZ are right in the middle of my centroid ranking, meaning they are right in the middle between being average and being unusual.)

The five cities furthest from the centroid – statistically the five most unusual cities – are a weird mix: South Fulton, GA; Garden Grove, CA; Menifee, CA; Jurupa Valley, CA; and Quincy, MA. All three California cities are in the south of the state, east of Los Angeles. Menifee and Jurupa Valley are part of the “Inland Empire”, while Garden Grove borders on Anaheim. Garden Grove, CA and Quincy, MA have larger Asian populations than many similarly-sized cities, and Jurupa Valley is majority Latinx, not unusual for California, but quite different from the rest of the nation. South Fulton, GA is a suburb of Atlanta. Like Jurupa Valley, CA, it was recently incorporated, which might explain why it’s got the fewest restaurants of any city in our set. (It might also be a data error.)

My centroid calculations currently weigh all vectors equally, despite the fact that some have very little variation, and others have lots. Google’s API has a category for Indonesian restaurants, despite that the fact that the vast majority of US cities don’t have any – removing Indonesian restaurants, for example, might give me more explicable results and help me find clusters of restaurants from the data.

But we don’t care about clusters of cities – we’re looking for statistically improbable food! I’m right there with you, friends.

Our method is quite good at showing us concentrations of restaurants in the categories that Google explicitly tracks. In the average American city, 0.07% of restaurants serve Afghan food. But six California cities – Fremont, Elk Grove, El Cajon, Tracy, Hayward and Concord – boast that at least 1% of their restaurants are Afghan. Fremont, Concord and Hayward are all in the East Bay, inland from San Francisco, and Elk Grove and Tracy in the same general part of the state, suggesting that migrants may well move to parts of the US where there’s an established population of compatriots to open their businesses.

African restaurants are similarly rare – 0.15% of total restaurants in our set. But the cities with high concentrations of African food are more widely distributed. Newark, NJ is well within the orbit of New York City, and Inglewood, CA within LA’s penumbra, and both have attracted African immigrants for whom real estate within the megalopolis is too expensive. The other three cities with more than 1% African restaurants – Minneapolis, MN, St. Paul, MN and Fargo, ND – are well known destinations for migrants from East Africa, particularly Somalis. (Wonderfully, two cities in easy driving distance of me – Albany, NY and Worchester, MA – rank in the top 20 of African restaurant distribution.)

Sometimes what’s interesting is what’s NOT present in a city. The cities with high concentrations of Mexican restaurants are where you would expect them to be – southern California, with a few in the Central Valley; border areas of Texas and New Mexico, the Phoenix suburbs. Mexican food deserts include some very cold places (Rochester and Buffalo, NY), and some cities with large non-Mexican immigrant populations (Arabic speakers in Dearborn, MI, Asian americans in Quincy, MA). Digging into demographic data, I discovered that two Mexican food deserts in Florida are demographically distinct from the Miami area, where both are located. Miami Gardens, FL is 62% African American, down from a decade ago – there’s significant Latinx immigration, but it’s very demographically distinct from Miami, which is 70% Latinx and 12% African American. Sunrise, FL, another Miami area city, has large Jamaican and Haitian populations (as well as a surprising number of Yiddish speakers.)

I had a hypothesis that concentrations of fast food were correlated to poverty. As the data is coming in, I think it may correlate more closely to rapidly growing suburbs – West Jordan, UT (Salt Lake City), North Las Vegas, NV (Las Vegas), Ontario, Rialto and Menafee, CA (Inland Empire/Riverside/San Bernardino) all rank high on that score. Fast food also may inversely correlate to population. Of the twenty largest cities in the US, only four have fast food prevalence over the mean (20.15%): Phoenix (20.85%); Jacksonville (21.72%); Fort Worth (21.53%); Oklahoma City (21.77%).

There’s something of a snob factor going on as well. The nine cities I found with fewer than 10% fast food restaurants are: San Francisco, CA; Seattle, WA; Portland, OR; Berkeley, CA; San Mateo, CA; Miami, FL; Oakland, CA; Pittsburgh, PA and Honolulu, Hawaii. Four of those cities are in the SF Bay Area, one of the wealthiest and most expensive parts of the country. Neither the Pacific Northwest nor Hawaii are cheap, either.

I’ve got tons more to do with this data. I’m fooling around with k-means clustering, trying to identify emerging patterns. I’ll make my code more efficient and expand this to the cities I am most in love with – the 50k to 100k cities – and see if the overall patterns change. Once I’ve fixed a few more data quality problems, I’ll release a spreadsheet at CSV of the data – if you’d like to play with it in the meantime, let me know.

For now, let me close with a set of top ten lists:

Highest prevalence of fast food:Menifee, California35.66%West Jordan, Utah35.07%Rialto, California32.77%Ontario, California32.72%North Las Vegas, Nevada32.55%Fontana, California32.35%Independence, Missouri31.85%Victorville, California31.72%Wichita Falls, Texas31.50%Olathe, Kansas31.44%Highest prevalence of American restaurants:Lincoln, Nebraska 28.29%Surprise, Arizona27.54%Goodyear, Arizona27.15%Menifee, California27.13%Tuscaloosa, Alabama26.09%Lafayette, Louisiana25.68%Independence, Missouri25.56%Rio Rancho, New Mexico25.53%Billings, Montana25.47%Moreno Valley, California25.43%Highest prevalence of BBQ restaurants:Shreveport, Louisiana5.23%Kansas City, Kansas4.55%Chattanooga, Tennessee4.18%Columbus, Georgia4.05%New Braunfels, Texas4.01%Huntsville, Alabama3.94%Honolulu, Hawaii3.87%Fayetteville, Arkansas3.85%Concord, North Carolina3.84%Beaumont, Texas3.74%Highest prevalence of Bar and Grill restaurants:Davenport, Iowa13.73%Cedar Rapids, Iowa9.90%Sioux Falls, South Dakota9.89%Madison, Wisconsin9.42%Akron, Ohio9.09%Evansville, Indiana8.96%Manchester, New Hampshire8.82%Lee's Summit, Missouri8.62%Spokane Valley, Washington8.60%Omaha, Nebraska8.59%Highest prevalence of diners:Yonkers, New York4.30%Lancaster, California3.96%Davenport, Iowa3.87%Hesperia, California3.86%Mobile, Alabama3.74%Rochester, New York3.64%Cape Coral, Florida3.41%Macon, Georgia3.40%Augusta, Georgia3.37%Jurupa Valley, California3.37%Highest prevalence of burger joints:Rio Rancho, New Mexico17.73%Menifee, California17.05%Moreno Valley, California14.74%Jurupa Valley, California13.94%West Jordan, Utah13.74%Victorville, California13.45%Yuma, Arizona13.36%Fontana, California12.75%Nampa, Idaho12.72%Surprise, Arizona12.68%Highest prevalence of pizza parlors:Hampton, Virginia17.70%Worcester, Massachusetts17.31%Lowell, Massachusetts17.24%Deltona, Florida16.96%Quincy, Massachusetts16.80%Newport News, Virginia16.29%Chesapeake, Virginia15.95%Lynn, Massachusetts15.52%Warren, Michigan15.45%Virginia Beach, Virginia15.30%Highest prevalence of steakhouses:Billings, Montana4.09%Evansville, Indiana3.54%Tyler, Texas3.34%San Angelo, Texas3.31%McAllen, Texas3.23%Fort Wayne, Indiana3.22%Suffolk, Virginia3.21%Davenport, Iowa3.17%Shreveport, Louisiana3.14%Rockford, Illinois3.09%Highest prevalence of Afghan restaurants:Concord, California2.01%Hayward, California1.74%Tracy, California1.68%El Cajon, California1.26%Elk Grove, California1.18%Fremont, California1.01%West Valley City, Utah0.65%Sacramento, California0.55%Kent, Washington0.54%Antioch, California0.52%Highest prevalence of African restaurants:Newark, New Jersey2.08%Inglewood, California1.85%Minneapolis, Minnesota1.62%Fargo, North Dakota1.58%St. Paul, Minnesota1.20%Richmond, California0.93%Arlington, Texas0.78%Menifee, California0.78%Worcester, Massachusetts0.77%Providence, Rhode Island0.72%Highest prevalence of Brazilian restaurants:Newark, New Jersey3.96%Lowell, Massachusetts3.88%Worcester, Massachusetts3.27%Richmond, California2.80%Carlsbad, California2.33%Coral Springs, Florida2.26%South Fulton, Georgia2.22%Huntington Beach, CA2.18%Brockton, Massachusetts2.08%Orlando, Florida2.06%Highest prevalence of Chinese restaurants:Quincy, Massachusetts12.70%Bellevue, Washington11.18%Daly City, California10.58%Philadelphia, Pennsylvania10.05%San Mateo, California9.92%Fremont, California9.92%Sunnyvale, California9.07%San Francisco, California9.05%New York, New York8.80%El Monte, California8.50%Highest prevalence of Greek Restaurants:Carmel, Indiana2.33%Boca Raton, Florida1.82%Alexandria, Virginia1.75%Tempe, Arizona1.74%Lee's Summit, Missouri1.72%Cincinnati, Ohio1.71%Salt Lake City, Utah1.66%Stamford, Connecticut1.64%High Point, North Carolina1.64%Manchester, New Hampshire1.63%Highest prevalence of Indian Restaurants:Sunnyvale, California16.95%Fremont, California13.77%Irving, Texas9.82%Tracy, California8.40%Santa Clara, California8.07%Frisco, Texas8.00%Jersey City, New Jersey7.36%Cary, North Carolina7.09%Naperville, Illinois6.34%Bellevue, Washington6.30%Highest prevalence of Indonesian Restaurants:West Covina, California0.71%Torrance, California0.40%El Monte, California0.40%Inglewood, California0.37%Albany, New York0.30%Sugar Land, Texas0.27%Round Rock, Texas0.26%Oceanside, California0.23%Rancho Cucamonga, CA0.22%Philadelphia, Pennsylvania0.20%(NB: Indonesian restaurants are quite uncommon in the US - that 0.3% in Albany, NY represents a single restaurant.)Highest prevalence of Italian Restaurants:Boca Raton, Florida11.62%Stamford, Connecticut10.38%Yonkers, New York9.31%Worcester, Massachusetts9.04%Scottsdale, Arizona8.80%Boston, Massachusetts8.25%Coral Springs, Florida7.74%New Haven, Connecticut7.39%Pompano Beach, Florida7.38%Palm Coast, Florida6.76%Highest prevalence ofJapanese Restaurants:Torrance, California15.59%Honolulu, Hawaii15.32%San Mateo, California13.32%Costa Mesa, California12.85%Berkeley, California11.04%Federal Way, Washington9.34%Bellevue, Washington9.15%Irvine, California9.09%Elk Grove, California8.53%San Francisco, California8.34%Highest prevalence of Korean Restaurants:Carrollton, Texas14.67%Federal Way, Washington12.45%Santa Clara, California8.74%Garden Grove, California8.20%Irvine, California7.75%Fullerton, California7.46%Ann Arbor, Michigan5.14%Honolulu, Hawaii5.13%Killeen, Texas4.40%Torrance, California4.25%Highest prevalence of Lebanese Restaurants:Dearborn, Michigan4.73%Sterling Heights, Michigan2.08%Miramarm Florida1.44%Toledo, Ohio1.29%Paterson, New Jersey1.28%Richardson, Texas1.19%Downey, California1.07%Anaheim, California1.03%Peoria, Illinois0.95%Lafayette, Louisiana0.91%Highest prevalence of Mediterranean Restaurants:Glendale, California8.08%Burbank, California5.97%Richardson, Texas5.93%Sterling Heights, Michigan5.88%Dearborn, Michigan5.68%Plantation, Florida4.12%Irvine, California3.43%Pasadena, California3.38%Tempe, Arizona3.36%Warren, Michigan3.33%Highest prevalence of Mexican Restaurants:Jurupa Valley, California32.69%Santa Ana, California29.71%Buckeye, Arizona29.55%Oxnard, California28.91%Pasadena, Texas27.66%Santa Maria, California27.62%El Monte, California27.53%Salinas, California27.07%Brownsville, Texas26.51%Laredo, Texas26.46%Highest prevalence of Middle Eastern Restaurants:Sterling Heights, Michigan7.96%Dearborn, Michigan7.57%Glendale, California5.27%Paterson, New Jersey4.04%Anaheim, California3.08%Burbank, California2.99%El Cajon, California2.93%Warren, Michigan2.73%Richardson, Texas2.57%Plantation, Florida2.47%Highest prevalence of Ramen Bars:Honolulu, Hawaii2.45%San Mateo, California2.35%Elk Grove, California2.06%Cambridge, Massachusetts2.05%Torrance, California2.02%Tempe, Arizona2.01%Fullerton, California1.99%Coral Springs, Florida1.94%Daly City, California1.92%Costa Mesa, California1.89%Highest prevalence of Spanish Restaurants:Elizabeth, New Jersey3.95%Newark, New Jersey2.40%Lynn, Massachusetts2.30%Yonkers, New York1.91%Paterson, New Jersey1.91%Hialeah, Florida1.86%Rochester, New York1.73%Albany, New York1.51%Jersey City, New Jersey1.45%Worcester, Massachusetts1.35%NB: I strongly suspect that "Spanish" is shorthand for "Latin", and includes Puerto Rican, Dominican, Cuban etc., knowing some of these cities well.Highest prevalence of Sushi Restaurants:San Mateo, California6.79%Simi Valley, California6.32%Costa Mesa, California5.67%Roseville, California5.58%Coral Springs, Florida5.48%Boca Raton, Florida5.24%Honolulu, Hawaii5.18%Pembroke Pines, Florida5.15%Davie, Florida5.06%Berkeley, California5.00%Highest prevalence of Thai Restaurants:St. Paul, Minnesota5.67%Portland, Oregon4.97%Amarillo, Texas4.63%Berkeley, California4.58%Vancouver, Washington4.12%Seattle, Washington4.08%Anchorage, Alaska3.74%Alexandria, Virginia3.71%Tacoma, Washington3.57%Lowell, Massachusetts3.45%Highest prevalence of Turkish Restaurants:Paterson, New Jersey2.77%Plantation, Florida1.65%El Cajon, California1.26%Richardson, Texas1.19%Waterbury, Connecticut0.99%Daly City, California0.96%West Jordan, Utah0.95%Dearborn, Michigan0.95%Kent, Washington0.82%Bridgeport, Connecticut0.76%Highest prevalence of Vietnamese Restaurants:Garden Grove, California21.12%San Jose, California6.44%Renton, Washington5.71%Federal Way, Washington4.67%Tacoma, Washington4.46%El Monte, California4.45%Kent, Washington4.36%Garland, Texas4.18%Everett, Washington3.49%Lowell, Massachusetts3.45%Finally, some grossly over simplified, aggregate statistics:Highest prevalence of "domestic" cuisine, including "American", diners, pizza, burgers, bar and grills,steakhouses:Davenport, Iowa        76.76%Billings, Montana75.16%Goodyear, Arizona74.66%Evansville, Indiana74.30%Surprise, Arizona73.92%Springfield, Illinois72.52%Spokane Valley, WA72.39%Rio Rancho, New Mexico71.64%Suffolk, irginia71.55%Lee's Summit, Missouri71.11%Highest prevalence of "international" cuisine, including all restaurants that mention a specific non-US nationality, plus sushi/ramen:San Mateo, California71.83%Federal Way, Washington68.86%Sunnyvale, California67.07%Garden Grove, CA        67.00%Santa Clara, California62.77%Berkeley, California62.49%Fremont, California62.14%San Francisco, CA61.40%Bellevue, Washington61.39%Costa Mesa, California61.26%

So very many disclaimers apply:

– Some of this data is guaranteed to be wrong. Some will be wrong because Google’s knowledge of US restaurants is imperfect. Some will be wrong because my code got something wrong. I welcome your “that can’t possibly be right” comments, but won’t be fixing rankings or code in response to them.

– There’s a small set of cities with populations over 100,000 who I had consistent problems getting accurate data for. They’ve been removed from the data set. (They know who they are.) I think this is a Google Places API problem but remain open to the idea that it’s my particular stupidity.

– These categories don’t make sense! Aren’t all the pizza places Italian restaurants! What the hell’s the difference between a Lebanese, Middle Eastern and Mediterranean restaurant? Yep. This one’s on Google – those are the categories I have access to. I strongly suspect they overlap, and it’s not clear whether restaurants self-categorize or are somehow categorized into these buckets.

– Where are the Uighur, Burmese and Peruvian places? Again, blame Google and its categories. I, for one, would welcome an app that told me how many miles I am from Uighur food at all times, and how to alter my driving so I can detour to eat cumin lamb. But so far, this is what I have easy access to. My next version of the tool is going to search for specific terms – “Uighur”, “Xinjiang”, “Ughyur”, etc. – in hopes of identifying some of my favorite cuisines.

Lastly, the code that I used for this analysis was written almost solely by Google Gemini, which was an experience in and of itself. I’ll post about that at a later date.

The post Stalking the Statistically Improbable Restaurant… With Data! appeared first on Ethan Zuckerman.