What We Discovered on ‘Deep YouTube’
beSpacific 2025-04-25
The Atlantic [no paywall] – The video site isn’t just a platform. It’s infrastructure [The article is an important read especially in this turbulent time]. “Until last month, nobody outside of YouTube had a solid estimate for just how many videos are currently on the site. Eight hundred million? One billion? It turns out that the figure is more like 14 billion—more than one and a half videos for every person on the planet—and that’s counting strictly those that are publicly visible. I have that number not because YouTube maintains a public counter and not because the company issued a press release announcing it. I’m able to share it with you now only because I’m part of a small team of researchers at the University of Massachusetts at Amherst who spent a year figuring out how to calculate it. Our team’s paper, which was published last month, provides what we believe is the most comprehensive analysis of the world’s most important video-sharing platform to date. The viral videos and popular conspiracy theorists are, of course, important. But the reality is that the number and perhaps even importance of those videos are dwarfed by hours-long church services, condo-board meetings, and other miscellaneous clips that you’ll probably never see. Unlike stereotypical YouTube videos—personality-driven and edited to engage the broadest possible audience—these videos aren’t uploaded with profit in mind. Instead, they illustrate some of the ways that people rely on YouTube for a much wider range of activities than you would find while casually scrolling through its algorithmically driven recommendations. YouTube may have started as a video platform, but it has since become the backbone of one of the 21st century’s core forms of communication. Despite its global popularity, YouTube (which is owned by Google) veils its inner workings. When someone studies, for example, the proliferation of extreme speech on YouTube, they can tell us about a specific sample of videos—their content, view count, what other videos they link to, and so on. But that information exists in isolation; they cannot tell us how popular those videos are relative to the rest of YouTube. To make claims about YouTube in its entirety, we either need key information from YouTube’s databases, which isn’t realistic, or the ability to produce a big-enough, random sample of videos to represent the website. That is what we did. We used a complicated process that boils down to making billions upon billions of random guesses at YouTube IDs (the identifiers you see in the URL when watching videos). We call it “dialing for videos,” inspired by the “random digit dialing” used in polling. It took a sophisticated cluster of powerful computers at the University of Massachusetts months to collect a representative sample; we then spent another few months analyzing those videos to paint what we think is the best portrait to date of YouTube as a whole. (We use a related, slightly faster method at this website to keep regularly updated data.)
So much of YouTube is effectively dark matter. Videos with 10,000 or more views account for nearly 94 percent of the site’s traffic overall but less than 4 percent of total uploads. Just under 5 percent of videos have no views at all, nearly three-quarters have no comments, and even more have no likes. Popularity is almost entirely algorithmic: We found little correlation between subscribers and views, reflecting how YouTube recommendations, and not subscriptions, are the primary drivers of traffic on the site. In other words, people tend to watch just a sliver of what YouTube has to offer, and, on the whole, they follow what the algorithm serves to them.