Not Free, Not Easy, Not Trivial — The Warehousing and Delivery of Digital Goods


“There is a persistent conceit stemming from the IT arrogance we continue to see around us, but it’s one that most IT professionals are finding real problems with — the notion that storing and distributing digital goods is a trivial, simple matter, adds nothing to their cost, and can be effectively done by amateurs. In fact, a study done last year found that initiatives to move to cloud-based computing stalled most often because of higher-than-expected costs... I’m not talking here about just things like the cost of electricity, which should be enough on its own to disabuse idealists of their vacuous notions of what makes the world go around. I analyzed this at length in another post earlier this year. Even beyond just their power requirements, digital goods have particular traits that make them difficult to store effectively, challenging to distribute well, and much more effective when handled by paid professionals. First, digital goods are not intangible. They occupy physical space, be that on a hard drive, on flash memory, or during transmission... In fact, their small size makes managing them somewhat unique and in some ways quite difficult. Because digital products are information products, they need more data than just their inherent data to be managed and used. This is metadata — descriptions of what’s in the tiny packet, where it resides, what forebears it has, what dependencies it has, and how it can be used... Creating, updating, and tracking the metadata is a chore for owners of digital goods... Then we get into the pesky problem that digital goods are very small and very easy to create. The proliferation of digital goods — photos, music, Web pages, blog posts, social media shares, tweets, ratings, movies and videos, and so much more — puts incredible and growing pressure on metadata management techniques and layers. This means building more and larger warehouses, which adds to both ongoing costs for current users and migration costs as older warehouses are outstripped by new demands. Megabytes become gigabytes become terrabytes become zettabytes and beyond. Where will they all fit? Building more and larger digital warehouses and the information to make those digital goods accessible entail major work from highly technical people using sophisticated equipment and engineering. These people are expensive, and work at multiple levels in the digital economy. Let’s not forget the human effort involved in distributing and storing digital goods... Well-stored digital goods need a superordinate data structure to organize them, a structure that becomes more complex and expensive to maintain the more robust, commercial, and valuable the underlying digital assets become. And this structure can change... Digital goods also need to be backed up. Because they’re small, they are fragile. Because they are digital, they are all-or-nothing ... Digital goods need to be secure... Digital goods have owners. The legal barrier for protection of databases is fairly low — ‘the author has to make choices about the selection, coordination, or arrangement of the facts or data’ — and some databases are very valuable... In fact, while PubMed Central is free to access, PubMed itself charges a licensing fee, often into the tens of thousands of dollars. Free abstracts compiled with taxpayer funds, and the government charges for them? Well, OA zealots, I’ll leave you to ponder that one... Even Creative Commons — which trades in nothing tangible — spends $2.5 million a year doing what it does, which is basically distributing labels it defines... Digital goods have to be available all the time. This requires a lot of infrastructure, more than a typical physical warehouse does... Digital warehouses are more expensive to build. Site planning is a major undertaking... Digital goods also vary in quality, and some of the quality has to do with the infrastructure in which they exist. You can pay for faster downloads at some sites, because bandwidth is an expensive variable in the purveyance of digital goods. Do you pay more for a robust data plan? Do you pay more for a bigger pipe? The costs associated with data provision and storage can shock IT managers because they reveal all the overheads of digital goods. After all, in standard IT budgets, things like electricity, heat, and rent are handled in other internal budgets. Once data move to the cloud, the IT budget is paying for those things directly via the vendor, and these additional expenses can be sizable — or so says Jonathan Alboum, CIO at the U.S. Department of Agriculture’s Food and Nutrition Service (FNS): ‘With the cloud, these basic infrastructure charges are baked into the overall cost, so I’m now paying for some things that previously didn’t come out of my IT budget.’ So, are digital goods infinitely reproducible? Not practically. Information has an energy cost, even in biological systems... In the realm of digital goods, we’re reaching a point at which we’re facing trade-offs. Already, some data sets are propagating at a rate that exceeds Moore’s Law, which may still accurately predic



