Who Is Winning the Generative AI Race? Nobody (yet).

e-Literate 2023-08-31

This is a post for folks who want to learn how recent AI developments may affect them as people interested in EdTech who are not necessarily technologists. The tagline of e-Literate is “Present is Prologue.” I try to extrapolate from today’s developments only as far as the evidence takes me with confidence.

Generative AI is the kind of topic that’s a good fit for e-Literate because the conversations about it are fragmented. The academic and technical literature is boiling over with developments on practically a daily basis but is hard for non-technical folks to sift through and follow. The grand syntheses about the future of…well…everything are often written by incredibly smart people who have to make a lot of guesses at a moment of great uncertainty. The business press has important data wrapped in a lot of WHEEEE!

Generative AI will definitely look exactly like this!

Let’s see if we can run this maze, shall we?

Is bigger better?

OpenAI and ChatGPT set many assumptions and expectations about generative AI, starting with the idea that these models must be huge and expensive. Which, in turn, means that only a few tech giants can afford to play.

Right now there are five widely known giants. (Well, six, really, but we’ll get to the surprise contender in a bit.) OpenAI’s ChatGPT and Anthropic’s Claude are pure plays created by start-ups. OpenAI started the whole generative AI craze by showing the world how much anyone who can write English can accomplish with ChatGPT. Anthropic has made a bet on “ethical AI” with more protections from harmful output and a few differentiating features that are important for certain applications but that I’m not going to go into here.

Then there are the big three SaaS hosting giants. Microsoft has been tied very tightly to OpenAI, of which it owns a 49% stake. Google, which has been a pioneering leader in AI technologies but has been a mess with its platforms and products (as usual), has until recently focused on promoting several of its own models. Amazon, which has been late out of the gate, has its own Titan generative AI model that almost nobody has seen yet but seems to be betting on hosting an ecosystem of platforms, including Anthropic and others.

About that ecosystem thing. A while back, an internal paper called “We Have No Moat, and OpenAI Doesn’t Either.” leaked from Google. It made the argument that so much innovation was happening so quickly in open-source generative AI that the war chests and proprietary technologies of these big companies wouldn’t give them an advantage over the rapid innovation of a large open-source community.

I could easily write a whole long post about the nature of that innovation. For now, I’ll focus on a few key points that should be accessible to everyone. First, it turns out that the big companies with oodles of money and computing power—surprise!—decided to rely on strategies that required oodles of money and computing power. They didn’t spend a lot of time thinking about how to make their models smaller and more efficient. Open-source teams with far more limited budgets quickly demonstrated that they could make huge gains in algorithmic efficiency. The barrier to entry for building a better LLM—money—is dropping fast.

Complementing this first strategy, some open-source teams worked particularly hard to improve data quality, which requires more hard human work and less brute computing force. It turns out that the old adage holds: garbage in, garbage out. Even smaller systems trained on more carefully curated data are less likely to hallucinate and more likely to give high-quality answers.

And third, it turns out that we don’t need giant all-purpose models all the time. Writing software code is a good example of a specialized generative AI task that can be accomplished well with a much smaller, cheaper model using the techniques described above.

The internal Google memo concluded by arguing that “OpenAI doesn’t matter” while cooperating with open source is vital.

That missive was leaked in May. Guess what’s happened since then?

The swarm

Meta had already announced in February that it was releasing an open-source-ish model called Llama. It was only open-source-ish because its license limited it to research use. That was quickly hacked and abused. The academic teams and smaller startups, which were already innovating like crazy, took advantage of the oodles of money and computing power that Meta was able to put into LLama. Unlike the other giants, Meta doesn’t make money by hosting software. They making from content. Commoditizing the generative AI will lead to much more content being generated. Perhaps seeing an opportunity, when Meta released LLama 2 in July, the only unusual restrictions they placed on the open-source license were to prevent big hosting companies like Amazon, Microsoft, and Google from making money off Llama without paying Meta. Anyone smaller than that can use the Llama models for a variety of purposes, including commercial applications. Importantly, LLama 2 is available in a variety of sizes, including one small enough to run on a newer personal computer.

To be clear, OpenAI, Microsoft, Google, Anthropic, and Google are all continuing to develop their proprietary models. That isn’t going away. But at the same time…

  • Microsoft, despite their expensive continuing love affair with OpenAI, announced support for Llama 2 and has a license (but not announced products that I can find yet) for Databricks’ open-source Dolly 2.0.
  • Google Cloud is adding both LLama 2 and Anthropic’s Claude 2 to their list of 100 LLM models they support, including their own open-source Flan T-5 and PaLM LLMs.
  • Amazon now supports a growing range of LLMs, including open-source Stability AI and Llama 2.
  • IBM—’member them?—is back in the AI game, trying to rehabilitate its image after the much-hyped and mostly underwhelming Watson products. The company is trotting out watsonx (with the very now, very wow lower-case “w” at the beginning of the name and “x” at the end) integrated with HuggingFace, which you can think of as being a little bit like the Github for open-source generative AI.

It seems that the Google memo about no moats, which was largely shrugged off publicly way back in May, was taken seriously privately by the major players. All the big companies have been hedging their bets and increasingly investing in making the use of any given LLM easier rather than betting that they can build the One LLM to Rule Them All.

Meanwhile, new specialized and generalized LLMs pop up weekly. For personal use, I bounce between ChatGPT, BingChat, Bard, and Claude, each for different types of tasks (and sometimes a couple at once to compare results). I use DALL-E and Stable Diffusion for image generation. (Midjourney seems great but trying to use it through Discord makes my eyes bleed.) I’ll try the largest Llama 2 model and others when I have easy access to them (which I predict will be soon). I want to put a smaller coding LLM on my laptop, not to have it write programs for me but to have it teach me how to read them.

The most obvious possible end result of this rapid sprawling growth of supported models is that, far from being the singular Big Tech miracle that ChatGPT sold us on with their sudden and bold entrance onto the world stage, generative AI is going to become just one more part of IT stack, albeit a very important one. There will be competition. There will be specialization. The big cloud hosting companies may end up distinguishing themselves not so much by being the first to build Skynet as by their ability to make it easier for technologists to integrate this new and strange toolkit into their development and operations. Meanwhile, a parallel world of alternatives for startups and small or specialized use will spring up.

We have not reached the singularity yet

Meanwhile, that welter of weekly announcements about AI advancements I mentioned before have not included massive breakthroughs in super-intelligent machines. Instead, many of them have been about supporting more models and making them easier to use for real-world development. For example, OpenAI is making a big deal out of how much better ChatGPT Enterprise is at keeping the things you tell it private.

Oh. That would be nice.

I don’t mean to mock the OpenAI folks. This is new tech. Years of effort will need to be invested into making this technology easy and reliable for the uses it’s being put to now. ChatGPT has largely been a very impressive demo as an enterprise application, while ChatGPT Enterprise is exactly what it sounds like; an effort to make ChatgGPT usable in the enterprise.

The folks I talk to who are undertaking ambitious generative AI projects, including ones whose technical expertise I trust a great deal, are telling me they are struggling. The tech is unpredictable. That’s not surprising; generative AI is probabilistic. The same function that enables it to produce novel content also enables it to make up facts. Try QA testing an application like that and avoiding regressions—i.e., bugs you thought you fixed but came back in the next version—using technology like that. Meanwhile, the toolchain around developing, testing, and maintaining generative AI-based software is still very immature.

These problems will be solved. But if the past six months have taught us anything, it’s that our ability to predict the twists and turns ahead is very limited at the moment. Last September, I wrote a piece called “The Miracle, the Grind, and the Wall.” It’s easy to produce miraculous-seeming one-off results with generative AI but often very hard to achieve them reliably at scale. And sometimes we hit walls that prevent us from reaching goals for reasons that we don’t see coming. For example, what happens when you run a data set that has some very subtle problems with it through a probabilistic model with half a trillion computing units, each potentially doing something with the data that is impacted by the problems and passing the modified problematic data onto other parts of the system? How do you trace and fix those “bugs” (if you even call them that).

It’s fun to think about where all of this AI stuff could go. And it’s important to try. But personally, I find the here-and-now to be fun and useful to think about. I can make some reasonable guesses about what might happen in the next 12 months. I can see major changes and improvements AI can contribute to education today that minimize the risk of the grind and the wall. And I can see how to build a curriculum of real-world projects that teaches me and others about the evolving landscape even as we make useful improvements today.

What I’m watching for

Given all that, what am I paying attention to?

  • Continued frantic scrambling among the big tech players: If you’re not able to read and make sense of the weekly announcements, papers, and new open-source projects, pay attention to Microsoft, Amazon, Google, IBM, OpenAI, Anthropic, and HuggingFace. The four traditional giants in particular seem to be thrashing a bit. They’re all tracking the developments that you and I can’t and are trying to keep up. I’m watching these companies with a critical eye. They’re not leading (yet). They’re running for their lives. They’re in a race. But they don’t know what kind of race it is or which direction to go to reach the finish line. Since these are obviously extremely smart people trying very hard to compete, the cracks and changes in their strategies tell us as much as the strategies themselves.
  • Practical, short-term implementations in EdTech: I’m not tracking grand AI EdTech moonshot announcements closely. It’s not that they’re unimportant. It’s that I can’t tell from a distance whose work is interesting and don’t have time to chase every project down. Some of them will pan out. Most won’t. And a lot of them are way too far out over their skis. I’ll wait to see who actually gets traction. And by “traction,” I don’t mean grant money or press. I mean real-world accomplishments and adoptions. On the other hand, people who are deploying AI projects now are learning. I don’t worry too much about what they’re building, since a lot of what they do will be either wrong, uninteresting, or both. Clay Shirky once said the purpose of the first version of software isn’t to find out if you got it right; it’s to learn what you got wrong. (I’m paraphrasing since I can’t find the original quote.) I want to see what people are learning. The short-term projects that are interesting to me are the experiments that can teach us something useful.
  • The tech being used along with LLMs: ChatGPT did us a disservice by convincing us that it could soon become an all-knowing, hyper-intelligent being. It’s hard to become the all-powerful AI if you can’t reliably perform arithmetic, are prone to hallucinations, can’t remember anything from one conversation to the next, and start to space out if a conversation runs too long. We are being given the impression that the models will eventually get good enough that all these problems will go away. Maybe. For the foreseeable future, we’re better off thinking about them as interfaces with other kinds of software that are better at math, remembering, and so on. “AI” isn’t a monolith. One of the reasons I want to watch short-term projects is that I want to see what other pieces are needed to realize particular goals. For example, start listening for the term “vector database.” The larger tech ecosystem will help define the possibility space.
  • Intellectual property questions: What happens if The New York Times successfully sues OpenAI for copyright infringement? It’s not like OpenAI can just go into ChatGPT and delete all of those articles. If intellectual property law forces changes to AI training, then the existing models will have big problems (though some have been more careful than others). A chorus of AI cheerleaders tell us, “No, that won’t happen. It’s covered by fair use.” That’s plausible. But are we sure? Are we sure it’s covered in Europe as well as the US? How much should one bet on it? Many subtle legal questions will need to be sorted over the coming several years. The outcomes of various cases will also shape the landscape.
  • Microchip shortages: This is a weird thing for me to find myself thinking about, but these large generative AI applications—especially training them—run on giant, expensive GPUs. One company, NVidia, has far and away the best processors for this work. So much so that there is a major race on to acquire as many NVidia processors as possible due to limited supply and unlimited demand. And unlike software, a challenger company can’t shock the world with a new microprocessor that changes the world overnight. Designing and fabricating new chips at scale takes years. More than two. Nvidia will be the leader for a long time. Therefore, the ability for AI to grow will be, in some respects, constrained by the company’s production capacity. Don’t believe me? Check out their five-year stock price and note the point when generative AI hype really took off.
  • AI on my laptop: On the other end of the scale, remember that open-source has been shrinking the size of effective LLMs. For example, Apple has already optimized a version of Stable Diffusion for their operating system and released an open-source one-click installer for easier consumer use. The next step one can imagine is for them to optimize their computer chip—either the soon-to-be-released M3 or the M4 after it. (As I said, computer chips take time.) But one can easily imagine image generation, software code generation, and a chatbot that understands and can talk about the documents you have on your hard drive. All running locally and privately. In the meantime, I’ll be running a few experiments with AI on my laptop. I’ll let you know how it goes.

Present is prologue

Particularly at this moment of great uncertainty and rapid change, it pays to keep your eyes on where you’re walking. A lot of institutions I talk to either are engaged in 57 different AI projects, some of which are incredibly ambitious, or are looking longingly for one thing they can try. I’ll have an announcement on the latter possibility very shortly (which will still work for folks in the former situation). Think about these early efforts as CBE for the future work. The thing about the future is that there’s always more of it. Whatever the future of work is today will be the present of work tomorrow. But there will still be a future of work tomorrow. So we need to build a continuous curriculum of project-based learning with our AI efforts. And we need to watch what’s happening now.

Every day is a surprise. Isn’t that refreshing after decades in EdTech?

The post Who Is Winning the Generative AI Race? Nobody (yet). appeared first on e-Literate.