Episode 32

LLM AI, the fourth pillar of software

After a year when the sudden leap forward of Gen AI and LLMs took almost everyone by surprise, it’s a good time to ignore the hype, take a step back and look at what we have.

Guido Appenzeller is a special advisor to a16z, a former CTO at Intel and VMware and a leading voice in the current wave of software and technology. In part one of our conversation with Guido, we discuss where AI fits into the existing software universe, what phase 2 might look like, the potential limitations – regulatory and infrastructural - and what it means for developers when you have an additional fundamental system building block with which to work.

Listen to part 2.

Watch video

Listen on:

Episode Transcript

Chris Kindt

Welcome to Orbit, the Hg podcast series where we speak to leaders and innovators from across software and tech ecosystem to discuss the key trends in building businesses that endure. My name is Chris Kindt. I'm the head of the value creation team in Hg, and I'm delighted to be joined today by Guido Apenzeller. Guido is a special advisor to Andreessen Horowitz former CTO at Intel and VMware and a leading technology expert. At Hg we've been fortunate to hear speak most recently on AI and its effect on a world of B2B SaaS. Thank you, Guido, for joining us.

Guido Appenzeller

It's great to be here today. Thank you.

Chris Kindt

So Guido, I can't believe that it's now almost a year since ChatGPT was launched and hit our world, and I must admit I didn't, and I think many of us didn't quite see the powerful impact that it would have and how much discussion has been sparked around it since. Would be really helpful to have your perspective on how we got here, and really what took us to the place where we are today.

Guido Appenzeller

Yeah, I think it surprised everyone how suddenly AIs have made a leap forward. And when I did my PhD at Stanford, AI was known as this thing which looked fantastic in demos, but then when you actually tried to put it in production, it never quite worked. And that continued like that pretty much until the mid 2010s like 2015 or so when some of the large hyperscalers Google, Facebook, Uber, Tesla's really figured out how to use deep learning type techniques and really create value. They build these massive internal networks to, for example, optimize advertising and similar tasks, but it was still something which only the very large hyperscale style companies could do. You would needed a large amounts of investment. You needed very specialists like my fellows stand for PhDs, these kinds of people that are very expensive and they would work for a long time and build a model that could solve one particular problem.

And then suddenly, I want to say mid last year roughly, we had a couple of major breakthroughs where we made models bigger. Suddenly we saw this new emergent behaviors and nobody quite understands why. But basically once you cross certain size thresholds, you saw new capabilities that previous models didn't exhibit and those made these models massively more useful, enabled different usage model with this idea of foundation models. And that really changed the unit economics and the adoption of this newer AI.

Chris Kindt

And I guess the question that people are now grappling with is what analogy do we use for this technology change? What would you draw the parallel to in terms of previous technology changes that we've seen hit our worlds?

Guido Appenzeller

Yeah, it's a great question. To me it looks right now this is a very big technology wave, so something that it creates immediate value. I'm using this technology every day like some other technology waves where the value took much, much longer to develop. We're also seeing a very rapid uptake. If you look at statistics, how quickly people are adopting this OpenAI, for example, being the fastest, possibly the fastest company from the first dollar to the first to a billion dollar run rate in the history of tech. So seeing these tools being adopted much quicker than previous of revolutions of similar type. I think in part what's happening here is that we've effectively created a new fundamental component to build systems that doesn't happen very often. We invented CPUs and that essentially took compute and made it free. We create databases which allowed somebody who has no idea how balanced trees or any of the complex IT in technologies work to quickly access data.

We developed networking that allowed us to transmit data for essentially free at least factor of a hundred thousand cheaper than before. And now we created something new models which allow us to solve certain problems, orders of magnitude, fast and cheaper than previously. And if you look at these other technologies like compute, storage, networking, they became fundamental building blocks, which today are used in pretty much every software that's out there. My current gut feeling is we've created a fourth component here, which in the future will also be in every software that we build that we make. And if that is true, that's a very major disruption. If you look at the last couple of ones, when we built CPUs, they created Microsoft and created Intel. When we created databases that created Oracle, when we created networking, we created the Ciscos and Google's of the world. We probably have something of that scale ahead of us, something that'll create trillion-dollar companies.

Chris Kindt

This is where the crystal ball I suspect gets a little bit hazy and fuzzy, but what's your perspective on what we might therefore see or how we might expect to see software companies evolve? And maybe it's easier to think about this in the shorter term, maybe something that is a bit more visible. And then do you have any kind of thoughts on what potential roots or vectors of disruption we might see in software longer term?

Guido Appenzeller

Yeah, it's a great question and I think the honest answer is we don't know yet. Look, assume we were sitting here and we just had seen the first demo of a web browser. It's very hard to predict all of the internet. There's so many things happening. I think nobody looked at the first web browser and said, "Oh, wow, Walmart is toast as the number one retailer in the United States, they're going to get replaced." So making these leaps, understanding these secondary effects is really very difficult. But if we look at other similar transitions what often happens is you have a first wave where basically the technology gets retrofitted into existing products. And I think we're seeing that. We're seeing that with, for example, Microsoft with their Copilot style products.

Where they're saying you have PowerPoint and now you have an assistant that helps you use PowerPoint more efficiently but still with the same workflows as before. Or Adobe I think did a great job with this in painting tool where it's now a new Photoshop tool. I can say mark an area and say replace this area with something. But then there's usually a second wave where basically people rethink the software from the bottom, really how... If I would reinvent Photoshop and my assumption is everything is centered around models, it would probably look completely different to what I have today. And if you look at a modern stable fusion interface, I think you can see some of the traits of that. So I think the second wave will be the really interesting thing where we see a much bigger impact on incumbents.

Chris Kindt

The question that I'm sure many software executives and investors are grappling with is who is going to really help shape that next evolution? What defensive moats do the incumbents have? What can they leverage and how real might it be that the companies of the future, well we haven't yet been formed and that there's a chance from outside disruption. Do you have a perspective on who's best placed to push ahead here?

Guido Appenzeller

Yeah, I think it depends very much industry by industry. I don't see any general rule for this. In general I think AI itself does not offer... We haven't seen any very deep moats so far. There's certain industries that have a network effect or you have a two-sided marketplace, they provide really deep moats. It's very hard to catch up to incumbents. In AI we have model training costs, but if you look at Llama 2 for example, which is one of the best models that's out there that was trained for low single digits of million dollars, which that's not a big moat. Maybe having people who understand how to do this is a moat, but it is more like a traditional advantage in having the best team.

Having proprietary data could be a moat. If you control a substantial fraction of all the data in a particular space, say you Bloomberg or so, maybe that's enough of a moat to provide you with some defensibility. But I think thanks to the renaissance, this idea that knowledge should be universally available, most research today is available on the internet and most text is available on the internet. For most of these models, the data is publicly available. I think there's no really deep moats here. I think at the end it'll probably look much more like a classic software industry where it's about executions, about building great teams. It's about having the best infrastructure and running fast.

Chris Kindt

Interesting. And your bring up an important and interesting kind of discussion point around the fact that so much of the knowledge that these models get trained on is out there in the public domain, and a lot of these models have been trained by large portions or maybe all available good quality data that's out on the web. So then that also then raises an immediate next question as, okay, how much further do we have to go on these models? Are we now at an inflection point or are there other routes to driving continued step changes and improvements in what we are seeing in terms of these core AI model performances?

Guido Appenzeller

Yeah, so another really, really good question, and I'm not sure there's a widely shared opinion on that. I think there's an active debate and with various players arguing one way or the other. But if we look at it, we know... Let's take large language models as an example. For large language models, there's something called the chinchilla scaling laws that basically tell us if you want to train a larger model, you need more data. And if you train on a less data, you're leaving money on the table. It's not actually a better model. You can get some benefit by over-training a model. So you can basically make a small model better by training it longer and longer. But model size is coupled to model of train data that you have.

Today the top models are trained on something like 3 trillion tokens, order of magnitude, which is roughly 3 trillion words. That represents a substantial fraction of all human written knowledge that was ever created. And probably in terms of quality, the best parts of it's all of Wikipedias, it's a large part of the internet, and some of them have taken Reddit and GitHub and some of these other sources in there. So can you go up by another factor of 10? Maybe. Can you go up by a factor of hundred? It's totally unclear to me, at least we would really have to do something radically new like teaching model, thought experiment or something like that. Even just doing thought experiments, I think there's probably other areas. So just in terms of purely scaling models up, we're running into limitations.

There's a second issue and that's just the compute platform we're running on. So if I can run a model inside a single graphics card, things are a lot easier versus having to distribute it over multiple cards. There's some interesting technologies like a mixture of experts where we basically have multiple smarter models and route tokens to one of them in a clever way. But all these increase complexity. So if you go with a monolithic model, even an 8 bit, you're limited it to basically 80 billion parameters because currently you don't have much more than 80 gigabyte graphics cards basically to run these on. So I think we're running a number of limitations. I wouldn't expect us to see big just size increases... Well, I would see us expect to see some level of size increase, but they'll probably slow down a little bit in the future.

Chris Kindt

We've seen in our Hg portfolio that our companies are going to grappling with what is the right response. There are many options out there, how to get started, many different providers. You've got open source models as well. I know this is a big complex topic to unpick, maybe as a starting point would be really helpful to hear how you think about what the key components are of that tech stack or that AI tech stack, and then maybe we can kind of unpick some of the trends and the runners and the riders that we have within that.

Guido Appenzeller

Yeah, that sounds great. Well, I think an iStack in many ways looks quite similar to a classic software stack. At the bottom you have the silicon, which for AI today is mostly Nvidia and there's some other chips from the large cloud providers. All of the big clouds have their own or building their own, and then you have also some other third parties like AMD and Intel are trying. But Nvidia is just a fantastic lock-in because of their software stack. I mean, it's a little bit like the old Intel days when you had Intel plus Windows on top. It was very hard to break into that. We have the same thing with Nvidia built CUDA and then all the software was built on top of that, very hard to replace it. So they're sitting pretty, and Jensen is a happy camper there.

On top of that, we have the clouds that actually make the accessible. Today most AI software runs in clouds and very few people run their own data center for many reasons. But one of them is actually that if you want to build high density AI builds, you have such crazy power and heat requirements, a cooling requirements that a classical data centers often not able to absorb it like. A single high-end AI server needs more power than the whole rack in a classic data center in some cases can actually supply. So we have AI clouds, all of the big ones are offering AI compute capacity. We also have seen some amazing specialty clouds and we've heard from core REEF that they announced, what was it? They raised about 2 billion in depth and announced that they had north of 2 billion in committed deals for next year. Which, for a start-up that's just, these are crazy numbers.

Chris Kindt

Absolutely.

Guido Appenzeller

It turns out in an early gold rush making peaks and shovels is a great business here- [inaudible 00:14:06]

Chris Kindt

Yeah, absolutely.

Guido Appenzeller

Then on top of that, we have a software ecosystem and that is sort of also layering classic software ecosystem. So we have the very early guys like an OpenAI, they have to integrate vertically just because if there's no infrastructure, you have to do everything yourself. But then some of the newer companies, we're seeing a split where one company trains a model like Meta with Llama or Stability with Stable Diffusion. Then you have another company hosting them, for example, Replicate or Hugging Face. They're doing great business and just taking these models and turning them to very easy to use serverless APIs. And then on top of that you have the actual application, which can sometimes be very thin, which may or may not be a good thing. Thin means you have less defensibility, but also means you have amazingly fast time to market and it can run very quickly. I think it's starting to look like a classic infrastructure stack where you might have clouds with databases and applications on top.

Chris Kindt

And this might be a question that dates very quickly, and I mean maybe we should call out that it's September 2023, but I guess one thing that we know software CTOs are grappling with is what the right technology strategy is in this space. And I know we've seen many go with the front-runner, OpenAI and using that vertically integrated offering that they provide, but then there are all sorts of other temptations that pull them away from that. So I wonder whether you have a perspective on if we were to try and roll the clock forwards 12 months, will we still see OpenAI with a level of dominance and it being the default option for many SaaS businesses or do you expect that alternative part of the ecosystem to develop and gain more share?

Guido Appenzeller

Boy predicting one year ahead in AI that's... So we'll position as best guesses here I think.

Chris Kindt

Absolutely.

Guido Appenzeller

So first let's get on what we have today. I mean, we have a range of closed source model providers. Let's stick to LLMs for a second because there's many categories of models today, but LLMs is more mature model types. So we have OpenAI, we have Anthropic with Claude, we have Cohere, Inflect, a longer list, and we have specialty models like character AI for example. They build a model specifically for chat, so let's leave those separate, but let's focus on more mainstream model as a service type companies. So we have the closed source folks, but then we also have now open source alternatives like Meta has Llama, is an amazing model like Llama 2, which is I'd say roughly at the level of a GPT 3.5. Which is very impressive, and you can just take that and run it in your own data center or go to somebody else and host it for you. And there's other open source models as well.

So in these markets, if we compare that to say open source, operating systems or databases, typically as the market matures for one use case, you see the field slimmed down a bit. It concentrates towards the market leader. So if you are open AI on the closed source side or Llama 2 on the open source side, I'd be pretty optimistic that they'll still be around. Usually if you're the number one and can keep that spot more gravitates towards you. I think if you're not the number one and you don't have a clearly defined niche, life gets harder. Being the number two closed source operating system. I'm not sure who that would be, like server operating system, the BSD probably or something like... No, that's open source, hold on, but they're the number one closed sources. But basically they have a little bit of a positioning problem. My best guess from what we see today is that I would expect concentration towards the top in each of these categories, open source, closed source.

Chris Kindt

Got it. One other topic, and maybe we see more of this in Europe though, although there's a discussion that's happening everywhere, it's around regulation and how the regulators will respond and also is it going to be multiple different regulatory responses or will there be some kind of convergence as well? What's your take on what you're seeing in the market at this stage develop?

Guido Appenzeller

Yeah, it's interesting. Regulating tech markets very early tends to not go well. First of all, something that's developing as quickly as AI, it's incredibly hard to regulate, but then also the regulatory agencies often are not even at the frontier of what's happening and being six months too late is fatal in these markets. You're just too far behind the curve. My impression is... Let's first ask the question why do we need regulation here? I think if you ask people, there's actually lots of different reasons that get pointed out. There's these sort of, "Oh, AI will take over the world and kill us all." And to me that's just a sort of a category error, it's like saying you toaster will kill us all. This is a tool. You'll input something, you get something out, but there's no living thing there.

I think that people are projecting something into these models that just doesn't exist in that sense, and maybe that'll change at some point in the future. It's just very hard to reason about something that doesn't exist yet, and then think about the threat model there. So that just doesn't make a lot of sense to me. There's a question of regulating it against biases and it's not clear to me we actually need any new regulation for that. I think a lot of the tools today are actually quite effective. There's similar concerns about intellectual property and for those categories, I think if we focus on regulating the actual applications that use these models as opposed to the models themselves, we probably entered a much, much better spot because a lot of the software today, if you have certain types of software for certain applications, you may not have certain biases. If you are creating certain sets of data or calling in certain ways, you have to respect copyright.

AI doesn't fundamentally change that. We may have to tweak it a little bit, but I think directionally... And this comes from discussions I've had in the United States with even some senior legal folks on the copyright holder side, they feel like we may have enough tools already to enforce what we need to enforce there. So right now, I think trying to regulate this is doing a lot more harm than good. I mean, I think the EU regulation, from what I've seen, I haven't studied in depth, but it seems very misguided. I think the United States thinks with a more voluntary regime that seems to be a little bit better. I recently had a discussion with the CTO of NASA and his thoughts on regulation were that in the early 2000's when United States private space companies were taking off like SpaceX and various others, one thing... Because they looked at it and decided to very consciously that they don't want to immediately regulate it because it's still too unclear how this will evolve.

So they decided to have a learning period, which actually I believe just ended where they basically said during this learning period, we're not going to regulate. We're just going to do one-on-one decisions but not try to pass anything blanket, and just approve individual flights. And then at the end of it, we think we have enough data that we can find good regulation. I think something like that might work here as well, because something this early... I think if you put the smartest people of AI in one room, they still couldn't find good relation today.

Chris Kindt

It's interesting how hard this conversation is to look even 12 months ahead, where normally we have conversations where we're trying to look many more years ahead rather than just trying to get to grips what the next 12 months might look like. But you are describing a picture where the ecosystem around AI still responding, including the regulator. The other response is, and you called it, we had that renaissance mindset around making knowledge and data broadly and publicly available. Do you see the potential for that, a move away from that for people to close off that knowledge sharing more widely and publicly in response to all these models training on that data? Is that something that you see could be more of a trend that we might see in the months and quarters or even years ahead?

Guido Appenzeller

I think to some degree that has happened. If you look at it up to GPT 3, OpenAI published all the details around every model, exactly how it was trained. Then they stopped because suddenly vast amounts of money were involved, and became a much more competitive industry. That said, I think while the company is becoming more commercially focused, which I think is fair, the training data itself is still mostly public. If you look at what these models are trained on, most of the data is available on the internet. You can just download it. So you have a lot of hard drive space, and the data at the end of the day is... I think the quality of a model, to the extent we can tell, it depends on two things. It's the quality of the data that goes in. Then there's still some voodoo and training and tuning hyper-parameters and really figuring out the training regime. But the data sources for most of these top models actually is fairly similar. Some of them may have proprietary data. It doesn't feel to me like right now that is really what differentiates these models.

Chris Kindt

Okay. So there isn't something where you say that there are a handful of key super high impact data sources, be it a stack overflow, be it a Reddit or other type of data sources. Your point is that a lot of the value that the models get through training comes from that broadly available data sets that can't just be locked down by a handful of key data owners.

Guido Appenzeller

That's right. I mean, I love Reddit, but when it comes to high quality data sources, Wikipedia ranks a little ahead of [inaudible 00:24:41] Reddit.

Chris Kindt

Are there any other kind of disruption risks that you see on the horizon here that people should be mindful of? And I guess one common news story that's been in the press a bit more lately is again, is around Taiwan and global chip supply. Is that something that we should have on our radars or are there any other risks that you'd add to that list as well?

Guido Appenzeller

There's probably a couple. The chip situation at the moment is already very constrained, taking out a geopolitical risk. We currently have an acute shortage of GPUs, and it's not entirely clear, but it looks like we might actually run into not only capacity limits of Nvidia building enough cards, but also capacity limits of TSMC, Taiwan semiconductor to build enough enough of these chips because the three nanometer nodes are getting constrained and sold out at this point, I believe for at least 24, possibly 25. And that means the amount of chips we can build is limited because building new fab capacity takes a very long time, building a fab cost many billions, and it takes many years.

So we are looking... And these models are ridiculously computationally intensive. The training models 10 to the 23 floating point operations. That's an absolutely crazy number. Even inference per word of output, takes basically twice as much operations as you have promised in the model. So that's billions. Billions of floating port operations per word that is output. So you multiply that with a number of people in the world that want to use this and you need a staggering number of these chips. I think that there's a real shortage. It's probably going to continue, at least for a little bit. Yes, any kind of geopolitical hiccups would potentially make this much worse. I think at this point, people understand that there's a lot of efforts underway to beef up chip capacity distributed more wildly around the world, including Europe. There's a number of initiatives underway. So I think we're slowly getting a handle on it, but these are very slow processes. Building fabs takes a lot of time.

Chris Kindt

And just out of curiosity, how much lag time is there in the system? So if things shut down I guess tonight, how long would it take for us to then see the repercussions in terms of what kind of compute power and access we have to AI? Would the lead times to impact be quite short or actually is there enough steps in the supply chain here that it'll take a while for us to feel the effects?

Guido Appenzeller

No, we'll feel the effects quickly and it will take us a long time to catch up. So building a new fab is the order of three to five years, depending on how complex the process is. And these are very complex ones that we're looking at. It takes a vast amount of money. It's also a complex international supply chain where certain parts only come from Europe, other parts only come from Asia. So you have to get all these pieces lined up correctly. All that said, I think we'll eventually sort out that bit. I think at the moment the biggest problem we have is just not enough capacity. This exponential boom where GPUs used to be this niche product for gamers or maybe a couple of AI researchers to know every software application needs this as a foundational technology to run on is a huge step.

Chris Kindt

In many markets, an economist might tell you that if you're in a market where you have supply capacity increasingly constrained, yet demand keeps booming, that a very natural response will be a very steep step up in price.

Guido Appenzeller

I think we're seeing that today.

Chris Kindt

We are. But just again, where do you think that might go? Because at the moment we are still reveling in the dramatic cost and productivity improvements. If we're looking at how quickly we can access certain images that would otherwise have taken a long time for a specialist video editor or artist to create that gap is still tremendous. But how quickly might that kind of pricing picture move on us?

Guido Appenzeller

I think we're still early enough in the cycle. We're seeing enough technological advancement, I would expect things to get continuously cheaper. It's just a question of how steep that drop is. So to give an example, when we started, for example, doing large language models last year, the efficiency and performance that we had was actually fairly low in many cases. If you naively run a transformer on a graphics card for inference, a large model, you're probably going to be in the single digit percentage of GPU utilization. And people run this with 32 bits, but then people figure out instead of 32 bits, we can get out to 16 bit precision. So we can then make the individual flowing point numbers shorter, which allows to basically do twice as many operations in one cycle and then they move to eight bits. So they gives you a factor of four.

Then people figure it out with techniques like flash attention or sparse network, a variety of tricks to basically speed this up further. So we probably today between sort of an unoptimized version, an optimized large language model, we have more than a factor of 10 in performance difference. And probably that journey isn't quite over yet. If you can go from eight bit to four bit and then to two bits... I'm not quite sure if two bit works, but four bit probably works for some applications. We have to figure out exactly how we need to normalize. There's a learning curve there, but I think eventually we might get there, we're getting better in understanding how to make models smaller while keeping the same performance, either breaking them up into multiple smaller models and routing cleverly or just over-training them.

So in the early days of databases or so we saw even on the same hardware, like continuous improvement's year over year just because people understand the technology better. I think the same thing will happen to AI. So I don't see the cost going up and the cost today is frankly very low. The image generation is about a hundred of a cent right. Before I go back to having human edit that in Photoshop-

Chris Kindt

It's a long way to go.

Guido Appenzeller

Yeah, it is a long way to go. The magic here is that we created something that can in many cases reduce cost by a factor of a hundred thousand or so, and I think we'll never go back. It'll never go be expensive enough for us to go back.

Chris Kindt

So your point is that on the one hand there is this tremendous amount of headroom and there's a real step change that we are seeing in terms of productivity. And then on top of that, just in the underlying technology, there's just so many efficiency improvements that we're still working through that that is giving us a lot of extra headroom in terms of further improvement.

Guido Appenzeller

Exactly right.

Orbit episodes

Latest

Podcast

Do tech leaders have to be tech experts?

Episode details

Podcast

The greatest tech comes when we ignore ROI

Episode details

Podcast

Mastering the billion-dollar software playbook

Episode details

Podcast

What drives business quality in an era of AI and digital platforms?

Episode details

Podcast

Sustainable IT and IT for Sustainability

Episode details

Podcast

Unlocking real-time behavioral data in SaaS

Episode details

Podcast

Why Tech is Deflationary in an Inflationary World

Episode details

Podcast

You are only as strong as your weakest point

Episode details

Podcast

Volcano Cat Bonds and Other Innovations

Episode details