It’ll be interesting to see how well companies will be able to monetise large, multi-purpose language and image-generation models.
Companies and investors are spending increasingly huge amounts of money on ML research talent and compute, typically with the hope that investments in this area lead to extremely profitable products. But—even if the resulting products are very useful and transformative—it still seems like it’s still a bit of an open question how profitable they’ll be.
Although huge state-of-the-art models are increasingly costly to create, the marginal cost of generating images and text using these models will tend to be low. Since competition tends to push the price of a service down close to the marginal cost of providing the service, it’ll be hard for any company to charge a lot for the use of their models.
As a result: It could be hard—or simply take a long time—for companies to recoup sufficiently large R&D costs, even a lot of people end up using their models.
2.
Of course, famously, this dynamic applies to most software. But some software services (e.g. Microsoft Office) still manage to charge users fees that are much higher than the cost of running the service.
Some things that can support important, persistent quality differences are:[2]
(a) patents or very-hard-to-learn-or-rediscover trade secrets that prevent competitors from copying valuable features;
(b) network effects that make the service more valuable the more other people are using it (and therefore create serious challenges for new entrants);
(c) steep learning curves or strong dependencies that, for reasons that go beyond network effects, make it very costly for existing users to switch to new software;
(d) customer bases that are small (which limit the value of trying to enter the area) or hard to cater to without lots of specialist knowledge and strong relationships (which raise the cost/difficulty of entering the area);
(e) other extreme difficulties involved in making the software;
(f) customer bases that are highly sensitive to very small differences in quality
3.
It’s not totally clear to what extent any of these conditions will apply, at least, to large language and image-generation models.
Patents in this space currently don’t seem like a huge deal; none of the key things that matter for making large models are patented. (Although maybe that will start to change?). Trade secrets also don’t seem like a huge deal; lots of companies are currently producing pretty similar models using pretty similar methods.
It’s also not clear to me that network effects or steep learning curves are a big deal: for example, if you want to generate illustrations for online articles, it doesn’t currently seem like it’d be too problematic or costly to switch from using DALL-E to using Imagen. It’s also not clear that it matters very much how many other people are using one or the other to generate their images. If you want to generate an illustration for an article, then I think it probably doesn’t really matter what other the authors of other articles tend to use. It’s also not clear to me that there will tend to be a lot of downstream dependencies that you need to take into account when switching from one model to another. (Although, again, maybe this will all change a lot over time?) At least big general models will tend to have large customer bases, I think, and their development/marketing/customer-support will not tend to heavily leverage specialist knowledge or relationships.
These models also don’t seem super, super hard to make — it seems like, for any given quality threshold, we should be expect multiple companies to be able to reach that quality threshold within a year of each other. To some extent, it seems, a wealthy tech company can throw money at the problem (compute and salaries for engineering talent) if it wants to create a model that’s close to as good as the best model available. At least beyond a certain performance level, I’m also not sure that most customers will care a ton about very slight differences in quality (although this might be the point I’m most unsure about).
4.
If none of the above conditions end up holding, to a sufficiently large extent, then I suppose the standard thing is you offer the service for free, try to make it at least slightly better than competitors’ services, and make money by showing ads (up to the point where it’s actively annoying to your user) and otherwise using customer data in various ways.
That can definitely work. (E.g. Google’s annual ad revenue is more than $100 billion.) But it also seem like a lot of idiosyncratic factors can influence a company’s ability to extract ad revenue from a service.
This also seems like, in some ways, a funny outcome. When people think about transformative AI, I don’t think they’re normally imagining it being attached to a giant advertising machine.
5.
One weird possible world is a world where the most important AI software is actually very hard to monetize. Although I’d still overall bet against this scenario[3], I think it’s probably worth analyzing.
Here, I think, are some dynamics that could emerge in this world:
(a) AI progress is a bit slower than it would otherwise be, since—after a certain point—companies realise that the financial returns on AI R&D are lower than they hoped. The rapid R&D growth in these areas eventually levels off, even though higher R&D levels could support substantially faster AI progress.[4]
(b) Un-monetized (e.g. academia-associated) models are pretty commonly used, at least as foundation models, since companies don’t have strong incentives to invest in offering superior monetized models.
(c) Governments become bigger players in driving AI progress forward, since companies are investing less in AI R&D than governments would ideally want (from the standpoint of growth, national power and prestige, or scientific progress for its own sake). Governments might step up their own research funding—or take various actions to limit downward pressure on prices.
I’m not widely read here, or an economist, so it’s possible these points are all already appreciated within the community of people thinking about inter-lab competition to create large models is going to play out. Alternatively, the points might just be wrong.
Maybe competition also shifts a bit toward goods/services that complement large many-purpose models, whatever these might be, or toward fine-tuned/specialized models that target more niche customer-bases or that are otherwise subject to less-intense downward pressure on pricing.
My guess would be that the ability to commercialize these models would strongly hinge on the ability for firms to wrap these up with complementary products, that would contribute to an ecosystem with network effects, dependencies, evangelism, etc.
I wouldn’t draw too strong conclusions from the fact that the few early attempts to commercialize models like these, notably by OpenAI, haven’t succeeded in creating the preconditions for generating a permenant stream of profits. I’d guess that their business models look less-than-promising on this dimension because (and this is just my impression) they’ve been trying to find product-market-fit, and have gone lightly on exploiting particular fits they found by building platforms to service these
Instead, better examples of what commercialization looks like are GPT-3-powered companies, like copysmith, which seem a lot more like traditional software businesses with the usual tactics for locking users in, and creating network effects and single-homing behaviour
I expect that companies will have ways to create switching costs for these models that traditional software product don’t have. I’m particularly interested in fine-tuning as a way to lock-in users by enabling models to strongly adapt to context about the users’ workloads. More intense versions of this might also exist, such as learning directly from individual customer’s feedback through something like RL. Note that this is actually quite similar to how non-software services create loyalty
I agree that it seems hard to commercialize these models out-of-the-box with something like paid API access, but I expect, given the points above, to be superseded by better strategies.
Couldn’t the exact same arguments be made to argue that there would not be successful internet companies, because the fundamental tech is hard to patent, and any website is easy to duplicate? But this just means that instead of monetising the bottom layer of tech (TCP/IP, or whatever), they make their billions from layering needed stuff on top—search, social network, logistics.
Couldn’t the exact same arguments be made to argue that there would not be successful internet companies, because the fundamental tech is hard to patent, and any website is easy to duplicate?
Definitely!
(I say above that the dynamic applies to “most software,” but should have said something broader to make it clear that it also applies to any company whose product—basically—is information that it’s close to costless to reproduce/generate. The book Information Rules is really good on this.)
Sometimes the above conditions hold well enough for people to be able to keep charging for software or access to websites. For example, LinkedIn can charge employers to access its specialized search tools, etc., due to network effects.
What otherwise often ends up happening is something is offered for free, with ads—because there’s some quality difference between products, which is too small for people to be willing to pay to use the better product but large enough for people to be willing to look at sufficiently non-annoying ads to use the better product. (E.g. Google vs. the next-best search engine, for most people.) Sometimes that can still lead to a lot of revenue, other times less so.
Other times companies just stop very seriously trying to directly make money in a certain domain (e.g. online encyclopaedias). Sometimes—as you say—that leads competition to shift to some nearby and complementary domain, where it is more possible to make money.
As initial speculation: It seems decently likely to me (~60%?) that it will be hard for companies making large language/image-generation models to charge significant prices to most of their users. In that scenario, it’s presumably still possible to make money through ads or otherwise by collecting user information.
It’d be interesting, though, if that revenue wasn’t very high—then most of the competition might happen around complementary products/services. I’m not totally clear on what these would be, though.
Ah, you do say that. Serves me right for skimming!
To start, you could have a company for each domain area that an AI needs to be fine-tuned, marketed, and adapted to meet any regulatory requirements. Writing advertising copy, editing, insurance evaluations, etc.
As for the foundation models themselves, I think training models is too expensive to go back to academia as you suggest. And I think that there are some barriers to getting priced down. Firstly, when you say you need “patents or very-hard-to-learn-or-rediscover trade secrets ”, does the cost of training the model not count? It is a huge barrier. There are also difficulties in acquiring AI talent. And future patents seem likely. We’re already seeing a huge shift with AI researchers leaving big tech for startups, to try to capture more of the value of their work, and this shift could go a lot further.
related: Imagen replicating DALL-E very well, seems like good evidence that there’s healthy competition between big tech companies, which drives down profits.
One thing that might push against this are economies of scope and if data really does become the new oil and become more relevant over time.
It’ll be interesting to see how well companies will be able to monetise large, multi-purpose language and image-generation models.
Companies and investors are spending increasingly huge amounts of money on ML research talent and compute, typically with the hope that investments in this area lead to extremely profitable products. But—even if the resulting products are very useful and transformative—it still seems like it’s still a bit of an open question how profitable they’ll be.
Some analysis:[1]
1.
Although huge state-of-the-art models are increasingly costly to create, the marginal cost of generating images and text using these models will tend to be low. Since competition tends to push the price of a service down close to the marginal cost of providing the service, it’ll be hard for any company to charge a lot for the use of their models.
As a result: It could be hard—or simply take a long time—for companies to recoup sufficiently large R&D costs, even a lot of people end up using their models.
2.
Of course, famously, this dynamic applies to most software. But some software services (e.g. Microsoft Office) still manage to charge users fees that are much higher than the cost of running the service.
Some things that can support important, persistent quality differences are:[2]
(a) patents or very-hard-to-learn-or-rediscover trade secrets that prevent competitors from copying valuable features;
(b) network effects that make the service more valuable the more other people are using it (and therefore create serious challenges for new entrants);
(c) steep learning curves or strong dependencies that, for reasons that go beyond network effects, make it very costly for existing users to switch to new software;
(d) customer bases that are small (which limit the value of trying to enter the area) or hard to cater to without lots of specialist knowledge and strong relationships (which raise the cost/difficulty of entering the area);
(e) other extreme difficulties involved in making the software;
(f) customer bases that are highly sensitive to very small differences in quality
3.
It’s not totally clear to what extent any of these conditions will apply, at least, to large language and image-generation models.
Patents in this space currently don’t seem like a huge deal; none of the key things that matter for making large models are patented. (Although maybe that will start to change?). Trade secrets also don’t seem like a huge deal; lots of companies are currently producing pretty similar models using pretty similar methods.
It’s also not clear to me that network effects or steep learning curves are a big deal: for example, if you want to generate illustrations for online articles, it doesn’t currently seem like it’d be too problematic or costly to switch from using DALL-E to using Imagen. It’s also not clear that it matters very much how many other people are using one or the other to generate their images. If you want to generate an illustration for an article, then I think it probably doesn’t really matter what other the authors of other articles tend to use. It’s also not clear to me that there will tend to be a lot of downstream dependencies that you need to take into account when switching from one model to another. (Although, again, maybe this will all change a lot over time?) At least big general models will tend to have large customer bases, I think, and their development/marketing/customer-support will not tend to heavily leverage specialist knowledge or relationships.
These models also don’t seem super, super hard to make — it seems like, for any given quality threshold, we should be expect multiple companies to be able to reach that quality threshold within a year of each other. To some extent, it seems, a wealthy tech company can throw money at the problem (compute and salaries for engineering talent) if it wants to create a model that’s close to as good as the best model available. At least beyond a certain performance level, I’m also not sure that most customers will care a ton about very slight differences in quality (although this might be the point I’m most unsure about).
4.
If none of the above conditions end up holding, to a sufficiently large extent, then I suppose the standard thing is you offer the service for free, try to make it at least slightly better than competitors’ services, and make money by showing ads (up to the point where it’s actively annoying to your user) and otherwise using customer data in various ways.
That can definitely work. (E.g. Google’s annual ad revenue is more than $100 billion.) But it also seem like a lot of idiosyncratic factors can influence a company’s ability to extract ad revenue from a service.
This also seems like, in some ways, a funny outcome. When people think about transformative AI, I don’t think they’re normally imagining it being attached to a giant advertising machine.
5.
One weird possible world is a world where the most important AI software is actually very hard to monetize. Although I’d still overall bet against this scenario[3], I think it’s probably worth analyzing.
Here, I think, are some dynamics that could emerge in this world:
(a) AI progress is a bit slower than it would otherwise be, since—after a certain point—companies realise that the financial returns on AI R&D are lower than they hoped. The rapid R&D growth in these areas eventually levels off, even though higher R&D levels could support substantially faster AI progress.[4]
(b) Un-monetized (e.g. academia-associated) models are pretty commonly used, at least as foundation models, since companies don’t have strong incentives to invest in offering superior monetized models.
(c) Governments become bigger players in driving AI progress forward, since companies are investing less in AI R&D than governments would ideally want (from the standpoint of growth, national power and prestige, or scientific progress for its own sake). Governments might step up their own research funding—or take various actions to limit downward pressure on prices.
I’m not widely read here, or an economist, so it’s possible these points are all already appreciated within the community of people thinking about inter-lab competition to create large models is going to play out. Alternatively, the points might just be wrong.
This list here is mostly inspired by my dim memory of the discussion of software pricing in the book Information Rules.
Companies do seem to have an impressive track of monetizing seemingly hard-to-monetize things.
Maybe competition also shifts a bit toward goods/services that complement large many-purpose models, whatever these might be, or toward fine-tuned/specialized models that target more niche customer-bases or that are otherwise subject to less-intense downward pressure on pricing.
This is insightful. Some quick responses:
My guess would be that the ability to commercialize these models would strongly hinge on the ability for firms to wrap these up with complementary products, that would contribute to an ecosystem with network effects, dependencies, evangelism, etc.
I wouldn’t draw too strong conclusions from the fact that the few early attempts to commercialize models like these, notably by OpenAI, haven’t succeeded in creating the preconditions for generating a permenant stream of profits. I’d guess that their business models look less-than-promising on this dimension because (and this is just my impression) they’ve been trying to find product-market-fit, and have gone lightly on exploiting particular fits they found by building platforms to service these
Instead, better examples of what commercialization looks like are GPT-3-powered companies, like copysmith, which seem a lot more like traditional software businesses with the usual tactics for locking users in, and creating network effects and single-homing behaviour
I expect that companies will have ways to create switching costs for these models that traditional software product don’t have. I’m particularly interested in fine-tuning as a way to lock-in users by enabling models to strongly adapt to context about the users’ workloads. More intense versions of this might also exist, such as learning directly from individual customer’s feedback through something like RL. Note that this is actually quite similar to how non-software services create loyalty
I agree that it seems hard to commercialize these models out-of-the-box with something like paid API access, but I expect, given the points above, to be superseded by better strategies.
Couldn’t the exact same arguments be made to argue that there would not be successful internet companies, because the fundamental tech is hard to patent, and any website is easy to duplicate? But this just means that instead of monetising the bottom layer of tech (TCP/IP, or whatever), they make their billions from layering needed stuff on top—search, social network, logistics.
Definitely!
(I say above that the dynamic applies to “most software,” but should have said something broader to make it clear that it also applies to any company whose product—basically—is information that it’s close to costless to reproduce/generate. The book Information Rules is really good on this.)
Sometimes the above conditions hold well enough for people to be able to keep charging for software or access to websites. For example, LinkedIn can charge employers to access its specialized search tools, etc., due to network effects.
What otherwise often ends up happening is something is offered for free, with ads—because there’s some quality difference between products, which is too small for people to be willing to pay to use the better product but large enough for people to be willing to look at sufficiently non-annoying ads to use the better product. (E.g. Google vs. the next-best search engine, for most people.) Sometimes that can still lead to a lot of revenue, other times less so.
Other times companies just stop very seriously trying to directly make money in a certain domain (e.g. online encyclopaedias). Sometimes—as you say—that leads competition to shift to some nearby and complementary domain, where it is more possible to make money.
As initial speculation: It seems decently likely to me (~60%?) that it will be hard for companies making large language/image-generation models to charge significant prices to most of their users. In that scenario, it’s presumably still possible to make money through ads or otherwise by collecting user information.
It’d be interesting, though, if that revenue wasn’t very high—then most of the competition might happen around complementary products/services. I’m not totally clear on what these would be, though.
Ah, you do say that. Serves me right for skimming!
To start, you could have a company for each domain area that an AI needs to be fine-tuned, marketed, and adapted to meet any regulatory requirements. Writing advertising copy, editing, insurance evaluations, etc.
As for the foundation models themselves, I think training models is too expensive to go back to academia as you suggest. And I think that there are some barriers to getting priced down. Firstly, when you say you need “patents or very-hard-to-learn-or-rediscover trade secrets ”, does the cost of training the model not count? It is a huge barrier. There are also difficulties in acquiring AI talent. And future patents seem likely. We’re already seeing a huge shift with AI researchers leaving big tech for startups, to try to capture more of the value of their work, and this shift could go a lot further.
Relevant: ” A reminder than OpenAI claims ownership of any image generated by DALL-E2″ - https://mobile.twitter.com/mark_riedl/status/1533776806133780481
related: Imagen replicating DALL-E very well, seems like good evidence that there’s healthy competition between big tech companies, which drives down profits.
One thing that might push against this are economies of scope and if data really does become the new oil and become more relevant over time.
This was excellent!