Google’s True Moonshot

2023-12-18 作者: Ben Thompson 原文 #Stratechery 的其它文章

Google’s True Moonshot ——

When I first went independent with Stratechery, I had a plan to make money on the side with speaking, consulting, etc.; what made me pull the plug on the latter was my last company speaking gig, with Google in November 2015 (I have always disclosed this on my About page). It didn’t seem tenable for me to have any sort of conflict of interest with companies I was covering, and the benefit of learning more about the companies I covered — the justification I told myself for taking the engagement — was outweighed by the inherent limitations that came from non-public data. And so, since late 2015, my business model has been fully aligned to my nature: fully independent, with access to the same information as everyone else.1

I bring this up for three reasons, that I shall get to through the course of this Article. The first one has to do with titles: it was at that talk that a Google employee asked me what I thought of invoking the then-unannounced Google Assistant by saying “OK Google”. “OK Google” was definitely a different approach from Apple and Amazon’s “Siri” and “Alexa”, respectively, and I liked it: instead of pretending that the assistant was the dumbest human you have ever talked to, why not portray it as the smartest robot, leaning on the brand name that Google had built over time?

“OK Google” was, in practice, not as compelling as I hoped. It was better than Siri or Alexa, but it had all of the same limitations that were inherent to the natural language processing approach: you had to get the incantations right to get the best results, and the capabilities and responses were ultimately more deterministic than you might have hoped. That, though, wasn’t necessarily a problem for the brand: Google search is, at its core, still about providing the right incantations to get the set of results you are hoping for; Google Assistant, like Search, excelled in more mundane but critical attributes like speed and accuracy, if not personality and creativity.

What was different from search is that an Assistant needed to provide one answer, not a list of possible answers. This, though, was very much in keeping with Google’s fundamental nature; I once wrote in a Stratechery Article:

An assistant has to be far more proactive than, for example, a search results page; it’s not enough to present possible answers: rather, an assistant needs to give the right answer.

This is a welcome shift for Google the technology; from the beginning the search engine has included an “I’m Feeling Lucky” button, so confident was Google founder Larry Page that the search engine could deliver you the exact result you wanted, and while yesterday’s Google Assistant demos were canned, the results, particularly when it came to contextual awareness, were far more impressive than the other assistants on the market. More broadly, few dispute that Google is a clear leader when it comes to the artificial intelligence and machine learning that underlie their assistant.

That paragraph was from Google and the Limits of Strategy, where I first laid out some of the fundamental issues that have, over the last year, come into much sharper focus. On one hand, Google had the data, infrastructure, and customer touch points to win the “Assistant” competition; that remains the case today when it comes to generative AI, which promises the sort of experience I always hoped for from “OK Google.” On the other hand, “I’m feeling lucky” may have been core to Google’s nature, but it was counter to their business model; I continued in that Article:

A business, though, is about more than technology, and Google has two significant shortcomings when it comes to assistants in particular. First, as I explained after this year’s Google I/O, the company has a go-to-market gap: assistants are only useful if they are available, which in the case of hundreds of millions of iOS users means downloading and using a separate app (or building the sort of experience that, like Facebook, users will willingly spend extensive amounts of time in).

Secondly, though, Google has a business-model problem: the “I’m Feeling Lucky Button” guaranteed that the search in question would not make Google any money. After all, if a user doesn’t have to choose from search results, said user also doesn’t have the opportunity to click an ad, thus choosing the winner of the competition Google created between its advertisers for user attention. Google Assistant has the exact same problem: where do the ads go?

It is now eight years on from that talk, and seven years on from the launch of Google Assistant, but all of the old questions are as pertinent as ever.

Google’s Horizontal Webs

My first point brings me to the second reason I’m reminded of that Google talk: my presentation was entitled “The Opportunity — and the Enemy.” The opportunity was mobile, the best market the tech industry had ever seen; the enemy was Google itself, which even then was still under-investing in its iOS apps.

In the presentation I highlighted the fact that Google’s apps still didn’t support Force Touch, which Apple had introduced to iOS over a year earlier; to me this reflected the strategic mistake the company made in prioritizing Google Maps on Android, which culminated in Apple making its own mapping service. My point was one I had been making on Stratechery from the beginning: Google was a services company, which meant their optimal strategy was to serve all devices; by favoring Android they were letting the tail wag the dog.

Eight years on, and it’s clear I wasn’t the only one who saw the Maps fiasco as a disaster to be learned from: one of the most interesting revelations from the ongoing DOJ antitrust case against Google was reported by Bloomberg:

Two years after Apple Inc. dropped Google Maps as its default service on iPhones in favor of its own app, Google had regained only 40% of the mobile traffic it used to have on its mapping service, a Google executive testified in the antitrust trial against the Alphabet Inc. company. Michael Roszak, Google’s vice president for finance, said Tuesday that the company used the Apple Maps switch as “a data point” when modeling what might happen if the iPhone maker replaced Google’s search engine as the default on Apple’s Safari browser.

It’s a powerful data point, and I think the key to understanding what you might call the Google Aggregator Paradox: if Google wins by being better, then why does it fight so hard for defaults, both for search and, in the case of Android, the Play Store? The answer, I think, is that it is best to not even take the chance of alternative defaults being good enough. This is made easier given the structure of these deals, which are revenue shares, not payments; this does show up on Google’s income statement as Traffic Acquisition Costs (TAC), but from a cash flow perspective it is foregone zero marginal cost revenue. There is no pain of payment, just somewhat lower profitability on zero marginal cost searches.

The bigger cost is increasingly legal: the decision in the DOJ case won’t come down until next year, and Google may very well win; it’s hard to argue that the company ought not be able to bid on Apple’s default search placement if its competitors can (if anything the case demonstrates Apple’s power).

That’s not Google’s only legal challenge, though: last week the company lost another antitrust case, this time to Epic. I explained why the company lost — while Apple won — in last Tuesday’s Update:

That last point may seem odd in light of Apple’s victory, but again, Apple was offering an integrated product that it fully controlled and customers were fully aware of, and is thus, under U.S. antitrust law, free to set the price of entry however it chooses. Google, on the other hand, “entered into one or more agreements that unreasonably restrained trade” — that quote is from the jury instructions, and is taken directly from the Sherman Act — by which the jurors mean basically all of them: the Google Play Developer Distribution Agreement, investment agreements under the Games Velocity Program (i.e. Project Hug), and Android’s mobile application distribution agreement and revenue share agreements with OEMs, were all ruled illegal.

This goes back to the point I made above: Google’s fundamental legal challenge with Android is that it sought to have its cake and eat it too: it wanted all of the shine of open source and all of the reach and network effects of being a horizontal operating system provider and all of the control and profits of Apple, but the only way to do that was to pretty clearly (in my opinion) violate antitrust law.

Google’s Android strategy was, without question, brilliant, particularly when you realize that the ultimate goal was to protect search. By making it “open source”, Google got all of the phone carriers desperate for an iOS alternative on board, ensuring that hated rival Microsoft was not the alternative to Apple as it had been on PCs; a modular approach, though, is inherently more fragmented — and Google didn’t just want an alternative to Apple, they wanted to beat them, particularly in the early days of the smartphone wars — so the company spun a web of contracts and incentives to ensure that Android was only really usable with Google’s services. For this the company was rightly found guilty of antitrust violations in the EU, and now, for similar reasons, in the U.S.

The challenge for Google is that the smartphone market has a lot more friction than search: the company needs to coordinate both OEMs and developers; when it came to search the company could simply take advantage of the openness of the web. This resulted in tension between Google’s nature — being the one-stop shop for information — and the business model of being a horizontal app platform and operating system provider. It’s not dissimilar to the tension the company faces with its Assistant, and in the future with Generative AI: the company wants to simply give you the answer, but how to do that while still making money?

Infrastructure, Data, and Ecosystems

The third reason I remember that weekend in 2015 is it was the same month that Google open-sourced TensorFlow, its machine-learning framework. I thought it was a great move, and wrote in TensorFlow and Monetizing Intellectual Property:

I’m hardly qualified to judge the technical worth of TensorFlow, but I feel pretty safe in assuming that it is excellent and likely far beyond what any other company could produce. Machine learning, though, is about a whole lot more than a software system: specifically, it’s about a whole lot of data, and an infrastructure that can process that data. And, unsurprisingly, those are two areas where Google has a dominant position.

Indeed, as good as TensorFlow might be, I bet it’s the weakest of these three pieces Google needs to truly apply machine learning to all its various business, both those of today and those of the future. Why not, then, leverage the collective knowledge of machine learning experts all over the world to make TensorFlow better? Why not make a move to ensure the machine learning experts of the future grow up with TensorFlow as the default? And why not ensure that the industry’s default machine learning system utilizes standards set in place by Google itself, with a design already suited for Google’s infrastructure?

After all, contra Gates’ 2005 claim, it turns out the value of pure intellectual property is not derived from government-enforced exclusivity, but rather from the complementary pieces that surround that intellectual property which are far more difficult to replicate. Google is betting that its lead in both data and infrastructure are significant and growing, and that’s a far better bet in my mind than an all-too-often futile attempt to derive value from an asset that by its very nature can be replicated endlessly.

In fact, it turned out that TensorFlow was not so excellent — that link I used to support my position in the above excerpt now 404s — and it has been surpassed by Meta’s PyTorch in particular; at Google Cloud Next the company announced a partnership with Nvidia to build out OpenXLA as a compiler of sorts to ensure that output from TensorFlow, Jax, and PyTorch can run on any hardware. This matters for Google because those infrastructure advantages very much exist; the more important “Tensor” product for Google is its Tensor Processing Unit series of chips, the existence of which make Google uniquely able to scale beyond whatever allocation it can get of Nvidia GPUs.

The importance of TPUs was demonstrated with the announcement of Gemini, Google’s latest AI model; the company claims the “Ultra” variant, which it hasn’t yet released, is better than GPU-4. What is notable is that Gemini was trained and will run inference on TPUs. While there are some questions about the ultimate scalability of TPUs, for now Google is the best positioned to both train and, more importantly, serve generative AI in a cost efficient way.

Then there is data: a recent report in The Information claims that Gemini relies heavily on data from YouTube, and that is not the only proprietary data Google has access to: free Gmail and Google Docs are another massive resource, although it is unclear to what extent Google is using that data, or if it is, for what. At a minimum there is little question that Google has the most accessible repository of Internet data going back a quarter of a century to when Larry Page and Sergey Brin first started crawling the open web from their dorm room.

And so we are back where we started: Google has incredible amounts of data and the best infrastructure, but once again, an unsteady relationship with the broader development community.

Gemini and Seamless AI

The part of the Gemini announcement that drew the most attention did not have anything to do with infrastructure or data: what everyone ended up talking about was the company’s Gemini demo, and the fact it wasn’t representative of Gemini’s actual capabilities. Here’s the demo:

Pammy Olson for Bloomberg Opinion was the first to highlight the problem:

In reality, the demo also wasn’t carried out in real time or in voice. When asked about the video by Bloomberg Opinion, a Google spokesperson said it was made by “using still image frames from the footage, and prompting via text,” and they pointed to a site showing how others could interact with Gemini with photos of their hands, or of drawings or other objects. In other words, the voice in the demo was reading out human-made prompts they’d made to Gemini, and showing them still images. That’s quite different from what Google seemed to be suggesting: that a person could have a smooth voice conversation with Gemini as it watched and responded in real time to the world around it.

This was obviously a misstep, and a bizarre one at that: as I noted in an Update Google, given its long-term advantages in this space, would have been much better served in being transparent, particularly since it suddenly finds itself with a trustworthiness advantage relative to Microsoft and OpenAI. The goal for the company should be demonstrating competitiveness and competence; a fake demo did the opposite.

And yet, I can understand how the demo came to be; it is getting close to the holy grail of Assistants: an entity with which you can conduct a free-flowing conversation, without the friction of needing to invoke the right incantations or type and read big blocks of text. If Gemini Ultra really is better than GPT-4, or even roughly competitive, than I believe this capability is close. After all, I got a taste of it with GPT-4 and its voice capabilities; from AI, Hardware, and Virtual Reality:

The first AI announcement of the week was literally AI that can talk: OpenAI announced that you can now converse with ChatGPT, and I found the experience profound.

You have obviously been able to chat with ChatGPT via text for many months now; what I only truly appreciated after talking with ChatGPT, though, was just how much work it was to type out questions and read answers. There was, in other words, a human constraint in our conversations that made it feel like I was using a tool; small wonder that the vast majority of my interaction with ChatGPT has been to do some sort of research, or try to remember something on the edge of my memory, too fuzzy to type a clear search term into Google.

Simply talking, though, removed that barrier: I quickly found myself having philosophical discussions including, for example, the nature of virtual reality. It was the discussion itself that provided a clue: virtual reality feels real, but something can only feel real if human constraints are no longer apparent. In the case of conversation, there is no effort required to talk to another human in person, or on the phone; to talk to them via chat is certainly convenient, but there is a much more tangible separation. So it is with ChatGPT.

The problem is that this experience requires a pretty significant suspension of disbelief, because there is too much friction. You have to open the OpenAI app, then you have to set it to voice mode, then you have to wait for it to connect, then every question and answer contains a bit too much lag, and the answers start sounding like blocks of text instead of a conversation. Notice, though, that Google is much better placed than OpenAI to solve all of these challenges:

  • Google sells its own phones which could be configured to have a conversation UI by default (or with Google’s Pixel Buds). This removes the friction of opening an app and setting a mode. Google also has a fleet of home devices already designed for voice interaction.
  • Google has massive amounts of infrastructure all over the globe, with the lowest latency and fastest response. This undergirds search today, but it could undergird a new generative AI assistant tomorrow.
  • Google has access to gobs of data specifically tied to human vocal communication, thanks to YouTube in particular.

In short, the Gemini demo may have been faked, but Google is by far the company best positioned to make it real.

Pixie

There was one other interesting tidbit in The Information article (emphasis mine):

Over the next few months, Google will have to show it can integrate the AI models it groups under the Gemini banner into its products, without cannibalizing existing businesses such as search. It has already put a less advanced version of Gemini into Bard, the chatbot it created to compete with ChatGPT, which has so far seen limited uptake. In the future, it plans to use Gemini across nearly its entire line of products, from its search engine to its productivity applications and an AI assistant called Pixie that will be exclusive to its Pixel devices, two people familiar with the matter said. Products could also include wearable devices, such as glasses that could make use of the AI’s ability to recognize the objects a wearer is seeing, according to a person with knowledge of internal discussions. The device could then advise them, say, on how to use a tool, solve a math problem or play a musical instrument.

The details of Pixie, such as they were, came at the very end:

The rollout of Pixie, an AI assistant exclusively for Pixel devices, could boost Google’s hardware business at a time when tech companies are racing to integrate their hardware with new AI capabilities. Pixie will use the information on a customer’s phone — including data from Google products like Maps and Gmail — to evolve into a far more personalized version of the Google Assistant, according to one of the people with knowledge of the project. The feature could launch as soon as next year with the Pixel 9 and the 9 Pro, this person said.

That Google is readying a super-charged version of the Google Assistant is hardly a surprise; what is notable is the reporting that it will be exclusive to Pixel devices. This is counter to Gemini itself: the Gemini Nano model, which is designed to run on smartphones, will be available to all Android devices with neural processing units like Google’s Tensor G3. That is very much in-line with the post-Maps Google: services are the most valuable when they are available everywhere, and Pixel has a tiny amount of marketshare.

That, by extension, makes me think that the “Pixie exclusive to Pixel” report is mistaken, particularly since I’ve been taken in by this sort of thing before. That Google Assistant piece I quote above — Google and the Limits of Strategy — interpreted the launch of Google Assistant on Pixel devices as evidence that Google was trying to differentiate its own hardware:

Today’s world, though, is not one of (somewhat) standards-based browsers that treat every web page the same, creating the conditions for Google’s superior technology to become the door to the Internet; it is one of closed ecosystems centered around hardware or social networks, and having failed at the latter, Google is having a go at the former. To put it more generously, Google has adopted Alan Kay’s maxim that “People who are really serious about software should make their own hardware.” To that end the company introduced multiple hardware devices, including a new phone, the previously-announced Google Home device, new Chromecasts, and a new VR headset. Needless to say, all make it far easier to use Google services than any 3rd-party OEM does, much less Apple’s iPhone.

What is even more interesting is that Google has also introduced a new business model: the Pixel phone starts at $649, the same as an iPhone, and while it will take time for Google to achieve the level of scale and expertise to match Apple’s profit margins, the fact there is unquestionably a big margin built-in is a profound new direction for the company.

The most fascinating point of all, though, is how Google intends to sell the Pixel: the Google Assistant is, at least for now, exclusive to the first true Google phone, delivering a differentiated experience that, at least theoretically, justifies that margin. It is a strategy that certainly sounds familiar, raising the question of whether this is a replay of the turn-by-turn navigation disaster. Is Google forgetting that they are a horizontal company, one whose business model is designed to maximize reach, not limit it?

My argument was that Google was in fact being logical, for the business model reasons I articulated both in that Article and at the beginning of this year in AI and the Big Five: simply giving the user the right answer threatened the company’s core business model, which meant it made sense to start diversifying into new ones. And then, just a few months later, Google Assistant was available to other Android device makers. It was probably the right decision, for the same reason that the company should have never diminished its iOS maps product in favor of Android.

And yet, all of the reasoning I laid out for making the Google Assistant a differentiator still hold: AI is a threat to Search for all of the same reasons I laid out in 2016, and Google is uniquely positioned to create the best Assistant. The big potential difference with Pixie is that it might actually be good, and a far better differentiator than the Google Assistant. The reason, remember, is not just about Gemini versus GPT-4: it’s because Google actually sells hardware, and has the infrastructure and data to back it up.

Google’s True Moonshot

Google’s collection of moonshots — from Waymo to Google Fiber to Nest to Project Wing to Verily to Project Loon (and the list goes on) — have mostly been science projects that have, for the most part, served to divert profits from Google Search away from shareholders. Waymo is probably the most interesting, but even if it succeeds, it is ultimately a car service rather far afield from Google’s mission statement “to organize the world’s information and make it universally accessible and useful.”

What, though, if the mission statement were the moonshot all along? What if “I’m Feeling Lucky” were not a whimsical button on a spartan home page, but the default way of interacting with all of the world’s information? What if an AI Assistant were so good, and so natural, that anyone with seamless access to it simply used it all the time, without thought?

That, needless to say, is probably the only thing that truly scares Apple. Yes, Android has its advantages to iOS, but they aren’t particularly meaningful to most people, and even for those that care — like me — they are not large enough to give up on iOS’s overall superior user experience. The only thing that drives meaningful shifts in platform marketshare are paradigm shifts, and while I doubt the v1 version of Pixie would be good enough to drive switching from iPhone users, there is at least a path to where it does exactly that.

Of course Pixel would need to win in the Android space first, and that would mean massively more investment by Google in go-to-market activities in particular, from opening stores to subsidizing carriers to ramping up production capacity. It would not be cheap, which is why it’s no surprise that Google hasn’t truly invested to make Pixel a meaningful player in the smartphone space.

The potential payoff, though, is astronomical: a world with Pixie everywhere means a world where Google makes real money from selling hardware, in addition to services for enterprises and schools, and cloud services that leverage Google’s infrastructure to provide the same capabilities to businesses. Moreover, it’s a world where Google is truly integrated: the company already makes the chips, in both its phones and its data centers, it makes the models, and it does it all with the largest collection of data in the world.

This path does away with the messiness of complicated relationships with OEMs and developers and the like, which I think suits the company: Google, at its core, has always been much more like Apple than Microsoft. It wants to control everything, it just needs to do it legally; that the best manifestation of AI is almost certainly dependent on a fully integrated (and thus fully seamless) experience means that the company can both control everything and, if it pulls this gambit off, serve everyone.

The problem is that the risks are massive: Google would not only be risking search revenue, it would also estrange its OEM partners, all while spending astronomical amounts of money. The attempt to be the one AI Assistant that everyone uses — and pays for — is the polar opposite of the conservative approach the company has taken to the Google Aggregator Paradox. Paying for defaults and buying off competitors is the strategy of a company seeking to protect what it has; spending on a bold assault on the most dominant company in tech is to risk it all.

And yet, to simply continue on the current path, folding AI into its current products and selling it via Google Cloud, is a risk of another sort. Google is not going anywhere anytime soon, and Search has a powerful moat in terms of usefulness, defaults, and most critically, user habits; Google Cloud, no matter the scenario, remains an attractive way to monetize Google AI and leverage its infrastructure, and perhaps that will be seen as enough. Where will such a path lead in ten or twenty years, though?

Ultimately, this is a question for leadership, and I though Daniel Gross’s observation on this point in the recent Stratechery Interview with him and Nat Friedman was insightful:

So to me, yeah, does Google figure out how to master AI in the infrastructure side? Feels pretty obvious, they’ll figure it out, it’s not that hard. The deeper question is, on the much higher margin presumably, consumer angle, do they just cede too much ground to startups, Perplexity or ChatGPT or others? I don’t know what the answer is there and forecasting that answer is a little bit hard because it probably literally depends on three or four people at Google and whether they want to take the risk and do it.

We definitively know that if the founders weren’t in the story — we could not definitively, but forecast with pretty good odds — that it would just run its course and it would gradually lose market share over time and we’d all sail into a world of agents. However, we saw Sergey Brin as an individual contributor on the Gemini paper and we have friends that work on Gemini and they say that’s not a joke, he is involved day-to-day. He has a tremendous amount of influence, power, and control over Google so if he’s staring at that, together with his co-founder, I do think they could overnight kill a lot of startups, really damage ChatGPT, and just build a great product, but that requires a moment of [founder initiative].

It’s possible, it’s just hard to forecast if they will do it or not. In my head, that is the main question that matters in terms of whether Google adds or loses a zero. I think they’ll build the capability, there’s no doubt about it.

I agree. Google could build the AI to win it all. It’s not guaranteed they would succeed, but the opportunity is there if they want to go for it. That is the path that would be in the nature of the Google that conquered the web twenty years ago, the Google that saw advertising as the easiest way to monetize what was an unbridled pursuit of self-contained technological capability.

The question is if that nature been superceded by one focused on limiting losses and extracting profits; yes, there is still tremendous technological invention, but as Horace Dediu explained on Asymco, that is different than innovation, which means actually making products that move markets. Can Google still do that? Do they want to? Whither Google?


  1. I do still speak at conferences, but last spoke for pay in January 2017 


文章版权归原作者所有。
二维码分享本站