When Meta first released its open source foundation AI model, LLaMA, I thought it was a shrewd move to gain public trust for a social media giant with a huge trust deficit. However, the more I think about Meta’s open source AI posture from a business strategy angle, the more I realize that the chess game is bigger than just gaining trust via transparency and earning some PR points.

Meta’s end game may be to commoditize all foundation AI models through open source, consequently rendering Google, OpenAI, or any other closed-source AI model makers “moatless”.

There is actually a series of historical examples, where open source was used as a defensive business lever, to improve a company’s positioning relative to its competitors. Meta (back then, Facebook) was the main player in one of these examples.

History of Open Source as Defense

In 2009, when Facebook’s user growth was hitting escape velocity, its infrastructure team needed to completely redesign every layer of its data center to meet this once-in-a-life-time scaling challenge. From custom software and servers, to physical racks and power supplies, every component was redesigned. This effort culminated in Facebook’s own data center located in Prineville, Oregon – dubbed the “Tibet of North America” and chosen specifically for its plentiful supply of dry, cool air sitting on a plateau 2,800 feet above sea level. (Here is a great story in Wired Magazine about the Facebook Prineville data center, published in 2011.)

Instead of keeping all this custom engineering and design work a secret, Facebook open sourced the whole thing. This move stood in stark contrast to Google at the time, who did keep all its infrastructure technology a secret. Externally, Facebook framed this decision to open source in magnanimous terms – promoting open collaboration in hardware to accelerate innovation and drive efficiency. But the business logic is to commoditize the infrastructure stack, so Facebook wouldn’t be held hostage to vendors who made money from proprietary infrastructure technology and hinder its torrid growth.

By forcing all these vendors, from server makers to telecom companies, who wanted to work with Facebook to be compatible with its open sourced design, it created an open standard that reduced the leverage of proprietary IP in the data center, rendering these companies “moatless” against Facebook. This design seeded what became the Open Compute Project Foundation, which now stewards the open data center standard initiated by Facebook more than a decade ago.

This defensive play allowed Facebook to focus on growing its user base and its data center capability unencumbered, during a time when public cloud was still a novel concept and Facebook was still figuring out its own business model. If Facebook had decided to monetize its infrastructure by offering a cloud service like AWS, this move to open source its data center design would have backfired, because that would mean turning a defensive posture into an offensive strategy. And Facebook did toy with this idea by acquiring a backend-as-a-service startup called Parse in 2013 to serve as the core of a possible cloud offering. But then Facebook figured out mobile advertising, Parse was shut down in 2017, and the rest is history.

Meta’s Prineville, Oregon data center

Facebook and the Open Compute Project is one of the most illustrative, though lesser known examples of the “open source as defense” strategy. There are two other more well-known examples.

One is Google’s decision to open source Kubernetes in mid-2014. It is a container orchestration infrastructure software layer that gave Google a wedge into AWS’s then total dominance in the cloud. By switching posture from its “infrastructure is secret” days to open sourcing a core part of how Google operates its own infrastructure, Kubernetes eventually became an open standard (like Facebook’s data center design) and seeded the Cloud Native Computing Foundation (like the Open Compute Project Foundation). Its popularity forced AWS to offer Kubernetes-related services, which (somewhat) eroded its dominance and moat in cloud computing over time.

Another example that most people in tech (or even outside of tech) have heard of is the iOS versus Android story. Android is the open source mobile operating system that came out of Google to defend against the iOS ecosystem’s dominance. While it is debatable how open Android really is these days, it has been open enough to fuel many smartphone brands to compete with Apple and take market share – Samsung, Xiaomi, Oppo, Huawei, and of course Google’s own Pixel.

Bill Gurley, the famed venture capitalist, gave a succinct overview of all three examples of “open source as defense” on a recent episode of Tim Ferriss’s podcast.

Meta’s Open Source Playbook

Since the Open Compute Project, Facebook/Meta open sourced many pieces of its tech stack that became popular – database technologies (Cassandra, RocksDB, Presto), GraphQL, React, PyTorch, to name a few. This “generous” behavior was quite common among the tech giants over the last decade, many of which picked and chose pieces of technology that are not core to its business model to open source. By doing so, they were able to improve certain layers of their tech stack faster (open source tends to produce faster iterations and more secure software), keep their top engineers happy, and help recruit more engineers, most of whom prefer working at a company with strong open source ethos.

So when Meta released LLaMA as an open source project, as opposed to a well-packaged chatbot (which it did before with BlenderBot and failed), the decision was not entirely surprising. Also implicit in this decision was perhaps an admission that Meta’s foundation AI model was not as good as OpenAI’s or Google’s. But it was still speculative to frame LLaMA as an “open source as defense” strategic play to erode competitors’ moat…until last week.

Appearing on the Lex Fridman podcast, Mark Zuckerberg directly brought up Open Compute Project as a reference point, when discussing the rationale behind Meta’s AI open source strategy. Here’s the clip:

It’s now clearer than ever that Meta is dusting off the “open source as defense” playbook, with the end game possibly becoming the Android of foundation AI models. Meta may not have the best model, but even having the 3rd or 4th best and making it open source and popular among developers would speed up the erosion of any moat that having the best model may deliver. And if Meta decides to license the next version of LLaMA to permit commercial use (the current version only allows for scientific research use), that erosion will only accelerate further. OpenAI seems to already recognize just how shallow its moat is. It is rumored to be building an app-store-like marketplace for models, as a hedge against the eventuality that having the best model won’t mean much.

Of course, just because open source can be a powerful commoditization lever, that doesn’t mean closed source technologies cannot co-exist and still strive. After all, iOS and AWS are doing fine. OpenAI will do fine, even if it continues to monetize its best models while keeping them closed. But it will have to contend with Meta, a tech giant with a long history of applying open source as part of its competitive business strategy, who doesn’t plan to make a dime from foundation AI models.

Meta想使所有AI都无护城河

当Meta首次发布其开源的基础AI模型LLaMA时,我当时认为这是一个精明的举动,目的是为一家缺乏公众信任的社交媒体巨头赢回些 trust。但当我又花些时间从商业战略的角度更深入地思考Meta开源AI缘由时,我意识到这盘棋的规模远大于仅通过开源的透明获得些信任,赢得些正面的媒体头条。

Meta的最终目的很可能是通过开源把所有市场上的基础AI模型都变得毫无价值,从而使Google、OpenAI或任何其他闭源AI模型制造商没有任何“护城河”。

其实这种“用开源打防守”的商业战略有一系列历史案例,从而改善一家公司于其竞争对手的相对地位。Meta(当时的Facebook)是其中一个案例的主角。

用开源打防守的历史

早在2009年,当时Facebook的用户增长正在达到逃逸速度时,其基础设施团队需要对其数据中心的每个层都进行重新设计,以满足这种千载难逢的扩容挑战。从定制基础软件和服务器,到机架和电源供应,每一个组件都被重新设计了。这般努力的成果是Facebook位于俄勒冈州Prineville建立的自己的数据中心 —— 这个地方号称 “北美的西藏”,特别选中它是因为其海拔2800英尺的高原上大量的干燥、凉爽的空气。(顺便链接一篇关于Facebook 在 Prineville的数据中心的长篇文章,发表于2011年的《连线》杂志。)

Facebook并没有将所有这些定制工程和设计工作保密,而是整体开源。这一举动与当时的Google形成了鲜明的对比,后者当时把内部所有的基础设施技术都保密。Facebook对外把这个决定的缘由描述的极为慷慨: 为了推动硬件的开放合作,以加速创新和提高效率。但背后的商业逻辑是要将基础设施堆栈完全商品化,这样Facebook就不会被那些从专有基础设施技术中赚钱的供应商所控制,并阻碍其迅猛增长。

通过强迫所有这些供应商,从服务器制造商到电信公司,都必须兼容其开源的数据中心设计,它最终创建了一个开放标准,降低了数据中心中专有IP的影响力,使这些公司对Facebook而言无护城河。此设计最终孵化出了后来的 Open Compute Project 基金会,该基金会现在托管着由Facebook十多年前发起的开放数据中心标准。

这种防御策略让Facebook能够在公有云还是个新概念,Facebook还在摸索自己的商业模式的时代,不受阻碍地专注于增长其用户数量和数据中心的容量。如果Facebook当时决定也提供类似AWS的云服务来盈利的话,那开源数据中心设计这一举动就会适得其反,因为那将意味着把防守姿态转变为进攻策略。Facebook也差点走出了这一步,在2013年收购一个叫Parse的后端服务小公司,作为云服务产品的核心。但是后来Facebook搞定了移动广告这门大生意,Parse在2017年正式关门,其余的就不用说了。

Meta 在 Prineville, Oregon 的数据中心

Facebook和Open Compute Project是“开源打防守”这一策略最具说明性,但又不太为人所知的一个例子。还有另外两个例子更有名些。

一个是Google在2014年中旬决定开源Kubernetes。这是一个容器编排基础设施软件层,使Google能够在AWS当时在云计算中的绝对主导地位中找到立足点。从“基础设施是秘密”的以前,转变为开源内部如何操作自己的基础设施的核心部分的Google,Kubernetes最终成为了一个开放的标准(如Facebook的数据中心设计),并孵化出了Cloud Native Computing Foundation(如Open Compute Project Foundation)。其受欢迎程度迫使AWS也必须提供与Kubernetes相关的服务,这在一定程度上侵蚀了其在云计算行业中的主导地位和优势。

另一个大多数在科技圈里(甚至在科技圈外)都听说过的例子是iOS与Android的故事。Android是来自Google的开源移动操作系统,用来防御iOS生态系统的主导地位。虽然关于Android现在到底有多开放是个有争议的话题,但它已足够开放到可以推动许多智能手机品牌与苹果竞争,获取市场份额: 三星,小米,Oppo,华为,当然还有Google自家的Pixel。

著名的风险投资家Bill Gurley在Tim Ferriss的播客的最近一集中,很精辟的讲诉了这三个 “开源打防守” 的例子:

Meta的开源打法

自从Open Compute Project以来,Facebook或Meta开源了许多其技术栈的部分,众多这些项目都分变得非常流行,比如数据库技术(Cassandra, RocksDB, Presto)、GraphQL、React、PyTorch等等。在过去的十年里,这种“慷慨”的行为在科技巨头中还是很常见的,许多公司会挑选开源与其商业模式不冲突的技术层面。通过开源,它们即能够更快地提升自身技术栈的迭代(开源往往能产出更快的迭代和更安全的软件),让内部的资深工程师们开心,并能在招聘方面给力,大多数工程师更愿意加入有开源精神和业绩的公司。

因此,当Meta发布LLaMA作为一个开源项目,而不是一款精心打包的Chatbot(曾经做过BlenderBot但失败了)时,这个决定并不完全令人惊讶。此决定中也可能隐含了些“认输”的姿态,Meta的基础AI模型并不如OpenAI的或Google的好。但当时就把LLaMA定位为一种 “开源打防守” 的策略,以侵蚀竞争对手的护城河,还是有些牵强的...直到上周。

在Lex Fridman的播客节目中,小扎在讨论Meta的AI开源策略的理念时,直接提到了Open Compute Project作为参考点。以下是片段:

现在我们都清楚了,Meta正在再次翻看“开源打防守”的打法,最终的目标可能是成为基础AI模型的Android。Meta也许没有最好的模型,但即使有第三或第四好的模型,并把其开源在开发者群体中“走红”,就会加速侵蚀拥有最好模型可能带来的任何壁垒。如果Meta决定许可LLaMA的下一个版允许商业使用(当前版本仅允许科学研究使用),那么竞品护城河的侵蚀只会进一步加速。OpenAI似乎已经意识到它的壁垒有多么浅。有报道传言说,它正在打造一个类似app store的模型市场,作为对 “有模型无堡垒” 的最终可能性的对冲。

当然,仅仅因为开源可以是强大的商品化杠杆,并不意味着闭源技术不能共存并仍然得到收益。毕竟,iOS和AWS都活的好好的。即使OpenAI继续把最好的模型搬出来收费并保持闭源,它也活的好好的。但它将不得不与Meta竞争,一家财大气粗的科技巨头,并有长期用开源打防守的历史,而且不打算从基础AI模型中赚一分钱。