Louise Matsakis of Semafor reported an important scoop this week: the White House’s upcoming executive order on AI may require cloud computing platforms to share customer information with regulators. If this scoop turns out to be true, this EO will basically implement a “know your customer” (or KYC) type scheme to the realm of AI, effectively turning clouds into banks.
Is this a good approach?
While I’m not generally in favor of (too much) regulation, if we, as a society, have agreed on an intention to regulate (which appears to be the case regarding generative AI), then it’s better to use analogous rules that are already working elsewhere and fit them within the existing industry landscape, rather than come up with something totally new.
So if the goal is to prevent bad actors or foreign adversaries from accessing AI computing power, as training and inference chips become strategically critical resources, then yes, I think this KYC-like approach applied to the clouds is a reasonable and practical approach. Let's expand on why.
Regulating Within A Well-Baked Industry
The cloud computing industry is maturing and the landscape is becoming increasingly well-baked. This is not to suggest that competition isn’t still fierce and there won’t be more innovation and growth. But the key players – the hyperscalers like AWS, Azure, GCP, Oracle, and to some extent AliCloud, Huawei Cloud, and Tencent Cloud – are well-established. (See my previous post on the global data center footprint comparison between the American and the Chinese clouds.)
This industry maturity makes regulating certain aspects of generative AI more straightforward, because almost all users and builders of future AI applications will have to access the computing resources via one or two of these massive cloud platforms. Sure, there will be the occasional deep-pocketed or compliance-sensitive enterprises, who would spend the money and human power to buy and own the GPUs and build their own AI training infrastructure. But, by and large, companies who want to build or use AI will rent the resources from the cloud.
This is good news for pragmatic, well-intentioned regulators for two reasons.
First, there is only a small universe of hyperscalers that matters. As opposed to banks, where there are more than 4,000 banks in the US and 36 of them have assets of over $100 billion, there are literally only a handful of clouds that are relevant in the AI age. We can list them all here: AWS, Azure, GCP, Oracle, IBM, and a few upstarts in the Nvidia orbit, like CoreWeave and Lambda Labs. This is a great set up that makes enforcement practical. (I’m, of course, leaving out the Chinese clouds since they are out of the jurisdiction of US regulators.)
Second, most of these hyperscalers already serve large banks and other financial service institutions as their IT infrastructure, so they are familiar with the complex set of regulatory compliance that the financial industry must adhere to. Banks are arguably the most demanding, but also the most lucrative customers for cloud platforms. On the technical side, banks require the highest level of data accuracy, consistency, and security, since they store people’s money. On the compliance side, they need a huge amount of granular capabilities to meet various audit requirements, like KYC. It is a monumental undertaking to get a bank to move to the cloud, but once they do, they stay for the long haul. That’s why AWS touts its work with HSBC and Standard Chartered, Azure does the same with Blackrock and RBC, and GCP highlights its collaboration with Goldman Sachs and Deutsche Bank. Banks are the customers that will bring in more customers from all industries.
Hyperscalers are already good at compliance. They have to be for business reasons. Routing AI-related regulatory concerns through the clouds by tapping into their existing compliance processes, perhaps with some small additions here and there, is the most effective and least cumbersome way to reach regulatory goals. This approach would have prevented embarrassing loopholes, like how Chinese tech companies that are on the US entity list simply rented Nvidia GPUs from different cloud service providers when they were barred from buying those chips.
I would even venture to say that requiring cloud platforms to share customer information with regulators would be good for new AI startups too. With AI being regarded as a strategic capability for national competitiveness, geopolitics is always in the air, and which customer from which country is using which AI service is always under scrutiny. Putting the compliance onus on the hyperscalers or specialized AI clouds would remove a huge burden that few startups have the resources to bear.
Just like any regulatory approaches, there are drawbacks to regulating AI through the clouds. Here are a couple I can think of.
Compute Threshold Hard to Draw: if the amount of compute used by a customer is what triggers reporting from the cloud provider to the government, drawing the right line to trigger the reporting can be difficult, if not impossible. As Matsakis noted in her reporting, as the cost of compute to train AI models continues to come down, a compute threshold is too fast moving of a target. In my view, if the regulation’s goal is to preemptively prevent AI threats, especially from foreign actors, then a compute threshold is not the right trigger. In this case, the “who” is the most important factor. If a terrorist organization from the Middle East or a SenseTime (or any other blacklisted Chinese company) runs even a tiny workload using AI chips racked in an AWS data center in the UAE or South Africa, wouldn’t an American regulator want to know?
The cleanest way this regulation would work is a constant cross-checking process between the hyperscalers’ customer list and the Commerce Department’s entity list, Treasury Department’s OFAC list, State Department’s Foreign Terrorist Organizations list, and other similar lists the US government currently maintains. Blacklisted entities aren’t stupid, of course, and are already using subsidiaries and shell companies to obfuscate their identity when trying to access sanctioned computing resources in the cloud. A random, new customer that appears out of nowhere and starts using AI compute resources should trigger an automatic report. Enforcement will require extra vigilance and cooperation between the hyperscalers and the regulators.
The Big Gets Bigger: this may sound counterintuitive, but more AI regulatory requirements placed on the hyperscalers will only make them stronger in the marketplace. These already big players will get bigger. There will be less room for new entrants to disrupt this market, unless they have some special relationship and backing from an existing big player, like the relationship between CoreWeave and Nvidia, the AI kingmaker. It is not surprising that this “KYC the cloud” idea has been pushed by Microsoft and OpenAI; being the regulatory targets in this case benefits them, the incumbents.
This is a classic example of “regulatory capture”, where regulation empowers the incumbents, promotes more rent seeking, reduces competition, and produces net-negative effects for society. I don’t have a good solution for this drawback. It has happened time and time before the US, given the prominent place that corporate lobbying places in the American lawmaking process. Benchmark’s Bill Gurley gave a compelling presentation a couple of weeks ago on this very subject, by citing past regulatory capture examples in the telecom and pharmaceutical industries, and the impending AI regulations being pushed by Sam Altman and others.
My view sits somewhere in the middle. I don’t think zero regulation on AI is right. I don’t think lots of regulations that clearly only benefit the incumbents are right either. There are many bone-headed, counterproductive ways to regulate AI. However, requiring some customer reporting transparency from the hyperscalers and treating the clouds like banks is not one of them.
美国科技媒体 Semafor的Louise Matsakis 本周报道了一条重要的独家新闻：白宫即将发布的有关AI的行政命令可能会要求云计算平台与监管机构分享客户信息。如果这个消息属实，此行政命令影响就是会在AI领域内实施一个类似 “实名客户”（或称Know Your Customer，KYC）的方案，把云平台像银行一样监管。
云计算行业正在成熟，整体格局也日益稳定。这并不是说竞争不再激烈，或者不会有更多的创新和增长。但是关键的玩家 – AWS、Azure、GCP、Oracle，以及在某种程度上国内的厂商像阿里云、华为云和腾讯云 – 都已经稳固立足。 (请参见我之前关于美国云和中国云之间的全球数据中心占比的文章。)
其次，这些大规模云厂商中的大部分已经为大型银行和其他金融服务机构提供IT基础设施，因此它们熟悉金融行业必须遵守的复杂的监管合规要求。银行无疑是对云计算平台最苛刻，但也是最有利可图的客户。在技术方面，银行要求极高的数据准确性、一致性和安全性，因为它们存的是客户的钱。在合规方面，它们需要高效率的得到许多很细微的信息来满足各种审计要求，如KYC。让一家银行上云是一项巨大的工程，但一旦上了，他们就是长期的客户，不会轻易挪动。这就是为什么AWS夸耀其与HSBC和Standard Chartered的合作，Azure与Blackrock和RBC做同样的宣传，而GCP则突出其与Goldman Sachs和Deutsche Bank的合作。银行是从所有行业吸引更多客户的客户。
大厂变更大：这可能听起来有些违反直觉，但对大规模云厂商施加更多的AI监管要求只会使他们在市场上变得更强大。大厂将变得更大。新兴创业公司要想打破这个AI云计算市场，空间将变得更小，除非他们与现有的大玩家有某种特殊的关系和支持，比如CoreWeave和AI界的王者英伟达之间的关系。不足为奇的是，这个 “KYC the cloud” 的监管提议就是来自微软和OpenAI携手推动的。在这种情况下，成为监管的目标对大厂其实是有利的。
这是一个 “监管捕获”（regulatory capture）的典型例子，其中的监管规定赋予了现有大企业权力，促进更多的寻租行为，同时减少竞争，并为社会产生了净负面效果。我对这个缺陷也没有什么好的解决方案。在美国，这种情况已经发生了一次又一次，因为企业游说团在美国的立法流程中占据了极有影响的地位。著名风投Benchmark的合伙人，Bill Gurley，几周前就此主题做了一个引人注目的演讲，通过引用电信和制药行业过去的监管捕获例子，以及Sam Altman和其他人正在推动的即将到来对AI的监管。