On the surface, cloud data center locations and women’s right to abortion in America are two topics that could not be further apart. Yet, they have become intimately intertwined when Proov, a healthcare tech company focused on fertility, decided to move its entire cloud infrastructure to Google Cloud Platform (GCP), solely to maximize user data privacy protection and redundancy in abortion-friendly states.
Last week, Protocol reported on Proov’s decision to go “all in” on GCP; it was previously using both AWS and Azure as its cloud providers. This decision rested on the fact that GCP was the only major cloud platform in America that has a data center in Nevada, giving Proov access to local data storage in three abortion-friendly states: California, Oregon, Nevada. (AWS and Azure are only in two such states.)
To understand the implication of a decision like Proov’s and appreciate its larger implications, let’s dig into how cloud storage works and why data centers are built where they are in the first place.
What is Stored in Vegas Stays in Vegas
Even though cloud computing enables seamless user experience of all types of digital products, anywhere and anytime, the way cloud data storage works behind the scenes is very local, where physical distance and location matters a great deal.
Physical distance tends to impact user experience and performance. Location tends to matter for compliance or privacy protection needs. I’ll illustrate with a hypothetical example.
To give a user that “seamless” experience, cloud infrastructure engineers would purposely store a copy of that user’s data and record (e.g. shopping history) in a data center that is closest to that user’s location (often called the “primary” copy), so this user can access an app or website faster. Another copy of the same user data is typically stored in a data center that is not too far away as a “secondary” backup, in case the “primary” data center has technical issues. Oftentimes, a third copy is stored in another data center far (really far) away as a “disaster recovery” backup to the “secondary”, just in case an earthquake, hurricane, or flood destroys the “primary” and the close-by “secondary”. In this architecture with built-in redundancy, this user’s data is always securely stored, somewhere in the world.
Data center location also matters when storing this same user’s data must comply with data governance rules or if the privacy of this data needs to be protected. In the “compliance scenario”, many countries and jurisdictions, e.g. China under its new Personal Information Protection Law, requires that the data of citizens from those places must be stored in data centers physically located in those places. Thus, if this user happens to be Chinese, the person’s data must be stored in a data center located inside China. In the “privacy scenario”, a company may want to keep this user’s data away from certain unfriendly or adversarial countries or governments to prevent tracking, hacking, or other forms of exploitation and abuse. Thus, if you are an American user of TikTok, your data is (or supposed to be) stored in data centers located away from the reach of the Chinese government. (As I illustrated in “TikTok's American Credibility Problem”, TikTok has yet to follow through on its privacy promises.)
Of course, both the compliance and privacy scenarios are far more complex than what I described; I simply want to illustrate the importance of a data center’s physical location when dealing with these issues. In Proov’s case, what made GCP’s new data center near Las Vegas the deciding factor was the “privacy scenario”. Proov wants to be able to safely store its users’ fertility and other health data in as many data center locations away from state governments unfriendly to abortion rights as possible.
GCP, by being the only major cloud platform to have a data center in Nevada, whose governor publicly committed to protecting abortion rights, won the deal! As Proov’s CTO, Jeff Schell, shared: “[W]hen you're talking two versus three viable options, it is significant, so that was something that informed our choice to move to Google…We are less likely to be subject to subpoenas to produce the data of our consumers.”
What is stored in Vegas stays in Vegas.
Did Google intentionally invest in Nevada back in 2018, hoping the location would yield a competitive advantage in storing fertility data? Of course not. That’s not how choosing a data center location works. At least not yet.
Why Are Data Centers Where They Are
The typical set of considerations that goes into choosing a data center location is a combination of existing infrastructure, proximity to users, available land to build, and other natural factors that give operating a data center some advantages. These factors are all reflected in one way or another in the top three clusters of data centers in the US – Virginia, California, and Oregon.
Virginia is the densest location of data centers primarily because of its proximity to both east coast internet users and Washington DC, where it can leverage existing, high-quality digital infrastructure already built for the US government. That’s why the data centers there are all clustered in Northern Virginia, essentially a suburb of DC, and not other parts of the state.
California is popular because of its proximity to west coast internet users and being the hot bed of technology innovation. It also has large swaths of land to develop, plus existing infrastructural network cables that connect the US to the fast-growing APAC region. Oregon, besides having the same west coast advantage that California has, also has regions that naturally come with having dry cool air, making the cooling of data centers less energy-intensive and less costly. (Facebook’s investment in designing its own data centers in Pineville, Oregon, to take advantage of these natural elements is a good read.)
As the demand for more cloud computing grows at a torrid pace, different states began using policy and tax incentives to lure big tech companies to build new data centers in their backyard. Iowa has been especially successful in this regard. On top of offering cheaper energy than California and no sales tax to power use, both the Iowa state legislature and various local governments have been willing to add more tax exemptions to sweeten its “data center package”. The city of Council Bluffs, for example, offered Google $33 million in property tax exemptions over 20 years to win the search giant’s $600 million data center investment. (The Atlantic has a nice article from a few years ago detailing how Iowa became a new hot destination for building data centers.)
Google’s data center investments in Nevada bear a similar flavor to what it did in Iowa. The same Las Vegas data center that was instrumental in GCP winning Proov’s business was “sweetened” with a combination of sales, use, and property tax exemptions from the Nevada state government worth $25.2 million. Seeing a “win-win” formula for both sides, Google committed an additional $1.2 billion in 2020 to expand its Las Vegas data center and build a new one in Northern Nevada, while receiving a similar $25 million worth of tax incentives from the Reno-Tahoe local government.
None of these considerations took into account the probability (now fact) that Roe vs. Wade will be overturned by the Supreme Court, instantly making data localization for fertility and reproductive data a “states’ rights” issue.
Two Unlikely Overlaying Maps
Usually, when assessing a cloud platform’s data center coverage, you start by looking at a map that shows the geographical distribution of all its data centers, as well as the type and quality of the network cables that connect these data centers.
Here is GCP’s map:
We may be entering a period in US history where if you work in the healthcare technology space, you may want to overlay a cloud data center map with another map that shows which state, and to what extent, protects or not protect abortion rights.
Here is the most current state-by-state map of the legal status of abortion, according to the New York Times:
By doing this unlikely overlay, we can see that among all the locations where Google has announced it would expand data center capacities, most of those locations (Ohio, Oklahoma, South Carolina, Texas, Alabama, Tennessee) fall under either the “Banned” or “Restricted” categories. Other locations (Nebraska, Virginia) are in the uncertain “Legal for now” category. Only Nevada and Oregon are safely in the “Legal or protected” category.
It is quite unfathomable to think that the otherwise purely economical and technical considerations that undergird a data center location choice, may now be blended with the most controversial political topic of this American generation. It is equally quite unfathomable to think that state governors and legislatures, flexing the full power of “states’ rights”, may use their stance on abortion rights to either attract new data center investments or subpoena people’s data inside those data centers.
Our world is interconnected in so many mysterious ways.
(本篇中文版文章是读者 Ben Yu 做的编译，我做了一些修改而发表。非常感谢Ben的贡献！)
标题看起来可能有些莫名其妙，云数据中心的地理位置和美国妇女堕胎权利能有什么关联呢？但请关注这一则新闻：专注于提供生育相关服务的医疗科技公司 Proov 为了最大限度地保护用户数据隐私，选择把云基础设施完全迁移到保护堕胎权利的州境内，大赢家是谷歌云。
上周， 美媒 Protocol 报道了这则新闻。Proov 此前使用 AWS 和 Azure，而迁移的很大原因，是因为谷歌云是美国唯一在内华达州设有数据中心的主流云服务平台，这让 Proov 可以同时在三个保护堕胎权利的州境内存储数据。（三个州分别是加州、俄勒冈州、内华达州，而 AWS 和 Azure 只在另外两个州设有数据中心）
为了理解 Proov 决策的含义，并了解背后更大的影响，我们需要先深入了解云存储是如何运作的，以及数据中心究竟是如何择址的。
为了提供给用户更流畅的体验，云基础设施工程师会特意将用户的数据记录（如购物记录）的一个副本存储在离该用户位置最近的数据中心（通常称为“主副本”），这样该用户就可以更快地访问 App 或网站。
保护数据隐私是指，公司也许希望将用户数据尽可能远离某些敌对国家或政府，以防跟踪、黑客攻击或其他形式的数据滥用。因此，如果你是一个在美国的 TikTok 的用户，你的数据应该被储存在远离中国的数据中心。（正如我在《TikTok在美国的信誉问题》一文中所写到的，TikTok 尚未履行其隐私承诺。）
当然，具体合规和保护隐私的现实操作都比我描述的例子要复杂得多，我想强调说明的是，在处理这些问题时，数据中心物理位置的重要性。在 Proov 这个案例中，让谷歌云新的数据中心位于拉斯维加斯附近的决定性因素是 "保护数据隐私"。Proov 希望能够在尽可能多的数据中心安全地存储用户的生育和其他健康数据，远离对堕胎权利不友好的州政府。
谷歌云作为唯一一个在内华达州拥有数据中心的主流云服务平台，拿下了这笔单子，因为内华达州州长公开承诺保护堕胎权。正如 Proov 的CTO Jeff Schell 分享的：“两个和三个数据中心的差异是很大的，这也是为什么我们选择谷歌云的原因，这样我们就不用太担心被传唤提供用户的数据了。”
当然，这也并不意味着早在 2018 年，谷歌就预见到了这一竞争优势，从而决定在内华达州投资搭建数据中心。
加利福尼亚州之所以受欢迎，主要是因为它靠近西海岸的互联网用户，又是技术创新的热土。它还有大片土地可供开发，加上现有的基础设施网络电缆，将美国与快速增长的亚太地区连接起来。 俄勒冈州，除了拥有与加州一样的西海岸优势外，还自然拥有些干冷空气的地区，使得数据中心的冷却能耗较低，降低运营成本。（延伸阅读：Facebook 在俄勒冈州普莱恩维尔投资搭建自己的数据中心，以充分利用这些自然因素。)
随着更多云计算需求的急剧增长，不同州开始利用政策和税收优惠来吸引科技巨头门建立新的数据中心。 在这一方面，爱荷华州尤其成功，除了提供比加州更便宜的能源和没有销售税的电力使用之外，爱荷华州立法机构和各地方政府也都愿意增加更多的免税措施。例如，康瑟尔布拉夫斯市向谷歌提供了 3300 万美元长达 20 年期的财产税豁免，以此赢得了 6 亿美元的数据中心投资。(这篇几年前 The Atlantic 出版的文章，详细介绍了爱荷华州如何成为搭建数据中心的新热点。)