On the surface, cloud data center locations and women’s right to abortion in America are two topics that could not be further apart. Yet, they have become intimately intertwined when Proov, a healthcare tech company focused on fertility, decided to move its entire cloud infrastructure to Google Cloud Platform (GCP), solely to maximize user data privacy protection and redundancy in abortion-friendly states.

Last week, Protocol reported on Proov’s decision to go “all in” on GCP; it was previously using both AWS and Azure as its cloud providers. This decision rested on the fact that GCP was the only major cloud platform in America that has a data center in Nevada, giving Proov access to local data storage in three abortion-friendly states: California, Oregon, Nevada. (AWS and Azure are only in two such states.)

(For more of our previous writings on cloud data centers and the cloud computing industry, see “Why Is Facebook Not in the Cloud Business?” (premium content) and the Cloud Industry tag.)

To understand the implication of a decision like Proov’s and appreciate its larger implications, let’s dig into how cloud storage works and why data centers are built where they are in the first place.

What is Stored in Vegas Stays in Vegas

Even though cloud computing enables seamless user experience of all types of digital products, anywhere and anytime, the way cloud data storage works behind the scenes is very local, where physical distance and location matters a great deal.

Physical distance tends to impact user experience and performance. Location tends to matter for compliance or privacy protection needs. I’ll illustrate with a hypothetical example.

To give a user that “seamless” experience, cloud infrastructure engineers would purposely store a copy of that user’s data and record (e.g. shopping history) in a data center that is closest to that user’s location (often called the “primary” copy), so this user can access an app or website faster. Another copy of the same user data is typically stored in a data center that is not too far away as a “secondary” backup, in case the “primary” data center has technical issues. Oftentimes, a third copy is stored in another data center far (really far) away as a “disaster recovery” backup to the “secondary”, just in case an earthquake, hurricane, or flood destroys the “primary” and the close-by “secondary”. In this architecture with built-in redundancy, this user’s data is always securely stored, somewhere in the world.

Data center location also matters when storing this same user’s data must comply with data governance rules or if the privacy of this data needs to be protected. In the “compliance scenario”, many countries and jurisdictions, e.g. China under its new Personal Information Protection Law, requires that the data of citizens from those places must be stored in data centers physically located in those places. Thus, if this user happens to be Chinese, the person’s data must be stored in a data center located inside China. In the “privacy scenario”, a company may want to keep this user’s data away from certain unfriendly or adversarial countries or governments to prevent tracking, hacking, or other forms of exploitation and abuse. Thus, if you are an American user of TikTok, your data is (or supposed to be) stored in data centers located away from the reach of the Chinese government. (As I illustrated in “TikTok's American Credibility Problem”, TikTok has yet to follow through on its privacy promises.)

Of course, both the compliance and privacy scenarios are far more complex than what I described; I simply want to illustrate the importance of a data center’s physical location when dealing with these issues. In Proov’s case, what made GCP’s new data center near Las Vegas the deciding factor was the “privacy scenario”. Proov wants to be able to safely store its users’ fertility and other health data in as many data center locations away from state governments unfriendly to abortion rights as possible.

GCP, by being the only major cloud platform to have a data center in Nevada, whose governor publicly committed to protecting abortion rights, won the deal! As Proov’s CTO, Jeff Schell, shared: “[W]hen you're talking two versus three viable options, it is significant, so that was something that informed our choice to move to Google…We are less likely to be subject to subpoenas to produce the data of our consumers.”

What is stored in Vegas stays in Vegas.

GCP’s Henderson, Nevada data center

Did Google intentionally invest in Nevada back in 2018, hoping the location would yield a competitive advantage in storing fertility data? Of course not. That’s not how choosing a data center location works. At least not yet.

Why Are Data Centers Where They Are

The typical set of considerations that goes into choosing a data center location is a combination of existing infrastructure, proximity to users, available land to build, and other natural factors that give operating a data center some advantages. These factors are all reflected in one way or another in the top three clusters of data centers in the US – Virginia, California, and Oregon.

Virginia is the densest location of data centers primarily because of its proximity to both east coast internet users and Washington DC, where it can leverage existing, high-quality digital infrastructure already built for the US government. That’s why the data centers there are all clustered in Northern Virginia, essentially a suburb of DC, and not other parts of the state.

California is popular because of its proximity to west coast internet users and being the hot bed of technology innovation. It also has large swaths of land to develop, plus existing infrastructural network cables that connect the US to the fast-growing APAC region. Oregon, besides having the same west coast advantage that California has, also has regions that naturally come with having dry cool air, making the cooling of data centers less energy-intensive and less costly. (Facebook’s investment in designing its own data centers in Pineville, Oregon, to take advantage of these natural elements is a good read.)

As the demand for more cloud computing grows at a torrid pace, different states began using policy and tax incentives to lure big tech companies to build new data centers in their backyard. Iowa has been especially successful in this regard. On top of offering cheaper energy than California and no sales tax to power use, both the Iowa state legislature and various local governments have been willing to add more tax exemptions to sweeten its “data center package”. The city of Council Bluffs, for example, offered Google $33 million in property tax exemptions over 20 years to win the search giant’s $600 million data center investment. (The Atlantic has a nice article from a few years ago detailing how Iowa became a new hot destination for building data centers.)

GCP data center in Iowa

Google’s data center investments in Nevada bear a similar flavor to what it did in Iowa. The same Las Vegas data center that was instrumental in GCP winning Proov’s business was “sweetened” with a combination of sales, use, and property tax exemptions from the Nevada state government worth $25.2 million. Seeing a “win-win” formula for both sides, Google committed an additional $1.2 billion in 2020 to expand its Las Vegas data center and build a new one in Northern Nevada, while receiving a similar $25 million worth of tax incentives from the Reno-Tahoe local government.

None of these considerations took into account the probability (now fact) that Roe vs. Wade will be overturned by the Supreme Court, instantly making data localization for fertility and reproductive data a “states’ rights” issue.

Two Unlikely Overlaying Maps

Usually, when assessing a cloud platform’s data center coverage, you start by looking at a map that shows the geographical distribution of all its data centers, as well as the type and quality of the network cables that connect these data centers.

Here is GCP’s map:

Source: https://cloud.google.com/about/locations#network

We may be entering a period in US history where if you work in the healthcare technology space, you may want to overlay a cloud data center map with another map that shows which state, and to what extent, protects or not protect abortion rights.

Here is the most current state-by-state map of the legal status of abortion, according to the New York Times:

Source: https://www.nytimes.com/interactive/2022/us/abortion-laws-roe-v-wade.html

By doing this unlikely overlay, we can see that among all the locations where Google has announced it would expand data center capacities, most of those locations (Ohio, Oklahoma, South Carolina, Texas, Alabama, Tennessee) fall under either the “Banned” or “Restricted” categories. Other locations (Nebraska, Virginia) are in the uncertain “Legal for now” category. Only Nevada and Oregon are safely in the “Legal or protected” category.

It is quite unfathomable to think that the otherwise purely economical and technical considerations that undergird a data center location choice, may now be blended with the most controversial political topic of this American generation. It is equally quite unfathomable to think that state governors and legislatures, flexing the full power of “states’ rights”, may use their stance on abortion rights to either attract new data center investments or subpoena people’s data inside those data centers.

Our world is interconnected in so many mysterious ways.

云数据中心和美国妇女堕胎权利间的关系

(本篇中文版文章是读者 Ben Yu 做的编译,我做了一些修改而发表。非常感谢Ben的贡献!)

标题看起来可能有些莫名其妙,云数据中心的地理位置和美国妇女堕胎权利能有什么关联呢?但请关注这一则新闻:专注于提供生育相关服务的医疗科技公司 Proov 为了最大限度地保护用户数据隐私,选择把云基础设施完全迁移到保护堕胎权利的州境内,大赢家是谷歌云。

上周, 美媒 Protocol 报道了这则新闻。Proov 此前使用 AWS 和 Azure,而迁移的很大原因,是因为谷歌云是美国唯一在内华达州设有数据中心的主流云服务平台,这让 Proov 可以同时在三个保护堕胎权利的州境内存储数据。(三个州分别是加州、俄勒冈州、内华达州,而 AWS 和 Azure 只在另外两个州设有数据中心)

(如果想了解更多有关云数据中心和云计算的内容,可以阅读此前我写的一些文章:《为什么Facebook不做云的生意?》(付费可读),以及  Cloud Industry 标签。)

为了理解 Proov 决策的含义,并了解背后更大的影响,我们需要先深入了解云存储是如何运作的,以及数据中心究竟是如何择址的。

本地数据优先

尽管云计算服务让用户能够不受空间和时间的限制,自由使用任意数字产品,但云数据存储背后的运作方式实际上是非常本地化的,物理距离和具体位置都非常重要。

物理距离往往会影响用户体验和性能,而位置则往往关系到合规性或隐私保护,这里将用一个虚构案例来说明。

为了提供给用户更流畅的体验,云基础设施工程师会特意将用户的数据记录(如购物记录)的一个副本存储在离该用户位置最近的数据中心(通常称为“主副本”),这样该用户就可以更快地访问 App 或网站。

同一用户数据的另一个副本通常存储在不远处的数据中心,作为“二级副本”,以防存有“主副本”的数据中心出现技术问题。通常情况下,还有第三个副本会被存储在另一个很远很远的数据中心,作为“二级副本”的备份,以防地震、飓风或洪水摧毁“主副本”和邻近的“二级副本”。在这种具有内置冗余的架构中,用户数据总是安全地存储在世界的某个地方。

上面解释了云数据需要距离用户更近易提供更好的用户体验,而当存储同一用户数据必须符合数据管理规则,或需要保护该数据隐私时,数据中心的位置就变得非常重要。

符合数据管理规则是指,许多国家和司法管辖区,例如中国根据其新的个人信息保护法,要求如果某用户是中国公民,则这个人的数据就必须存储在位于中国境内的数据中心。

保护数据隐私是指,公司也许希望将用户数据尽可能远离某些敌对国家或政府,以防跟踪、黑客攻击或其他形式的数据滥用。因此,如果你是一个在美国的 TikTok 的用户,你的数据应该被储存在远离中国的数据中心。(正如我在《TikTok在美国的信誉问题》一文中所写到的,TikTok 尚未履行其隐私承诺。)

当然,具体合规和保护隐私的现实操作都比我描述的例子要复杂得多,我想强调说明的是,在处理这些问题时,数据中心物理位置的重要性。在 Proov 这个案例中,让谷歌云新的数据中心位于拉斯维加斯附近的决定性因素是 "保护数据隐私"。Proov 希望能够在尽可能多的数据中心安全地存储用户的生育和其他健康数据,远离对堕胎权利不友好的州政府。

谷歌云作为唯一一个在内华达州拥有数据中心的主流云服务平台,拿下了这笔单子,因为内华达州州长公开承诺保护堕胎权。正如 Proov 的CTO Jeff Schell 分享的:“两个和三个数据中心的差异是很大的,这也是为什么我们选择谷歌云的原因,这样我们就不用太担心被传唤提供用户的数据了。”

储存在拉斯维加斯的数据会一直存留在拉斯维加斯。

谷歌云在内华达周的数据中心

当然,这也并不意味着早在 2018 年,谷歌就预见到了这一竞争优势,从而决定在内华达州投资搭建数据中心。

数据中心是如何择址的?

选择数据中心位置的典型因素是现有的基础设施、与用户的距离、可用于建设的土地以及其他给运营数据中心提供一定优势的自然因素的组合。这些因素都以各种方式反映在美国的三大数据中心集群中——弗吉尼亚州、加利福尼亚州和俄勒冈州。

弗吉尼亚州是数据中心最密集的地方,主要是因为它既靠近东海岸的互联网用户,又靠近华盛顿特区,从而可以利用现有的、已经为美国政府建设的高质量数字基础设施。这就是为什么那里的数据中心都集中在弗吉尼亚州的北部,基本上就是华盛顿的郊区,而不是该州的其他地方。

加利福尼亚州之所以受欢迎,主要是因为它靠近西海岸的互联网用户,又是技术创新的热土。它还有大片土地可供开发,加上现有的基础设施网络电缆,将美国与快速增长的亚太地区连接起来。 俄勒冈州,除了拥有与加州一样的西海岸优势外,还自然拥有些干冷空气的地区,使得数据中心的冷却能耗较低,降低运营成本。(延伸阅读:Facebook 在俄勒冈州普莱恩维尔投资搭建自己的数据中心,以充分利用这些自然因素。)

随着更多云计算需求的急剧增长,不同州开始利用政策和税收优惠来吸引科技巨头门建立新的数据中心。 在这一方面,爱荷华州尤其成功,除了提供比加州更便宜的能源和没有销售税的电力使用之外,爱荷华州立法机构和各地方政府也都愿意增加更多的免税措施。例如,康瑟尔布拉夫斯市向谷歌提供了 3300 万美元长达 20 年期的财产税豁免,以此赢得了 6 亿美元的数据中心投资。(这篇几年前 The Atlantic 出版的文章,详细介绍了爱荷华州如何成为搭建数据中心的新热点。)

谷歌云在爱荷华州的数据中心

谷歌在内华达州的数据中心投资与它在爱荷华州的投资“味道”很像。谷歌的拉斯维加斯数据中心得到了价值 2520 万美元的内华达州政府的销售额、使用税和财产税豁免优惠。谷歌承诺在 2020 年追加 12 亿美元,以扩大其在拉斯维加斯的数据中心,并在内华达州北部建立一个新的数据中心,同时从 Reno-Tahoe 政府获得价值 2500 万美元的类似税收优惠

但是,上述这些都没有考虑到罗诉韦德案被美最高法院推翻的可能性(现在是事实了),从而使生育数据的本地化成为一项“州权利”的问题。

堕胎倾向和数据中心地图指南

通常情况下,在评估一个云平台的数据中心覆盖范围时,首先需要看一张显示所有数据中心的地理分布的地图,这张地图还需要显示连接这些数据中心的网络电缆的类型和质量。

下面这是谷歌云的地图:

来源: https://cloud.google.com/about/locations#network

我们可能正在见证美国历史进入另一个时期,某个在医疗科技领域工作的员工,需要在云数据中心地图上叠加另一张地图,显示哪个州以及在何种程度上保护或不保护堕胎权利。

下图是《纽约时报》报道的各州最新的堕胎法律情况地图:

来源: https://www.nytimes.com/interactive/2022/us/abortion-laws-roe-v-wade.html

叠加后我们可以看到,在所有谷歌已经宣布扩大数据中心容量的地方中,大多数地方(俄亥俄州、俄克拉荷马州、南卡罗来纳州、德克萨斯州、阿拉巴马州、田纳西州)属于“禁止”或“限制”类别,其他地方(内布拉斯加,弗吉尼亚)属于不确定的“暂时合法”类别,只有内华达州和俄勒冈州安全地处于“合法或受保护”的类别。

原本对数据中心的择址更多是经济和技术上的考量,而现在则和可能是这一代美国人经历的最有争议的政治话题混合在一起,令人匪夷所思。另外,各州州长和立法机构可能会充分运用“州权”的力量,利用他们对堕胎权的立场来吸引新的数据中心投资,或传唤这些数据中心内的用户数据。

当今的世界总是以一种无法预料的方式交织和“互联”在一起。

如果您喜欢所读的内容,请用email订阅加入“互联”。要想读以前的文章,请查阅《互联档案》。每周一篇新文章送达您的邮箱。请在TwitterLinkedIn、Clubhouse(@kevinsxu)上给个follow,和我交流互动!