AWS vs Azure: Return of the JEDI?

You may have come here because of the “clever” title, but I promise you this is a serious analysis of the ongoing saga around the U.S. Department of Defense’s (DoD) massive cloud computing contract, JEDI (Joint Enterprise Defense Infrastructure).

Earlier this month, this saga was reignited by a new complaint that Amazon filed directly to the DoD, which led to dueling blog posts between Microsoft and Amazon’s public relations departments that are anything but cordial.

This movie will not be over any time soon. And it’s worth analyzing why, from both a technical and a political angle.

Current State of Play

The quality of reporting on this issue has been mixed. I don’t blame the reporters. To really explain what’s going on, you need a solid foundation of both how cloud computing works from a technical and business perspective, and how federal contracting works in the context of power and influence in Washington. And this is not just any run of the mill federal contracts, but a massive, 10-year investment by the U.S. military. I have found reporting by the Federal News Network, a niche media outlet, the most useful so far.

The current state of play in one set of bullet points without legal jargons:

  • The DoD gave the whole contract to Microsoft Azure.
  • Amazon filed a lawsuit alleging that Donald Trump interfered with the contracting to get back at Jeff Bezos (he owns the Washington Post).
  • Amazon also thinks the entire six-part technical evaluation was flawed and wants a total re-do.
  • The Judge found at least one of the six categories -- Price Scenario 6 -- faulty, pauses the whole thing for 120 days (until August 17), to give all the parties time to sort this out.
  • DoD volunteered to revise and reopen the bidding of Price Scenario 6.
  • Amazon does not like it, filed another complaint (not a lawsuit) directly to the DoD on the new, limited rebidding.
  • Meanwhile, the DoD’s Inspector General’s office (a self-investigative watchdog unit, similar to an independent auditing committee of a company) tried to investigate the whole thing and issued a big 300-plus page report
  • Two main takeaways from the report: 1. DoD officials accidentally shared some Azure confidential information with the AWS team; 2. The question regarding possible political interference from the Trump White House is inconclusive (read: we don’t know enough to know)

With that context in mind, let’s dive deeper.

Full IG Report

Assess Data Storage Pricing

The overarching technical question is whether revising Price Scenario 6 is enough. This category has to do with data storage. Based on the public portions of the court filings, Judge Patricia E. Campbell-Smith found that Azure’s solution did not meet the standards of the contract for a “cloud data warehouse” that needs to be “online” and “highly accessible” to JEDI users “without human intervention.” Azure thinks AWS’s pricing was too high and lost the bid fair and square. AWS thinks Azure’s bid was technically deficient and won under ambiguous DoD guidelines.

Which argument is more plausible?

The crux of this category seems to be a straightforward delivery of a highly available data warehouse. And evaluating high availability is both a storage durability issue and a networking issue; you can’t have one without the other.

Having data “online” and “accessible” (or “available”) “without human intervention” (aka automatically) requires some implementation of data replication plus a consensus algorithm (e.g. Paxos or Raft), so the system automatically makes copies (at least 3) of the data and synced all copies of that data to make sure they are the same (aka consistency). This implementation will likely happen both within a single data center region and across different geographical regions to guarantee “accessibility” with reasonable performance. E-commerce retailers do this so people can buy stuff. Social networks do this so people can “like” stuff. I expect much higher requirements from the world’s largest military.

All this technical work happening behind the scenes to have stored data be “online” automatically means a lot of data transfer cost via networks. Cloud industry observers are well aware of the trend that while storage cost is going down, the data transfer cost is not. And no organization of any meaningful scale would store a single copy of their data on AWS S3 or Azure Blob Storage and be done with it. Network data transfer cost is the bigger chunk to make “highly accessible” data a reality.

Lest you think I’m just mincing words here, I strongly believe that having precise definitions and shared understandings of the implications of those definitions are fundamental to any fair evaluation. If the definitions and scope of Price Scenario 6 are only about storage, it’s quite possible that Azure’s lower-priced bid does not capture the whole picture, while AWS’s bid appears expensive but is more comprehensive.

To be clear, this is not a dunk on Azure’s alleged technical inferiority. Even though I have criticized Azure’s reliability before, especially during all the COVID-19-induced usage surge (a line of argument that AWS is apparently using now), building a government cloud for the military is a unique technical challenge. Regardless of which company gets picked to build this, judging them from how their public cloud performs is not entirely fair.

If Pricing Scenario 6 is in fact limited to data storage with imprecise definitions and expectations, it is reasonable for the re-bidding process to go beyond this single category to at least include whichever category deals with networking. As far as I know, there is no public information as to what the other five categories are. Having said this, I’m not sure if Amazon’s request for a total re-do of all six categories is reasonable from a technical perspective. Did the DoD make some mistakes? Certainly appears so, based on the Judge’s ruling and the IG report. But did it mess up the entire process during the two-plus years of evaluation? Unlikely. Given the Pentagon’s current response to Amazon’s lawsuit, it clearly prefers the most limited rebidding process possible.

So what is Amazon trying to achieve with its lawsuits and complaints? I think there is a larger strategy at play that has less to do with technology and more to do with the U.S. political calendar.

Bigger Strategy: Get To November

I hate to make everything about politics, elections, and Donald Trump, but it’s unavoidable in this case. Amazon’s bigger strategy is to delay any work or action on the JEDI contract by all means possible until November when the U.S. presidential election happens. It’s betting on a Joe Biden victory. If Biden does become president, Washington will go through a complete leadership reshuffle at all levels of the government, and Amazon may very well get the total JEDI redo it is hoping for.

Amazon’s lawsuit has caused a pause until August 17, the endpoint of the current 120-day remand period. There are checkpoints along the way and the remand could be extended. This part in one of the Federal News Network reports provides a clear look ahead:

“Judge Campbell-Smith ordered government [aka DoD], AWS and Microsoft attorneys to submit a status report by June 16 to update the court on how the process is playing out. Once the reconsideration process is finished, another update is due within five days, along with each party’s views on whether the lawsuit needs to continue. But the order makes clear that the 120-day remand period could be extended.”

If the remand period is not extended beyond August 17, that scenario obviously won’t get Amazon to Election Day. Thus, Amazon filed its latest complaint directly to the DoD two weeks ago, along with waging an increasingly bitter public relations war with Microsoft to add more fuel to the court of public opinion, which always has some influence on the court of law.

The current lawsuit sits in the U.S. Court of Federal Claims. This court has at least two layers of more powerful courts above it: the U.S. Court of Appeals and the U.S. Supreme Court. If Amazon receives an unfavorable ruling from the Court of Federal Claims, an appeal to a higher level court is almost guaranteed. Amazon will use a combination of lawsuits and appeals, direct complaints, and public relations campaigns to drag this out to November. Microsoft will not be sitting idle either; its recent blog post responding to Amazon’s latest DoD complaint is full of emotional appeals to patriotism and jabs at Amazon as a sore loser, authored by a former Marine.

But will a Biden administration be so nice to Amazon that a “Return of the JEDI” will happen? Nothing is guaranteed, but what is certain is that Amazon will have a more receptive audience. Trump’s hatred towards Bezos personally and Amazon broadly is well-documented. Though to be fair, most Democrats don’t like Amazon either. Amazon’s upper leadership is stacked with former Obama-Biden administration officials, the most high-profiled being Jay Carney, who is Amazon’s SVP of Global Corporate Affairs. He was Biden’s communications director when Biden was Vice President and later Obama’s White House Press Secretary. (Disclaimer: I used to work for Jay in the White House as one of his many assistants. He was a really nice boss.)

If Biden becomes president, Amazon will use those relationships to influence the JEDI outcome. And I’m not here to single out Amazon. Microsoft would do the same. So would Oracle. So would every company who is or wants to do business with the Federal government. (When I wrote “Why Zoom Chose Oracle” a few weeks ago, I explained why Zoom's government business prospect is one of the reasons why it chose Oracle as its newest cloud provider.) All of of this is part of the “DC influence game” that gives Washington the swampy reputation it frankly deserves.

Of course, there is no guarantee that Biden will win. At this moment, his chances of beating Trump is 50/50 at best. But if what’s on the line is a second shot at a $10 billion USD contract that will most definitely open the door to many more millions of contracts with other government agencies, the expected outcome of this coin toss is totally worth the effort. All the lawyer fees and possible short-term reputational damages are peanuts compared to the potential winnings.

If you like what you've read, please SUBSCRIBE to the Interconnected email list. New posts will be delivered to your inbox (twice per week). Follow and interact with me on: Twitter, LinkedIn.

AWS vs Azure:JEDI重来?

你来这里可能是因为这个看似“聪明”的标题,但我向你保证,这是篇围绕美国国防部打造云计算平台JEDI(Joint Enterprise Defense Infrastructure)这个大单子里,五花八门的事件的严肃分析。

本月早些,Amazon直接向国防部提交了一项新的申诉,再次点燃整个事件,这导致了MicrosoftAmazon公关部门之间的博客文章决斗,态度极不友好。

这部“精彩大片”一时半会儿结束不了了。我们也值得花点时间从技术和政治角度分析背后的原因。

目前状态

关于整个故事的报道质量参差不齐。我也不怪记者。要真正解释整个事件的各种角度,你需要对云平台的技术和商业层面有了解,以及熟悉联邦政府采购与华府的权力和影响力斗争的背景。而且这不是个普通的采购,而是美国军方在云计算上长达10年的大规模投资。我发现联邦新闻网(Federal News Network)的报道是迄今为止最有价值。

我们刨除法律术语,先通俗点的总结一下整个故事目前的状态:

  • 国防部把整个合同给了Microsoft Azure。
  • Amazon不开心,提起诉讼,指控Trump干涉了采购过程,因为他很讨厌Jeff Bezos(Bezos也是《华盛顿邮报》的老板)。
  • Amazon还认为,整个分六部分的技术评估存在缺陷,希望全面重新评估。
  • 法官发现这六部份中至少有一部份——Pricing Scenario 6——有问题,暂停整个合同120天(直到8月17日),给所有当事组织时间调解。
  • 国防部自愿修改并重新开始为Pricing Scenario 6招标。
  • Amazon不开心,直接向国防部提交新的投诉(这不是个诉讼)。
  • 与此同时,国防部监察长办公室(Inspector General,是一个自我调查的独立监管部门,类似于一个公司的独立审计委员会)试图调查整个事件,并发布了一份300多页的大报告
  • 报告的两个主要结论是:(1)国防部官员意外地把一些Azure的机密信息泄漏给了AWS。(2)关于特朗普白宫的政治干预嫌疑还无法下结论(意思是:我们知道的还不够)

有了这个总结做铺垫,那我们开始深入探讨吧。

评估数据存储定价

这里首要的技术问题是,修改Pricing Scenario 6是否足够?这个定价类别与数据存储有关。根据法庭文件公开的部分,法官发现,Azure的解决方案不符合“云数据仓库”合同的标准,该“云数据仓库” 需要 “在线” 和“高度可访问”,而且不需要 “人工干预”。Azure认为AWS的定价太高,没赢是理所当然的。AWS认为Azure的技术是有缺陷的,而是因为国防部采购的标准的模糊而赢得合同的。

哪个论点更合理呢?

这个类别的目标看似就是交付一个高可用的云数据仓库。评估“高可用”既是一个存储持久性问题,也是一个网络问题;两项都不能缺。

让数据“在线”并且“可访问”(或“可用”)“无需人工干预”(或“全自动”)需要一些数据复制再加上一致性算法(例如Paxos或Raft)。因此整个系统需要有自动复制数据(至少3个copy)的功能加同步该数据的所有副本,以确保它们都一样(又名“一致性”)。这种实现可能同时在单个数据中心区域内和跨不同地域的数据中心区域同时进行,以确保随时的“可访问性”。电商常常这样做,方便人们随时买东西。社交网络也常常也这么做,方便人们随时点赞。我相信世界上最大的军队的要求只会比这更高。

所有这些技术工作都在后台无形的进行着,让存储的数据自动“在线”意味着大量的网络传输成本。关注云行业的人都很清楚,虽然存储成本在下降,但数据传输成本却没有下降。任何有点规模的组织都不会将自己的数据,以单个副本存储在AWS S3或Azure Blob Storage上就完事了。网络数据传输是让“高度可访问”变成现实的更大块成本。

大家可能觉得我有点咬文嚼字。但我坚信,每个定义的准确度以及对这些定义的共同理解是任何公正评估的基础。如果Pricing Scenario 6的定义和范围只限于存储,那么很有可能Azure的低价竞价并不能反映所有有关的成本,而AWS看似更贵的竞价其实更全面。

这个观点并不是对Azure被指控的技术劣势的指责。尽管我之前批评过Azure的可靠性,特别是在疫情导致的使用激增期间(AWS显然开始使用了这个观点),但为军方构建政府级别的云是一项独特的技术挑战。不管最终选择哪家公司来构建这个云平台,从他们的公有云的表现来判断水平都不是最公平的。

如果Pricing Scenario 6仅限于数据存储的定义而且预期表达得不精确,那么整个重新竞标的过程起码也应该包括与网络成本有关的那个类别。据我所知,与其他五个类别有关的信息都还没有公开。话虽如此,但我也不确定Amazon要求对所有六个类别都重新洗牌是不是合理的。国防部有没有犯错误?根据法官的裁决和IG的报告,显然是有的。但这两年多的评估过程中的所有工作都不合格?这也不太可能。从五角大楼目前对Amazon诉讼的回应来看,它显然更倾向于尽可能限制重新招标的范围。

那么,Amazon的诉讼和投诉到底是想要达到什么目的呢?我认为有一个更大的战略目标在起作用,它与技术无关,而是与美国大选的时间有关。

大目标:熬到11月

我不喜欢把什么事都政治化、都和特朗普扯上关系,但在看这个问题的时候,无法避免。Amazon大的战略目标显然是,尽一切可能将JEDI合约的任何工作或行动延迟到11月美国总统大选后,赌拜登的胜利。如果拜登真的成为了总统,整个华盛顿的每一层政府领导将重新替换改组,到那时候Amazon很可能得到它梦寐以求的“JEDI重来”。

Amazon的诉讼已经把事情暂停到8月17日,也就是目前120天的还押期结束,该还押期限是可以延长的。联邦新闻网报道中的这一部分清楚的提供了一版未来展望:

“法官命令政府、AWS和Microsoft的律师在6月16日前提交一份状态报告,向法庭通报这一过程的进展情况。一旦复议程序结束,五天内将有另一个最新的状态情况,以及各方对诉讼是否需要继续的意见。但命令明确表示,120天的还押期限可以延长。”

如果还押期在8月17日结束,那对Amazon来说当然不够久,延不到大选之日。因此,Amazon在两周前直接向国防部提出的申诉,同时与Microsoft展开了一场愈演愈烈的公关战,都是为了给整个事件的舆论环境添油加醋,从多方面影响法庭对这件事的裁决。

目前的诉讼是由U.S. Court of Federal Claims审理。这个法院至少有两层更高的法院在上:U.S. Court of Appeals 和 U.S. Supreme Court。如果Court of Federal Claims最后的裁决对Amazon不利,它一定会向更高一级的法院上诉的。Amazon将通过诉讼和上诉、直接投诉和公关活动等多种方式将整个事件拖到11月。Microsoft也不会坐视不管;它最近回应Amazon的博客里,充满了爱国主义的情感诉求,并抨击Amazon是个没有度量的败者,文章作者还是名前海军陆战队的军人。

但是,拜登政府一定就会对Amazon友好,以至于让JEDI招标重新开始吗?在发生之前,没有什么是可以百分之百保证的,但可以肯定的是,Amazon将有一个更接受它的听众。特朗普对Bezos个人和Amazon整个公司的仇恨已经众所周知的。公平地说,大多数民主党人也不喜欢Amazon。Amazon的高层领导层里有很多前奥巴马-拜登政府的官员,其中曝光率最高的是Amazon全球企业事务高级副总裁Jay Carney。他曾经在拜登担任副总统时担任他的公关总监,后来担任奥巴马白宫主发言人。(免责声明:我曾经在白宫为Jay工作过,是他的众多助理之一。他是个好老板。)

如果拜登成为总统,Amazon将利用这些关系来影响和引导JEDI合同的大结局。我这么说并不是要单独挑出Amazon,Microsoft也会这么做,Oracle也是一样,所有想与联邦政府做生意的公司都会这样做。(我认为这也是为什么Zoom选择Oracle作为其最新的云提供商的主要原因之一。)这是“DC势力游戏”的一部分,也是为什么华府有着自己独特的腐败声誉。

当然,拜登能不能赢还是个未知数。目前,他打败特朗普的机率最多就是50/50。但如果这个“抛硬币”的机率的背后是第二次赢得100亿美元合同的机会,同时也是赢得未来其他政府机构数百万合同的大门,那是完全值得努力争取的。与未来可能的收益相比,所有的律师费和可能短期的声誉损伤,微不足道。

如果您喜欢所读的内容,请用email订阅加入“互联”。每周两次,新的文章将会直接送达您的邮箱。请在TwitterLinkedIn上给个follow,与我交流互动!