People should pay for what they use

A group of research institutions and hospitals are in a partnership to study 2 PBs of genomic data. The institute that owns the data stores it in an Amazon S3 bucket and updates it regularly. The institute would like to give all of the organizations in the partnership read access to the data. All members of the partnership are extremely cost-conscious, and the institute that owns the account with the S3 bucket is concerned about covering the costs for requests and data transfers from Amazon S3.

Which solution allows for secure data-sharing without causing the institute that owns the bucket to assume all the costs for S3 requests and data transfers?

  1. Ensure that all organizations in the partnership have AWS accounts. In the account with the S3 bucket, create a cross-account role for each account in the partnership that allows read access to the data. Have the organizations assume and use that read role when accessing the data.

  2. Ensure that all organizations in the partnership have AWS accounts. Create a bucket policy on the bucket that owns the data. The policy should allow the accounts in the partnership read access to the bucket. Enable Requester Pays on the bucket. Have the organizations use their AWS credentials when accessing the data.

  3. Ensure that all organizations in the partnership have AWS accounts. Configure buckets in each of the accounts with a bucket policy that allows the institute that owns the data the ability to write to the bucket. Periodically sync the data from the institute’s account to the other organizations. Have the organizations use their AWS credentials when accessing the data using their accounts.

  4. Ensure that all organizations in the partnership have AWS accounts. In the account with the S3 bucket, create a cross-account role for each account in the partnership that allows read access to the data. Enable Requester Pays on the bucket. Have the organizations assume and use that read role when accessing the data.


问题中说到了共享数据的时候,不希望对数据拥有者造成请求和数据传输上的费用。那么这就表明了需要请求者来支付相应的费用。从答案来看,2和4的Enable Requester Pays满足条件。

但问题的难点即在这里,2和4两个答案都提出了这样的解决方案:

  • 所有的机构都有自己AWS账号
  • 允许其他机构对数据进行访问
  • 启用请求者支付
  • 用自己的账号访问数据

这里最大的区别是在如何共享S3数据的方式上。2和4采用的方法都可以完成数据的共享,其方法分别是文档中Resource-based Access和Cross-account IAM roles两种方式。但如果结合Requester Pays Buckets文档中的这句话

When the requester assumes an AWS Identity and Access Management (IAM) role prior to making their request, the account to which the role belongs is charged for the request.

来看,role所在的account会被用来支付费用。所以答案4中的共享方式,其他组织通过cross-account role来访问数据,由于这些role都是数据拥有者账号内创建的, 所以实际的费用支付者还是数据拥有者。这样就并不能实际达到Requester Pays的效果。

所以最后答案选2。