Part III · Chapter 07

Storage & Data Services

After this chapter you will be able to attack a cloud provider's object-storage layer as a system — squat its global namespace, exploit predictable and provider-generated bucket names, and turn cross-tenant storage trust into a confused-deputy or supply-chain compromise.

~3,900 words · 3 figures

An engineer on a team you have never met opens a console, picks a service, and clicks through a wizard. Behind the scenes, the cloud quietly sets up a storage bucket to hold their files. They never named it. They never saw it. They just assume it is theirs.

It is not. You created a bucket with that exact name last month — because the name was easy to guess — and you have been waiting ever since. Their files now land in your hands, and nothing in their account logs a single thing went wrong.

The problem

A web engineer hears "cloud storage attack" and pictures a public S3 bucket leaking a spreadsheet. That is an in-tenant misconfiguration — one customer's mistake, one customer's blast radius — and this course deliberately demotes it. We are here to attack storage as a provider-side, multi-tenant system: the shared registry, the cross-tenant trust, the auto-fetch-and-execute paths the provider built for everyone at once.

First, vocabulary, because the reader knows HTTP but not necessarily S3. Object storage is a key-value store at planetary scale. A bucket (AWS S3, GCS) or storage account → container (Azure Blob) is a named container; an object / blob is a blob of bytes addressed by a key. Each is reachable over HTTPS at a predictable URL:

Provider	URL anatomy	Guessable parts
AWS S3	`https://<bucket>.s3.<region>.amazonaws.com/<key>`	bucket name, region, key
Azure Blob	`https://<account>.blob.core.windows.net/<container>/<blob>`	account, container, blob
Google Cloud Storage	`https://storage.googleapis.com/<bucket>/<object>`	bucket name, object

The problem is not that someone left a bucket public. The problem is two structural facts about how object storage is built. The first is the global namespace: bucket names are claimed first-come-first-served from one flat registry shared by every tenant on Earth, with nothing in the name that says "this belongs to me." The second is cross-tenant storage trust: services, SDKs, and infrastructure-as-code templates routinely resolve a name to data and then act on that data — often fetching and executing it — without ever checking who owns the bucket the data came from. Put those two facts together and you get the chapter's thesis: in object storage the enforcement plane and the attack surface are the same flat string. Whoever controls the name controls the bytes; and whoever controls the bytes a victim's service auto-fetches controls the victim.

◈ Concept · The global bucket namespace ▾

Object-storage names occupy a single flat namespace shared by every tenant of the provider — one S3 bucket name is unique across all AWS accounts, one Azure storage-account name across all of *.blob.core.windows.net. There is no account, subscription, or project component in the name. The account boundary is a wall IAM enforces on every request; the naming boundary has no wall — only a turnstile, where whoever calls CreateBucket first wins and a deleted name drops back into a global free pool.

Figure 7.1The global namespace as a shared component. Three tenants — one of them an attacker — draw from one flat registry with no per-tenant partition. The failing boundary is naming; the shared component is the namespace itself; the provider "magic" is the SDKs and services that generate names you can predict.

Why it matters / how it differs from a traditional pentest

Pentesting a tenant, you would look for one customer's public bucket and report it. That finding has a blast radius of one. Attacking the provider's storage layer is a different exercise, and three properties make it so.

The namespace is the sixth isolation boundary, and it is global. Chapters 1–6 attacked five boundaries — network, identity, hypervisor, Linux namespace, account/subscription. This chapter introduces naming. A naming bug does not affect one tenant; it affects whoever next claims, or next trusts, a name. When the namesquatting victims turned out to include AWS's own service teams, the blast radius became the entire tenant base.

Recon leaves no trace in the victim's logs. Claiming a bucket happens in the attacker's account, so the victim's CloudTrail records nothing. Enumerating Azure storage accounts happens over public DNS, entirely off-platform. This is the storage-specific instance of the log-invisible reconnaissance from Chapter 2 — the victim's detection surface for the recon phase is precisely nothing.

The payoff runs through the confused deputy. A traditional pentest exploits the attacker's own access. Here the impactful step is performed by the victim's service — CloudFormation, Glue, a build pipeline — running with the victim's privilege on attacker-supplied content. Recall from Chapter 3: a confused deputy is a privileged component tricked into misusing its authority on an attacker's behalf. Storage is where you weaponise it cheaply, silently, and patiently.

The methods at a glance

Four techniques, each a different way the namespace or the trust placed in it breaks. The breakdown sections below take them one at a time.

Technique	What it abuses	Payoff
Squatting a predictable name	Region-templated bucket names; foreseeable future regions	Hijack a code-delivery path into many tenants
Shadow resources & Bucket Monopoly	Provider-generated bucket names derived from non-secret values	Cross-account administrator
Enumerating the namespace	Storage-account names that are DNS names	Log-invisible discovery of every tenant's storage
Storage as a confused deputy	Components that trust whatever lands in a bucket	Log forgery; tampered templates; poisoned artifacts

Technique 1 · Squatting a predictable name

The simplest abuse of a first-come-first-served namespace: claim a name a victim expects to own — before they create it, or after they delete it — so their traffic, uploads, or deployments land in attacker-controlled storage.

◈ Concept · Bucket squatting ▾

Bucket squatting is the storage-namespace member of a family you already know — domain squatting (register the domain a company forgot to renew) and dependency confusion (publish a public package named like a victim's private one so their build pulls yours). The difference is that the registry here is the cloud provider's own bucket namespace, registration costs nothing, and it leaves no trace in the victim's logs.

The attack works only because bucket names are predictable. Software that deploys across many regions cannot hard-code one bucket name — it needs one per region — so it builds names from a template, typically prefix-{region}. The region is a known, small, public set. The cruel part: future regions are foreseeable. AWS announces regions long before they open, and new region names leak even earlier through Certificate Transparency logs as AWS provisions infrastructure. An attacker who knows a template uses appdevzipfiles-{region} can register appdevzipfiles-<next-region> before AWS launches that region.

That is exactly what Ian McKay demonstrated in 2019, the foundational disclosure of the problem^[1]^#211. He found a Sumo Logic CloudFormation quick-start template — the one-click install an ISV ships to thousands of customers — that sourced a Lambda's code like this:

"Code": {
  "S3Bucket": { "Fn::Join": ["", ["appdevzipfiles-", { "Ref": "AWS::Region" }]] },
  "S3Key": "cloudwatchlogs-with-dlq.zip"
}

{ "Ref": "AWS::Region" } resolves at deploy time to whatever region the customer is in. McKay registered appdevzipfiles-eu-north-1 before AWS launched the Stockholm region. Any customer who later deployed that template in Stockholm would pull their Lambda's executable code straight out of his bucket — not their data, their code. He had hijacked a code-delivery path into thousands of accounts for the price of a bucket name and the patience to wait for a region to open.

ℹ Note · The provider falls into its own trap ▾

After McKay reported it, AWS's security team eventually handed him back 52 buckets that belonged to AWS service teams — CloudFormation, Systems Manager, VPC, Lambda, Control Tower^[1]^#211. The people most exposed to a namespace bug are the provider's own service teams, because they ship name-templated artifacts to every customer. The blast radius of one squatted service bucket is the whole tenant base.

Notice the shape. The squat is a control-plane race — a single CreateBucket call — whose payoff is a data-plane interception: the victim's GetObject reads attacker bytes. The victim's CloudTrail records nothing, because the attacker's bucket lives in the attacker's account.

Technique 2 · Shadow resources & Bucket Monopoly

McKay's attack still needed a human to write a predictable template. The 2024 escalation removed the human: the provider itself creates predictably-named buckets, automatically, with names the customer never chose and often never sees.

◈ Concept · Shadow resource ▾

A shadow resource is a resource a managed service creates on the customer's behalf, automatically, with a name the customer never chose. Open AWS Glue Studio for the first time and Glue silently creates an S3 bucket named aws-glue-assets-{AccountID}-{Region} to stage your scripts. The name is fully derivable from values the customer is not even trying to keep secret — so if a name is generated by a rule, an attacker who knows the rule can generate it first.

Aqua's Bucket Monopoly research catalogued six first-party AWS services that auto-create predictably-named S3 buckets^[2]^#236:

Service	Shadow bucket pattern	Attacker needs
CloudFormation	`cf-templates-{Hash}-{Region}`	region (+ hash discovery)
Glue	`aws-glue-assets-{AccountID}-{Region}`	account ID + region
EMR	`aws-emr-studio-{AccountID}-{Region}`	account ID + region
SageMaker	`sagemaker-{Region}-{AccountID}`	account ID + region
CodeStar	`aws-codestar-{Region}-{AccountID}`	account ID + region
Service Catalog	`cf-templates-{Hash}-{Region}`	region (+ hash discovery)

For four of those six, the entire secret ingredient is the victim's AWS account ID — and an account ID is not a secret. It is printed in every ARN, leaks in S3 "access denied" error messages, and appears in public role trust policies. Recover it once and you can compute every Glue/EMR/SageMaker/CodeStar shadow-bucket name the victim will ever use, in every region.

Now the "Monopoly." An attacker pre-creates the predictably-named bucket in every region the victim has not used yet. The victim cannot enable the service in that region until the bucket exists — and the attacker owns it. The attacker holds a monopoly on that namespace slot, and the victim is forced to interact with attacker-controlled storage the first time they touch that service in that region. There is no "choose a different name" escape hatch; the name is hard-coded into the service.

The CDK case is worth a beat because the SDK hands the attacker the name. CDK bootstraps with a staging bucket called cdk-{Qualifier}-assets-{AccountID}-{Region}. The default qualifier is the fixed string hnb659fds^[3]^#230 — it looks random, but it is hard-coded into CDK and used by the overwhelming majority of CDK projects. So the "random-looking" bucket name collapses to account ID + region, both knowable. That is the bucket from the cold open.

⚠ Pitfall · Real, but opportunistic and slow ▾

Do not over-sell namespace bugs. To hijack a CDK staging bucket, the victim must (a) use CDK, (b) have deleted the bootstrap bucket at some point, and (c) later run cdk deploy again^[3]^#230 — all three or no payoff. The honest risk profile is low probability per target but free to attempt, infinite patience, and catastrophic when it lands.

How the chain reaches cross-account admin

Squatting a code-delivery path is bad; Aqua turned it into full account takeover, and the route runs through the confused deputy. The escalation works because the squatted bucket is not a passive sink — S3 buckets support event notifications. The attacker pre-creates cf-templates-{hash}-{region} in an unused region and attaches a notification that triggers an attacker-owned Lambda on every PutObject. When the victim later runs a CloudFormation deployment in that region, their template is uploaded into the attacker's bucket. The attacker's Lambda fires, reads the template (data theft) and injects new resources — most usefully a new IAM role with AdministratorAccess and a trust policy that lets the attacker's account assume it. CloudFormation then deploys the tampered template with the victim's privilege, creating an admin role inside the victim's account.

Technique steps · Bucket Monopoly to cross-account admin (#236)

Recon. Read the victim's AWS account ID off a public ARN — account IDs are non-secret.
Pre-claim the namespace. CreateBucket the predictably-named shadow bucket in every region the victim has not used, holding a monopoly on those slots.
Arm the bucket. Attach an S3 event notification pointed at an attacker Lambda, plus a permissive bucket policy granting the victim's services cross-account write.
Wait. The victim opens Glue / runs a CloudFormation deploy in that region; their template lands in the attacker's bucket.
Tamper. The attacker Lambda exfiltrates the template and injects an AdministratorAccess IAM role trusting the attacker's account.
Confused deputy. The victim's CloudFormation reads the tampered template and deploys it with the victim's privilege — creating the admin role inside the victim account.
Impact. sts:AssumeRole into the new role → cross-account administrator.

Figure 7.2Bucket Monopoly to cross-account admin (#236). The squat in steps 2–3 happens entirely in the attacker's account — invisible to the victim's CloudTrail. The victim's CloudFormation (step 6) is the confused deputy: it executes attacker-supplied template content with the victim's own privilege.

Read through the lens: naming failed first, which then defeated the account boundary; the attacker never used their own privilege for the impactful step — the victim's CloudFormation role did; the detection surface on the victim side is empty until the admin role appears, by which point it is over.

Technique 3 · Enumerating the namespace

A reasonable reader objects: Azure made blobs private by default years ago; surely the namespace is no longer a free recon channel. CyberArk's 2021 blob hunt answers that objection by finding 50 million publicly readable files across roughly 100,000 storage accounts^[4]^#192 — PII, health records, around a million invoices, half a million log files, and plaintext credentials and keys.

"Private by default" did not stop this, and the reason is structural: the defense is a per-container toggle; the attack is against the namespace, which no toggle touches. The enumeration pipeline:

Technique steps · the BlobHunter enumeration (#192)

Storage-account discovery via DNS. A storage-account name is a DNS name. Generate candidate names, fire a DNS query for <name>.blob.core.windows.net; an answering A record proves the account exists. CyberArk scanned roughly 200 million candidate names and confirmed about 100,000 real accounts — entirely off-platform, against public resolvers.
Container discovery. For each confirmed account, send the List Blobs REST call against a wordlist of common container names (backup, logs, data, $web). Only containers set to public access level Container answer — but a great many do.
File triage. Walk the listings and pull anything interesting. 50 million files.

The provider-side teaching point is step 1. Because the storage-account name is a DNS name, the entire global namespace is enumerable for free, from outside the cloud, leaving zero entries in any victim's log. The existence and naming of every storage account leak through DNS regardless of any blob ACL — and the victim's detection surface for that recon is precisely nothing. That is a provider architecture decision, not a customer slip. CyberArk packaged the technique as the open-source tool BlobHunter.

A redacted scan of a completed employment application form, showing name, address, Social Security number and employment-history fields blacked out. — Figure 7.3One of the 50 million files the blob hunt surfaced from publicly readable containers — a completed employment application carrying a name, address, and Social Security number (redacted by the researchers). Source: ^[4]^#192

⚠ Pitfall · A default is not a boundary ▾

"Private by default" protects a blob's initial ACL — a default value, not an enforced wall — and owners flip the container-level public flag and never re-audit it. Even on AWS, with bucket-wide Block Public Access on, an object-level ACL can still mark an individual object public^[5]^#007. When you audit storage, never trust the account-level toggle — enumerate object ACLs.

Technique 4 · Storage as a confused deputy & a supply chain

The Bucket Monopoly chain and the pickle-model kicker are two instances of one idea: a victim's component trusts the contents of storage and acts on them. Object storage is a trust sink — and a sink is dangerous in both directions. Things downstream trust what they read out of it; and a "private" bucket is not safe if something upstream hands out scoped write credentials and the consumer trusts whatever lands there.

☢ War Story · Writing to Amazon Go's logging bucket ▾

In 2018, Rhino Security Labs walked into an Amazon Go store — the cashier-less "just walk out" grocery concept — with a laptop and a Wi-Fi hotspot, and walked out with write access to an internal Amazon S3 bucket^[6]^#141.

The Go mobile app talks to an internal Amazon service, IhmFenceService. Intercepting the app's traffic in Burp revealed it being handed short-lived AWS credentials — the credential-vending pattern from the IMDS chapter. The operation a request invokes is selected by the X-Amz-Target header. Decompiling the Android APK with JADX surfaced other operation names; swapping getTransientQueue for getUploadCredentialsV2 returned a credential set scoped for log upload. Static analysis revealed the target bucket hard-coded in the APK: ihm-device-logs-prod.

The bucket blocked public access. It did not matter — the vended credentials granted s3:PutObject. The researchers could write arbitrary files: fill the bucket to run up Amazon's bill, or, more interestingly, forge log content, since the log format was simple text trivially edited before upload. A "private" bucket whose contents a downstream pipeline trusts is not a safe place to put data — it is an injection point.

The template-as-supply-chain pattern is not a one-off either. McKay's Sumo Logic finding was an ISV shipping a name-templated CloudFormation quick-start to every customer; the same shape recurred when Microsoft's Defender for Cloud distributed an AWS-onboarding CloudFormation template through public GitHub with an under-scoped IAM trust policy^[7]^#238. And the trust web runs all the way up: the BingBang research^[8]^#085, whose headline is an Entra multi-tenant auth flaw, also exposed an internal Microsoft application called COSMOS — a file-management front-end over more than four exabytes of Microsoft internal files, with in-app list, read, and edit. Object storage is rarely the end of a chain; it is the medium the chain travels through. Chapter 9 develops CI/CD supply chain in full — here the warm-up lesson is that a squatted bucket or a tampered template feeds a victim's pipeline.

Attacker's checklist · storage & data services

Namespace recon. Enumerate the target's storage namespace off-platform — DNS sweeps of *.blob.core.windows.net, bucket-name guessing — knowing it leaves no trace in any victim log.
Harvest account IDs. Pull AWS account IDs from public ARNs, access-denied errors, shared snapshots, and role trust policies — they seed every shadow-resource name.
Enumerate shadow-resource patterns. For each managed service the victim might use, compute the predictable bucket name (Glue, EMR, SageMaker, CodeStar, CloudFormation, CDK hnb659fds).
Pre-claim unused regions. CreateBucket the shadow / staging buckets in every region the victim has not yet deployed into — establish the monopoly and wait.
Arm squatted buckets. Attach S3 event notifications → attacker Lambda and permissive bucket policies, so a victim's first interaction triggers tampering.
Probe write-trust sinks. Find vended credentials (mobile apps, IMDS-like endpoints) that grant PutObject to "private" buckets a downstream pipeline trusts.

Defender's mirror · retrofitting a boundary onto a name

Resource-owner condition keys. AWS's aws:ResourceAccount IAM condition lets a role refuse to read or write any bucket whose owning account is not yours — the direct fix for shadow-resource and CDK-squatting confused-deputy chains. Apply it to service and deploy roles.
Account-namespaced bucket names. AWS's namespace syntax <prefix>-<accountid>-<region>-an retrofits a tenant identity into the name itself: only the embedded account, in the embedded region, can create the bucket^[9]^#207. The org-wide s3:x-amz-bucket-namespace SCP condition key lets a provider or org require the namespaced pattern fleet-wide.
Region-mapping, not region-substitution, in IaC. Replace Fn::Join with { "Ref": "AWS::Region" } by a Fn::FindInMap against an explicit RegionMap of buckets you actually own — McKay's own remediation.
Domain-verified namespaces. GCS already allows domain-named buckets that only the verified domain owner can create — a namespace with a built-in ownership proof.
The honest limit. None of these protect retroactively. Already-published templates with prefix-{region} names stay vulnerable. Bucketsquatting is dying, not dead.

That comparison — across all three majors — is worth holding as one table:

	AWS S3	Azure Blob	Google Cloud Storage
Namespace scope	Global, flat, all accounts	Global (storage-account name)	Global, flat
Tenant component in name	None (legacy)	None	None (legacy)
Predictable-name driver	Service shadow buckets; CDK qualifier `hnb659fds`	—	—
Free enumeration channel	`CreateBucket` collision / `HEAD`	DNS (`*.blob.core.windows.net`)	DNS / `storage.googleapis.com`
Aggravating limit	—	24-char account-name cap	—
Squatting fix	`-an` account namespace + `s3:x-amz-bucket-namespace`	None — no namespace fix	Domain-verified names
Owner-check fix	`aws:ResourceAccount` condition key	—	—

◆ Key takeaways

Naming is the sixth isolation boundary, and it has no wall. The bucket namespace is global, flat, and first-come-first-served — a turnstile, not a barrier. In storage, the enforcement plane and the attack surface are the same string.
Predictability is the exploit. Provider services and SDKs generate bucket names from account IDs, regions, and fixed strings. A shadow resource is provider convenience turned into a pre-computable attack surface.
The payoff is the confused deputy. Bucket Monopoly works because the victim's own CloudFormation executes attacker-supplied content with the victim's privilege — squatting a name escalates to cross-account admin.
A default is not a boundary. "Private by default" protects a blob's initial ACL; it does nothing about a DNS-enumerable namespace or per-container toggles owners flip and forget.

References

Ian McKay, "S3 Bucket Namesquatting — Abusing predictable S3 bucket names." Archived: local copy · Original: onecloudplease.com. Corpus #211.
Aqua Security (Yakir Kadkoda, Michael Katchinskiy, Ofek Itach), "Bucket Monopoly: Breaching AWS Accounts Through Shadow Resources." Archived: local copy · Original: aquasec.com. Corpus #236.
Aqua Security, "AWS CDK Risk: Exploiting a Missing S3 Bucket Allowed Account Takeover." Archived: local copy · Original: aquasec.com. Corpus #230.
CyberArk (Daniel Niv), "Hunting Azure Blobs Exposes Millions of Sensitive Files." Archived: local copy · Original: cyberark.com. Corpus #192.
Cloud Security Alliance, "3 Big Amazon S3 Vulnerabilities You May Be Missing." Archived: local copy · Original: cloudsecurityalliance.org. Corpus #007.
Rhino Security Labs, "AWS Misconfiguration: Arbitrary File Upload in Amazon Go." Archived: local copy · Original: rhinosecuritylabs.com. Corpus #141.
"Unauthorized access to AWS account findings in Microsoft Defender for Cloud." Archived: local copy · Original: cloudvulndb.org. Corpus #238.
Wiz (Hillai Ben-Sasson), "BingBang: AAD misconfiguration led to Bing.com results manipulation and account takeover." Archived: local copy · Original: wiz.io. Corpus #085.
Ian McKay, "Bucketsquatting is (finally) dead — AWS's S3 bucket-naming attack mitigated." Archived: local copy · Original: onecloudplease.com. Corpus #207.