Attacking Cloud Service Providers ACSP Chapter 02
Part I · Chapter 02

The CSP Kill Chain & Reconnaissance

After this chapter you will be able to describe the five-stage shape every provider compromise takes, and run reconnaissance against a cloud provider as a deliberate hunt for the one exploitable crack that puts you inside provider-controlled territory.

~5,000 words · 3 figures

Picture two people standing in front of the same building. The first is a burglar: they want into one apartment, they pick one lock, they take what is in that one flat. The second is something rarer — someone who wants the building itself. Not a tenant's belongings, but the janitor's master key, the wiring closet, the elevator controls. Owning one apartment is a crime. Owning the building is owning every apartment at once.

A cloud provider is that building. Millions of customers rent space inside it, each behind their own lock. The provider runs the plumbing they all share — the identity system, the host machines, the internal network, the management APIs. An attacker who only wants one customer's data picks one lock. An attacker who wants the provider is hunting for the wiring closet: a way out of any single apartment and into the shared machinery behind the walls.

But you cannot pick a lock you have not found. Before any of that happens, the attacker has to do the unglamorous work of walking the perimeter, rattling doors, and noticing which one is not quite shut. That is reconnaissance — and against a cloud provider it is a very specific hunt: not for a customer's secrets, but for the one foothold that drops you inside provider-controlled territory. This chapter shows you the shape of the whole attack first, then the six ways attackers find that opening.

The CSP kill chain

You already know Lockheed Martin's kill chain and MITRE ATT&CK from web security. Before we talk about reconnaissance, it helps to see the whole attack the recon serves. The CSP kill chain is a provider-specific five-stage model: a template that any provider compromise can be poured into. It is deliberately abstract — not a story about one bug, but a frame for all of them. Reconnaissance, the subject of the rest of this chapter, is the work of finding a way into Stage 1. Seeing all five stages now tells you what that foothold is for.

StageThe question the attacker answersPrimitive chaptersLens emphasis
S1 — Foothold"How do I get any code or request execution inside a managed service?"Ch 4, 6, 8, 9, 10Plane: usually data
S2 — Local control"How do I become root / SYSTEM / admin in my own sandbox?"Ch 3, 6, 8Identity propagation
S3 — Boundary crossing"How do I leave my tenant's sandbox?" — the defining CSP move.Ch 5, 6, 11Isolation boundary
S4 — Provider infrastructure"How do I reach the control-plane infrastructure the provider runs?"Ch 4, 5, 8Shared component + magic
S5 — Blast radius"How many tenants — how much of the provider — does this now own?"Ch 7, 11, 12Detection surface
Figure 2.1The CSP kill chain. Five stages, twelve chapters. The chain visibly pierces the dashed-red tenant boundary at S3 and enters purple provider infrastructure at S4. Reconnaissance — this chapter — is the hunt for an entry into S1.

S1 — Foothold. The foothold stage asks one plain question: what component inside a managed service will run input you supply? The answer is almost always something the tenant was invited to use. A bundled notebook executes your code because executing code is the feature. A database extension runs queries because running queries is the feature. A container image you deploy executes because deploying containers is the product. S1 is rarely a "bug" — it is the legitimate front door, and the attacker simply notices that the front door is also an execution surface. The plane question of the course's six-part lens almost always answers "data plane" here.

S2 — Local control. Given execution, what privilege does that context hold, and can you raise it within the sandbox? This is ordinary local privilege escalation — a host process running more privileged than intended, a writable path, a misconfigured runtime. The crucial subtlety is that S2 keeps you inside your tenant boundary. Becoming root in your own container is, alone, unremarkable. Its value is as a stepping stone: root is what lets you tamper with whatever the provider placed inside your sandbox to constrain you.

S3 — Boundary crossing. S3 is the stage that makes an attack a CSP attack rather than a tenant pentest. Everything before it happens inside the customer's own allocation; S3 steps over the line into shared or provider-owned territory — a network path the local firewall was supposed to block, a container-to-node escape, a request replayed against a shared runtime. Structurally it is always the same: the attacker reaches a component on the trusted side of the isolation boundary.

S4 — Provider infrastructure. Having crossed the boundary, S4 asks what provider-run infrastructure is now reachable: a host-agent channel, an internal management API, a control-plane endpoint. These components were built on the assumption that anything talking to them is already trusted, so they are frequently under-hardened — they may not validate the caller at all. S4 is where the lens's "shared component" and "provider magic" questions earn their keep.

S5 — Blast radius. S5 is not an exploitation step; it is a measurement step. The chain has reached provider infrastructure — now you ask how far that reaches. Usually the answer is a stolen credential, and the question becomes: what is its scope? One tenant, one cluster, one region, or every customer of the service? Whatever the credential unlocks is the chain's ceiling — and the ceiling, not the entry point, is the finding's true severity.

◈ Concept · The weakest-boundary principle

Score a chain by the last boundary it crosses — the most trusted one — not the first. A trivial S1 foothold does not dilute a chain that ends inside provider infrastructure; the terminal blast radius sets the severity. A chain starting with a non-finding and ending at the control plane is a critical finding.

⌖ Case · Mapping known chains onto the model

The model is abstract on purpose, but it is grounded in real chains. ChaosDB[2]#056 — a Cosmos DB takeover you will study in Chapter 8 — and Azurescape, the Azure Container Instances cross-tenant escape from Chapter 6, both decompose cleanly onto the five stages:

StageChaosDB (Ch 8)Azurescape (Ch 6)
S1 FootholdBundled notebook runs codeDeploy a container image
S2 Local controlPrivileged host process → rootRuntime breakout → root on node
S3 Boundary crossingReach the shared host networkNode → multitenant cluster
S4 Provider infraHost-agent channel hands out certsReach the cluster api-server
S5 Blast radiusOne cert → many tenants, many regionsCluster-admin → all co-tenants

Notice the shape both share: S1 and S2 are non-findings, S3 and S4 are "components doing their job," and the criticality is emergent — it lives in the composition. And notice where each begins: at S1, with a foothold that recon had to find first.

The problem

The kill chain tells you what the attack becomes. It does not tell you how it starts. Every chain in this book begins at S1 — a foothold — and a foothold does not appear by wishing for one. Someone has to find it. That finding is reconnaissance, and it is the unglamorous first phase of any attack: before you exploit anything, you discover what is there and where it is soft.

Against a normal target — a web application, a corporate network — recon means port scans, directory brute-forcing, reading job postings to guess the stack. You build a picture, then attack the weak spot. Against a cloud provider the target is built differently, so the hunt is different too. You are not surveying a customer's belongings; you are looking for the one place where the customer's space and the provider's space touch in a way the provider did not intend. A managed product that runs your code on a provider machine. An SSRF that lets you speak to a provider-internal address. An exposed admin service on a provider-owned host. A leaked credential that belongs to the provider's own staff.

◈ Concept · The control plane

The control plane is the machinery that provisions and manages infrastructure — identity, host agents, internal management APIs — as opposed to the data plane, which moves customer data. CSP reconnaissance is the search for a path that begins in the data plane (a product you rent) and ends with a toehold in the control plane (machinery the provider runs).

So reframe recon for this course. Reconnaissance against a CSP is the hunt for an exploitable foothold into provider-controlled territory. Not "what does this customer own" — that is a different job — but "where, across this provider's enormous surface, is there a crack I can stand inside." The rest of this chapter is six concrete ways attackers have found that crack, every one of them anchored to published, real-world research.

The methods at a glance

The corpus of published provider-side research is dominated by AWS and Azure — an accident of who writes blog posts, not of where the bugs live. Strip the branding away and every major provider exposes the same handful of openings. The rest of this chapter breaks down six recon methods. Read the table first; each row is a section below.

MethodWhat it hunts forFoothold it yields (S1)
1. Scan for exposed control-plane instancesBruteforceable SSH, unauthenticated admin/API services, open dashboards on provider hostsDirect access to a provider-operated machine or service
2. Code execution inside a managed productA rented product that runs your code on a provider host with a weak internal network around itA vantage point on the provider's internal network
3. SSRF to the metadata serviceA request-forgery primitive that reaches IMDS and the provider's own internal endpointsIAM credentials of the provider's internal accounts
4. Code execution on any managed-compute surfaceThe building-block primitive — any way to run code on provider-managed computeExecution context inside a shared service
5. OSINT against the providerLeaked CSP-internal credentials and undocumented internal endpointsProvider credentials or a non-public attack surface
6. Predictable naming & shared resourcesProvider-owned buckets/registries with guessable or squattable namesA resource the provider itself trusts and consumes

Notice the third column. Every method is graded by the same question: what foothold does it actually hand you? Recon is not done when you have a list — it is done when you have a candidate S1.

◈ Concept · The foothold primitive

A foothold primitive is any reproducible capability that puts attacker-controlled code or requests inside provider-managed territory: code execution in a managed product, an SSRF reaching internal addresses, a credential, an exposed service. Methods differ; the deliverable is always one of these. Everything after S1 in the kill chain builds on it.

Method 1 — Scan the internet for exposed control-plane instances

What it is. The bluntest recon method needs no cleverness at all: point a scanner at the provider's address space and look for things that should not answer. Bruteforceable SSH on a management host, an unauthenticated admin API, a dashboard with no login, a debug port left open on a worker node. This is the same internet-wide scanning a traditional attacker does — the difference is the target. You are not scanning a customer's VMs; you are scanning the provider's own machines and the managed-service infrastructure they operate.

How it works. The hyperscalers (AWS, Azure, GCP) are large enough and process-mature enough that a raw exposed-service find on their core control plane is now rare — but not gone, because managed services constantly stand up new fleets of worker nodes, and those fleets are provider-operated machines with their own listening ports. Smaller and regional CSPs are a different story: they ship products faster than they harden them, and an exposed management service on a regional provider can be the entire foothold. Be honest about the corpus here — published exposed-service research against major providers is thinner than for the other five methods, precisely because the easy finds get fixed.

Recon steps · scanning for exposed CSP control-plane instances
  1. Enumerate the provider's published IP ranges (AWS, Azure, and GCP all publish them; smaller CSPs reveal theirs through ASN lookups and Certificate Transparency).
  2. Mass-scan for management ports — SSH, RDP, JMX/RMI, Kubernetes API (6443/10250), database admin ports, and HTTP dashboards.
  3. For each responder, test for the absence of authentication or for default/bruteforceable credentials.
  4. Confirm the host is provider-operated (a managed-service worker, not a customer VM) — that distinction is what makes the find an S1 into provider territory.
⌖ Case · An exposed worker node on a hyperscaler

Even a hyperscaler is not immune. GCP's Cloud Dataflow ran its worker nodes — machines the customer is not supposed to have to secure — with an unauthenticated Java JMX service that was, under certain configurations, exposed straight to the internet.[1]#039 Reaching that port gave unauthenticated root code execution on a Google-managed worker, and from there the host-networked metadata service. A plain port scan found a provider machine with the door open.

The lesson is calibration. Against AWS, Azure, or GCP, treat exposed-service scanning as a low-probability sweep that occasionally pays off when a new managed-service fleet is rushed out. Against a regional or niche provider, treat it as a primary technique — their hardening maturity is years behind, and an exposed admin panel may be the only foothold you need.

Method 2 — Code execution in a managed product, then scan the provider's internal network

What it is. This is the workhorse of CSP attacks, and the cleanest illustration of why the kill chain begins where it does. You rent a managed product — a managed Jupyter notebook, a managed Postgres instance, a managed analytics runtime — and you use the product exactly as advertised to run code. That code does not execute in some abstract cloud; it runs on a real machine in the provider's datacenter. And the network that machine sits on is the provider's internal network, where tenant isolation is often weak or absent. Your recon is then a network scan from inside the building.

How it works. A managed product is sold as a sealed appliance, but underneath it is a host VM or container on a provider-run subnet. Once you have code execution on that host — typically a feature, not a bug — you can do what no external scanner can: enumerate the internal subnet, probe neighbouring tenants' instances, query the host's metadata service, and find the provider's internal management endpoints. The product was the foothold; the internal network is what you reconnoitre with it.

Recon steps · pivoting from a managed product into the provider network
  1. Provision the managed product yourself and find its built-in code-execution surface (a notebook cell, a database extension, an analytics connector).
  2. From inside the product, fingerprint the host: OS, network interfaces, the IP range it sits on, what agents the provider installed.
  3. Scan the local subnet for other tenants' instances and for provider-internal services — host agents, management APIs, certificate endpoints.
  4. Catalogue every reachable internal address; each is a candidate S3 boundary-crossing target for later chapters.
⌖ Case · ChaosDB, SynLapse, ExtraReplica — three products, one shape

ChaosDB[2]#056 began in a Cosmos DB managed Jupyter notebook: Wiz ran C# in the notebook, found the host process ran as root, and discovered the internal network reached Azure's Service Fabric. ExtraReplica[3]#061 began with code execution on an attacker-owned Azure PostgreSQL Flexible Server, whose internal subnet turned out to be reachable to other tenants' database instances. SynLapse[4]#062 began with command injection in a bundled ODBC connector inside Synapse's shared Integration Runtime. Three different managed products — and all three recon steps are identical: get code running, then look around the internal network the provider forgot you could see.

The reason this method is so productive is structural. The provider must give you a product that runs your input — that is the whole value proposition — but isolating the host that runs it, and the network around that host, is separate, harder, deadline-pressured work. The gap between "the product runs my code" and "the host is properly contained" is exactly the recon territory of Method 2.

Method 3 — SSRF to the metadata service and the provider's internal accounts

What it is. Server-Side Request Forgery is a vulnerability where you trick a service into making an HTTP request for you, to a destination you choose. In ordinary web security, SSRF is bad because it reaches internal apps. In the cloud it is far worse, because every compute instance carries a special internal address — the Instance Metadata Service (IMDS) — that hands out the IAM credentials of whatever identity the instance runs as. Find an SSRF in a provider-operated service, point it at IMDS, and you steal the credentials of the provider's own internal account.

◈ Concept · The instance metadata service (IMDS)

IMDS is a link-local endpoint (169.254.169.254 on AWS/GCP, 168.63.129.16 for Azure's WireServer) reachable only from on the instance itself. It serves configuration and, critically, short-lived IAM credentials for the instance's role. An SSRF that reaches IMDS turns "make a request" into "become this machine's identity." Chapter 4 takes IMDS apart in full.

How it works. The attacker first finds a service that fetches a URL on their behalf — a template renderer, a webhook tester, a bot's outbound-request feature, a data-connection importer. They then point that fetch at the metadata address. If the service is a provider-operated service, the credentials returned belong to a provider-internal role, and the recon payoff is enormous: you now hold an identity the provider trusts and can use it to enumerate internal infrastructure.

Recon steps · SSRF-to-IMDS against a provider service
  1. Inventory provider-operated features that take a URL or fetch remote content — template engines, importers, webhook validators, AI agents with browsing.
  2. Test each for SSRF: can you make it request an attacker host? Can it reach link-local addresses?
  3. If filtered, attempt bypasses — redirect chains (a 301 to an internal host), alternate IP encodings, DNS rebinding.
  4. Point the working SSRF at IMDS / WireServer, retrieve role credentials, and enumerate what that provider-internal identity can see.
⌖ Case · BreakingFormation and the Azure SSRF cluster

BreakingFormation[5]#005: Orca found an XXE in AWS CloudFormation's template renderer, read files off an AWS-internal host, and used SSRF to make requests as an internal AWS service principal — a foothold inside AWS infrastructure, not a tenant. The Azure side is a whole cluster: SSRF in Copilot Studio[6]#254 and in the AI Health Bot[7]#256 reached Microsoft's internal metadata endpoints; Azure Machine Learning's data-connection API was SSRF-able via a redirect that bypassed its internal-host filter[8]#258; and Mandiant's WireServing[9]#255 showed AKS pods could reach the WireServer host-agent endpoint and pull node bootstrap credentials. Same primitive, five provider services.

Method 4 — Code execution on any managed-compute surface

What it is. Methods 2 and 3 are specific shapes of a more general truth: any way to run code on provider-managed compute is a foothold worth hunting. Method 4 is the building-block primitive in its purest form. The recon question is not "what product" or "what vulnerability class" — it is simply "where, anywhere on this provider, can I get arbitrary code to execute on a machine the provider operates?" Managed CI runners, AI inference hosts, container build services, serverless workers — every one is provider-managed compute, and code execution on any of them is an S1.

How it works. The attacker enumerates every surface that accepts and runs something: a container image, a notebook, a model file, a task definition, a build script. Many of these surfaces run untrusted input by design and rely on isolation alone for safety. The recon is finding which one will run your code on a host you do not own — and crucially, finding the surfaces where the input format itself is an execution vector that the provider may not have appreciated.

Recon steps · finding a managed-compute execution surface
  1. List every provider surface that ingests executable artefacts — container images, ML models, notebooks, build/CI scripts, task definitions, plugins.
  2. For each, identify the implicit execution vector: a pickle-deserialising model loader, a container entrypoint, a build step, a task command field.
  3. Submit a benign proof-of-execution artefact and confirm it runs on provider-managed compute (check the hostname, the network, the identity).
  4. Record the execution context — what user, what container, what host identity — as the starting state for S2 local escalation.
⌖ Case · Malicious models and weaponized task definitions

Wiz's Hugging Face research[10]#091 uploaded a malicious model — AI model formats such as pickle permit arbitrary code execution on load — to the Inference API, ran code inside the EKS-hosted shared service, and used container escape plus IMDS to reach other tenants' private models. On AWS, Rhino's ECS task-definition work[11]#002 showed that registering a task definition and calling run-task runs an attacker-chosen command on managed ECS compute, immediately exposing the task-role credentials from the metadata endpoint. An uploaded file and a JSON field — two formats, both quietly an "execute my code" button.

Method 5 — OSINT against the provider itself

What it is. The methods so far probe live systems. Method 5 reads what is already public. Open-source intelligence against a customer is routine — leaked customer keys in GitHub commits. Open-source intelligence against the provider is the same craft pointed one level up: leaked credentials belonging to the provider's own staff and systems, and the discovery of the provider's non-production and internal endpoints that were never meant to be found.

How it works. Two veins are especially productive. The first is leaked CSP-internal credentials: provider engineers commit secrets to public repos exactly as customers do, and a provider-internal credential is a direct S1. The second is endpoint discovery through Certificate Transparency — the public, append-only logs of every issued TLS certificate. Every time a provider stands up a new internal or pre-production service behind TLS, the hostname becomes public in CT, including non-production API endpoints no wordlist would ever guess.

Datadog's Certificate-Transparency recon pipeline: Cert Spotter on EC2, an SQS queue, an AWS API Identifier Lambda, a second SQS queue, an API Fingerprinter autoscaling group, and two DynamoDB tables.
Figure 2.2Datadog's CT-logs recon pipeline — Cert Spotter → SQS → an "AWS API identifier" Lambda → SQS → an API-fingerprinter fleet, with results in DynamoDB. Recon to find recon targets, built as production infrastructure. Source: [12]#152

Datadog's research[12]#152 turned this into an industrial pipeline. AWS runs thousands of non-production API endpoints — pre-prod, gamma, beta — and they are not idle DNS records: they are IAM-enabled and callable with ordinary credentials, and some write no audit log at all. Finding them is the hard part, because by definition they are in no documentation. The pipeline watches CT for new certificates on AWS-owned domains, classifies each candidate hostname as an AWS API or not, then fingerprints the survivors with an autoscaling fleet.

Inverted funnel: 83,514,897 domains analyzed, 48,539 AWS API endpoints identified, 6,516 AWS API endpoints fingerprinted.
Figure 2.3The pipeline's output: 83.5 million domains narrowed to 6,516 fingerprinted AWS API endpoints — a map of provider attack surface that exists in no documentation. Source: [12]#152
Recon steps · OSINT for provider credentials and hidden endpoints
  1. Search public code hosts, paste sites, and artefact registries for secrets scoped to the provider's own accounts, not just customers'.
  2. Stream Certificate Transparency logs for new certificates on provider-owned domains (amazonaws.com, aws.dev, a2z.com, and equivalents).
  3. Classify each discovered hostname — is it a callable provider API? — and fingerprint the survivors for IAM behaviour and audit-log coverage.
  4. Flag undocumented or non-production endpoints: a callable internal API is a foothold no scanner of the documented surface would ever find.
⌖ Case · The non-production fleet and the undocumented private API

Datadog's non-production-endpoints research[12]#152 found that AWS's beta/gamma endpoints could be called with stolen credentials to enumerate permissions silently — a provider-side attack surface that simply is not in the docs. A smaller cousin: by watching the AWS Console's own traffic, Datadog also found the undocumented iamadmin private API[13]#080, whose batch IAM methods returned results with no CloudTrail event. Undocumented endpoints hide in plain sight — in CT logs, and sometimes in the provider's own JavaScript.

Method 6 — Recon by predictable naming & shared resources

What it is. The final method exploits a quirk of how cloud providers name things. Storage buckets, container registries, and staging resources frequently live in global namespaces, and providers often generate their names by formula. If a name is predictable, an attacker can compute it ahead of time — and if the resource does not exist yet, the attacker can create it first. The recon is the act of working out the naming formula and finding which predictable resources are unclaimed.

How it works. Many provider services auto-create helper resources — a CloudFormation staging bucket, a Glue assets bucket, an EMR scratch bucket — with deterministic names derived from inputs an attacker can know, such as a 12-digit account ID and a region. Because the namespace is global and first-come-first-served, an attacker who computes the name before the provider's service does can register it themselves. The provider's own service then reads from, or writes to, a bucket the attacker controls — a resource the provider trusts by name.

Recon steps · namesquatting provider-trusted resources
  1. Reverse-engineer the naming formula a provider service uses for its helper resources (read its docs, its IaC templates, its open-source SDK code).
  2. Identify the variable inputs — account ID, region, service name — and which of them an attacker can obtain or guess.
  3. Compute candidate names for a target account, especially for regions or services the victim has not used yet.
  4. Pre-register the unclaimed names; the foothold is a resource the provider's service will later trust and consume.
⌖ Case · S3 namesquatting and Bucket Monopoly

Ian McKay's original S3 namesquatting work[14]#211 abused region-substituted bucket names in IaC templates: guess the next region's name from CT logs, pre-register the bucket, and the template serves your malicious Lambda code into the victim. He claimed 52 AWS-service buckets this way. Aqua's Bucket Monopoly[15]#236 generalised it: six AWS services auto-create staging buckets with names like aws-glue-assets-{AccountID}-{Region}, so an attacker who knows only a victim's account ID can pre-claim the bucket in an unused region and inject code when the victim first uses the service. Predictable names are an attack surface.

Putting the six together

The six methods are not a menu to pick one from — they are a sweep. A real CSP engagement runs all of them against a target provider and treats every hit as a candidate S1. Method 1 finds open doors; Methods 2 and 4 find products that run your code; Method 3 finds request-forgery into the control plane; Method 5 finds what is already leaked or hidden; Method 6 finds resources the provider trusts by name. Each one ends at the same place: a foothold inside provider-controlled territory, and the start of the kill chain you saw at the top of this chapter.

Attacker's checklist · hunting an S1 foothold
  • Frame the hunt. The deliverable is not an asset inventory — it is a candidate foothold into provider-controlled territory. Grade every find by "what S1 does this give me?"
  • Sweep for open doors. Scan the provider's published ranges for exposed management services — heavily for regional/niche CSPs, opportunistically for hyperscaler managed-service worker fleets (#39).
  • Rent and run. Provision managed products, find their built-in code-execution surface, and scan the provider's internal network from inside (#56, #61, #62).
  • Hunt request-forgery. Inventory provider features that fetch URLs; test for SSRF to IMDS/WireServer and the provider's internal accounts (#5, #255).
  • Probe every execution surface. Container images, ML models, notebooks, task definitions — any artefact a provider runs is an S1 (#91, #2).
  • Mine the public record. OSINT for leaked provider-internal credentials and CT-driven discovery of undocumented/non-production endpoints (#152, #80).
  • Compute the names. Reverse-engineer naming formulas and pre-claim provider-trusted buckets and resources (#211, #236).
Defender's mirror
  • Treat managed-service hosts as production attack surface. Every worker fleet you stand up is a provider machine on the internet; an unauthenticated debug port on it (#39) is an S1 handed out for free. Scan your own ranges as an attacker would.
  • Isolate the host, not just the product. A managed product will run customer code — that is the feature. Containment must be the host VM and the network around it, never a control inside the box the tenant runs code in (#56).
  • Block IMDS from every request-forgery surface. Any provider service that fetches a URL must not be able to reach link-local metadata addresses — and SSRF filters must re-validate after redirects (#258).
  • Audit predictable names. If a service auto-creates a resource from a formula in a global namespace, an attacker can pre-claim it. Use account-scoped conditions and pre-create resources in all regions (#236).
  • Shrink the discoverable surface. Non-production endpoints in Certificate Transparency, secrets in public repos, undocumented APIs in console JavaScript — every one is a recon gift. Monitor CT for your own domains and treat non-prod endpoints as in-scope for hardening (#152).
◆ Key takeaways
  • The CSP kill chain — Foothold → Local control → Boundary crossing → Provider infrastructure → Blast radius — is the shape every provider compromise takes. Reconnaissance is the hunt for an entry into Stage 1.
  • Provider recon is not tenant recon. Tenant recon inventories one customer's account; provider recon hunts for a crack in the provider's own surface that becomes a foothold inside provider-controlled territory.
  • The deliverable of recon is a foothold primitive: code execution in a managed product, an SSRF reaching internal addresses, an exposed service, or a provider credential — never just an asset list.
  • Exposed-service scanning is opportunistic against hyperscalers but a primary technique against smaller, less-hardened regional CSPs.
  • Managed products are the workhorse foothold: the provider must run your code to sell you the product, but isolating the host that runs it is separate, harder, deadline-pressured work — and that gap is the recon prize.
  • Predictable names, leaked provider credentials, and undocumented endpoints are all attack surface. If a name is a formula or a hostname is in Certificate Transparency, an attacker can find it before you do.

References

  1. M. Brancato, "Remote Code Execution in Google Cloud Dataflow." Archived: local copy · Original: mbrancato.github.io. Corpus #39.
  2. N. Ohfeld & S. Tzadik, Wiz Research, "ChaosDB Explained: Azure's Cosmos DB Vulnerability Walkthrough." Archived: local copy · Original: wiz.io. Corpus #56.
  3. Wiz Research, "ExtraReplica: Cross-Account Database Vulnerability in Azure PostgreSQL." Archived: local copy · Original: wiz.io. Corpus #61.
  4. T. Pahima, Orca Security, "SynLapse — Technical Details for Critical Azure Synapse Vulnerability (CVE-2022-29972)." Archived: local copy · Original: orca.security. Corpus #62.
  5. Orca Security, "BreakingFormation: Exploiting XXE in AWS CloudFormation." Archived: local copy · Original: orca.security. Corpus #5.
  6. Tenable, "SSRFing the Web with Copilot Studio (CVE-2024-38206)." Archived: local copy · Original: cloudvulndb.org. Corpus #254.
  7. Tenable, "Azure AI Health Bot Privilege Escalation via SSRF." Archived: local copy · Original: cloudvulndb.org. Corpus #256.
  8. Tenable & Wiz, "Azure Machine Learning SSRF via Redirect Bypass." Archived: local copy · Original: cloudvulndb.org. Corpus #258.
  9. Mandiant, "WireServing: AKS Node Credential Exposure via WireServer." Archived: local copy · Original: cloudvulndb.org. Corpus #255.
  10. Wiz Research, "Cross-Tenant Access via IMDS on Hugging Face AI-as-a-Service Infrastructure." Archived: local copy · Original: wiz.io. Corpus #91.
  11. Rhino Security Labs, "Weaponizing ECS Task Definitions to Steal Container Credentials." Archived: local copy · Original: rhinosecuritylabs.com. Corpus #2.
  12. Datadog Security Labs, "Non-Production Endpoints as an Attack Surface in AWS." Archived: local copy · Original: securitylabs.datadoghq.com. Corpus #152.
  13. Datadog Security Labs, "Bypassing CloudTrail Logging via the Undocumented iamadmin API." Archived: local copy · Original: securitylabs.datadoghq.com. Corpus #80.
  14. I. McKay, "S3 Bucket Namesquatting — Abusing Predictable S3 Bucket Names." Archived: local copy · Original: onecloudplease.com. Corpus #211.
  15. Aqua Security, "Bucket Monopoly: Breaching AWS Accounts Through Shadow Resources." Archived: local copy · Original: cloudvulndb.org. Corpus #236.