Attacking Cloud Service Providers ACSP Chapter 12
Part IV · Chapter 12

Logging, Detection Evasion & Cloud Incident Response

After this chapter you will be able to predict what a cloud provider's audit pipeline will and won't record about an action, operate beneath that threshold, establish persistence that survives credential revocation — and then switch hats and design the telemetry that catches a cross-tenant attacker anyway.

~4,500 words · 6 figures

A security team sets a trap. They scatter a few fake login credentials in places an intruder is likely to look — a config file, an old code repository — and wire them to an alarm. The deal is simple: the credentials are dead, they unlock nothing, and the instant anyone so much as tries to use them, the team gets paged. It is a clean idea. It has caught real attackers.

Then someone discovers a way to ask the question quietly. Before touching the bait, an intruder can find out whether a set of stolen credentials is wired to an alarm — using a corner of the cloud where the alarm simply does not listen. The trap never springs. The team never gets paged. The intruder pockets the real credentials and walks past the tripwire it had been told would always fire.

This chapter is about that quiet corner: the gap between what a cloud provider says it records and what it actually records, who lives in that gap, and how — from the provider's chair — you catch them anyway.

The problem

Every major cloud provider hands its customers an audit log: a running record of who did what, when, and from where. AWS calls it CloudTrail; Azure splits it into the Activity Log and the Entra sign-in/audit logs; Google calls it Cloud Audit Logs. It is marketed as the single source of truth for an account — the thing your detection engineering, your compliance posture, and your post-incident investigation all stand on.

The problem is the gap between that promise and the wiring. The audit log is not a magical, all-seeing eye. It is a managed service like any other — code the provider runs, on the control plane, that is separate from the API it records. Between "an API handler processed a request" and "an event landed in your log" there is a multi-stage pipeline, and every stage can fail. Some actions are never recorded by design. Some are recorded only if the customer pays extra. Some are recorded against the wrong actor. And — the heart of this chapter — some are missing because the provider's own code simply does not emit the event it is documented to emit.

◈ Concept · The cloud audit log

A cloud audit log is the provider-operated, append-only record of API activity in an account, tenant, or project. The crucial framing: it is a managed service, and it is not the same component as the API it records — so every gap between the API and the log is a detection-evasion opportunity. The thing you talk to is not always the thing that records you.

This is the home of the detection surface — the sixth lens from Chapter 1, the one that has surfaced in every chapter's Defender's mirror without ever getting a chapter of its own. It is also a callback to Chapter 2, where you used error-message differentials to enumerate AWS roles and users without leaving a trace in the victim's logs and were promised the general theory later. This is later: "logging-invisible enumeration," generalised into a discipline.

Why it matters / how it differs from a traditional pentest

In a traditional pentest, "the logs" belong to the target organisation. Detection evasion means disabling endpoint agents, clearing event logs, timing your actions for the quiet hours — you fight the customer's SOC, and the customer owns the gaps. Cloud detection evasion is a different game.

When a cloud audit log fails to record something, the gap usually sits on the provider's side of the shared-responsibility line. If AWS ships a service that emits no event, the customer cannot fix it — no policy, no configuration setting, nothing. That makes a logging gap a provider vulnerability with an all-customer blast radius: every tenant of that service is blind to it simultaneously, and stays blind until the provider patches. For a CSP red-teamer, finding and reporting a logging gap is as valuable as finding a cross-tenant isolation break.

The second difference: there are only two planes to record. Cloud audit logs are overwhelmingly a control-plane record — API calls that mutate or read configuration. They are not an EDR; they do not see in-guest process execution on the data plane at all. A defender who forgets this watches the control plane and declares victory while malware runs untouched inside the VMs.

Figure 12.1The six-stage audit-log pipeline. Stages 1–3 are the provider's code — no customer configuration can repair a gap there. Any logging-evasion case can be placed on this spine.

The methods at a glance

Detection evasion is the craft of exploiting any one stage of that six-stage pipeline. The corpus cases featured in this chapter cluster into four techniques, plus the persistence discipline that lets an attacker stay once detection fails. Each gets a breakdown section below.

TechniqueWhat it exploitsPipeline stageFeatured case
Invisible enumerationServices and APIs the audit log does not cover at all1 — API handlerIAM enumeration 2.0 (#134), AssumeRole role enumeration (#137)
Invisible changeNon-production endpoints that skip the logging route3 — deliveryService Catalog CloudTrail bypass (#154)
Provider logging failuresDocumented events the service never emits2 — event generationCopilot Studio logging gaps (#168)
Under-attributionEvents recorded against the wrong actor2 — event generationResource Explorer quiet enumeration (#148)
PersistenceStaying after detection — an anti-IR disciplinen/a (post-exploitation)Persistence-as-a-service in AWS (#149)
◈ Concept · Management events, data events, and the "no event" tier

Management events mutate or read resource configuration (CreateUser, AssumeRole) and are logged by default. Data events touch the contents of a resource (S3 GetObject, Lambda Invoke) and are not logged unless the customer enables and pays for it. The "no event" tier is the surface the pipeline never covers — unsupported services, undocumented APIs, mis-classified actions — and no customer configuration makes it visible.[11]

Picture these as three concentric rings of visibility: the inner ring (management events) is lit by default, the middle ring (data events) is dark unless the customer pays to light it, and the outer ring stays dark no matter what the customer does. An attacker's job is to push every action outward; a defender's job is to know which ring each action lands in and to drag as many actions inward as the budget allows. The three providers draw the rings slightly differently — worth committing to memory before the breakdowns.

ProviderControl-plane audit logIdentity audit logRetention default
AWSCloudTrail — "Event history" (90 days, free); a trail persists to S3IAM events sit inside CloudTrail90 days (Event history)
AzureActivity Log — ARM resource operationsEntra ID audit & sign-in logs — a separate pipeline90 days unless exported via Diagnostic Settings
GCPCloud Audit Logs — Admin Activity / Data Access / System Event / Policy DeniedIAM events sit inside Cloud Audit LogsAdmin Activity 400 days; Data Access 30 days

Note Azure's split brain: an attacker working the identity side shows up in the Entra logs but not the Activity Log.[12] GCP's Admin Activity stream is always on and free; its Data Access stream is off by default, the exact analogue of AWS's data-event tier.[13]

Figure 12.2Three rings of visibility, with corpus cases plotted in the ring they actually land in. The amber arrow is the attacker's objective: every action, pushed outward.

Technique 1 · Invisible enumeration

What it is. Reconnaissance that produces zero events in the victim's audit log. The attacker learns account structure, identities, and roles — and the defender's pipeline records nothing, success or failure.

How it works. CloudTrail has a fixed coverage map: it supports a defined list of AWS services and ignores the rest. Calling a service outside that map emits no event at all. But the API still responds — and an access-denied response is a free oracle, because it often returns the caller's identity or distinguishes "exists" from "does not exist". The attacker treats the error message as a boolean and scripts against it, entirely outside the victim's Stage 1.

⌖ Case · IAM enumeration 2.0 — bypassing CloudTrail (#134)

This is the chapter's foundational logging-evasion example. Rhino Security Labs found that CloudTrail does not support a set of AWS services at all — and the access-denied error from one of them still returns the caller's full ARN. The technique steps:

Technique steps · honeytoken-safe identity recon
  1. Hold an unknown set of AWS keys (found in a repo, a leaked config, a honeytoken trap).
  2. Call an unsupported service — Rhino's example is AppStream DescribeFleets. CloudTrail emits no event.
  3. Read the access-denied error: it contains the caller's ARN, e.g. arn:aws:iam::111111111111:user/the-path/TheUserName (a role's ARN includes the session name).
  4. Extract the account ID; learn whether the identity is a user or a role.
  5. Inspect the IAM path in the ARN. The SpaceCrab honeytoken project places decoy users under /SpaceCrab/ — visible verbatim. The honeytoken's alarm is CloudTrail; the unlogged service lets the attacker read the path and walk away before the tripwire fires.

Rhino shipped this as the Pacu module iam__detect_honeytokens and a standalone awshoney_check.py. AWS's response: ARNs "are not sensitive," and CloudTrail was working as intended.[1]#134

Case #134 was recon against your own account's identity. The companion case turns the same idea outward — against someone else's account, leaving their pipeline completely empty.

⌖ Case · Enumerating AWS roles through AssumeRole (#137)

Rhino found that sts:AssumeRole returned a different AccessDenied message depending on whether the target role existed:

// role exists, caller not trusted:
"User: arn:... is not authorized to perform: sts:AssumeRole
 on resource: arn:aws:iam::TARGET:role/SecretRole"

// role does not exist:
"Not authorized to perform sts:AssumeRole"
Technique steps · cross-account role enumeration
  1. Obtain the target's 12-digit account ID — far less private than people assume; it leaks via public AMIs, public EBS/RDS snapshots, screenshots, and GitHub.
  2. Script sts:AssumeRole against arn:aws:iam::TARGET:role/<guess> using a role-name wordlist. No account lockout, no throttling.
  3. Treat the message difference as a boolean oracle — confirm which role names exist.
  4. Note where this lands: the sts:AssumeRole calls are logged in the attacker's own account; the victim's CloudTrail records nothing. The victim only ever sees an event if a genuinely misconfigured role is actually assumed.

Rhino used this to find 50+ accounts whose roles carried "AWS": "*" trust policies — assumable by the entire internet.[2]#137

The family extends. Case #135 applies the verbose-error trick to users, confirming IAM usernames at scale from an account ID and a wordlist with no victim CloudTrail.[3]#135 Then AWS patched, normalising the STS error to a flat Access denied. Rhino routed around it: case #136 found a new oracle in trust-policy principal validation — saving a trust policy that names a principal ARN forces AWS to resolve it on the backend, and an invalid principal throws Invalid principal in policy.[4]#136 The same pattern recurs in undocumented internal APIs: case #080's Console-only AWSIdentityManagementAdminService exposed batch IAM methods that emitted no CloudTrail events, until AWS made it emit the matching iam events.[5]#080

⚠ Pitfall · Error-message oracles get patched; feature-inherent oracles don't

It is tempting to treat #135, #136 and #137 as "the same bug." They are not. A verbose-error differential is a discrete defect — the provider normalises the message and the technique dies. A feature-inherent oracle (#136) is a property the feature must have to work, so it cannot be removed without breaking the feature. When you choose an evasion, prefer the one rooted in a feature.

Technique 2 · Invisible change

What it is. Everything in Technique 1 was read-only — the attacker learned things invisibly but changed nothing. Invisible change is the step up: altering the environment with no event generated, a true post-exploitation evasion rather than recon.

How it works. Cloud providers run non-production copies of their services — beta and gamma endpoints — for internal testing. These endpoints often accept ordinary customer credentials and execute real actions, but they sit outside the production logging route (Stage 3 of the pipeline). The hard part is finding them. The AWS Console leaks them: its Content-Security-Policy header lists every endpoint the page may talk to, including the non-production ones.

⌖ Case · Bypassing CloudTrail in AWS Service Catalog (#154)

Datadog found the first publicly known CloudTrail bypass that permits invisible write actions. The technique steps:

Technique steps · invisible write via a beta endpoint
  1. Read the Content-Security-Policy meta tag of an AWS Console page; spot the anomalous entry https://aws242-servicecatalog-beta.us-east-1.amazonaws.com sitting next to the normal servicecatalog.us-east-1.amazonaws.com.
  2. Pull the Console's JavaScript; extract the service definition for the internal service AWS242ServiceCatalogService (target prefix AWS242ServiceCatalogService).
  3. Sign requests to the beta and gamma endpoints with ordinary, valid SigV4 credentials.
  4. Observe that both read and write Service Catalog actions execute normally — and generate no CloudTrail events at all. Creating a portfolio on the beta endpoint is invisible to a normal ListPortfolios audit.

A separate, related bug caused missing CloudTrail logging in AWS Control Tower. AWS fixed Service Catalog on 2023-02-07 and Control Tower on 2023-02-13.[6]#154

HTTP 200 response from a non-production servicecatalog endpoint returning portfolio JSON.
Figure 12.3A signed request to the aws242-servicecatalog non-production endpoint returns live portfolio data — a Console-leaked endpoint that executed actions outside the CloudTrail pipeline. Source: [6]#154

This is not a one-off. Case #152 mapped AWS's IAM-enabled non-production endpoints as an entire attack-surface class, disclosing two more CloudTrail bypasses (ce:GetCostAndUsage and route53resolver:ListFirewallConfigs) plus event-source obfuscation — one action firing multiple, misleading events.[7]#152 The non-production endpoint is a recurring seam, and the Console header is, reliably, where you find it.

Technique 3 · Under-attribution

What it is. A softer evasion than invisibility: the action is logged, but the log names the wrong actor. The event exists, the alert never fires, because nobody is looking for "the AWS service did it."

How it works. When an action is proxied through a managed service, the audit log often attributes it to the service principal, not to the identity that triggered it. The canonical case is the AppSync confused-deputy attack from Chapter 9 (#074), where a cross-account role assumption appears as appsync.amazonaws.com and blends into ordinary service noise. The cleaner teaching example sits in this chapter's own corpus.

◈ Concept · Under-attribution

Under-attribution means an event is recorded but credited to the wrong actor — typically a legitimate service principal instead of the compromised identity. It is more durable than full invisibility: a missing event is a discrete bug the provider can patch in a sprint, whereas "all actions through service X look like X" is an architectural property of how the service was designed.

⌖ Case · Resource Explorer quiet enumeration (#148)

Datadog found that resource-explorer-2:ListResources was classed as a data event — so unlogged without customer configuration — and it effectively proxied enumeration through the service, so calls were not traced to the compromised identity. The result was account-wide resource discovery with none of the noisy List*/Describe* plus AccessDenied spam that ordinary enumeration produces. After Datadog's disclosure, AWS reclassified it as a management event on 2025-07-15, so it now logs by default — a clean example of the disclosure-to-reclassification fix.[8]#148

Technique 4 · Provider logging failures

What it is. Not an unsupported service and not a non-production endpoint — a production, fully supported service that fails to emit events it is documented to emit. Stage 2 of the pipeline, broken inside the provider's own code.

How it works. "Documented as logged" is a claim, and claims can be wrong. A service may simply not emit an event that the provider's documentation lists as audited. When the missing event is an administrative action that itself disables logging, the gap and the off-switch compose into a complete audit blackout.

⌖ Case · Logging gaps in Microsoft Copilot Studio (#168)

From at least 2025-08-29 to 2025-09-25, Microsoft Copilot Studio did not log four administrative actions that Microsoft's own documentation lists as recorded in the Unified Audit Log: BotUpdateOperation-BotAuthUpdate (removing an agent's authentication), BotUpdateOperation-BotAppInsightsUpdate (disabling conversation logging), BotUpdateOperation-BotShare, and BotUpdateOperation-BotPublish. The technique steps an attacker with only the Editor role could chain:

Technique steps · the Copilot Studio audit blackout
  1. Strip the agent's authentication (BotAuthUpdate) — exposing it to anonymous users. Unlogged.
  2. Disable App Insights conversation logging (BotAppInsightsUpdate). Unlogged.
  3. Publish the agent (BotPublish). Unlogged.
  4. Interact with the now-anonymous agent to exfiltrate its data — anonymous interaction also skips the CopilotInteraction event, so the exfiltration is unlogged too.

MSRC fixed it at important severity by 2025-10-05 — and Datadog then observed a regression: two of the four events still log only intermittently.[9]#168

Diagram of a malicious Copilot Studio Editor stripping agent auth, disabling App Insights, and publishing — all unlogged.
Figure 12.4The Copilot Studio blackout: a malicious Editor strips authentication (BotAuthUpdate), disables App Insights logging (BotAppInsightsUpdate), and publishes (BotPublish) — every step unlogged. Source: [9]#168
☢ War Story · "Documented as logged" is a claim, not a fact

A detection engineer who built a rule on the assumption that BotAuthUpdate always fires would have a silent gap and never know it. The discipline is to verify your own provider — emit the event yourself in a test tenant and confirm it lands — and to re-test, because the Copilot Studio fix regressed within weeks. A fix is a point-in-time event; detection coverage decays.

Technique 5 · Persistence — operating after you are caught

What it is. Persistence is the attacker's answer to being detected anyway. It is an anti-incident-response discipline: pre-positioned access designed to survive the obvious remediation — revoking the credential the responder found.

How it works. Persistence has been used in this book without being named — Golden SAML (Chapter 3) survives a password change, backdoor roles survive key rotation. The discipline has five classes, and a serious operator layers at least two, because an IR team that finds and kills one will assume the job is done.

◈ Concept · The five classes of cloud persistence
  • Credential-class — extra IAM users and access keys, login profiles, certificate-based service-principal logins.
  • Trust-class — a role whose trust policy lets an external attacker account assume it (#158); a consented multi-tenant OAuth app.
  • Compute-class — "persistence-as-a-service": attacker code that re-mints access on demand (#149).
  • Anti-containment — making the foothold un-removable: Entra restricted-management Administrative Units (#174), disabling org-level trusted access (#149).
  • Telemetry-class — turning logging off so future actions are invisible: StopLogging/DeleteTrail, Azure Policy modify effects that disable diagnostic settings (#173).
⌖ Case · Persistence-as-a-service in AWS (#149)

Datadog's threat-hunters worked a real intrusion: a leaked long-term AKIA* key belonging to an IAM user in an AWS Organization management account. Within a 150-minute window, five distinct IP addresses picked up the key. The technique steps the attacker executed:

Technique steps · the #149 persistence chain
  1. Use the leaked AKIA* key; mint STS temporary credentials for Console access.
  2. Create a Lambda function (buckets555) fronted by an HTTP API Gateway; the Lambda code calls iam:CreateUser. Now a single HTTP request to the API Gateway URL mints a fresh attacker IAM user — persistence-as-a-service, decoupled from the original credential.
  3. Call DisableAWSServiceAccess for six services — including IAM Access Analyzer, SSM, and CloudFormation StackSets — degrading the org's own security tooling.
  4. Create an Identity Center group secure and user Secret, assign a permission set, and modify the SSO instance's MFA configuration to sign in without MFA.

A telling detail: a ConsoleLogin event arrived from a Telegram IP, 149.154.161.235 — the attacker drove the Console via a Telegram bot, and Telegram's link-preview crawler followed the sign-in URLs, stamping the attacker's tooling choice straight into the victim's CloudTrail.[10]#149

Figure 12.5The #149 persistence chain. Steps 2–4 are all management events — they do hit CloudTrail. Persistence is loud; the attacker is betting nobody watches.
ℹ Note · Persistence is usually loud — and that is the defender's opening

Every persistence step in Figure 12.5 — CreateFunction, CreateUser, DisableAWSServiceAccess, the Identity Center calls — is a management event, logged by default. Establishing persistence is typically the noisiest phase of an intrusion. The attacker is not invisible here; they are gambling that nobody is watching, and detection's job is to make that gamble lose.

Three more cases round out the taxonomy. Case #174 documents Entra ID restricted-management Administrative Units: an attacker account placed inside a restricted AU with no AU-scoped admin becomes un-disable-able through any tenant admin's normal flow — a "by design" abuse, the cleanest anti-containment example in the corpus.[14]#174 Case #173 abuses Azure Policy: a Resource Policy Contributor edits a shared policy definition, and append/modify effects can disable diagnostic logging or inject backdoor VM extensions while needing no service principal and producing no logs.[15]#173 Case #158 is the textbook trust-class case — a role SupportAWS whose trust policy named an external attacker-owned AWS account, surviving revocation of every credential in the victim account.[16]#158

Defender's mirror — detection from the provider's vantage

This is the chapter's second half, not its coda. Everything above told you how to evade the pipeline. Now you build the pipeline that catches the evader anyway. The reader is a web-security engineer, so the detection-engineering vocabulary gets a callout before the techniques.

◈ Concept · Detection engineering — the minimum vocabulary

Telemetry is the raw event stream; a detection is a rule that fires on a pattern within it. Detections come in three flavours — signature (an exact indicator), anomaly (a statistical spike), and behavioural (a suspicious sequence). A coverage gap is a technique with no detection rule — the precise defensive twin of a logging gap.

Figure 12.6The detection funnel and its leaks. An attack escapes detection only if it leaks at every stage; the defender's job is to ensure some partial signal always survives.

Even under evasion, threat actors have a fingerprint

The defender's strongest answer to Techniques 1–5 is case #102. Unit 42 took two real adversary groups — Muddled Libra (financially motivated, overlapping with Scattered Spider) and Silk Typhoon (nation-state) — and mapped each group's cloud MITRE ATT&CK techniques to the alert rules they trigger. Across 22 industries (June 2024–June 2025), Muddled Libra's 11 techniques mapped to roughly 70 unique alert rules; Silk Typhoon's 12 to about 50 — and the two sets shared only three rules.[18]#102 The distribution of alert types in a victim environment fingerprints which group is operating, enough to attribute an intrusion from telemetry alone. The teaching point is the chapter's defensive thesis: you cannot evade every logged action — some always land in Ring 1, and the pattern of even partial telemetry is itself a signal. MITRE ATT&CK's Cloud matrices give every technique a stable ID, which is what makes this cross-group comparison possible.[17]

Bar chart of unique cloud alert counts per industry, from 22 down to 5.
Figure 12.7Unit 42's measured distribution of unique cloud alert types across 22 industries — the raw material of the threat-actor fingerprint. Source: [18]#102

Pick the right telemetry source

Case #104 is the Azure-side example. AzureHound — the Go-based Azure component of the BloodHound suite — enumerates Entra ID and Azure through the Microsoft Graph and Azure REST (ARM) APIs. Both are externally reachable, so AzureHound does not need to run inside the victim; it has been misused by Peach Sandstorm, Void Blizzard, and the ransomware operator Storm-0501. Unit 42's key observation: the activity is loud in the Entra sign-in and audit logs — a burst of Graph plus ARM read calls from one service principal — even though it is silent in the resource-level Activity Log.[19]#104 A defender who watches only the Activity Log misses AzureHound entirely; detection engineering starts with choosing the telemetry source that carries the signal.

⚠ Pitfall · CloudTrail is not an EDR

Case #108 (Unit 42) tracked five evolving ELF malware families — NoodleRAT, Winnti, SSHdInjector, Pygmy Goat, AcidPour — targeting cloud Linux workloads. The tradecraft is LD_PRELOAD dynamic-linker hijacking: the malware injects into a legitimate process such as sshd with no on-disk binary altered. None of it appears in CloudTrail or the Activity Log, because the cloud audit log records the control plane, not in-guest process execution. In-workload persistence is a separate telemetry domain — host EDR, eBPF sensors, runtime monitoring — and ignoring it leaves a coverage gap the size of every running VM.[20]#108

How you actually hunt this

Case #165 is Datadog's hypothesis-driven threat-hunting methodology, converting the abstractions above into concrete signals. Backdoor IAM user names from real intrusions — sesadministrator, ampIify-dev (a capital-I homoglyph), ses_legion — are signature IoCs, but the durable detections are behavioural: a high volume of AccessDenied from one identity, and IAM-user creation originating from an EC2 instance (a workload role should almost never call CreateUser).[21]#165 Case #159 shows the pivot in action: SNS GetSMSAttributes enumeration in CloudTrail led Datadog to an attacker IP, then to an entire SMS-phishing campaign.[22]#159

⚠ Pitfall · A log sink is data — and data flows

A delicious inversion from case #037: GCP's VPC Service Controls write a Policy Denied log entry for every blocked request — recording the attacker-chosen request parameters into the attacker's own project. An attacker inside a perimeter could encode stolen secrets into the parameters of API calls they knew would be denied, turning the security control's deny log itself into an exfiltration channel.[23]#037 When you design telemetry, ask not only "what does this record" but "who can read what it records."

Cloud Incident Response from the provider's chair

When a provider runs incident response — not a tenant — they hold a cross-tenant view of the control plane, host-agent telemetry, and the internal-API logs. But they inherit the tenant's blind spots too: if a service emitted no event, the provider's SOC is just as blind. The IR loop — detect → scope → contain → evict → recover → learn — has two cloud-specific wrinkles. First, eviction must explicitly kill persistence: revoking the leaked key from #149 does nothing about the API-Gateway-fronted Lambda that re-mints users. Second, "scope" and "recover" are corrupted when the attacker tampered with the very logs you would use — which is why telemetry-class persistence does not just hide the next action, it blinds the entire post-incident investigation.

Attacker's checklist

Attacker's checklist · operating beneath the provider's eye
  • Before acting, place every planned action on the six-stage pipeline (Figure 12.1) and in a visibility ring (Figure 12.2). Know what each action will record.
  • For recon, prefer Ring 3 — unsupported services, undocumented internal APIs, non-production endpoints. Read the Console's CSP header to find beta/gamma endpoints.
  • Check for honeytokens through an unlogged service before using unknown keys — read the IAM path in an access-denied error (#134).
  • Prefer feature-inherent oracles over verbose-error oracles; the former survive patching (#136 vs #137).
  • Where invisibility is impossible, aim for under-attribution — route actions through a service principal so the log names the service, not you (#148).
  • Assume the build of persistence is logged — it almost always is. Establish it fast, then layer at least two persistence classes so eviction of one is not eviction of all.
  • Telemetry-class persistence (disabling logging) is force-multiplying: it blinds not just your next action but the defender's whole post-incident scope.
  • Remember your tooling leaves fingerprints — a Telegram link-preview crawler stamped an attacker's ASN into CloudTrail (#149).

Defender's mirror

Defender's mirror · building telemetry that catches the evader
  • Enable data-event / Data Access logging on crown-jewel resources — drag Ring 2 inward where it matters, accepting the cost.
  • Use organisation-wide trails, log-file integrity validation (digest files), and immutable / object-locked storage so Stage 4 cannot be tampered with.
  • Watch both Azure pipelines — the Activity Log and the Entra audit/sign-in logs. Identity-side attacks (AzureHound) live only in the latter.
  • Alert on the tamper actions themselves: StopLogging, DeleteTrail, DisableAWSServiceAccess, diagnostic-setting changes, GuardDuty's Stealth:IAMUser/CloudTrailLoggingDisabled.
  • Write behavioural detections, not just signatures: AccessDenied spikes per identity; CreateUser originating from an EC2 instance role (#165).
  • Map your detections to MITRE ATT&CK Cloud and audit for coverage gaps — the defensive twin of a logging gap (#102).
  • Treat in-workload telemetry as a separate, mandatory domain: host EDR / eBPF / runtime monitoring catches what CloudTrail structurally cannot (#108).
  • Verify, do not assume: emit each "documented as logged" event in a test tenant and confirm it lands — and re-test, because fixes regress (#168).
  • In IR, eviction must explicitly hunt and kill persistence across all five classes; revoking the initial credential is necessary but never sufficient.
◆ Key takeaways
  • The audit log is a managed service, separate from the API it records. Six pipeline stages sit between "request processed" and "analyst alerted" — and every one can fail.
  • Three rings of visibility: management events (logged by default), data events (logged only if configured), and the "no event" tier (never logged, uncloseable by the customer).
  • Logging-gap bugs sit on the provider's side of the shared-responsibility line — a logging gap is a provider vulnerability with an all-tenant blast radius.
  • Service Catalog (#154) was the step-change: the first publicly known bypass that let an attacker change the environment invisibly, not merely enumerate it.
  • Feature-inherent oracles outlast error-message oracles; under-attribution outlasts full invisibility — both are properties of design, not patchable defects.
  • Persistence is an anti-IR discipline with five classes; establishing it is usually loud, which is precisely the defender's opening.
  • You cannot evade every logged action — the distribution of surviving telemetry fingerprints the attacker (#102). CloudTrail is not an EDR; in-workload activity is a separate telemetry domain.

Chapter 13 re-analyses ChaosDB, Azurescape, and SynLapse as complete chains, and asks you to attach a detection analysis to every step: what did the provider's pipeline see, what did it miss, what telemetry would have caught it. Carry Figure 12.1 forward — it is now a chaining-analysis tool. For every link in every chain, answer the sixth lens.

References

  1. Rhino Security Labs, "AWS IAM Enumeration 2.0: Bypassing CloudTrail Logging". Archived: local copy · Original: rhinosecuritylabs.com. Corpus #134.
  2. Rhino Security Labs, "Assume the Worst: Enumerating AWS Roles through AssumeRole". Archived: local copy · Original: rhinosecuritylabs.com. Corpus #137.
  3. Rhino Security Labs, "Using AWS Account IDs for IAM User Enumeration". Archived: local copy · Original: rhinosecuritylabs.com. Corpus #135.
  4. Rhino Security Labs, "Unauthenticated AWS Role Enumeration (IAM Revisited)". Archived: local copy · Original: rhinosecuritylabs.com. Corpus #136.
  5. Datadog Security Labs, "Bypassing CloudTrail logging via undisclosed private APIs (iamadmin)". Archived: local copy · Original: securitylabs.datadoghq.com. Corpus #080.
  6. Datadog Security Labs, "Bypassing CloudTrail in AWS Service Catalog, and other logging research". Archived: local copy · Original: securitylabs.datadoghq.com. Corpus #154.
  7. Datadog Security Labs, "Non-production endpoints as an attack surface in AWS". Archived: local copy · Original: securitylabs.datadoghq.com. Corpus #152.
  8. Datadog Security Labs, "Enumerating AWS the quiet way: CloudTrail-free discovery with Resource Explorer". Archived: local copy · Original: securitylabs.datadoghq.com. Corpus #148.
  9. Datadog Security Labs, "Uncovering agent logging gaps in Microsoft Copilot Studio". Archived: local copy · Original: securitylabs.datadoghq.com. Corpus #168.
  10. Datadog Security Labs, "Tales from the cloud trenches: the attacker doth persist too much" (Cloud Attacker Persistence Techniques in AWS). Archived: local copy · Original: securitylabs.datadoghq.com. Corpus #149.
  11. Amazon Web Services, "Logging data events — AWS CloudTrail" and "CloudTrail events" (management vs data events; defaults). Original: docs.aws.amazon.com.
  12. Microsoft, "Audit logs in Microsoft Entra ID" and "Diagnostic settings — log options" (Azure's separate Activity Log and Entra audit/sign-in pipelines). Original: learn.microsoft.com.
  13. Google Cloud, "Cloud Audit Logs overview" (Admin Activity / Data Access / System Event / Policy Denied streams). Original: cloud.google.com.
  14. Datadog Security Labs, "Abusing Entra ID Administrative Units for sticky persistence". Archived: local copy · Original: securitylabs.datadoghq.com. Corpus #174.
  15. "Azure Policy abuse for privilege escalation and persistence". Archived: local copy. Corpus #173.
  16. "Unwanted visitor: an AWS cloud intrusion case study" (cross-account backdoor role). Archived: local copy. Corpus #158.
  17. MITRE, "ATT&CK Enterprise — Cloud matrices (IaaS, Identity Provider, SaaS, Office Suite)". Original: attack.mitre.org.
  18. Palo Alto Networks Unit 42, "Novel Technique to Detect Cloud Threat-Actor Operations via Cloud Logging". Archived: local copy · Original: unit42.paloaltonetworks.com. Corpus #102.
  19. Palo Alto Networks Unit 42, "Cloud Discovery and Threat-Actor Misuse of AzureHound". Archived: local copy · Original: unit42.paloaltonetworks.com. Corpus #104.
  20. Palo Alto Networks Unit 42, "The Evolution of Linux Binaries in Targeted Cloud Operations". Archived: local copy · Original: unit42.paloaltonetworks.com. Corpus #108.
  21. Datadog Security Labs, "Following attackers' CloudTrail in AWS: methodology and findings in the wild". Archived: local copy · Original: securitylabs.datadoghq.com. Corpus #165.
  22. Datadog Security Labs, "Using AWS CloudTrail to identify malicious activity and spot phishing campaigns". Archived: local copy · Original: securitylabs.datadoghq.com. Corpus #159.
  23. TrustOnCloud, "Exfiltrate data from your super-secure Google Cloud project using the security control built to prevent it". Archived: local copy · Original: trustoncloud.com. Corpus #037.