Shai-Hulud compromised a dev machine and raided GitHub org access: a post-mortem

2025-12-1410:07262184trigger.dev

On November 25th, one of our engineers was compromised by the Shai-Hulud npm supply chain worm. Here's what happened, how we responded, and what we've changed.

Show article

On November 25th, 2025, we were on a routine Slack huddle debugging a production issue when we noticed something strange: a PR in one of our internal repos was suddenly closed, showed zero changes, and had a single commit from... Linus Torvalds?

The commit message was just "init."

Within seconds, our #git Slack channel exploded with notifications. Dozens of force-pushes. PRs closing across multiple repositories. All attributed to one of our engineers.

We had been compromised by Shai-Hulud 2.0, a sophisticated npm supply chain worm that compromised over 500 packages, affected 25,000+ repositories, and spread across the JavaScript ecosystem. We weren't alone: PostHog, Zapier, AsyncAPI, Postman, and ENS were among those hit.

This is the complete story of what happened, how we responded, and what we've changed to prevent this from happening again.

No Trigger.dev packages were ever compromised. The @trigger.dev/* packages and trigger.dev CLI were never infected with Shai-Hulud malware. This incident involved one of our engineers installing a compromised package on their development machine, which led to credential theft and unauthorized access to our GitHub organization. Our published packages remained safe throughout.

The Attack Timeline

Time (UTC)	Event
Nov 24, 04:11	Malicious packages go live
Nov 24, ~20:27	Engineer compromised
Nov 24, 22:36	First attacker activity
Nov 25, 02:56-05:32	Overnight reconnaissance
Nov 25, 09:08-15:08	Legitimate engineer work (from Germany)
Nov 25, 09:10-09:17	Attacker monitors engineer activity
Nov 25, 15:17-15:27	Final recon
Nov 25, 15:27-15:37	Destructive attack
Nov 25, ~15:32	Detection
Nov 25, ~15:36	Access revoked
Nov 25, 16:35	AWS session blocked
Nov 25, 22:35	All branches restored
Nov 26, 20:16	GitHub App key rotated

The compromise

On the evening of November 24th, around 20:27 UTC (9:27 PM local time in Germany), one of our engineers was experimenting with a new project. They ran a command that triggered pnpm install. At that moment, somewhere in the dependency tree, a malicious package executed.

We don't know exactly which package delivered the payload. The engineer was experimenting at the time and may have deleted the project directory as part of cleanup. By the time we investigated, we couldn't trace back to the specific package. The engineer checked their shell history and they'd only run install commands in our main trigger repo, cloud repo, and one experimental project.

This is one of the frustrating realities of these attacks: once the malware runs, identifying the source becomes extremely difficult. The package doesn't announce itself. The pnpm install completes successfully. Everything looks normal.

What we do know is that the Shai-Hulud malware ran a preinstall script that:

Downloaded and executed TruffleHog, a legitimate security tool repurposed for credential theft
Scanned the engineer's machine for secrets: GitHub tokens, AWS credentials, npm tokens, environment variables
Exfiltrated everything it found

When the engineer later recovered files from their compromised laptop (booted in recovery mode), they found the telltale signs:

TruffleHog artifacts found on compromised machine

The .trufflehog-cache directory and trufflehog_3.91.1_darwin_amd64.tar.gz file found on the compromised machine. The extract directory was empty, likely cleaned up by the malware to cover its tracks.

17 hours of reconnaissance

The attacker had access to our engineer's GitHub account for 17 hours before doing anything visible. According to our GitHub audit logs, they operated methodically.

Just over two hours after the initial compromise, the attacker validated their stolen credentials and began mass cloning:

Time (UTC)	Location	Activity
22:36:50	US	First attacker access, mass cloning begins
22:36-22:39	US	73 repositories cloned
22:48-22:50	US	~70 more repositories cloned (second wave)
22:55-22:56	US	~90 repositories cloned (third wave)
22:59-23:04	US	~70 repositories cloned (fourth wave)
23:32:59	India	Attacker switches to India-based infrastructure
23:32-23:37	India	73 repositories cloned
23:34-23:35	US + India	Simultaneous cloning from both locations

The simultaneous activity from US and India confirmed we were dealing with a single attacker using multiple VPNs or servers, not separate actors.

While our engineer slept in Germany, the attacker continued their reconnaissance. More cloning at 02:56-02:59 UTC (middle of the night in Germany), sporadic activity until 05:32 UTC. Total repos cloned: 669 (527 from US infrastructure, 142 from India).

Here's where it gets unsettling. Our engineer woke up and started their normal workday:

Time (UTC)	Actor	Activity
09:08:27	Engineer	Triggers workflow on cloud repo (from Germany)
09:10-09:17	Attacker	Git fetches from US, watching the engineer
09:08-15:08	Engineer	Normal PR reviews, CI workflows (from Germany)

The attacker was monitoring our engineer's activity while they worked, unaware they were compromised.

During this period, the attacker created repositories with random string names to store stolen credentials, a known Shai-Hulud pattern:

github.com/[username]/xfjqb74uysxcni5ztn
github.com/[username]/ls4uzkvwnt0qckjq27
github.com/[username]/uxa7vo9og0rzts362c

They also created three repos marked with "Sha1-Hulud: The Second Coming" as a calling card. These repositories were empty by the time we examined them, but based on the documented Shai-Hulud behavior, they likely contained triple base64-encoded credentials.

10 minutes of destruction

At 15:27 UTC on November 25th, the attacker switched from reconnaissance to destruction.

The attack began on our cloud repo from India-based infrastructure:

Time (UTC)	Event	Repo	Details
15:27:35	First force-push	triggerdotdev/cloud	Attack begins
15:27:37	PR closed	triggerdotdev/cloud	PR #300 closed
15:27:44	BLOCKED	triggerdotdev/cloud	Branch protection rejected force-push
15:27:50	PR closed	triggerdotdev/trigger.dev	PR #2707 closed

The attack continued on our main repository:

Time (UTC)	Event	Details
15:28:13	PR closed	triggerdotdev/trigger.dev PR #2706 (release PR)
15:30:51	PR closed	triggerdotdev/trigger.dev PR #2451
15:31:10	PR closed	triggerdotdev/trigger.dev PR #2382
15:31:16	BLOCKED	Branch protection rejected force-push to trigger.dev
15:31:31	PR closed	triggerdotdev/trigger.dev PR #2482

At 15:32:43-46 UTC, 12 PRs on jsonhero-web were closed in 3 seconds. Clearly automated. PRs #47, #169, #176, #181, #189, #190, #194, #197, #204, #206, #208 all closed within a 3-second window.

Our critical infrastructure repository was targeted next:

Time (UTC)	Event	Details
15:35:41	PR closed	triggerdotdev/infra PR #233
15:35:45	BLOCKED	Branch protection rejected force-push (India)
15:35:48	PR closed	triggerdotdev/infra PR #309
15:35:49	BLOCKED	Branch protection rejected force-push (India)

The final PR was closed on json-infer-types at 15:37:13 UTC.

Detection and response

We got a lucky break. One of our team members was monitoring Slack when the flood of notifications started:

Our #git Slack channel during the attack. A wall of force-pushes, all with commit message "init."

Every malicious commit was authored as:


Author: Linus Torvalds <[email protected]>

An attacked branch: a single "init" commit attributed to Linus Torvalds, thousands of commits behind main.

We haven't found reports of other Shai-Hulud victims seeing this same "Linus Torvalds" vandalism pattern. The worm's documented behavior focuses on credential exfiltration and npm package propagation, not repository destruction. This destructive phase may have been unique to our attacker, or perhaps a manual follow-up action after the automated worm had done its credential harvesting.

Within 4 minutes of detection we identified the compromised account, removed them from the GitHub organization, and the attack stopped immediately.

Our internal Slack during those first minutes:

"Urmmm guys? what's going on?"

"add me to the call @here"

"Nick could you double check Infisical for any machine identities"

"can someone also check whether there are any reports of compromised packages in our CLI deps?"

Within the hour:

Time (UTC)	Action
~15:36	Removed from GitHub organization
~15:40	Removed from Infisical (secrets manager)
~15:45	Removed from AWS IAM Identity Center
~16:00	Removed from Vercel and Cloudflare
16:35	AWS SSO sessions blocked via deny policy (sessions can't be revoked)
16:45	IAM user console login deleted

The damage

Repository clone actions: 669 (public and private), including infrastructure code, internal documentation, and engineering plans.

Branches force-pushed: 199 across 16 repositories

Pull requests closed: 42

Protected branch rejections: 4. Some of our repositories have main branch protection enabled, but we had not enabled it for all repositories at the time of the incident.

npm packages were not compromised. This is the difference between "our repos got vandalized" and "our packages got compromised."

Our engineer didn't have an npm publishing token on their machine, and even if they did we had already required 2FA for publishing to npm. Without that, Shai-Hulud would have published malicious versions of @trigger.dev/sdk, @trigger.dev/core, and others, potentially affecting thousands of downstream users.

Production databases or any AWS resources were not accessed. Our AWS CloudTrail audit showed only read operations from the compromised account:

Event Type	Count	Service
ListManagedNotificationEvents	~40	notifications
DescribeClusters	8	ECS
DescribeTasks	4	ECS
DescribeMetricFilters	6	CloudWatch

These were confirmed to be legitimate operations by our engineer.

One nice surprise: AWS actually sent us a proactive alert about Shai-Hulud. They detected the malware's characteristic behavior (ListSecrets, GetSecretValue, BatchGetSecretValue API calls) on an old test account that hadn't been used in months, so we just deleted it. But kudos to AWS for the proactive detection and notification.

The recovery

GitHub doesn't have server-side reflog. When someone force-pushes, that history is gone from GitHub's servers.

But we found ways to recover.

Push events are retained for 90 days via the GitHub Events API. We wrote a script that fetched pre-attack commit SHAs:


# Find pre-attack commit SHA from events
gh api repos/$REPO/events --paginate | \
 jq -r '.[] | select(.type=="PushEvent") |
 select(.payload.ref=="refs/heads/'$BRANCH'") |
 .payload.before' | head -1

Public repository forks still contained original commits. We used these to verify and restore branches.

Developers who hadn't run git fetch --prune (all of us?) still had old SHAs in their local reflog.

Within 7 hours, all 199 branches were restored.

GitHub app private key exposure

During the investigation, our engineer was going through files recovered from the compromised laptop and discovered something concerning: the private key for our GitHub App was in the trash folder.

When you create a private key in the GitHub App settings, GitHub automatically downloads it. The engineer had created a key at some point, and while the active file had been deleted, it was still in the trash, potentially accessible to TruffleHog.

Our GitHub App has the following permissions on customer repositories:

Permission	Access Level	Risk
contents	read/write	Could read/write repository contents
pull_requests	read/write	Could read/create/modify PRs
deployments	read/write	Could create/trigger deployments
checks	read/write	Could create/modify check runs
commit_statuses	read/write	Could mark commits as passing/failing
metadata	read	Could read repository metadata

To generate valid access tokens, an attacker would need both the private key (potentially compromised) and the installation ID for a specific customer (stored in our database which was not compromised, not on the compromised machine).

We immediately rotated the key:

Time (UTC)	Action
Nov 26, 18:51	Private key discovered in trash folder
Nov 26, 19:54	New key deployed to test environment
Nov 26, 20:16	New key deployed to production

We found no evidence of unauthorized access to any customer repositories. The attacker would have needed installation IDs from our database to generate tokens, and our database was not compromised as previously mentioned.

However, we cannot completely rule out the possibility. An attacker with the private key could theoretically have called the GitHub API to enumerate all installations. We've contacted GitHub Support to request additional access logs. We've also analyzed the webhook payloads to our GitHub app, looking for suspicious push or PR activity from connected installations & repositories. We haven't found any evidence of unauthorized activity in these webhook payloads.

We've sent out an email to potentially effected customers to notify them of the incident with detailed instructions on how to check if they were affected. Please check your email for more details if you've used our GitHub app.

Technical deep-dive: how Shai-Hulud works

For those interested in the technical details, here's what we learned about the malware from Socket's analysis and our own investigation.

When npm runs the preinstall script, it executes setup_bun.js:

Detects OS/architecture
Downloads or locates the Bun runtime
Caches Bun in ~/.cache
Spawns a detached Bun process running bun_environment.js with output suppressed
Returns immediately so npm install completes successfully with no warnings

The malware runs in the background while you think everything is fine.

The payload uses TruffleHog to scan $HOME for GitHub tokens (from env vars, gh CLI config, git credential helpers), AWS/GCP/Azure credentials, npm tokens from .npmrc, environment variables containing anything that looks like a secret, and GitHub Actions secrets (if running in CI).

Stolen credentials are uploaded to a newly-created GitHub repo with a random name. The data is triple base64-encoded to evade GitHub's secret scanning.

Files created:

contents.json (system info and GitHub credentials)
environment.json (all environment variables)
cloud.json (cloud provider credentials)
truffleSecrets.json (filesystem secrets from TruffleHog)
actionsSecrets.json (GitHub Actions secrets if any)

If an npm publishing token is found, the malware validates the token against the npm registry, fetches packages maintained by that account, downloads each package, patches it with the malware, bumps the version, and re-publishes, infecting more packages.

This is how the worm spread through the npm ecosystem, starting from PostHog's compromised CI on November 24th at 4:11 AM UTC. Our engineer was infected roughly 16 hours after the malicious packages went live.

If no credentials are found to exfiltrate or propagate, the malware attempts to delete the victim's entire home directory. Scorched earth.

File artifacts to look for: setup_bun.js, bun_environment.js, cloud.json, contents.json, environment.json, truffleSecrets.json, actionsSecrets.json, .trufflehog-cache/ directory.

Malware file hashes (SHA1):

bun_environment.js: d60ec97eea19fffb4809bc35b91033b52490ca11
bun_environment.js: 3d7570d14d34b0ba137d502f042b27b0f37a59fa
setup_bun.js: d1829b4708126dcc7bea7437c04d1f10eacd4a16

We've published a detection script that checks for Shai-Hulud indicators.

What we've changed

We disabled npm scripts globally:


npm config set ignore-scripts true --location=global

This prevents preinstall, postinstall, and other lifecycle scripts from running. It's aggressive and some packages will break, but it's the only reliable protection against this class of attack.

We upgraded to pnpm 10. This was significant effort (had to migrate through pnpm 9 first), but pnpm 10 brings critical security improvements. Scripts are ignored by default. You can explicitly whitelist packages that need to run scripts via pnpm.onlyBuiltDependencies. And the minimumReleaseAge setting prevents installing packages published recently.


minimumReleaseAge: 4320 # 3 days in minutes

To whitelist packages that legitimately need build scripts:

This prompts you to select which packages to allow (like esbuild, prisma, sharp).

For your global pnpm config:


pnpm config set minimumReleaseAge 4320
pnpm config set --json minimumReleaseAgeExclude '["@trigger.dev/*", "trigger.dev"]'

We switched npm publishing to OIDC. No more long-lived npm tokens anywhere. Publishing now uses npm's trusted publishers with GitHub Actions OIDC. Even if an attacker compromises a developer machine, they can't publish packages because there are no credentials to steal. Publishing only happens through CI with short-lived, scoped tokens.

We enabled branch protection on all repositories. Not just critical repos or just OSS repos. Every repository with meaningful code now has branch protection enabled.

We've adopted Granted for AWS SSO. Granted encrypts SSO session tokens on the client side, unlike the AWS CLI which stores them in plaintext.

Based on PostHog's analysis of how they were initially compromised (via pull_request_target), we've reviewed our GitHub Actions workflows. We now require approval for external contributor workflow runs on all our repositories (previous policy was only for public repositories).

Lessons for other teams

The ability for packages to run arbitrary code during installation is the attack surface. Until npm fundamentally changes, add this to your ~/.npmrc:

Yes, some things will break. Whitelist them explicitly. The inconvenience is worth it.

pnpm 10 ignores scripts by default and lets you set a minimum age for packages:


pnpm config set minimumReleaseAge 4320 # 3 days

Newly published packages can't be installed for 3 days, giving time for malicious packages to be detected.

Branch protection takes 30 seconds to enable. It prevents attackers from pushing to a main branch, potentially executing malicious GitHub action workflows.

Long-lived npm tokens on developer machines are a liability. Use trusted publishers with OIDC instead.

If you don't need a credential on your local machine, don't have it there. Publishing should happen through CI only.

Our #git Slack channel is noisy. That noise saved us.

One of the hardest parts of this incident was that it happened to a person.

"Sorry for all the trouble guys, terrible experience"

Our compromised engineer felt terrible, even though they did absolutely nothing wrong. It could have happened to any team member.

Running npm install is not negligence. Installing dependencies is not a security failure. The security failure is in an ecosystem that allows packages to run arbitrary code silently.

They also discovered that the attacker had made their GitHub account star hundreds of random repositories during the compromise. Someone even emailed us: "hey you starred my repo but I think it was because you were hacked, maybe remove the star?"

Metric	Value
Time from compromise to first attacker activity	~2 hours
Time attacker had access before destructive action	~17 hours
Duration of destructive attack	~10 minutes (15:27-15:37 UTC)
Time from first malicious push to detection	~5 minutes
Time from detection to access revocation	~4 minutes
Time to full branch recovery	~7 hours
Repository clone actions by attacker	669
Repositories force-pushed	16
Branches affected	199
Pull requests closed	42
Protected branch rejections	4

Resources

About the Attack:

Mitigation Resources:

Have questions about this incident? Reach out on Twitter/X or Discord.

Read the original article

Comments

By snickerbockers 2025-12-1417:127 reply

>Running npm install is not negligence. Installing dependencies is not a security failure. The security failure is in an ecosystem that allows packages to run arbitrary code silently.

No, your security failure is that you use a package manager that allows third-parties push arbitrary code into your product with no oversight. You only have "secutity" to the extent that you can trust the people who control those packages to act both competently and in good faith ad infinitum.

Also the OP seemingly implies credentials are stored on-filesystem in plaintext but I might be extrapolating too much there.

By majormajor 2025-12-1421:551 reply

> Running npm install is not negligence. Installing dependencies is not a security failure. The security failure is in an ecosystem that allows packages to run arbitrary code silently.

This is wildly circular logic!

"One person using these tools isn't bad security practice, the problem is that EVERYONE ELSE ["the ecosystem"] uses these tools and doesn't have higher standards!"

It should be no shock to anyone at this point that huge chunks of common developer tools have very poor security profiles. We've seen stories like this many times.

If you care, you need to actually care!

By perching_aix 2025-12-1423:172 reply

So do you actually agree or disagree that there's something wrong with npm? It reads as if you were playing both sides, just to land on blaming the individual each time.

Even if this was actually some weirdly written plea to shared responsibility, surely it makes sense that in a hierarchy, one would proritize trying to fix things upstream closer to the root, rather than downstream closer to the leaves, doesn't it?

> This is wildly circular logic!

They're very clearly implying a semantic disagreement there, not making a logical mistake.

By jrflowers 2025-12-155:011 reply

I can’t speak for majormajor but I thought the language was kind of funny. “The problem is an ecosystem that allows packages to run arbitrary code silently” is an odd statement because for many people that’s kind of what a package manager does.

By ballpug 2025-12-1512:17

[dead]

By chatmasta 2025-12-1511:14

> one would proritize trying to fix things upstream closer to the root

One should prioritize fixing things one is responsible for. If you make a commitment to protect your user’s data, then you take responsibility for the tools you use, and how you use them.

Whether or not you – or someone else – should fix those tools upstream, is a separate issue to be solved later. First solve the problems that are your responsibility. Then worry about everyone else.

The npm ecosystem has many security issues but they are all mitigatable.

By deepsun 2025-12-1417:342 reply

Same thing with IDE plugins. At least some are full-featured by the manufacturer, but I couldn't get on with VS Code as for every small feature I had to install some random plugin (even if popular, but still developed by who-knows-who).

By willvarfar 2025-12-1421:16

The amount of browser extension authors who have talked openly about being approached to sell their extension or insert malicious code is many, and presumably many others have taken the money and not told us about it. It seems likely there are IDE extensions doing or going to do the same thing...

By packtreefly 2025-12-1423:47

It's painful, but I've grown distrustful enough of the ecosystem that I disable updates on every IDE plugin not maintained by a company with known-adequate security controls and review the source code of plugin changes before installing updates, typically opting out unless something is broken.

It's unclear to me if the code linked on the plugin's description page is in amy way guaranteed to be the code that the IDE downloads.

The status quo in software distribution is simultaneously convenient, extraordinarily useful, and inescapably fucked.

By amluto 2025-12-157:25

>> The security failure is in an ecosystem that allows packages to run arbitrary code silently.

> No, your security failure is that you use a package manager that allows third-parties push arbitrary code into your product with no oversight.

How about both? It’s conceptually straightforward to build a language in which code cannot do anything other than read its inputs, consume resources, and produce correctly typed output.

This would not fully solve the supply chain problem — malicious code could produce maliciously incorrect output or exploit side channels, but the exposure would be much, much less than it is now.

By atherton94027 2025-12-154:383 reply

> No, your security failure is that you use a package manager that allows third-parties push arbitrary code into your product with no oversight.

Could you explain how you'd design a package manager that does not allow that? As far as I understand the moment you use third party code you have to trust to some extent the code that you will run.

By tkinom 2025-12-155:232 reply

Can we design something like virustotal setup? (https://en.wikipedia.org/wiki/VirusTotal)

NPM setup similar dl_files_security_sigs.db .database for all downloaded files from npm in all offline install? List all versions, latest mod date, multiple latest crypto signatures (shar256, etc) and have been reviewed by multiple security org/researchers, auto flag if any contents are not pure clear/clean txt...

If it detects anything (file date, size, crypto sigs) < N days and have not been thru M="enough" security reviews, the npm system will automatically raise a security flag and stop the install and auto trigger security review on those files.

With proper (default secure) setup, any new version of npm downloads (code, config, scripts) will auto trigger stop download and flagged for global security review by multiple folks/orgs.

When/if this setup available as NPM default, would it stop similar compromise from happen to NPM again? Can anyone think of anyway to hack around this?

By delusional 2025-12-156:04

How would you identify "security researchers" and tell them apart from the attacker in a trench coat?

After you've done that, why would these supposedly expert security researchers review random code in your package manager?

By duckmysick 2025-12-158:341 reply

> have been reviewed by multiple security org/researchers

I imagine reviewing all the code for all the packages for all the published versions gets really expensive. Who's paying for this?

By Orygin 2025-12-1514:05

Microsoft has a 3.5 trillion dollar market cap. I guess they can pay for it?

By snickerbockers 2025-12-1516:43

I'm speaking to the concept of automatic updates in general, which package managers either enable by default or implicitly allow through lack of security measures.

One obvious solution is to host your own repositories so that nothing gets updated without having been signed off by a trusted employee. Another is to check the cryptographic hash of all packages so it cannot change without the knowledge and consent of your employees.

You're right in that this does not completely eliminate the possibility of trojan horses being sneaked in through open-source dependencies but it would at the very least require some degree of finesse on the part of the person making the trojan horse so that they have to manipulate the system into doing something it was not designed to do.

One thing I really hate about the modern cybersecurity obsession is that there's a large contingent of people who aggressively advocate against anything which might present a problem if misused (rust, encryption on everything no matter how inconsequential, deprecating FTP, UEFI secure boot, timing side-channels, etc) yet at the same time there's a massive community of high-level software developers who appear to be under the impression that extremely basic vulnerabilities (trojan package managers, cross-site scripting, letting my cell phone provider steal my identity because my entire life is authenticated by a SIM card, literally just concatenating strings received over the internet into an SQL statement, etc) are unsolved problems which just has to be tolerated for now until somebody figures out a way to not download and execute non-vetted third-party code. Somehow the two groups never seem to cross swords.

TL;DR: Reading HN i feel like im constantly getting criticized for using C because I might fuck up and let a ROP through yet so many of the most severe modern security breaches are coming from people who think turning off automatic updates is like being asked to prove the rieman zeta hypothesis.

By vasco 2025-12-154:593 reply

They can't explain, it's just victim blaming. The market currently doesn’t have a proper solution to this.

Everyone works with these package managers, I bet the commenter also has installed pip or npm packages without reading its full code, it just feels cool to tell other people they are dumb and it's their own fault for not reading all the code beforehand or for using a package manager, when every single person does the same. Some just are unlucky.

The whole ecosystem is broken, the expectations of trust are not compatible with the current amount of attacks.

By voidnap 2025-12-159:123 reply

It isn't victim blaming. People like you make it impossible to avoid attacks like these because you have no appetite for a better security model.

I run npm under bubblewrap because npm has a culture of high risk; of using too many dependencies from untrusted authors. But being scrupulous and responsible is a cost I pay with my time and attention. But it is important because if I run some untrusted code and am compromised it can affect others.

But that is challenging when every time some exploit rolls around people, like you, brush it off as "unlucky". As if to say it's inavoidable. That nobody can be expected to be responsible for the libraries they use because that is too hard or whatever. You simply lack the appetite for good hygene and it makes it harder for the minority of us who care about how our actions affect others.

By VPenkov 2025-12-1510:201 reply

> you have no appetite for a better security model

For what it's worth, there are some advancements. PNPM - the packager used in this case - doesn't automatically run postinstall scripts. In this case, either the engineer allowed it explicitly, or a transitive dependency was previously considered safe, and allowed by default, but stopped being safe.

PNPM also lets you specify a minimum package age, so you cannot install packages younger than X. The combination of these would stop most attacks, but becomes less effective if everyone specifies a minimum package age, so no one would fall victim.

It's a bit grotesque because the system relies on either the package author noticing on time, or someone falling victim and reporting it.

NPM now supports publishing signed packages, and PNPM has a trustPolicy flag. This is a step in a good direction, but is still not enough, because it relies on publishers to know and care about signing packages, and it relies on consumers to require it.

There _is_ appetite for a better security model, but a lot of old, ubiquitous packages, are unmaintained and won't adopt it. The ecosystem is evolving, but very slowly, and breaking changes seem needed.

By VPenkov 2025-12-1518:26

I had the chance to finish reading and it looks like Trigger were using an older version of PNPM which didn't do any of the above, and have since implemented everything I've mentioned in my post, plus some additional Git security.

So a slight amendment there on the human error side of things.

By vasco 2025-12-1515:371 reply

What no appetite? I just don't like your solution. The industry needs an answer to this problem stat, and it can't be "just read the code before".

By voidnap 2025-12-1518:031 reply

At some point you must be open to being compelled to read code you run or ship. Otherwise, if that's to hard, then I don't know what to tell you. We'll just never agree.

If you find a better solution than being responsible for what you do and who you trust, I'm all for it. Until then, that's part of the job.

When I was a junior, our company payed a commercial license for some of the larger libraries we used and it included support. Or manage risk by using fewer and more trustworthy projects like Django instead of reaching for a new dependency from some random person every time you need to solve a simple problem.

> What no appetite? I just don't like your solution.

When I say "appetite" I am being very deliberate. You are hungry but you won't eat your vegetables. When you say "I just don't like your vegetables", then you aren't that hungry. You don't have the appetite. You'd rather accept the risk. Which is fine but then don't complain when stuff like this happens and everyone is compromised.

By vasco 2025-12-1611:06

I hope you've read every diff to every Linux kernel you've ever deployed... There's LOADS of code you've deployed I can bet a large amount of money you never read. So clearly there's solutions that solve the problem of having to read every line of every dependency you deploy. It's just that certain ecosystems are more easy to exploit so new solutions are needed. Read everything is not a solution, it's a bandaid that shows there's a problem of trust to be solved (or improved enough to discourage this wave of attacks) with a technical solution.

By godelski 2025-12-1510:211 reply

No, you are the problem because you have a higher expectation than reality. People shouldn't have to run npm in containers. You're over simplifying with one case where you have found one solution while ignoring the identical problems elsewhere. You are preventing us from looking at other solutions because you think the one you have is enough and works for everyone.

By voidnap 2025-12-1517:451 reply

I agree with you that I shouldn't have to treat my libraries like untrusted code. I don't know what the rest of your comment means. I don't see how I'm preventing anybody from looking at other solutions to npm, they just don't want to do it because it's hard. And I have similar criticisms for cargo as it just copies npm and inherits all of its problems. I hate that.

npm has had a bad ecosystem since its inception. The left-pad thing being some of my earliest memories of it [1]. So none of this is new.

But all of this is still an issue because it's too convenient and that's the most important thing. Even cargo copies npm because they want to be seen as convenient and the risk is acknowledged. Nobody has the appetite to be held accountable for who they put their trust in.

[1] https://en.wikipedia.org/wiki/Npm_left-pad_incident

By godelski 2025-12-1522:521 reply

The problem is you're victim blaming.

  > snickerbockers > No, your security failure is that you use a package manager
  > you > It isn't victim blaming. People like you make it impossible to avoid attacks like these because you have no appetite for a better security model.

I'd wager a large portion of people with `npm` don't actually realize they have `npm`. I'd also wager that most people that know they have `npm` aren't aware of the security issues.

Under those conditions, people are not in fact making choices. These are not people "that have no appetite for a better security model". These are people who don't even know they are unsafe!

Yes, this is victim blaming. Just in the same way people blame a rape victim for what they wear. Does what you wear modify the situation? Yes. Does it cause the situation? No. We only really blame a victim if they are putting themselves directly, and knowingly, in harms way. This is not that case! This is a case where people are uninformed, both in the dangers present as well as the existence of danger.

FFS, on more than one occasion I've installed a package only to see that it bundles `npm` along with it. And I'm more diligent than most people, so I know tons of people don't know it's happening. Especially because you can't always run `which npm` to find if it is installed. But the fact is that you can do something like `brew install foo` and foo has a dependency that has a dependency that has node as a dependency.

Dependency hell is integral to the problem here! So you can go ahead and choose a package manager that doesn't allow 3rd parties to push arbitrary code and end up with a package manager that allows 3rd parties to push arbitrary code! That's even what made left-pad a thing (and don't get me started on the absurdity of using a module for this functionality!).

  > Nobody has the appetite to be held accountable for who they put their trust in

That is jut not the reality of things. In the real world nobody can read all the lines of code. It just simply isn't possible. You aren't reading everything that you're running, let alone all the dependencies and all the way down to the fucking kernel. There just isn't enough time in the day to do this within your lifetime, even if you are running a very cut down system. There's just too many lines of code!

So stop this bullshit rhetoric of "know what you're running" because it is ignoring the reality of the situation. Yes, people should do due diligence and inspect, but the reality is that this is not possible to do. Nor is it bulletproof, as it requires the reader to be omniscient themselves, or at least a security expert with years of training to even be able to spot security mistakes. Hell, if everyone (or just programmers) already had that kind of training then I'd wager 90+% of issues wouldn't even exist in the code in the first place.

So stop oversimplifying the situation because we can't even begin to talk about what needs to be done to solve things if we can't even discuss the reality of the problem.

By voidnap 2025-12-220:36

It's cute that you truncated the most important part of the other commenter's message; "your security failure is that you use a package manager [that allows third-parties push arbitrary code into your product with no oversight]."

> I'd wager a large portion of people with `npm` don't actually realize they have `npm`.

Recklessness is not a defense.

> But the fact is that you can do something like `brew install foo` and foo has a dependency that has a dependency that has node as a dependency.

That's good to know. I've never looked at brew and wasn't planing on using it, but I will stay away from it in the future. It sounds like you learned your lesson though, right?

Because if you haven't, that sounds like negligence. You can't be unaccountable for your actions by admitting that you did not expect those outcomes when you did not do your due diligence. And if you don't hold yourself accountable, then you sure aren't about to hold others accountable either. So your whole ecosystem is screwed.

> Yes, this is victim blaming. Just in the same way people blame a rape victim for what they wear.

Not even remotely. I can say and it's bad for people to abuse exploits and they don't deserve that. At the same time, if I put my private key without a passphrase into the public, or commit secrets to git and share them with the public, I am being negligent.

You are leaving your car unlocked with the windows rolled down in a dodgy part of town overnight. And when it's gone/pilfered in the morning, it's completely fair to say that you did a stupid thing.

We can say that is negligent without saying that you deserved it or that it ought to have happened. And it's absolutely okay for me, or anybody else, to say that you should have known better, without you comparing me to a rape apologist.

> In the real world nobody can read all the lines of code. There's just too many lines of code!

I don't know why you went on that rant when you quoted me talking about "trust". I wouldn't need trust if I could fully understand everything about every machine I use and only rely on myself.

> So stop this bullshit rhetoric of "know what you're running" because it is ignoring the reality of the situation.

Naw, it isn't. I trust packages from my operating system's package manager. The issues we see with left-pad and shai-hulud, have never and will never happen to me using those packages because they simply do not accept the kinds of garbage people put up on npm, or brew apparently as you pointed out.

I avoid running stuff like on-my-zsh because I don't have the patience to audit that and I certainly don't want to run untrusted stuff in my shell as root. But it's a very popular package because people, like you, have a greater risk tolerance. And that's fine, as long as you accept the consequences of that risk tolerance. You aren't paying for support or liability, you aren't reading the code, you are putting trust in random sources and hoping that things work out.

If you want the luxury running untrusted code as root, or the luxury of leaving your car open in a dodgy part of town overnight, then maybe maybe what you want is a surveillance state, idk. There is a cost to that. A tradeoff. If that's what you want and that's your goal, then I can't stop you. But it's you could also just ... not do such risky things.

By u8080 2025-12-1512:20

>it's their own fault for not reading all the code beforehand or for using a package manager, when every single person does the same.

But like, isn't that actually the core of the problem? People choose to blindly trust some random 3rd parties - isn't exploiting this trust seems to be inevitable and predictable outcome?

By snickerbockers 2025-12-1516:58

>it's just victim blaming

Victim-blaming is when a girl gets raped and you tell her that it's her fault for dressing like a skank and getting drunk at a college fraternity party. Telling the bank they should have put the money in a vault instead of leaving it in an unlocked drawer next to the cash register is not victim-blaming. Telling the CIA that they shouldn't have given Osama Bin-Laden guns and money to fight the soviets in afghanistan is not victim-blaming. Telling president Roosevelt it was a poor decision to park the entire Pacific fleet in a poorly-defended naval base adjacent to an expansionist empire which is already at war with most of America's allies is not victim-blaming. *Telling a well-funded corporation to not download and execute third-party code with privileges is not victim blaming, especially as their customers are often the ones who are actually being targeted.*

>I bet the commenter also has installed pip or npm packages without reading its full code

I think i did use pip at some point about a decade ago but i can't remember what for. In general though you lose that bet because I don't use either of these programs.

> it just feels cool to tell other people they are dumb

it does, yes.

>and it's their own fault for not reading all the code beforehand or for using a package manager, when every single person does the same.

I don't suppose you've ever played an old video game called "Lemmings"?

>Some just are unlucky.

Lol.

>The whole ecosystem is broken, the expectations of trust are not compatible with the current amount of attacks.

that's kind of my point, except it doesn't mitigate responsibility for participating in that ecosystem.

By LtWorf 2025-12-1418:461 reply

> Also the OP seemingly implies credentials are stored on-filesystem in plaintext but I might be extrapolating too much there.

Doesn't really matter, if the agent is unlocked they can be accessed.

By johncolanduoni 2025-12-1420:271 reply

This is not strictly true - most OS keychain stores have methods of authenticating the requesting application before remitting keys (signatures, non-user-writable paths, etc.), even if its running as the correct user. That said, it requires careful design on the part of the application (and its install process) to not allow a non-elevated application to overwrite some part of the trusted application and get the keys anyway. macOS has the best system here in principle with its bundle signing, but most developer tools are not in bundles so its of limited utility in this circumstance.

By michaelt 2025-12-1422:562 reply

> This is not strictly true - most OS keychain stores have methods of authenticating the requesting application before remitting keys (signatures, non-user-writable paths, etc.), even if its running as the correct user.

Isn't that a smartphone-and-app-store-only thing?

As I understand it, no mainstream desktop OS provides the capabilities to, for example, protect a user's browser cookies from a malicious tool launched by that user.

That's why e.g. PC games ship with anti-cheat mechanisms - because PCs don't have a comprehensive attested-signed-code-only mechanism to prevent nefarious modifications by the device owner.

By acdha 2025-12-1423:041 reply

> As I understand it, no mainstream desktop OS provides the capabilities to, for example, protect a user's browser cookies from a malicious tool launched by that user.

macOS sandboxing has been used for this kind of thing for years. Open a terminal window on a new Mac and trying to open the user’s photo library, Desktop, iCloud documents, etc. will trigger a permissions prompt.

By michaelt 2025-12-1423:371 reply

Interesting, it's a few years since I've used a Mac.

Descriptions of this stuff online are pretty confusing. Apparently there's an "App Sandbox" and also "Transparency Consent and Control" - I assume from your mention of the photo library describing the latter?

How does this protection interact with IDEs? For some operations conducted in an IDE, like checking out code and collecting dependencies the user grants the software access to SSH keys, artifact repo credentials and suchlike. But unsigned code can also be run as a child process of the IDE - such as when the user compiles and runs their code.

How does the sandboxing protection interact with the IDE and its subprocesses, to ensure only the right subprocesses can access credentials?

By acdha 2025-12-1513:38

They added sandboxing in the 2000s, which does mandatory access control (e.g. you can write a rule that Firefox.app can’t access ~/Library/Keychains) and expanded it with containers (not OCI) which standardize the layout starting with the App Store so they all follow common restrictions for what they can access and where they store different classes of data. Those policies are inherited by child processes (e.g. your Terminal.app permissions apply to CLI tools you run in its windows but not something you start by logging in via SSH) so much of the effort has been standardizing the UX – don’t access photos directly, use the system picker which allows the user to select subsets, etc.

https://developer.apple.com/documentation/security/app-sandb...

So the answer to that question depends on what permissions the IDE has asked for and been granted. It’s likely that the first time you opened a shell inside the IDE you’d get promoted for permission to access protected locations the first time you ran a command which did something protected, but they could ask for something like full disk access at install time to avoid many prompts.

By johncolanduoni 2025-12-1517:53

macOS and Windows’s native keychains both support this - they encrypt the secrets with a key that is not accessible to apps that run with user permissions without sudo (macOS) or elevation (Windows). The actual user can still access them, but a normal app (other than the one that stored the secret in the keychain originally) running as that user cannot do so directly.

By c0balt 2025-12-151:561 reply

> Also the OP seemingly implies credentials are stored on-filesystem in plaintext but I might be extrapolating too much there.

To be fair, some tools only support a netrc file for http(s) based auth. Regardless, if you want to use git via http this vector exists almost always.

By woodruffw 2025-12-152:201 reply

Serious question: what tools only support netrc for authentication? I'm aware of lots of tools that (unfortunately IMO) support netrc as a source of credentials, but I can't think of a single one that requires it.

By c0balt 2025-12-177:58

Afaik, nix for https-based git(hub/lab/...) repositories and http-auth protected resources (via fetchurl and friends).

By elif 2025-12-1418:121 reply

It wasn't in their product. It was just on a devs machine

By hnlmorg 2025-12-1418:242 reply

I think the OP is aware of that and I agree with them that it’s bad practice despite how common it is.

For example with AWS, you can use the AWS CLI to sign you in and that goes through the HTTPS auth flow to provide you with temporary access keys. Which means:

1. You don’t have any access keys in plain text

2. Even if your env vars are also stolen, those AWS keys expire within a few hours anyway.

If the cloud service you’re using doesn’t support OIDC or any other ephemeral access keys, then you should store them encrypted. There’s numerous ways you can do this, from password managers to just using PGP/GPG directly. Just make sure you aren’t pasting them into your shell otherwise you’ll then have those keys in plain text in your .history file.

I will agree that It does take effort to get your cloud credentials set up in a convenient way (easy to access, but without those access keys in plain text). But if you’re doing cloud stuff professionally, like the devs in the article, then you really should learn how to use these tools.

By robomc 2025-12-1420:252 reply

> If the cloud service you’re using doesn’t support OIDC or any other ephemeral access keys, then you should store them encrypted. There’s numerous ways you can do this, from password managers to just using PGP/GPG directly. Just make sure you aren’t pasting them into your shell otherwise you’ll then have those keys in plain text in your .history file.

This doesn't really help though, for a supply chain attack, because you're still going to need to decrypt those keys for your code to read at some point, and the attacker has visibility on that, right?

Like the shell isn't the only thing the attacker has access to, they also have access to variables set in your code.

By hnlmorg 2025-12-1421:35

I agree it doesn’t keep you completely safe. However scanning the file system for plain text secrets is significantly easier than the alternatives.

For example, for vars to be read, you’d need the compromised code to be part of your the same project. But if you scan the file system, you can pick up secrets for any project written in any language, even those which differ from the code base that pulled the compromised module.

This example applies directly to the article; it wasn’t their core code base that ran the compromised code but instead an experimental repository.

Furthermore, we can see from these supply chain attacks that they do scan the file system. So we do know that encrypting secrets adds a layer of protection against the attacks happening in the wild.

In an ideal world, we’d use OIDC everywhere and not need hardcoded access keys. But in instances where we can’t, encrypting them is better than not.

By majormajor 2025-12-1421:59

It's certainly a smaller surface that could help. For instance, a compromised dev dependency that isn't used in the production build would not be able to get to secrets for prod environments at that point. If your local tooling for interacting with prod stuff (for debugging, etc) is set up in a more secure way that doesn't mean long-lived high-value secrets staying on the filesystem, then other compromised things have less access to them. Add good, phishing-resistant 2FA on top, and even with a keylogger to grab your web login creds for that AWS browser-based auth flow, an attacker couldn't re-use it remotely.

(And that sort of ephemeral-login-for-aws-tooling-from-local-env is a standard part of compliance processes that I've gone through.)

By cyberax 2025-12-1422:151 reply

> 1. You don’t have any access keys in plain text

That's not correct. The (ephemeral) keys are still available. Just do `aws configure export-credentials --profile <YOUR_OIDC_PROFILE>`

Sure, they'll likely expire in 1-24 hours, but that can be more than enough for the attacker.

You also can try to limit the impact of the credentials by adding IP restrictions to the assumed role, but then the attacker can just proxy their requests through your machine.

By hnlmorg 2025-12-156:472 reply

> That's not correct. The (ephemeral) keys are still available. Just do `aws configure export-credentials --profile <YOUR_OIDC_PROFILE>`

That’s not on the file system though. Which is the point I’m directly addressing.

I did also say there are other ways to pull those keys and how this isn’t completely solution. But it’s still vastly better than having those keys in clear text on the file system.

Arguing that there are other ways to circumvent security policies is a lousy excuse to remove security policies that directly protect you against known attacks seen in the wild.

> Sure, they'll likely expire in 1-24 hours, but that can be more than enough for the attacker.

It depends on the attacker, but yes, in some situations that might be more than long enough. Which is while I would strongly recommend people don’t set their OIDC creds to 24 hours. 8 hours is usually long enough, shorter should be required if you’re working on sensitive/high profile systems. And in the case of this specific attack, 8 hours would have been sufficient given the attacker probed AWS while the German team were asleep.

But again, i do agree it’s not a complete solution. However it’s still better than hardcoded access keys in plain text saved in the file system.

> You also can try to limit the impact of the credentials by adding IP restrictions to the assumed role, but then the attacker can just proxy their requests through your machine.

In practice this never happens (attacks proxying) in the wild. But you’re right that might be another countermeasure they employ one day.

Security is definitely a game of ”cat and mouse”. But I wouldn’t suggest people use hardcoded access keys just because there are counter attacks to the OIDC approach. That would be like “throwing the baby out with the bath water.”

By cyberax 2025-12-157:142 reply

> That’s not on the file system though.

They are. In `~/.aws/cli/cache` and `~/.aws/sso/cache`. AWS doesn't do anything particularly secure with its keys. And none of the AWS client libraries are designed for the separation of the key material and the application code.

I also don't think it's even possible to use the commonly available TPMs or Apple's Secure Enclave for hardware-assisted signatures.

> 8 hours is usually long enough. And in the case of this specific attack, 8 hours would have been sufficient given the attacker probed AWS while the German team were asleep.

They could have just waited a bit. 8 hours does not materially change anything, the credential is still long-lived enough.

I love SSO and OIDC but the AWS tooling for them is... not great. In particular, they have poor support for observability. A user can legitimately have multiple parallel sessions, and it's more difficult to parse the CloudTrail. And revocation is done by essentially pushing the policy to prohibit all the keys that are older than some timestamp. Static credentials are easier to manage.

> In practice this never happens (attacks proxying) in the wild. But you’re right that might be another countermeasure they employ one day.

If I remember correctly, LastPass (or was it Okta?) was hacked by an attacker spying on the RAM of the process that had credentials.

And if you look at the timeline, the attack took only minutes to do. It clearly was automated.

I tried to wargame some scenarios for hardware-based security, but I don't think it's feasible at all. If you (as a developer) have access to some AWS system, then the attacker running code on your behalf can also trivially get it.

By nijave 2025-12-1512:49

You can use keyring/keychain with credential_process although it's only a minor shift in security from "being able to read from the fs" to "being able to execute a binary"

By hnlmorg 2025-12-157:411 reply

> They are. In `~/.aws/cli/cache` and `~/.aws/sso/cache`. AWS doesn't do anything particularly secure with its keys.

Thanks for the correction. That’s disappointing to read. I’d have hoped they’d have done something more secure than that.

> And none of the AWS client libraries are designed for the separation of the key material and the application code.

The client libraries can read from env vars too. Which isn’t perfect either, but on some OSs, can be more secure than reading from the FS.

> If I remember correctly, LastPass (or was it Okta?) was hacked by an attacker spying on the RAM of the process that had credentials.

That was a targeted attack.

But again, I’m not suggesting OIDC solves everything. But it’s still more secure than not using it.

> And if you look at the timeline, the attack took only minutes to do. It clearly was automated.

Automated doesn’t mean it happens the moment the host is compromised. If you look at the timeline, you see that the attack happened over night; hours after the system was compromised.

> They could have just waited a bit. 8 hours does not materially change anything, the credential is still long-lived enough.

Except when you look at the timeline of those specific attack, they probed AWS more than 8 hours after the start of the working day.

A shorter TTL reduces the window of attack. That is a material change for the better. Yes I agree on its own it’s not a complete solution. But saying “it has no material benefit so why bother” is clearly ridiculous. By the same logic, you could argue “why bother rotating keys at all, we might as well keep the same credentials for years”….

Security isn’t a Boolean state. It’s incremental improvements that leave the system, as a whole, more of a challenge.

Yes there will always be ways to circumvent security policies. But the harder you make it, the more you reduce your risk. And having ephemeral access tokens reduces your risk because an attacker then has a shorter window for attack.

> I tried to wargame some scenarios for hardware-based security, but I don't think it's feasible at all. If you (as a developer) have access to some AWS system, then the attacker running code on your behalf can also trivially get it.

The “trivial” part depends entirely on how you access AWS and what security policies are in place.

It can range anywhere from “forced to proxy from the hosts machine from inside their code base while they are actively working” to “has indefinite access from any location at any time of day”.

A sufficiently advanced attack can gain access but that doesn’t mean we shouldn’t be hardening against less sophisticated attacks.

To use an analogy, a burglar can break a window to gain access to your house, but that doesn’t mean there isn’t any benefit in locking your windows and doors.

By cyberax 2025-12-1519:40

Agreed.

> A sufficiently advanced attack can gain access but that doesn’t mean we shouldn’t be hardening against less sophisticated attacks.

I'm a bit worried that with the advent of AI, there won't be any real difference between these two. And AI can do recon, choose the tools, and perform the attack all within a couple of minutes. It doesn't have to be perfect, after all.

I've been thinking about it, and I'm just going to give up on trying to secure the dev environments. I think it's a done deal that developers' machines are going to be compromised at some point.

For production access, I'm going to gate it behind hardware-backed 2FA with a separate git repository and build infrastructure for deployments. Read-write access will be available only via RDP/VNC through a cloud host with mandatory 2FA.

And this still won't protect against more sophisticated attackers that can just insert a sneaky code snippet that introduces a deliberate vulnerability.

By voxic11 2025-12-157:171 reply

They are on the filesystem though.

By hnlmorg 2025-12-157:23

Oh that’s disappointing. Thanks for the correction.

By marifjeren 2025-12-1422:121 reply

> """ I'm strongly in favor of blocking post-install scripts by default. :+1: This is a change that will have a painful adjustment period for our users, but I believe in ~1 year everyone will look back and be thankful we made it. It's nuts that a [pnpm|yarn|npm] install can run arbitrary code in the first place. """

- a pnpm maintainer 1 year ago

https://github.com/pnpm/pnpm/pull/8897

By classified 2025-12-156:001 reply

And yet here we are…

Convenience trumps security every time. With people who allegedly know better.

By M4v3R 2025-12-158:02

Well pnpm does it by default for quite some time. It’s annoying, yes, but I take a little annoyance if it means I’m more secure.

By KomoD 2025-12-1415:291 reply

> stored in our database which was not compromised

Personally I don't really agree with "was not compromised"

You say yourself that the guy had access to your secrets and AWS, I'd definitely consider that compromised even if the guy (to your knowledge) didn't read anything from the database. Assume breach if access was possible.

By nsonha 2025-12-1415:453 reply

There are logs for accessing aws resources and if you don't see the access before you revoke it then the data is safe

By MrDarcy 2025-12-1416:062 reply

Unless the attacker used any one of hundreds of other avenues to access the AWS resource.

Are you sure they didn’t get a service account token from some other service then use that to access customer data?

I’ve never seen anyone claim in writing all permutations are exhaustively checked in the audit logs.

By otterley 2025-12-1416:381 reply

It depends on what kind of access we're talking about. If we're talking about AWS resource mutations, one can trust CloudTrail to accurately log those actions. CloudTrail can also log data plane events, though you have to turn it on, and it costs extra. Similarly, RDS access logging is pretty trustworthy, though functionality varies by engine.

By MrDarcy 2025-12-1516:421 reply

What do you mean by “trust cloud trail”

So cloud trail shows the compromised account logging into an EC2 instance every day like normal.

Then service account credentials are used to access user data in S3.

How does cloud trail indicate the compromised credentials were used to access the customer data in S3?

By otterley 2025-12-1517:211 reply

If you have data events enabled for your S3 bucket, CloudTrail will log every access to that bucket along with the identity of the principal used to access it. https://docs.aws.amazon.com/awscloudtrail/latest/userguide/l...

By MrDarcy 2025-12-1518:081 reply

Right and in my example it would be the principal of the service account, not the compromised AWS account.

If you ran a cloud trail query that's essentially "Did Alice access user data in S3 ever?" the answer would be "No"

So that brings us back to the question, what is meant by "trust CloudTrail"

By otterley 2025-12-1518:38

Most non-trivial security investigations involve building chains of events. If SSM Session Manager was used to access the EC2 instance (as is best practice) using stolen credentials, then the investigation would connect access to the instance to the use of instance credentials to access the S3 bucket, as both events would be recorded by CloudTrail.

CloudTrail has what it has. It's not going to record accesses to EC2 instances via SSH because AWS service APIs aren't used. (That's one of the reasons why using Session Manager is recommended over SSH.) But that doesn't mean CloudTrail isn't trustworthy; it just means it's not omniscient.

By johncolanduoni 2025-12-1420:43

Ideally you should have a clear audit log of all developer actions that access production resources, and clear records of custody over any shared production credentials (e.g. you should be able to show the database password used by service A is not available outside of it, and that no malicious code was deployed to service A). A lot of places don't do this, of course, but often you can come up with a pretty good circumstantial case that it was unlikely that exfiltration occurred over the time range in question.

By zymhan 2025-12-150:151 reply

Because an attacker would never cover their tracks...

By everfrustrated 2025-12-1515:42

Indeed, being able to trust your audit logs is imperative.