
⚡ Better `pre-commit`, re-engineered in Rust. Contribute to j178/prek development by creating an account on GitHub.
BTW. Pre-commit hooks are the wrong way to go about this stuff.
I'm advocating for JJ to build a proper daemon that runs "checks" per change in the background. So you don't run pre-commit checks when committing. They just happen in the background, and when by the time you get to sharing your changes, you get all the things verified for you for each change/commit, effortlessly without you wasting time or needing to do anything special.
I have something a bit like that implemented in SelfCI (a minimalistic local-first Unix-philosophy-abiding CI) https://app.radicle.xyz/nodes/radicle.dpc.pw/rad%3Az2tDzYbAX... and it replaced my use of pre-commit hooks entirely. And users already told me that it does feel like commit hooks done right.
Just because the hooks have the label "pre-commit" doesn't mean you have to run them before committing :).
I, too, want checks per change in jj -- but (in part because I need to work with people who are still using git) I need to still be able to use the same checks even if I'm not running them at the same point in the commit cycle.
So I have an alias, `jj pre-commit`, that I run when I want to validate my commits. And another, `jj pre-commit-branch`, that runs on a well-defined set of commits relative to @. They do use `pre-commit` internally, so I'm staying compatible with git users' use of the `pre-commit` tool.
What I can't yet do is run the checks in the background or store the check status in jj's data store. I do store the tree-ish of passing checks though, so it's really quick to re-run.
Yep, I think a watcher is better suited [0] to trigger on file changes.
I personally can't stand my git commit command to be slow or to fail.
[0]: such as https://github.com/watchexec/watchexec
I prefer to configure my IDE to apply precisely the same linting and formatting rules as used for commits and in CI. Save a file, see the results, nothing changes between save, commit, stage, push, PR, merge.
> I personally can't stand my git commit command to be slow or to fail.
I feel the same way but you can have hooks run on pre-push instead of pre-commit. This way you can freely make your commits in peace and then do your cleanup once afterwards, at push time.
To myself: sometimes I think the background process should be committing for me automatically each time a new working set exists, and I should only rebase and squash before pushing.
That’s reversing the flow of control, but might be workable!
jj already pretty much does that with the oplog. A consistent way of making new snapshots in the background would be nice though. (Currently you have to run a jj command — any jj command — to capture the working directory.)
I don't think you have to, you can run the integrated watcher, no?
You can configure watchman to do it. `fsmonitor.watchman.register-snapshot-trigger = true`
I don't recommend it, though, at least not on large repositories. Too much opportunity to collide with command-line jj write operations.
> I personally can't stand my git commit (...) or to fail.
But that's the whole point of locally checking the code, no? Would you prefer to commit broken things, fix them and then rebase and squash each time?
I see it as a layered system, each one slower than the last, but saving time in the long run.
* in-editor, real time linting / formatting / type checking. This handles whatever file you have open at the time.
* pre-commit, do quick checks for all affected code - linting, type checking, formatting, unit tests.
* CI server, async / slow tests. Also does all the above (because pre-commit / pre-push scripts are clientside and cannot be guaranteed to run), plus any slower checks like integration tests.
Basically "shift left", because it takes 100x as long to find and fix a typo (for example) if you find it in production compared to in your editor while writing.
I like this approach. Something related I've been tinkering with are "protected bookmarks" - you declare what bookmarks (main, etc) are protected in your config.toml and the normal `jj bookmark` commands that change the bookmark pointer will fail, unless you pass a flag. So in your local "CI" script you can do `jj bookmark set main -r@ --allow-protected` iff the tests/lints pass. Pairs well with workspaces and something that runs a local CI (like a watcher/other automated process).
I haven't yet submitted it to upstream for design discussion, but I pushed up my branch[1]. You can also declare a revset that the target revision must match, for extra belts and suspenders (eg., '~conflicts()')
[1] https://github.com/paulsmith/jj/tree/protected-bookmarks
Cool! That would pair well with SelfCI's MQ daemon, preventing accidentally forgetting about merging in stuff without running the local CI.
That's a great idea, and I was just thinking about how it would pair with self hosted CI of some type.
Basically what I would want is write a commit (because I want to commit early and often) then run the lint (and tests) in a sandboxed environment. if they pass, great. if they fail and HERAD has moved ahead of the failing commit, create a "FIXME" branch off the failure. back on main or whatever branch head was pointed at, if tests start passing, you probably never need to revisit the failure.
I want to know about local test failures before I push to remote with full CI.
automatic branching and workflow stuff is optional. the core idea is great.
> automatic branching and workflow stuff is optional. the core idea is great.
I'm not sure if I fully understood. But SelfCI's Merge-Queue (mq) daemon has a built-in hook system, so it's possible to do custom stuff at certain points. So probably you should be able to implement it already, or it might require couple of minor tweaks (should be easy to do on SelfCI side after some discussion).
Oh interesting. Checks sound similar to lix validation rules [1].
We were coming from a an application perspective where blocking the users intent is a no-go.
Do you have a link to a discussion where the JJ community is discussing checks?
I think we built similar things: http://github.com/bjackman/limmat
From the docs I think Limmat is much more minimal. It doesn't have a merge queue or anything, "jobs" are just commands that run in a worktree.
I would be interested to try SelfCI coz I have actually gone back and forth on whether I want that merge queue feature in Limmat. Sometimes I think for that feature I no longer want it to be a local tool but actually I just want a "proper CI system" that isn't a huge headache to configure.
I had been eagerly moving over to using JJ when I discovered that 'hook' behavior was not present. Pre-push hooks for formatting and linting were very helpful for me because I needed to enforce these standards on others who were more junior. It would be great for JJ to incorporate it in some way if possible. I understand the structural differences and why that makes it hard but something about that pre-* hook just hits right
That looks really cool! I've been looking for a more thought-out approach to hooks on JJ, I'll dig into this. Do you have any other higher level architecture/overview documentation other than what is in that repo? It has a sense of "you should already know what this does" from the documentation as is.
Also, how do you like Radicle?
> Do you have any other higher level architecture/overview documentation other than what is in that repo?
SelfCI is _very_ minimal by design. There isn't really all that much to document other than what is described in the README.
> Also, how do you like Radicle?
I enjoy that it's p2p, and it works for me in this respect. Personally I disagree with it attempt to duplicate other features of GitHub-like forge, instead of the original collaborate model of Linux kernel that git was built for. I think it should try to replicate something more like SourceHut, mailinglist thread, communication that includes patches, etc. But I did not really _collaborated_ much using Radicle yet, I just push and pull stuff from it and it works for that just fine.
Looks very interesting, I fully agree that running CI locally is viable.
But what I didn't pick up for a quick scan of README is best pattern for integrating with git. Do you expect users to manually run (a script calling) selfci manually or is it hooked up to git or similar? When does the merge hooks come into play? Do you ask selfci to merge?
I want multilayered reactive DAG ala Maya for source code
Being visible is useful, this is probably better suited for an ide than a hook or a daemon.
git ls-files | entr pre-commit
I have also been working on an alternative written in Rust, but in my version the hooks are WASI programs. They run on a virtual filesystem backed by the Git repo. That means a) there are no security issues (they have no network access, and no file access outside the repo), b) you can run them in parallel, c) you can choose whether to apply fixes or not without needing explicit support from the plugin, and most importantly d) they work reliably.
I'm sure this is more reliably than pre-commit, but you still have hooks building Python wheels and whatnot, which fails annoyingly often.
The VFS stuff is not quite finished yet though (it's really complicated). If anyone wants to help me with that it would be welcome!
the second the hooks modify the code they've broken your sandbox
I think wasi is a cool way to handle this problem. I don't think security is a reason though.
> the second the hooks modify the code they've broken your sandbox
Changes to code would obviously need to be reviewed before they are committed. That's still much better than with pre-commit, where e.g. to do simple things like banning tabs you pretty much give some guy you don't know full access to your machine. Even worse - almost everyone that uses pre-commit also uses tags instead of commit hashes so the hook can be modified retroactively.
One interesting attack would be for a hook to modify e.g. `.vscode/settings.json`... I should probably make the default config exclude those files. Is that what you meant? Even without that it's a lot more secure than pre-commit.
I wouldn't want hooks modifying the code. They should be only approve/reject. Ideally landlock rules would give them only ro access to repo dir
It's going to be optional - the hooks will always fix the code if they can, but then you can supply a `--no-fix` flag (or config) if you want to tell it to not actually apply those changes to the real filesystem.
It doesn't need Landlock because WASI already provides a VFS.
It depends. I wrote a pre-commit hook (in shell, not precommit the tool) at a previous job that ran terraform fmt on any staged files (and add the changes to the commit) because I was really tired of having people push commits that would then fail for trivial things. It was overrideable with an env var.
IMO if there’s a formatting issue, and the tool knows how it should look, it should fix it for you.
The standard way for this with current tools is to have the formatter/linter make the changes but exit with a non-zero status, failing the hook. Then the person reviews the changes, stages, and commits. (That's what our setup currently has `tofu fmt` do.)
But if you don't want to have hooks modify code, in a case like this you can also just use `tofu validate`. Our setup does `tflint` and `tofu validate` for this purpose, neither of which modifies the code.
This is also, of course, a reasonable place to have people use `tofu plan`. It you want bad code to fail as quickly as possible, you can do:
tflint -> tfsec -> tofu validate -> tofu plan
That'll catch everything Terraform will let you catch before deploy time— most of it very quickly— without modifying any code.
> make the changes but exit with a non-zero status
That's reasonable. My personal (and that of my team at the time) take was that I was willing to let formatting - and only formatting - be auto-merged into the commit, since that isn't going to impact logic. For anything else, though, I would definitely want to let submitter review the changes.
ok but I was replying to a comment about a tool which advertises precisely that feature
I think it was a massive mistake to build on the pre-commit plugin base. pre-commit is probably the most popular tool for pre-commit hooks but the platform is bad. My main critique is that it mixes tool installation with linting—when you will undoubtedly want to use linters _outside_ of hooks. The interface isn't built with parallelism in mind, it's sort of bolted on but not really something I think could work well in practice. It also uses a bunch of rando open source repos which is a supply chain nightmare even with pinning.
pre-commit considered harmful if you ask me. prek seems to largely be an improvement but I think it's improving on an already awful platform so you should not use it.
I know I am working on a competing tool, but I don't share the same criticism for lefthook or husky. I think those are fine and in some ways (like simplicity) better than hk.
I think really they just need to implement some kind of plug-in or extension framework. Extensions are just not first class citizens but they really should be.
There should be a .gitextensions in the repo that the repo owners maintain just like .gitignores and . gitattributes etc etc. Everything can still be opt in by every user but at least all git clients would be able to know about, pull down, and install per user discretion.
It seems pretty basic in this day and age but it's still a gaping hole. You still need to manually call LFS install for goodness sake.
> My main critique is that it mixes tool installation with linting
If you use a tool like this via Devenv instead of using its built-in mechanisms for installing tools:
- you can add a linter without putting it on your path
- you can put a linter on your path without enabling any git hooks for it
- if you are already using a linter in a git hook, adding it to your environment will get you the exact same version as you're already using for your git hook at no additional storage cost
- if you are already using a linter at the CLI, and you add a git hook for it, your hook will run with the exact same version that you are already using at the CLI at no additional storage cost
- your configuration interface is isomorphic to the upstream one, so
- any custom hooks you're already using can be added without modification beyond converting from one format to another;
- any online documentation you find about adding a custom hook not distributed with the upstream framework still applies;
- and you can configure new, custom, or modified hooks with a familiar interface.
- any hook you write as a script or with substantial logic can also be plugged into a built-in task runner for use outside git hook contexts, where
- you can express dependency relationships between tasks, so
- every task runs concurrently unless dependency relations mandate otherwise.
Which imo solves that problem pretty well. My team uses that kind of setup in all of our projects.> The interface isn't built with parallelism in mind, it's sort of bolted on but not really something I think could work well in practice.
I'm curious about what this means. Could you expand on it?
You might be interested in something like `treefmt`, which is designed around a standard interface for configuring formatters to run in parallel, but doesn't do any package management or installation at all:
https://github.com/numtide/treefmt
(That might address both of the issues I've replied to you about so far, to some extent.)
> It also uses a bunch of rando open source repos which is a supply chain nightmare even with pinning.
If the linters you're running are open source, isn't this what you're ultimately doing anyway? Nix gives a bit more control here, but I'm not sure whether it directly addresses your concern.
> I am working on a competing tool
Oh, dammit. Only after writing all of this out did I realize you're the author of mise. I'm sure you're well aware of Nix and Devenv. :)
Because I think your critiques make sense and might be shared by others, I'll post this comment anyway. I'm still interested in your replies, and your opinion of having some environment management tool plug into prek and supplant its software installation mechanisms.
And because I think mise likely gets many of these things right in the same way Devenv does, and reasonable people could prefer either, I'll include links to both Devenv and mise below:
It's idiomatic in pre-commit to leverage the public plugins which all do their own tool installation. If you're not using them, and also not using it for tool installation, I'm not sure why you'd not be using the much simpler lefthook.
If you look at hk you will understand what I'm talking about in regards to parallelism. hk uses read/write locks and other techniques like processing --diff output to safely run multiple fixers in parallel without them stomping on each other. treefmt doesn't support this either, it won't let you run multiple fixers on the same file at the same time like hk will.
> If the linters you're running are open source, isn't this what you're ultimately doing anyway?
You have to trust the linters. You don't also need to trust the plugin authors. In hk we don't really use "plugins" but the equivalent is first-party so you're not extending trust to extra parties beyond me and the lint vendors.
> If you look at hk you will understand what I'm talking about in regards to parallelism. hk uses read/write locks and other techniques like processing --diff output to safely run multiple fixers in parallel without them stomping on each other. treefmt doesn't support this either, it won't let you run multiple fixers on the same file at the same time like hk will.
That sounds pretty cool! I will definitely take a closer look.