Photo Credit to Petter Boström I recently attended the Goatmire Elixir Conf and one of the standout talks for me was Saša Jurić's "Tell Me a Story". It was an incredible presentation t
I find the sort of opinions on this post quite common on a subset of engineers - namely mid levels with some time in the career, who start to consider themselves senior engineers and want everyone to follow the same set of strict rules they decided make sense. It’s the same mindset that makes people pedantically apply DRY to every situation or forcing others to TDD basic apps.
In practice:
- smaller PRs aren’t necessarily easier to review (and this arbitrary obsession almost always leads to PR overload in chunks that don’t make any sense, reducing code quality as a result)
- nobody reads intermediate commit messages one by one on a PR, period. I worked on a team where the lead was adamant about this and started to write messages in the vein of “if you’re reading this message, I’ll give u $5”. I never paid anyone a dollar. Don’t waste your time writing stuff for no one.
- “every commit must compile” - again, unnecessary overzealousness. Every commit on the MAIN branch definitely should compile. Wasting your time with this in a branch, as you work towards a solution, is focusing on the wrong thing
You want PRs because they help others absorb what you’re doing (they’ll have to read that same code sooner or later). You don’t want to create a performance theater.
> nobody reads intermediate commit messages one by one on a PR, period. I worked on a team where the lead was adamant about this and started to write messages in the vein of “if you’re reading this message, I’ll give u $5”. I never paid anyone a dollar. Don’t waste your time writing stuff for no one.
I do. Especially if the author is competent.
That said, empirically, you're correct most people don't.
However, that said, I think changing the culture rather than throwing away the practice would be a better response.
Reading and reviewing clean history is really so much nicer. I'd also argue that actually making your history clean (as opposed to theatrically and thoughtlessly making small commits, say) forces you as the author to review it more carefully.
Replying to echo this, I also read every commit message, and review PRs commit by commit. This was common practice at my last job (a small, experienced team), and the expectation was that commits were atomic.
Yes, we griped that GitHub would not allow us to merge individual commits, but if it was ever urgent or helpful to do so, we cherry-picked a commit into a separate PR.
Everyone's workflow is a bit different, and it can be hard to redirect organizational inertia. But without a doubt, reading a clean commit history is a pleasure.
Thank you for this! I find it disturbing that the original-original poster extrapolated their experience to the rest of us as if it were a fact.
PR is the atomic level of work. I'd argue PR-level history (i.e. squash) is often enough and is way cleaner. Why would I care about "commit A", "change parts of A because I misunderstood a requirement", "improve A based on code review" etc.?
If I want that granularity, I'd go read the original PR and the discussion that took place.
> PR is the atomic level of work.
The atomic level of work should be a single, logically coherent change to the codebase. It's not managerial, it's explanatory.
As you work things naturally arise. Over here a reformatted file, over there comments to clarify an old function that confused you, to help the next developer who encounters it. Cleaning, preparatory refactoring that is properly viewed as a separate action, and so on. Each of these is a distinct "operation" on the codebase, and should be reviewed in isolation, as a commit.
Some of these operations have nothing to do with the new feature you're adding. And yet creating separate PRs for each of them would be onerous to your reviewers and spammy. Clean, atomic history lets you work naturally while still telling a clear story about how the code changed, both for reviewers and future developers.
> The atomic level of work should be a single, logically coherent change to the codebase.
Sure, and we must make that correspond to the atomic unit that our collaboration tools provide us for reviewing and merging. In Github and similar git forges, that's a PR, not as a commit. A string of atomic changes should be represented as a series of PRs, not a series of commits in one PR, because Github isn't designed to review and merge individual commits.
The "atomic commits" crowd are (in my opinion) coming up with best practices for the tools they wish they had and working against the grain of the tools we actually use.
This is simply not true.
There is a "commits" tab and next button to quickly go through commits on every PR. It's very easy to use.
All that you mean is most people ignore it.
And then when people put comments on your commits and you force-push new ones, does it link the updated version of each commit to its previous self, giving a clear timeline of comments and changes? I don't think so. But if people write comments at the MR level and you fix them in new commits, then the throughline is clear.
I think a workflow like this for atomic commits would be nice. tangled.sh supports it for jujutsu¹, and it looks really neat. But the existing code review interface is clearly designed for code review to take place at the MR level.
I don't know -- these are fair points.
I do know that when I was using GH regularly on a team where a number of people wrote clean history, the problems you mentioned didn't come up, not that I can recall. So for the 90% case, let's say, you can do clean history on GH and get the majority of its benefits. But yes, I'm sure it's flawed especially in workflows where those types of problem arise often.
Yeah that's probably fair. I haven't used a commit-centric workflow in a professional context, so I can't really say how often, if ever, these issues come up.
These opinions could use some arguments for why they're useful.
maybe if the majority of devs switched to jj or something similar your idea might become semi-common. but it seems to me that currently for most most project, it generates a lot of cost for very little gain.
My personal idea is that the code in each PR should be well documented enough for a review, but also for when people join the team and need to learn. Or when a pour soul needs to check upon your code while you are out on a holiday. This personal rule does not apply to all projects, but for bread and butter stuff I tend to go by it and not care about clean commit histories. The cost to reward seems way better.
The PR UI on GitHub definitely leads to treating it as the unit of work. I consider this unfortunate for the most part, but the basic side effect is that I'll often end up submitting every commit as a separate PR so they actually get looked at
Respectfully, I disagree. I understand people have vastly different experiences and preferences. In my ideal world, a PR is a unit of work that has an in-state and an out-state. I don't have to see an initial commit within a PR with a full fledged spec and then wonder if any future commits within the same PR overrode those changes. A PR will rarely be clean from the start.
The way it appears to me, if there's multiple commits submitted as separate PRs, then maybe the PR wasn't so atomic to begin with.
Indeed, I agree. If a PR has multiple commits and those commits are atomic then by definition the PR is not atomic.
> I'd go read the original PR and the discussion that took place.
Until your company switches code repos multiple times and all the PR history is gone or hard/impossible to track down.
I will say, I don't usually make people clean up their commits and also usually recommend squashing PRs for any teams that aren't comfortable with `git`. When people do take the time to make a sensible commit history (when a PR warrants more than one commit) it makes looking back through their code history to understand what was going on 1000% easier. It also forces people to actually look over all of their changes, which is something I find a lot of people don't bother to do and their code quality suffers a lot as a result.
It also enables bisect to work properly.
Bisecting on squashed commits is not usually helpful. You can still narrow down to the offending commit that introduced the error but… it’s somewhere in +1200/-325 lines of change. Good luck.
If your PR is + 1000 code lines long, you already made a mistake at the requirements and planning stage (like many teams do).
This sounds unattainable to me. For code bases in the 2 million or more lines range, something as simple as refactoring the name of a poorly named base class can hit 5000 lines. It also was not a mistake with the original name it had, but you'd still like to change it to make it more readable given the evolution of the codebase. You would not split that up into multiple commits because that would be a mess and it would not compile unless done in one commit.
How is this a mistake?
Such PR's shouldn't be the norm but the exception. What happens way more often is that such refactoring happen in addition to other criteria on the same PR. In high-functioning teams i've worked this is usually done as a separate PR / change, as they are aware of the complexity this operation adds by mixing it with scope-related changes and that refactoring shouldn't be in the scope of the original change.
I don't agree, for example my team includes yarn.lock in the commit which adds quite a few lines to the PR.
Or finding what bug was reintroduced in a +13/-14398
Yeah, I agree with both you and the GP. There's a mess of commits that usually don't matter because mostly only the before and after level of an actual viable PR does, ergo I squash them.
I'm cool with other reasonble approaches though, but I'm pretty over pointless hoops because someone says so.
If the commits don't matter why did you make them separate to begin with?
”Updated something tiny and ran CI again until it failed on some other step”
Together with backing up your work.
Sure you can keep amending your last commit but whenever you detour to another problem in the same PR that turns into a mess.
Easier to just treat the PR as the atomic unit of work and squash away all that intermediate noise.
It also ensures that CI will pass on every commit on the main branch.
A lot of engineers do what you suggest rather than `git add; git commit --amend`
This is why commits are often noise. If people are using commits well, they tell a story. The fact that people often use the tool wrong certainly begs some criticism of the tool, but when used correctly commits are certainly worth looking at one by one
> The fact that people often use the tool wrong certainly begs some criticism of the tool, but when used correctly commits are certainly worth looking at one by one
What do you consider correct usage of git, and why? In this very discussion, I can see at least two distinct purposes that, more often than not, are mutually exclusive:
- To "tell a story" for other people
- To checkpoint units of work as individual perceives them, helping them deal with interruptions (which include running out of work day).
Storytelling is a skill in itself, doing it is a distinct kind of extra work, so you can't really have people use git for both at the same time. Which is where the whole commit history management idea comes from - it's to separate the two into distinct phases; first you commit for yourself, then you rework it to tell a story for others.
In my usage, the story is for myself first and foremost. Telling the story helps keep me organized and helps me remember what I've done and where I'm going. I don't need to know that I fixed a typo in a comment, I need to know what the changes are overall doing.
Sometimes I go down a dead end, reverse out, and leave a comment about why a different approach would be a dead end. I (and others) don't need a record of the work I did on that path, just the synthesis (an explanatory comment)
There are multiple systems for structuring commits, but the commit message body content approximates to the same in all of them. The classic advice is https://tbaggery.com/2008/04/19/a-note-about-git-commit-mess... , but I find https://www.conventionalcommits.org/en/v1.0.0/ useful for looking at the oneline log
To address this point:
> To checkpoint units of work as individual perceives them, helping them deal with interruptions (which include running out of work day).
Yes, commits can be used like this! But once you have a chunk of work ready for review, cleaning up the commit log/history, grouping related changes, and describing them is useful for maintaining the software.
I don't like squash merges personally, though they have their merits. But regardless, I would copy the commit subject/body content into the PR message, which then puts everything into the PR commit also, so technically the granular commits are less relevant when one merges, but occasionally are still useful to refer to
> In my usage, the story is for myself first and foremost.
But that's my point exactly. Unless you're exceptionally clear thinker, a story that's natural for you is not very good for anyone else. Your story is optimized for an audience of 1, developed interactively, and meant to help you in the now. The story for the team is meant to help them orient themselves after the fact. Turning one into the other is its own kind of work.
But then different people and teams have different ways of working. VC isn't the whole world. In some projects, I'd make "team story" commits directly, because I used a separate text file to note down my thoughts, and used that to keep me on track. So it's a different way of solving this problem.
There are not mutually exclusive. While I don't personally make the checkpoint style commits ever, I work with those who do. But they re-create a set of logical atomic story commits before submitting as a PR or otherwise
Which means they are inventing a narrative that did not exist when they were developing the PR. While also spending time on squashing and merging commits to pretend the development was linear. When it absolutely wasn't.
That is the perfect story for the final merge commit or the PR when this nicely crafted story is squashed into nothingness and merged.
I'm not the person you asked, but I often don't. I personally don't care about git history - I read code not commit messages. I don't really care how it got the way it is I care about what it is, maybe once in a blue moon there could potentially be some useful information in a commit message but it's not enough.
So, I'll often just make lots of changes and then commit them all at once with some vague commit message and that's that. Nobody cares. If I want to tell people why the code is the way it is I'll just add a comment.
This is how I work. I have tried to be more disciplined with commits and stuff like that but I find that it just slows me down and makes the work feel more difficult. I also frequently just forget and find myself having made lots of changes without any commits so then I have to retroactively split it up into commits which can be difficult too. So I'd rather just not worry about it, focus on getting good work done and move on rather than obsess over a git history that's unlikely to ever be read by anyone. I realize that's a self-fulfilling prophecy in that it'd be more likely to be read if it was useful and well done but it's not just me. If I was in a team where everyone did it really well I'd try to keep my own work up to par. But usually I'm the one who cares most about how we do things and this just doesn't seem important to me.
The important thing about making commits separately and as much self-contained as possible is to allow cherrypicking.
Say you're on a development branch and you added something new, that the Project thinks can be and should be added to the Main branch. By having that addition in its own self-contained commit allows the Project to create a new branch, cherrypick the commit, and merge the branch to Main, without having to pull in the rest of the development branch.
It's of course not really necessary if you're the only person doing all the development, but it's just a good etiquette if you're working with other people in the Project.
If there is only useful information in the commit message "once in a blue moon" it means someone isn't writing good commit messages.
The number of times I look at a change and all I can think is "why did they/I do that" is very very often. Having the answer to that question available saves re learning the lesson that led to the change.
Put that in the developer-docs/issue-tracker where it has chance of being seen again. And not only by developers.
> I personally don't care about git history - I read code not commit messages.
Honest question: why do you even use version control? What do you get out of it?
Based on your workflow, you could just as well not use it at all, and create zip files and multiple copies of files with names like `_final3_working_20250925`.
Change history is the entire point of version control. It gives you the ability to revert to a specific point in time, to branch off and work on experiments, to track down the source of issues, and, perhaps most useful of all, to see why a specific change was done.
A commit gives you the ability to add metadata to a change that would be out of place in the code itself. This is often the best place to describe why the change was done, and include any background or pertinent information that can help future developers—including yourself—in many ways. Adding this to the code base in a comment or another document would be out of place, and difficult to manage and discover.
You may rarely need to use these abilities, but when you do, they are invaluable IME. And if you don't have them at that point, you'll be kicking yourself for not leveraging a VCS to its full potential.
> I have tried to be more disciplined with commits and stuff like that but I find that it just slows me down and makes the work feel more difficult.
Of course it slows you down. Taking care of development history requires time and effort, but the ROI of doing that is many times greater.
I encourage you to try to be disciplined for a while, and see how you feel about it. I use conventional commits and create atomic commits with descriptive messages even on personal projects that nobody else will work on. Mainly because it gives me the chance to reflect on changes, and include information that will be useful to my future self, months or years from now, after I inevitably abandon the project and eventually get back to it.
Here is an example from a project I was working on recently[1]. It's practically a blog post, but it felt good to air my grievances. :)
[1]: https://github.com/hackfixme/sesame/commit/10cd8597559b5f478...
We use it to save our progress, backup files, communicate with others. You know, the main benefits of version control?
Writing a story no one will ever see is not one of them. Write real docs and your PM, QA, SMEs will benefit as well, not only developers who bother to dig thru the history.
> We use it to save our progress
Saving progress is useless if your history is a mess and you have no idea what a previous state contains.
> backup files, communicate with others.
You do know that there are better tools than a VCS specifically built for these use cases, right?
> You know, the main benefits of version control?
No, I don't think you understand what version control is for.
You can use a knife to open a wine bottle, but that doesn't mean it's a good idea.
> Writing a story no one will ever see is not one of them.
You won't. I definitely will, and I take the liberty to be as verbose as I need in personal projects.
> Write real docs and your PM, QA, SMEs will benefit as well, not only developers who bother to dig thru the history.
You should write "real docs", but that's not what commit messages are for. They're not meant to be read by non-developers either. And developers don't have to "dig thru the history" to see them. Commits are easily referenced and accessible.
> Saving progress is useless if your history is a mess…
Nope, it works, commit! Tests pass, commit! Push.
You’ve demonstrated you don’t know what version control is for. Cleaning up the past is a peripheral nicety, that is not at all core.
In fact some situations prefer history not be changed at all.
IF you truly think the main point of version control is to maintain a coherent commit history than you are deluded. For most teams if it can do:
1. allow collaboration
2. Have branching and merging
3. Have diffs between two points in time/branches/tags
4. Allow release tagging
it is enough to work with it. Not to say that a coherent git history is great, but to call it the main point is something else. As that is definitely not how a lot of teams are using git or any version control.
Honestly I didn't even know you were allowed to write that much. I've always tried to make the commit message fit in a sentence or two, what you showed here looks more like what I'd write on a PR description. Except I wouldn't write that much there either.
> Honest question: why do you even use version control? What do you get out of it? > Change history is the entire point of version control. It gives you the ability to revert to a specific point in time, to branch off and work on experiments, to track down the source of issues, and, perhaps most useful of all, to see why a specific change was done.
You answered your own question here. I get pretty much all of that stuff. Maybe you get some of that stuff a bit better than I do but I don't really think there's much of a difference. I can still go back in time and make changes etc, I can't necessarily revert every specific small change ever made using git alone but I can easily just make that change as a new commit instead - which is probably faster than scanning through a million commit messages trying to find the one I want to revert anyway.
I can go back to some arbitrary point in time just like you can. My resolution may not be as fine as someone who makes more commits but so what? Being able to go back to an arbitrary day and hour would be timetravel enough for me, I don't need to be able to choose the specific second.
Just to be clear I do make commits and I do try to write descriptive messages on them - I just also try to avoid spending more than a few seconds deciding what to write. That commit you just showed is larger than most of mine. That's a whole PR for me, which is pretty much what I said initally: I'll just do the whole task and commit it all in one go - what I'm not doing is splitting it up into 50 individual commits like some people would want.
I think the primary difference between the two of us is that you write huge commit messages and I don't, aside from that our commits seem very similar to me.
I get where you're coming from, but I do think you're doing a disservice to yourself and your team by not doing granular commits. This doesn't mean splitting changes into arbitrary chunks, or that each commit must be of a specific size, but that each commit does a single change. I find conventional commits helpful for this, since they force you into thinking about the type of change, which naturally limits the commit scope. I don't do a perfect job at this either, and often bundle unrelated changes into one commit if I'm being lazy, but usually I strive to keep it clean.
There are many benefits of doing this. When following the output of `blame`, you can see exactly why a change was made, down to the line or statement level. This is very helpful for newcomers to the codebase, and yourself in a few months time. It's invaluable for `bisect` and locating the precise change that introduced an issue. It's very useful for cherry picking specific changes across branches, or easily reverting them, as you mention. It makes it easier to write descriptive changelogs, especially if you also use conventional commits, which nowadays can be automated with LLMs. And so on.
Most of these tasks are very difficult or impossible if you don't have a clean history. Yes, they require discipline, time, and effort to do correctly, but it saves you and the team so much time and effort in the long run.
Ultimately, it's up to each person or team to use a VCS in whatever way they're most comfortable with. But ignoring or outright rejecting certain practices that can make your life as a developer easier and more productive, even though they require more time and effort upfront, is a very short-sighted mentality.
> I think the primary difference between the two of us is that you write huge commit messages and I don't
The commit I linked to is an outlier, and if you see my other commit messages, most are a few sentences long. It's not about writing a lot, but about describing the change: what led to it, why it was implemented in a specific way, mention any contextual information, trade-offs, external links, etc. In that particular case it was a major feature (indicated by the exclamation point in the subject) that changed large parts of the code base, so it deserved a bit more context. I was also feeling a particular way and used the opportunity to vent, which isn't a good place for it, but since this is a personal project, I don't mind it. Although how the programmer felt while writing the code is also relevant contextual information, and I would gladly read it from someone else in order to understand their state of mind and point of view better.
Also, these days with LLMs you can quickly summarize large amounts of text, so there's no harm in writing a lot, but you can never add context that doesn't exist in the first place.
I never work with squashers. There's no open source project I know of, which accepts squashers. Many companies do so, because their management is fucked, and if you see such a thing, look for another job immediately.
Baseless claim, just your opinion.
> Why would I care about "commit A", "change parts of A because I misunderstood a requirement", "improve A based on code review" etc.?
For me it’s because Feature A may largely be fine but one of those intermediary commits introduced a regression. I can bisect and isolate an issue much more easily if I have the full history to step through as opposed to “this big commit intrigued a one-line regression _somewhere_ in a 900 line commit”
> "commit A", "change parts of A because I misunderstood a requirement", "improve A based on code review" etc.
People are supposed to rebase all that noise away. Changes are supposed to be structured as sensible chunks that build up to the desired feature. It's like showing your work in a math exercise: you don't write out the final answer with no explanation, you demonstrate step by step how you reached it.
You should care because if the author cared enough to make descriptive atomic commits, they will help you understand why a particular change was done. This can often avoid unnecessary discussions.
And, no, PRs are not necessarily an atomic level of work. While they should contain a single feature, fix, etc., sometimes that work can span multiple commits.
If the PR includes superfluous commits, then they should be squashed into the appropriate commit. Squashing the entire PR when it includes multiple changes is simply a bad practice. It's bad because you lose all the history of how the overall change was done, which will be useful in the future when you need to do a blame, cherry pick, bisect, etc.
It's surprising to me how many developers misunderstand the value of atomic commits, or even what they are. And at the same time, it's exhausting having this discussion every time that happens, especially if there is continued pushback.
I am not against people having their preferred way of using VCS tools. As long as it works for their team, that's fine. But there are certain best practices that simply help everyone in the long-term, including the author, that I'm baffled whenever they're willfully ignored. I can't help but think that it's often done out of laziness, and lack of discipline and care into the work they do, which somehow becomes part of their persona as they gain more experience.
> You should care because if the author cared enough to make descriptive atomic commits, they will help you understand why a particular change was done. This can often avoid unnecessary discussions.
That's what the description field is for. I never, ever inspect the "commits" tab in a PR unless I see some lucicrous number on it. And even then it's just to see what the heck happened.
> If the PR includes superfluous commits, then they should be squashed into the appropriate commit.
This happens on merge if your Github is set up correctly.
> Squashing the entire PR when it includes multiple changes is simply a bad practice.
The bad practice is the PR changing multiple distinct things.
> It's bad because you lose all the history of how the overall change was done, which will be useful in the future when you need to do a blame, cherry pick, bisect, etc.
It's not.
> That's what the description field is for.
No. The PR description is for describing the overall change, which, again, may include multiple commits. The description can also include testing instructions, reviewing suggestions, and other information which is not suitable for a commit message.
PR descriptions can be edited and updated during the review, which can be helpful. A commit message is immutable, and remains as a historical artifact.
Also, when I'm working on a code base, the last thing I want is to go hunting for PRs to get context about a specific change. The commit message should have all the information I need directly in the repo.
> I never, ever inspect the "commits" tab in a PR unless I see some lucicrous number on it.
And... you're actually proud of this? Amazing.
Have you ever read a descriptive commit message? Do you even know what they look like?
I'm taken aback by the idea that there are developers who would take the time and effort to write a detailed commit message, only for others to not only never read it, but to be proud of that fact. Disgraceful.
> This happens on merge if your Github is set up correctly.
No. This is what I mean about developers not understanding what atomic commits even are. There are commits that will be done during a review, or as ad-hoc fixes, which indeed shouldn't exist when the PR is merged. But this doesn't mean that the entire PR should be squashed into a single commit.
Those useless commits should instead be squashed into the most relevant commit, which is straightforward if you create `--fixup` commits which can then be automatically squashed with `rebase --autosquash`.
But the PR may ultimately end up with multiple atomic commits, and squashing them all into a single commit would nullify the hard work the author did to keep them atomic in the first place.
If you configure GitHub to always squash PRs, or to always create a merge commit, or to always rebase, you're doing it wrong. Instead, these are decisions that should be made on a case-by-case basis for each PR. There are situations when either one of them is the best approach.
> The bad practice is the PR changing multiple distinct things.
Right. I'm sure you enjoy the overhead of dealing with a flood of small PRs that are all related to a single change, when all of it could be done in a single PR with multiple commits. This is easier to review, discuss, and merge as a single unit, rather than have it spread out over multiple PRs because of a strict "one PR-one commit" policy.
All that rule does, especially if you have PR squashing enabled by default, is create a history of bloated commits with thousands of lines of unrelated changes, that are practically useless for cherry picking, bisecting, and determining why a specific change was done, which is the entire point of commits. Good luck working on that codebase.
> It's not.
k.
Virtue signaling is not a business need.
If you want to communicate with others, write proper docs in a format that won't be lost to time, and are accessible to everyone, not merely investigative developers.
Virtue signalling? What are you on about?
Everything I said has direct benefits for the team, and hence for the company.
> If you want to communicate with others, write proper docs in a format that won't be lost to time
You have a severe misunderstanding of what commit messages are for. They're meant to describe changes that can be used as historical reference by developers. They're not meant to be read by non-developers, serve as replacement for "proper docs", or for general communication.
A VCS history is by definition never "lost to time". It is an immutable record of the development process of the project. If you don't find that useful, choose not to use it to its full potential, and strangely relish in that fact, you might as well use another tool.
> And... you're actually proud of this? Amazing.
Your posts above are dripping in it.
Docs are available to everyone, accessibility in action. You have a severe misunderstanding of what communication is.
There’s no important developer information that should be explicitly and effectively hidden from others. There’s not even a proper search facility, you have to browsing with a lot of background knowledge until you hopefully find something. Newer members won’t have this knowledge.
Code changes, requirements change, often. Info becomes obsolete rather quickly. Projects may last decades. By definition, historical assumptions are inferior. There’s already a mechanical commit record as well.
So yes, buying any important information there is going to be lost to time, and is therefore a waste of it.
I always tell my engineers to create atomic commits and we usually review commit by commit. Obviously commits like "fixed review comments" or "removed some left-over comments" or "fixed typo" should not be pushed into a PR you asked others to review. I expect people to understand how to clean their commit history - if they don't I teach them. The senior people who are capable of structured work - e.g. are used to contribute open source projects - do it anyway. Because messiness is usually not tolerated by maintainers of important projects.
You find people how aren't able to craft clean commits and PRs usually thrive in environments in which people are either work mostly alone or in which cooperation is enforced by external circumstances (like being in the same team in a company). As soon as developers many are free to choose whom to associate with and whose code they accept - rules are usually made and enforced.
> Obviously commits like "fixed review comments" or "removed some left-over comments" or "fixed typo" should not be pushed into a PR you asked others to review.
Could you explain this a bit more? I'm having trouble visualizing the end to end process.
1. Someone has what they feel is a complete change and submits a PR for review.
2. The reviewers read part of it, first half looks good, and halfway through they have concerns and request changes.
3. The submitter now has to fix those concerns. If they are not allowed to push an additional commit to do this, how do you propose they accomplish this? Certainly they should not force push a public branch, right? That would cause pain to any reviewer who fetched their changes, and also break the features on GitHub such as them marking which files they have already read in the UI. But if we cannot push new commits and we cannot edit the existing commits, what is the method you are suggesting?
The reason for that tidiness with some open source projects is because they want to conserve the valuable time of the maintainers, and are willing to expend a lot of time from people wanting changes to do that.
That's not the situation in a normal corporate environment. You want to reduce total time expended (or total cost, at least). It's going to be cheaper to just have a chat with your coworker when a PR is confusing.
> Why would I care about ...
You would not allow those commits. Code review improvements should appear as fixup commits which should be autosquashed on merge. It is a shame that GitHub does not support autosquash though.
Not sure, squashing is hiding history, I prefer to see history, even if it is not clean or buildable.
I actually prefer to not be dogmatic about one approach or the other, and I think the answer will be different for different dev/ci/deployment workflows.
Here's my approach, of course from experience limited to my (past) workplace. We have the usual CI setup, where each merged PR triggers a build followed by a deploy to staging.
This means that what goes in a PR is decided by what sub-functionality of the feature at hand has to be tested first[0], whereas what goes in commits is decided by what is easy to read for reviewers for PRs where such an approach makes sense [1], or it simply doesn't matter much like you said, for a lot of other cases.
That is the way I like to think about it.
I know I know git bisect etc... but IME in the rare cases we used it we just ran bisect on the master branch which had squashed PR level commits, and once we found the PR, it was fairly straightforward to manually narrow it down after that.
In more systems level projects there will actually be clear layers for different parts of your code (let's be honest, business logic apps are not structured that well, especially as time goes) and the individual-commits-should-work approach works well.
[0] ideally, the whole feature is one PR and you config-gate each behaviour to test individually, but that's not always possible.
[1] for example, let's say we have a kafka producer and consumer in the same repo. They each call a myriad of internal business logic functions, and modifications were required there too. It is much easier to read commits separating changes, even to the same function, made as part of the producer flow and the consumer flow.
I've met thousands of developers over my career and i could put them into two categories: those who don't give a shit about intermediate commit messages (majority) and those who browse every single intermediate commit message in a PR (very few). To be honest, the latter had some tendency to be difficult to work with. It was also a useful discriminator to avoid getting those into my teams.
I tend to read intermediate commits because it can be helpful in understanding how the engineer thought through developing the feature. This is especially informative when reviewing more junior/mid-level code, or when a feature grows beyond what I would consider acceptable scope - obviously, avoiding these kinds of branches is the ideal state, and unfortunately the realty doesn’t let me always push back for smaller PRs.
> Reading and reviewing clean history is really so much nicer.
You can have both with git and it's not even hard. Unfortunately it seems many people pride themselves in what little they know of git. I'm not being sarcastic, I've read people say this almost word-for-word.
git is a means, not an end
commits mean precisely what their author intend them to mean, nothing more
if you squash-merge every PR then history is clean where it matters
To quote my least favourite HN response: "No."
Such developers should be condemned to work with CVS until they repent for their sinful statements.
IMO there's no point having a clean history of commits within a PR. With rare exceptions, if you have a PR with a clean history of commits and each commit compiles and passes the tests... they should be separate PRs! If it isn't clean then it should be squashed.
A few exceptions:
1. When refactoring often your PR is "do an enormous search and replace, and then fix some stuff manually". In that case it's way easier to review if the mechanical stuff is in a separate commit.
2. Similarly when renaming and editing files, Git tracks it better if you do it in two commits.
3. Sometimes you genuinely have a big branch that's lasted months and has been worked on by many people and it's worth preserving history.
Also I really really wish GitHub had proper support for stacked PRs.
This is truer now that `git bisect --first-parent` exists. But it didn't always. And even then, there are times you find out that there is "prep work" to land your feature. And a PR just to do some deck chair moving that makes a follow-up commit easier is kind of useless. I have done prep work as a separate PR, but this is usually when it is more extensive than the feature and it is worthwhile on its own.
Another instance is a build system rewrite. There was a (short) story of the new system itself and then a commit per module on top of that. It landed as 300+ commits in a single PR. And it got rebased 2-3 times a week to try and keep up as more bits were migrated (and new infra added for things other bits needed). Partial landing would have been useless and "rewrite the build system" would have been utter hell for both me developing and anyone that tries to blame across it if it hadn't been split up at least that much.
Basically, as with many things in software development, there are no black-and-white answers here.
> IMO there's no point having a clean history of commits within a PR. With rare exceptions, if you have a PR with a clean history of commits and each commit compiles and passes the tests... they should be separate PRs! If it isn't clean then it should be squashed.
I think that whether clean history has a point, really depends on how deep are you refinement sessions. And perhaps a bit on the general health of your codebase.
If you don't do refinement with your editors open and grind tickets into dust, there will be side-changes adjacent to each PR which are not directly related to the ticket. These are better to have their own commit (and commit message).
> IMO there's no point having a clean history of commits within a PR. With rare exceptions, if you have a PR with a clean history of commits and each commit compiles and passes the tests... they should be separate PRs! If it isn't clean then it should be squashed.
A perfect illustration of a backwards mindset. If this made sense then the standard or least common denominator PR tool would work better with many small PRs, which here also means they must be able to depend on each other. (separate PRs!!!) So is it?
> Also I really really wish GitHub had proper support for stacked PRs.
No. It doesn’t even support it.
So how does this make sense? This culture of people wanting “one PR” for each change, and then standard PR tool that everyone knows of doesn’t even support it? What’s the allegiance even to, here? Phabricator or whatever the “stacked” tools are?
It’s impressive that Git forge culture has managed to obfuscate the actual units of change so much that heavyweight PRs have become the obvious—they should be separate PRs!—unit of change... when they don’t even support one-change-then-another-one.
Yeah it's kind of infuriating really. It's not like it's an uncommon workflow either. Everywhere I've worked people end up with PRs that just say "this depends on this other PR; ignore the first commit".
Gitlab kind of supports it - if your second PR's target branch is the first PR then it will only show you the code from the second PR and it will automatically update the target branch to master when the first one gets merged. I wouldn't say it's first class support though.
Sapling sort of has support for making it work on GitHub: https://sapling-scm.com/docs/addons/reviewstack/
And there was some forge that supports Jujutsu that has proper first class support, but I can't find it now.
Anyway it's a very useful workflow that lots of people want and kind of insane that it isn't well supported by GitHub.
To be fair I can't remember the last time GitHub introduced any really new features. It's basically in maintenance mode.
Also replying to echo this. Hate these blanket statements.
The "mid levels who consider themselves senior" are the exact type of people who I see saying what you're saying, i.e.
* Yes, TDD on production code is nice in theory, but it doesnt work in my case.
* Yes, short PRs are nice in theory, but it doesnt work in my case.
In every case, as far as I can see, it meant "It does work, I just dont know how to do it".
When I say "if you dont think it works in your case, come to me, Ill show you" they often demur and I end up with a huge PR anyway.
In practice I dont think ive ever seen a long PR that wouldnt have benefitted from being strategically broken up, but every other day I see another one that should have been.
> Yes, TDD on production code is nice in theory, but it doesnt work in my case...
Parent said something more along the lines of "they don't work in every case, and trying to force it in every case is misguided".
I agree that too big is more common than too small with respect to PR size, but you aren't putting forward much of an argument against parents "there are no absolutes" argument by straw manning them.
I think "doesn't work in every case" is true for basically every rule of course. But 99% of people in the industry are not qualified to make that call because they will always choose "not" out of laziness rather than because it actually wasn't a good idea
Have we stoped celebrating laziness being a virtue in software development? Discipline doesn't and will never scale and the pressures of business mean that processes that processes that put up walls to shipping will always crumble.
Real example, we do PR reviews because they're required for our audit and I'm of the opinion that they're mostly theater. It's vanishingly rare that someone actually wants a review rather than hitting the approve button and will call it out specifically if they do. Cool. This means you can't count on code review to catch problems, discipline doesn't scale after all. So instead we rely on a suite of end to end tests that are developed independently by our QA team.
Give me one example then. One is all it takes to disprove a rule.
Im fairly sure that I could explain how to break up any long PR in a sensible way. Parent thinks couldnt be done, so do you - what is an example?
The only exception i can think of is something where 99.9% of the changes are autogenerated (where i wouldnt really be reading it carefully anyway, so the length is immaterial...).
> Im fairly sure that I could explain how to break up any long PR in a sensible way. Parent thinks couldnt be done, so do you - what is an example?
Not couldn't - but shouldn't, such as when there's tight coupling across many files/modules. As an example, changing the css classes and rules affecting 20+ components to follow updated branding should be in one big PR[1] for most branching strategies.
Sometimes it's easier to split this into smaller chunks and front-load reviews for PRs into the feature branch, and then merge the big change with no further reviews, which may go against some ham-fisted rule about merging to main. Knowing when to break rules and why, ownership, and caring for the spirit of the law and not just the letter are what separates mid-levels from seniors.
1. Or changeset, if your version control system allows stacking.
You can and should break that up because I'm probably going to want to see screenshots to ensure that the branding changes make sense in context and everything looks consistent.
How would you do this? You'd either
1. Create N pull requests then merge all of them together into a big PR that would get merged into mainline at once 2. Do the same thing but do a bit of octopus merging since git merge can take multiple branches as arguments. Since most source control strategies are locked down, this isn't usually something that I can tell my juniors to do
The point of breaking things down like this is to minimize reviewer context. With bigger PRs there's a human tendency to try and hold the whole thing in your head at once, even if parts of the pull request are independent from others.
> The point of breaking things down like this is to minimize reviewer context.
This principle is much more important than some rule that says "Merges to main should not be more than 150 lines long". Sticklers for hard-and-fast rules usually haven't achieved the experience to know that adhering to fundamental principles will occasionally direct you to break the rules.
> Merges to main should not be more than 150 lines long
This can be done by allowing a flag in the commit message that bypasses the 150 line long (or whatever example) rule in the CI that enforces it. Then the reviewers and submitter can agree whether or not it makes sense to bypass the rule for this specific case.
In many cases like this, it's okay to override a rule if the people in charge of keeping the codebase healthy agree it's a special case.
minimizing reviewer context is one thing a PR can try to do, but it's not like that's any kind of universal most-important metric that every PR needs to optimize for, in fact very often minimizing reviewer context is in direct tension with making changes that are holistic and coherent
code review is meant to take time
>Not couldn't - but shouldn't, such as when there's tight coupling across many files/modules.
No, this is a pretty classic example of where you can break up the work by first refactoring out the tightly wound coupling in one PR before making the actual (now simpler/smaller) change in a second PR.
I agreed with you initially.
> I'm fairly sure that I could explain how to break up any long PR in a sensible way. Parent thinks couldnt be done, so do you - what is an example?
To me, when I meet experts in any field, the quality that stands out isn't that they do everything to expert level, it's that they get everything done as they said they would. Sometimes that means big PRs, because that's the environment created, and the expert finds the way to get the job done.
I'm not doubting you _could_ break up any PR into a shorter one. But that's kind of the point of an expert: they recognise what makes sense to do in reality, rather than just doing something because it's best practice and expecting everyone else to do the same.
They ultimately get the thing done how they said they would.
In my experience - this one is the correct one. Make a commitment, keep the commitment, stay responsible for the commitment afterwards.
This whole chain is like arguing on how tidy your desk should be. Some people like it fastidious to the nth degree. Some people prefer a little mess.
In neither case does that preference really matter much compared to all the other things a real job entails.
>I'm not doubting you _could_ break up any PR into a shorter one. But that's kind of the point of an expert: they recognise what makes sense to do in reality
I have seen plenty of huge PRs which were more trouble than they were worth to break up after discovery. At some point it becomes like unbaking a cake. It's a trade off.
Ive just never thought when I saw any of them that there wasnt a more practical way to get there with a bunch of smaller PRs.
Unlike dealing with an already existent large PR this isnt really a trade off thing - there are basically almost no circumstances when it is preferable to review one 1000 line code change instead of 4x self contained 200 line changes.
> they get everything done as they said they would
This. It saves everyone's time.
Lets say you have an api bug where you are allowing clients to ask for data with a very expensive query, and you want to dissallow that behavior. I think it makes more sense to change both the api spec (remove the endpoint) and the backend (dissallow queries of that kind in the future) in one go. That way you can note in the one PR exactly why both changes are made, referencing whatever bug/postmortem. Making two PRs that separately look like fixes to the issue can be confusing later, and don't really buy you any clarity in the PR itself.
Is it possible to break them up? Sure. Is it better to do so? I don't think so.
Also, for clarity, neither myself, nor op, every said couldn't be done.
A widely used class has a bad API. You refactor it to make a cleaner API, but your change isn't backwards compatible. Your options are:
- Change everything all at once. This creates a large PR. - Split it up into multiple small PRs. Now your individual PRs don't compile, and make less sense on their own. - Create a new class, and then split up multiple PRs that transition code to use the new class, and then finally a PR to remove the old class. This is more work for both the author and the reviewer and the individual PRs are harder to understand on their own.
This is interesting. I believe one way to deal with such breaking changes is to have multiple PRs, where the breaking change in each is hidden (protected) by a feature flag and tested by unit tests. Once all the PRs are committed, end to end testing can be done by enabling the flag. Any problems in production can be quickly reverted by disabling the flag. Eventually, a final PR removes the now-useless flag.
Of course, your mileage may vary; this technique is certainly not suitable for all breaking changes or all workfkows.
API changes breaking BC feel like they should be using versioning. I don't see enough people putting versioning support in to their API stuff at the outset. I've been chastised for doing that with "YAGNI". And then... one day, we do need it, and trying to introduce versioning support becomes... that much harder.
A new feature that fundamentally changes the way a lot of code is structured.
A group of features that only combined produce a measurable output, but each one does not work without the others.
A feature that will break a lot of things but needs to be merged now so that we have time for everyone to work on fixing the actual problems before deadline X as it is constantly conflicting every day and we need to spend time on fixing the actual issues, not fixing conflicts.
If I'm refactoring, truly refactoring, a 10k line PR where all the renaming happens at once is mandatory or else it won't compile. The only other option would be incremental refactors with an intermediate, parallel state that adds complexity, increases the likelihood of missing something and makes a 30 minute, 1 time PR review become 12x10 minute PR reviews.
Obviously it has to be a pure refactor, entirely isolated from functional changes but there are plenty of similar cases where doing it once is the least effort.
This doesn't match my experience and I assume is deeply cultural and subjective.
Sure.
> “nobody reads intermediate commit messages one by one on a PR”
I clean my history so that intermediate commits make sense. Nobody reads these messages in a pull request, but when I run git blame on a bug six months later I want the commit message to tell me something other than "stopping for lunch".
> pedantically apply DRY to every situation or forcing others to TDD basic app
Sure, pedantically doing or forcing anything is bad, but in my experience, copy-paste coding with long methods and a lack of good testing is a far more common problem.
You may be 100% correct in your particular case, but in general if senior devs are complaining that your code is sloppy and under-tested, maybe they aren't just being pedantic.
Yes. I think many people have no culture of good commits, so they never use bisect or blame, so they never see the use of good commits. It's a cycle
Good commits are not a requirement form bisect. I commit when I think something more or less completed, or I want to start a major refactoring and I'm afraid I might need to revert it.
I don't always check if commits are buildable, PR should be, because that is what is merged to master and tip of master should be buildable.
I actually find the relevant PR/MR discussion a lot more useful than the commit messages themselves. So any git blame is just to get a commit hash and look that up in GitLab/GitHub to see the entire change set and any comments around it. It makes me wish those comments were bundled with the merge commit somehow and could easily be accessed in the terminal where I'm viewing the git history.
Not my experience. Often the single commit is all the context I need. If it's not, follow the merge to the ticket number to get more context.
> Sure, pedantically doing or forcing anything is bad, but in my experience, copy-paste coding with long methods and a lack of good testing is a far more common problem.
This is a false dichotomy and an unproductive thing to focus at.
Experienced engineers know when to make an abstraction and to not. It is based in the knowledge about project.
Abstarct well and don't do compression. Easy said, and good engineers know how to do it.
My simple suggestion to my teams:
PRs are emails to your team and to your future self.
Framed in that context it's easier to carry the correct tone and think about scoping / what's important.
---
> pedantically apply DRY to every situation
I swear DRY has done more damage to the software industry from the developer side than it has done good because it has manifested into this big stick with which to bludgeon people without taking context into account.
A great way to frame DRY that I heard from hackernews: "DRY things that are supposed to have the same behavior, not things that happen to have the same behavior"
This is a really good way to put this. The "just because the do they same thing right now doesn't mean they _do the same thing_" concept is hard to convey!
I enjoy Sandi Metz' point there as well: Code just looking the same is not enough to call it duplication. Once you have to change two places looking the same to add a new feature or to fix a bug, then you have duplication and should centralize it.
I usually wait till 3; 2 is about the _very general_ point at which changing multiple places is about the same work as changing it to be centralized. 3 is almost always a better place to make that leap.
> PRs are emails to your team and to your future self.
Indeed! I've found many point on this discussion answered by the linux kernel idea of mailing lists where a change is discussed then approved, often with feedback acknowledged
> PRs are emails to your team and to your future self.
This should be commits though. Typically, developers would look for clues in this order:
code -> code comment -> commit message -> PR text -> external document
So commit messages puts the information closer to the user. One hop doesn't seem much, but the time saved adds up as you go.
Also, as some other reader mentioned anecdotally, PRs may not be there forever. E.g. your team may migrate to a new platform PR text and reviews were left behind.
In most sane development cycles I've seen (from 2-people teams to 100k people teams), intermediate commits disappear as soon as you merge your branch (in other words, you do short lived branches + squash merge).
If you decide to do merges without squashing, then yes, you gotta have to have more hygiene on each individual commit. It creates a lot of unnecessary friction and it's guaranteed to be slower (devs can't use commits as checkpoints/savepoints on their work, but rather each commit becomes a fully fleshed out "intermediate final state"). The only situation where I see this making sense is if you share work on a branch with other engineers (which is also a bad idea).
> devs can't use commits as checkpoints/savepoints on their work
But they can! In git you can do whatever you want with your local/remote working branch. And after you're done it's pretty straightforward to massage it into a coherent series of commits (especially if you had been working with that in mind).
> each commit becomes a fully fleshed out "intermediate final state"
This is really a team decision. You can allow intermediate commits to e.g. fail the tests, and add a tag to your main/master after each merge. Then you know that only the tagged commits are guaranteed to be fully functional.
> And after you're done it's pretty straightforward to massage it into a coherent series of commits
Why waste time? Just squash and merge, you have a single commit and it WORKS. Intermediate messages disappear and you have a single, atomic rollback point on your main branch
> You can allow intermediate commits to e.g. fail the tests, and add a tag to your main/master after each merge. Then you know that only the tagged commits are guaranteed to be fully functional.
OR… squash and merge. Block merging with tests and compilation passing
For anything in tech, there’s the frictionless way and the busywork way. Both of your examples are busywork that’s completely unnecessary if you just… squash and merge
The best process is the process nobody needs to remember to do shit for it to work
Everything works, until it doesn't.
You always have the PR discussion to refer to, until you move to a different platform to cut costs.
You can always ask the author of the code, until they have left the company.
The squashed commit message remains, even in your extremely unlikely examples. Not sure what you’re protecting yourself against, in reality
I agree with you. It's how the best commits are on Terraform.
Everywhere I've worked the past few years squashes PRs on merge with the PR becoming the commit title + message so the context lives on in the git history.
> nobody reads intermediate commit messages one by one on a PR
I think it's fine to have a whole bunch of "WIP" commit messages on intermediate commits while the PR is in a draft stage, but then all of those garbage commits should really be squashed down into one commit and you should at least write a one liner that describes what the whole change is doing. I think it does materially make repo history harder to understand to merge in PR's with 10 garbage commits in them.
> nobody reads intermediate commit messages one by one on a PR, period.
I do! I find it the easiest way to review code when the author has taken the time to structure it in that way. I'm lucky to work with some great people.
>nobody reads intermediate commit messages one by one on a PR, period.
>Don’t waste your time writing stuff for no one.
I've thought about that as I continue to write them. I think I can justify it by saying they are mostly for me. Can I describe what I'm trying to do with a specific push into a few items. It let's me reflect if I'm waiting too long between commits or if my ideas are getting too spread apart and really should be in two different branches that each have their own PRs. Then there is the rare case on a slower project where an item gets deprioritized and I come back to it weeks or even months later. Having the messages help me catch back up to speed.
As such, I find the 20 seconds or so to type out 1 to 2 sentences to be worthwhile, even if the ones reviewing the eventual PR never check. I'm also not above throwing in a "ditto" or "fixed issue" when a single commit really is that small or insignificant.
>“every commit must compile”
I agree with your take this is overzealous, but to expand upon my previous point, if I know a commit on a branch won't compile (say just had something else come up and need to swap focus for a few days), then I'll try to make sure I call that out in my last message just in case anyone else happens to get put on the project.
If I were to summarize my approach, treat PR messages seriously, but treat branch commit messages like sticky notes that will likely end up in the trash by week's end.
It's clear you've never worked on a large open source project... There are good reasons for all the practices you're thoughtlessly dismissing.
I agree that for a common team of programmers working for a single company, the value isn't always there. But that's the easiest and least interesting case... in big distributed projects this stuff really matters.
> - nobody reads intermediate commit messages one by one on a PR, period [...] > > - “every commit must compile” - again, unnecessary overzealousness. [...]
In my part of the world both of these are true, and proudly so. We keep catching a myriad of errors, big and small. The history is easy to read, and helps anyone catching up with how a certain project evolved.
I understand it might not be true for everyone, every team, in every line of business; but this sort of discipline pays off in quality oboth of the code _and_ the team members' abilities.
Sometimes you're overhauling something which you can't do in chunks less than something 2,000 line long PR. There's no intermediate working system. The problem is, "take this very large bit of code and throw it entirely away and rebuild it completely differently". Trying to craft some evolutionary step between A and B is just going to take 10x longer and won't help any code reviewers.
I agree.
When you have a large PR like this, here's how I like to get it reviewed.
1. Give reviewers sometime to become familiar with the PR. They might not understand all parts of it, but they should have at least a cursory understanding of the PR.
2. Have a meeting where the PR is explained in front of the group of reviewers. The reviewers will understand the PR better and they can ask questions in realtime.
3. Let folks review the PR after the meeting in case they spot anything else, or think of additional questions.
Most of the time PR review is done asynchronously, but doing most of the review in the meeting can also be a decent team building exercise.
Yeah, ideally the reviewers have been in standups with you so that it isn't all new as a concept to them to begin with, or there's generally been communication that you're going to land the plans for a nuclear reactor in their work queue.
Hopefully you've been going around and around at a high level communicating back all the problems that you've hit and the design issues that emerged during exploratory surgery.
Then, you definitely want to schedule at least one meeting to go over it. Which can become several meetings, including follow-up meetings with one or two individuals to pound out some specific issue. Depends on the complexity of the nuclear reactor.
While I agree about not having hard and fast rules, like LoC per PR, the principles of this are very relevant.
When reviewing a conglomerate commit in a PR, I have to reverse engineer how the different changes interact to figure out the intent. I then have to do this on each update they make. Contrast that to when someone breaks up their commits where I can zoom through variable renames, extracting functions, etc to see the one line that change that all of that unblocked that makes the difference. Then if updates are pushed, I only have to worry about the commits that were updated.
As for all commits compiling, that is helpful to review the individual commits.
Both of these (small commits, all compiling) are also great for bisecting. You get pointed to a very small change that you can more easily analyze vs dealing with breakages or having to analyze a large change to find what the problem is.
Thank you, more people need to read this. The software industry seems packed with these strange gatekeeping structures that only hinder development.
Focus on customer outcomes, and keep main clean.
> “every commit must compile” - again, unnecessary overzealousness. Every commit on the MAIN branch definitely should compile. Wasting your time with this in a branch, as you work towards a solution, is focusing on the wrong thing
(With few exceptions,) I generally follow this practice; BUT, I think enforcing this on other developers feels like micromanagement. That being said, with few exceptions, committing code that doesn't compile feels like an incomplete sentence.
(Sometimes on massive refactors I make commits that don't compile. It gives me a place to roll back to. If someone thinks this is poor practice, than I think they're putting principles in place of practicality.)
that's the whole point.
A _branch_ is a unit of work that should be merged when done.
As the owner of a branch, an engineer has the ability to move into intermediate states. The larger the codebase, the larger the possibility of something unexpected breaking or not compiling. Just like editing a large body of text - you will have "incomplete sentences" through the process. It's part of writing. Expecting others to write their drafts the same way you like is just silly - it's putting rigid principles ahead of anything else that matters.
> reads intermediate commit messages
> every commit must compile
I’m in the opposite camp. Following these two practices often doesn’t make any difference but the few times it did saved me a ton of time.
Dropping commits or rebasing is much easier when you have descriptive, atomic commits. It’s also helpful when performing git blame archeology to try and understand why this code looks so weird and has no context. It’s also useful when bisecting (not so much a problem with small PRs, quite handy as they grow bigger and bigger)
As with everything it’s about context and circumstances. As you gain expérience you can appreciate and gauge when it’s required. When you don’t have the expérience then you follow rules so that you gain said expérience. That’s how I see it.
> nobody reads intermediate commit messages one by one on a PR, period.
Very common practice at my old company, and one I continue in my current role.
> “every commit must compile”
sucks ass for anyone else trying to rebase your branch onto the update main/master when they don't. Once your PR is out of "working on the feature" and into the "getting it merged" phase, do a little `git rebase -i` and squash your really intermediate commits into ones that compile. Ignore this if you have real grown up CI where your PRs never stay open for more than a day.
> Ignore this if you have real grown up CI where your PRs never stay open for more than a day.
A vast majority of the drama that comes out of source control is associated with branches living for far too long.
I've got an internal alarm that starts to go off somewhere around 72 hours. If something takes longer than this, I've probably screwed up in my planning phase. There are some things that do need to sit, but they should be rebased every morning like clockwork. The moment things start to conflict, the PR gets closed and the branch is now a reference for how to do it again when whatever blocker is cleared.
Another way to think about all of this is to pretend like everything you are touching is taking a synchronous lock out (even if it's not), similar to how tools like Perforce behave. So, you generally want to move as quickly as possible to get out from under lock contention. Git allows you to pretend like you aren't conflicting for a really long time, but at some point you must answer for all of this debt (with interest).
> I've got an internal alarm that starts to go off somewhere around 72 hours.
Nah, in my experience, if you've got good commit hygiene you can often merge even ancient commits.
Here's a pretty hefty commit I merged five years after it was originally written, converting a ~100k line codebase from GTK to SDL2, written in 2015, committed in 2020, with tons of development in between, with "10 files changed, 777 insertions(+), 804 deletions(-)"
https://github.com/smcameron/space-nerds-in-space/commit/4ab...
I was expecting it to be a bit of a nightmare, but it really wasn't bad at all.
> sucks ass for anyone else trying to rebase your branch onto the update main
why would anyone else rebase your branches? YOU should rebase your branches.
>> You want PRs because they help others absorb what you’re doing
That isn't really where it came from though. The idea was, if I want an open source maintainer to accept my changes, I make a request to pull them from my branch. Once the open source maintainer has merged it in, they own it. If they don't like it (even one little bit), they can reject it because quality / ownership / maintenance is completely on them.
On a team environment where no one owns anything it is a little less clear what the value is. You want to incentivize the "betterness" of "something" and are using "broadened knowledge" as a proxy for that. Usually this just goes unexamined but really it would be good to establish how broad and deep you want this knowledge to be and work back from there - is the 5 minute PR review the best way to achieve it?
> - smaller PRs aren’t necessarily easier to review (and this arbitrary obsession almost always leads to PR overload in chunks that don’t make any sense, reducing code quality as a result)
This is why I gave up reading the article shortly after reaching the point about making a history with commit messages. The comments—even if it is on a Git forum—will just be full of people that either say that it’s a waste of time or that it is literally impossible for this to be practiced by anyone.[1]
Your best bet is to find projects where this is practiced (and you don’t have to look far). But making the case to a general audience? No, too many loud voices that treat version control like “I am committing now because I need to pick up the dry-cleaning” arbitrary/random snapshot-maker.
[1] No one, period? Sounds like a bit of a strict ontological rule to me.
I disagree with everything you wrote here. Prs must absolutely limit their scope both in terms of length as well as what they accomplish (e.g. don't do unrelated refactorings in a pr delivering something else), large features must absolutely be broken into individual commits if not individual PRs, I definitely read each one and each one definitely has to complete and pass tests 100%, else they are leaking details into the next or that would "fix" the problem. This is also absolutely nothing like "forcing TDD" in people and these are all practices that junior devs should absolutely be doing since it will help them to think about code, change and maintainability a ton.
> - nobody reads intermediate commit messages one by one on a PR, period. I worked on a team where the lead was adamant about this and started to write messages in the vein of “if you’re reading this message, I’ll give u $5”. I never paid anyone a dollar. Don’t waste your time writing stuff for no one.
I often do, In a larger PR or in one where it's hard to tell what is being accomplished, like this article articulates the commits can tell a story of the engineers journey to solution. Even if I review a commit that is largely undone by future commits that piece of history is often key to my understanding.
This. Just last week I have split a coworker's single-commit-MR into multiple commits so that I could distinguish between unrelated changes and to check smaller chunks of code. It worked beautifully.
> - “every commit must compile” - again, unnecessary overzealousness.
Until you need to `git bisect`. Then you'll require that every commit compile, pass tests, etc.; even if that means rebase/squashing to do it.
You don't bisect a merge/pull request. There is no need for it, unless it is a giant, but then your workflow is different.
Main has clean history and every commit is good.
> - “every commit must compile” - again, unnecessary overzealousness.
So you're the one breaking git bisect all the time. Grrrr.
Use stgit and make decent commits instead of rolling in the dirt like an animal.
With commit messages you miss the point. It’s more like the final test of the commit. If you can’t formulate easily what you did and why, then you need to rethink your changes.
Thank you so much. You speak from the bottom of my heart.
>> every commit must compile
If every commit on the main branch must compile then why wouldn't it also compile in the PR branch? It doesn't make sense to ask people to review, then after that rebase and merge imo.
> “every commit must compile” - again, unnecessary overzealousness.
my understanding is that you commit when you are at the "good place", where the part of the code you are working on works. That way when you keep going and find yourself going in a direction that is not right, you can go back to the last good place. If your code doesn't even compile, that doesn't seem like a good place.
I agree with you that this shouldn't be 100% hard defaults, but it's a good standard to have, and imo it's valuable to be explain why one is deviating from it.
> - smaller PRs aren’t necessarily easier to review (and this arbitrary obsession almost always leads to PR overload in chunks that don’t make any sense, reducing code quality as a result)
Oh but they sure can be reviewed more easily, because they are shorter? Doing so feels like less effort, and you get a dopamine hit from hitting that "submit review" button faster/more often (improved morale, and PR turnaround time!). Plus, if there's a longer discussion about X, it's great if it's not tangled up with Y and Z at the same time - allowing you to dig into X.
> - nobody reads intermediate commit messages one by one on a PR, period.
Come on, that's intellectually dishonest. 1. VSCode displays commit messages inline as blame for me (and many of my colleagues), so even when we don't read the commit messages one by one _on a PR_, I often read them later in the IDE (we don't squash merge PRs). I spend significantly more time reading code than writing, and commit messages, PR descriptions and linked issues provide extra context that is useful to me especially for complex code. If those messages were entirely unreadable, I'd be annoyed. 2. When someone invests time into telling a good story commit by commit, in my team they write "Review commit-by-commit is encouraged" in the PR description, to tell the reviewers that yes, they should read the individual commits, as that'll make understanding the PR easier. Often as reviewer, I follow that suggestion.
> Wasting your time with this in a branch, as you work towards a solution, is focusing on the wrong thing
It seams you're conflating "working on a feature" with "presenting it as PR to review". That's two very different things, and Edamagit in VSCode makes it so so easy to provide a reasonable commit history that hides some of your missteps, and to fill in commit messages.
> we don't squash merge PRs
you need to be careful with every single commit message, every commit must compile, etc, in your case. My comments apply if you squash-merge, in which case all that commit-level care is not necessary since intermediate commits go away on merge. You’re probably making your life harder for no reason for avoiding squash-merge, but that’s just my opinion
> and started to write messages in the vein of “if you’re reading this message, I’ll give u $5”. I never paid anyone a dollar.
Meh, most people wont address it or ask that dollar. It does not mean I did not read it, I chuckled and moved on.
I do read every commit on PR chain and every line. I am not necessary super attentive reviewer or something, but I never accept it without at least formally looking at it.
I write intermediate commit messages as notes to self. You don't always work continuously on the same PR. The commit messages are a useful context refresher.
Why advocate against this anyway? If no one reads them, it harms no one. Just like personal blogs. However, the writing of the blog is the useful act, not the reading.
Ironic that you are accusing TFA article of being an expert novice. I don't disagree your take on him / the article, but you are committing the same sin.
You missed the point entirely. The point is forcing others to do something that has no inherent value to them or to your process, just because you like it, is junior behavior.
Who's forcing? I might have misread TFA I guess. My reading was that the guy attended a conference, enjoyed the storytelling kind of talk (I mean this is a tried and true approach, there are even flash card decks on the story technique), and wrote a blog to capture and crystallize what he liked about it as it applies to his daily activity of writing code. I didn't read anything there claiming it was the one true way and anything else is a bankrupt approach.
If the point is about forcing someone to write commit essays, then yes I did miss it.
PR review is probably at least a little performative.
But I trust my colleagues to do good reviews when I ask them to, and to ignore my PRs when I don't. That's kind of the way we all want it.
I regularly ask for a review of specific changes by tagging them in a comment on the lines in question, with a description of the implications and a direct question that they can answer.
This, "throw the code at the wall for interpretation" style PR just seems like it's doomed to become lower priority for busy folks. You can add a fancy narrative to the comments if you like, but the root problem is that presenting a change block as-is and asking for a boolean blanket approval of the whole thing is an expensive request.
There's no substitute for shared understanding of the context and intent, and that should come out-of-band of the code in question.
For any PR above a few line change, if a developer has not done a self review, I don’t review it all.
Instead I request that it is self reviewed with context added, prior to requesting re-review.
I also tend to ask the question, “are any of these insights worth adding as comments directly to the code?”
9/10 the context they wrote down should be well thought out comments in the code itself. People are incredibly lazy sometimes, even unintentionally so. We need better lint tools to help the monkey brain perform better. I wish git platforms offered more UX friendly ways to force this kind of compliance. You can kind of fake it with CI/CD but that’s not good enough imo.
By self review, you mean that the developer adds comments in the code review tool? that is a great idea, I want to try this.
Yep. Self-reviewing your own PRs is a large boost to both yourself and the team, and often one of the first things I encourage new-ish developers to do.
- 90% of the time when you self-review your own PR, you're going to spot a bug or some incorrect assumption you made along the way. Or you'll see an opportunity to clean things up / make it better.
- Self-reviewing and annotating your reasons/thought process gives much more context to the actual reviewer, who likely only has a surface level understanding of what you're trying to do.
- It also signals to your team that you've taken the time to check your assumptions and verify you're solving the problem you say you are in the PR description.
Even when I worked for myself and had CodeRabbit help me do MRs, I still did a self-review before pushing any change to main.
Self-review is very, very helpful.
I always review my own PR before I expect someone else to, but I generally don't add comments. I just look it over and if I see something I want to fix I fix it. Adding comments for things I specifically want feedback on or am unsure about seems like a nice addition to the process though. I might start doing that too.
I thought everyone did this. I review twice. For each commit with -v and finally in GH/GL after I open the PR/MR. I often catch something on that last one.
It's rubber ducking.
I self review but I don’t write comments. I simply fix the code as I see the problems that I find self reviewing.
Unfortunately, these days, I am getting a lot of PRs where nobody has read the code, which came straight out of a robot. This makes me really angry.
What is `-v` mean here? I was assuming `git show`, but that doesn't seem to have a `-v` parameter.
`git commit` has a `-v` option that adds the diff to the bottom of the commit message template so you can see it while you write the message.
Yep!
Adding context to both your commits and your code review tools pull requests / merge requests makes everyone's lives better. Including future you, who inevitably is looking at the PR or commit in the future due to an incident caused by said change.
I have been following this personal rule for well over a decade, and have never regretted it.
I do it quite often and it's great, because it helps contextualise some changes that might not seem to be intuitive.
You could argue this is what commits are for, but given how people use GitHub and PRs, it gives some extra visibility.
And if you're going to use AI to assist you when writing the code I would argue this self-review step is 100% mandatory.
I've been doing this as part of my workflow for a few years now. My coworkers have expressed appreciation around that effort.
A nice side effect is that going through a self review and adding comments to the PR has helped me catch innumerable things that my coworkers never had to call me on.
Do you see any added value in adding the comments? I just fix the original commits and force push (each dev owns their branches at $JOB), but I'm wondering if I'm missing something?
The value I get is that it helps me catch errors before I ask others to check my work, and I use my PR comments as a teaching tool (I'm one of the seniors on the team).
Yeah, I never send a PR out without reviewing each commit myself and adding GitHub comments when I think it's relevant. Sometimes a PR is clear enough that I don't feel the need to add comments, though.
I self review but I don’t add comments I just fix the problems that I find. I should add clarifying comments.
I've done this and strongly recommend.
I often do this. It's a great way to highlight the areas you want a reviewer to focus on, the areas you are least happy about and want some collab. You can't always get collab pre-review, as you're writing. Plus you want to write it down and move on. Then you can async edit when the feedback comes. Not unlike a prose writing process.
This is a huge pet peeve of mine. At work I'm an expert on part of the code base that sees a fair number of contributions.I get many private IMs from colleagues asking me to "Approve please" or something like that with a dump of 100s of lines, of which maybe 10 lines are relevant to me (touch files or behaviour that I'm an expert on, hence why they need my approval.)
Minimally, I would like context for the change, why it required a change to this part of the codebase, and the thought process behind the change. Sometimes but not often enough I send the review back and ask them for this info.
IMO many software developers are just not fast enough at writing or language so providing an explanation for their changes is a lot of work for them. Or they are checked out and they just followed the AI or IDE until things worked, so they don't have explanations to provide. People not taking the time is what makes reviews performative.
> Minimally, I would like context for the change, …
… the why is important
> IMO many software developers … don't have explanations to provide. People not taking the time is what makes reviews performative.
… a lot of developers only consider the how.
i’ve had a lot of experiences of “once my PR is submitted that’s my work/ticket finished” kind of attitude.
i spent a year mentoring some people in a gaming community to become dev team members. one of the first things i said about PRs was — a new PR is just your first draft, there is still more work to do.
it helped these folks were brand spanking new to development and weren’t sullied by some lazy teacher somewhere.
> … the why is important > … a lot of developers only consider the how.
The why is someone else's job, so the developers should just ask them for a blurb to put in the PR for context, along with a note to the reviewer to ask that person for even more context if necessary.
I think there's a why with regard to the code. Why this "how" and not some other "how". (Why did you pick this algorithm, this pattern, this solution to the bigger business why.)
My team uses a github PR template with the following sections. Answers to each can be short yet it has been extraordinarily helpful to pass over important info to the reviewer that's not captured in code. It also borders on "checklist" that the code author has actually done the bare minimum to think things through.
# Goal (why is this change needed at all)
# What I changed and why I did it this way
# What I'm not doing here and how I'll follow up
# How I know it works (optional section, I include this only for teams with lots of very junior engs: "added a test" is often sufficient)
That's a great idea, I should do that. There is another team that does something similar but everyone complains about it (it's not as good as your template).
> IMO many software developers are just not fast enough at writing or language
I think this is the overwhelming factor, software engineering doesn't select for communication skills (and plenty of SEs will mock that as a field of study), or at least most SEs don't start out with them.
> SEs will mock that as a field of study
Who are these people? I've never encountered that. In my experience engineers who aren't great at communication freely own up to it.
Commit descriptions are criminally under used for this purpose. You can add so much more context if you don't limit yourself to just the short commit message.
>that should come out-of-band of the code in question.
Ideally, yes. After a decade and something' under ZIRP, seems a lot of workers never had incentive to remain conscious of their intents in context long enough to conduct productive discourse about them. Half of the people I've worked with would feel slighted not by the bitterness the previous sentence, but by its length.
There's an impedance mismatch between what actually works, and what we're expected to expect to work. That's another immediate observation which people to painfully syntaxerror much more frequently than it causes them to actually clarify context and intent.
In that case the codebase remains the canonical representation of the context and intent of all contributors, even when they're not at their best, and honestly what's so bad about that? Maybe communicating them in-band instead of out-of-band might be a degraded experience. But when out-of-band can't be established, what else is there to do?
I'd be happy to see tools that facilitate this sort of communication through code. GitHub for example is in perfect position to do something like that and they don't. Git + PRs + Projects implement the exact opposite information flow, the one whose failure modes people these days do whole conference talks about.
Exactly, if you want people to think about your code/changes you should be able to give them the needed context.
If you don't know them, please realize your code isn't automatically a gift everybody waited for, you may see it that way, but from the other side this isn't clear until someone put in the work to figure out what you did.
In short: added code produces work. So the least you should do is try reducing that work by making it easy to figure out what your code is and isn't.
Sum up what changes you made (functionally), why you made them, which choices you made (if any) and why and what the state of the PR code is in your own opinion. Maybe a reasoning why it is needed, what future maintenance may look like (ideally low). In essence, ask yourself what you'd like to know if someone appeared at the door and gave you a thumb drive with a patch for your project and add that knowledge.
Also consider to add a draft PR for bigger features early on. This way you can avoid programming things that nobody wanted, or someone else was already working on. You also give maintainers a way to steer the direction and/or decline before you put in the work.
[flagged]
[flagged]
> and to ignore my PRs when I don't
PRs should be optional, IMHO. Not all changes require peer review, and if we trust our colleagues then we should allow them to merge their branch without wasting time with performative PRs.
There is a difference between a code review and approval to merge/release.
Part of the difference is the idea you can catch all problems with piecemeal code review is nonsense, so you should have at least some sweeping QA somewhere.
I always appreciate an extra pair of eyeballs, even on a one-liner. Everyone's an idiot sometimes.
I’m firmly in this boat too. If it’s a small change I can likely get it reviewed within minutes, if it isn’t small it should have a review regardless.
Trust, but verify. We're only human after all :-)
At $DAY_JOB we need approvals from peers due to industry regulation.
In my experience, US healthcare, that box can be checked at later stages, namely deployment to production. It's a choice to add it earlier.
If it is for checking a box, sure. If it is part of a process that aspires to deliver projects with quality and with somewhat predictable release dates, that seems way too late, imho.
And a great way to end up leaking customer data from a SQL injection or other error that could have easily been caught during a more piece-wise analysis and vetting of the related code nearer to time of writing.
Sadly it often is box checking, code review or not. I'm only stating that there is no requirement in US healthcare that I'm aware of that requires approvals before merging code. Maybe that's not true in other industries. But most regulatory frameworks that I'm aware of are flexible, ambiguous, on implementation details by design.
If you find that outcomes are the same by making approvals optional at that stage, then do so with accompanied justification.
Yes! I once read a great article I can no longer find that talked about 3 types of PRs. Simple ones that you self approve. Ones that you tag someone because you want to spread the knowledge of what has been done. And ones that need actual review. Everything being reviewed is simply unnecessary and exhausting.
SOX compliance audit looks suspiciously at this comment.
No single person being able to make changes to a system is a tenant of that.
It's a very common refrain but I don't really agree with it:
"How do you create a PR that can be reviewed in 5-10 minutes? By reducing the scope. A full feature should often be multiple PRs. A good rule of thumb is 300 lines of code changes - once you get above 500 lines, you're entering unreviewable territory."
The problem with doing this is if you're building something a lot bigger and more complex than 500 lines of code, splitting that up across multiple PR's will result in:
- A big queue of PR's for reviewers to review
- The of the feature is split across multiple change sets, increasing cognitive load (coherence is lost)
- You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
The right answer for the size of a PR is NOT in lines of code. Exercise judgement as to what is logically easier to review. Sometimes bigger is actually better, it depends. Learn from experience, communicate with each other, try to be kind when reviewing and don't block things up unnecessarily.
> - A big queue of PR's for reviewers to review
This is a feature. I would infinitely prefer 12 PRs that each take 5 minutes to review than 1 PR that takes an hour. Finding a few 5-15 minute chunks of time to make progress on the queue is much easier than finding an uninterrupted hour where it can be my primary focus.
> - The of the feature is split across multiple change sets, increasing cognitive load (coherence is lost)
It increases it a little bit, sure, but it also helps keep things focused. Reviewing, for example, a refactor plus a new feature enabled by that refactor in a single PR typically results in worse reviews of either part. And good tooling also helps. This style of code review needs PRs tied together in some way to keep track of the series. If I'm reading a PR and think "why are they doing it like this" I can always peek a couple PRs ahead and get an answer.
> - You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
This is a tooling problem. Git and Github are especially bad in this regard. Something like Graphite, Jujutsu, Sapling, git-branchless, or any VCS that supports stacks makes this essentially a non-issue.
code review isn't about diffs, it's about holistic changes to the project
the point is not queue progression, it is about dissemination of knowledge
one holistic change to a project = one PR
simple stuff really
I agree with this. One way to keep changes small but still compose them into a coherent PR is to make each commit in the final PR independently meaningful, rather than what actually transpired during local development. TFA touches on this somewhat, contradicting the bit you quoted.
A trivial example would be adding the core logic and associated tests in the first commit, and all the remaining scaffolding and ceremony in subsequent commits. I find this technique especially useful when an otherwise additive change requires refactoring of existing code, since the things I expect will be reviewed in each and the expertise it takes are often very different.
I don't mind squashing the branch before merging after the PR has been approved. The individual commits are only meaningful in the context of the review, but the PR is the unit that I care about preserving in git history.
The problem that I find myself in is that I almost always run into stuff I didn't expect. Some integration that I thought would be minor turns out to slowly get out of hand, and before I know it I've made way more changes than I meant to. And then it all gets tangled together.
Maybe it's just a me problem, maybe I need to be more disciplined. Not sure but it catches me quite often.
That's one of the challenges with making changes all at once: it is a lot easier for one thing going wrong to suddenly result in thousands of lines of changes.
One technique I use when I find that happening is to check out a clean branch, and first make whatever structural change I need to avoid that rabbit hole. That PR is easy to review, because it doesn't change any behavior and there are tests that verify none of my shuffling things around changed how the software behaves (if those tests don't exist, I add them first as their own PR).
Once I've made the change I need to make easy, then the PR for the actual change is easy to review and understand. Which also means the code will be easy to understand when someone reads it down the line. And the test changes in that PR capture exactly how the behavior of the system is changed by the code change.
This skill of how to take big projects and turn them into a series of smaller logical steps is hard. It's not one that gets taught in college. But it lets us grow even large, complex code bases that do complex tasks without getting overwhelmed or lost or tangled up.
That makes sense. Reading your comment got me thinking some of the issue might be that I have always worked on somewhat immature projects. Either R&D or greenfield projects. Which is super nice in a whole lot of ways, but a lot of times I don't know what the final shape of the changes to the rest of the system are going to be, because that part of the system itself isn't well established yet. So it evolves throughout whatever I'm doing. Which would make it difficult to break them off and work them in a different branch.
Maybe there's a partial solution if I can keep those commits clean and separate in the tree. And then when I'm done reorder things such that those all happen as a block of contiguous commits.
There's a nice Manning book from 2014 about this way of working named The Mikado Method.
> You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
This shouldn't matter unless you are squashing commits further back in the tree before the PR or other people are also merging to main.
If a lot of people are merging back to main so you're worried about those causing problems, you could create a long life branch off main, branch from that and do smaller PRs back to it as you go, and then merge the whole thing back to main when your done. That merge might 2k lines of code (or whatever) but it's been reviewed along the way.
I don't necessarily disagree with you. Just pointing out that there are ways to manage it.
I also feel like what gets lost in this is not everything you are building is a bite size feature in large existing project. Sometimes you are adding an entire subsystem that is large to something relatively greenfield. if you broke that down into features, you will need 20 PRs and if you wait for review, or even don't wait but have to circle back to integrate lots of requested changes, what might be a couple of weeks of work turns into 2 to 3 months of work. That just does not work unless you are in a massive enterprise that is ok with moving like molasses. Do you wind up with something not as high quality? Probably. But that is just the trade-off with shipping faster.
If you are the only developer who ever going to work on something, maybe. Even then, I will argue you are more likely to deliver successfully if you are cutting your work into smaller pieces instead of not delivering anything at all for weeks at a time.
But for the company, having two people capable of working on a system is better than one, and usually you want a team. Which means the code needs to be something your coworkers understand, can read and agree with. Those changes they ask for aren't frivolous: they are an important part of building software collaboratively. And it shouldn't be that much feedback forever: after you have the conversation and you understand and agree with their feedback, the next time you can take that consideration into account when you are first writing the code.
If you want to speed that process up, you can start by pair programming and hashing out disagreements in real time, until you get confident you are mostly on the same page.
Professional programming isn't about closing tickets as fast as possible. It is about delivering business value as a team.
Here's an alternative approach: Discuss the design with your team beforehand, and have active ongoing discussions, sanity checks, and even pair programming during the development process. That way the review is not an exhaustive end-to-end review with the reviewer coming in cold. It's instead the final approval step in a long chain of decisions that have already been discussed and agreed upon.
Of course that won't work for all projects/teams/organizations. But I've found that it works pretty well in the kinds of projects/teams/organizations I've personally been a part of and contributed to.
A stack of PRs is much better for reviewers than a single massive PR.
Use jujutsu and then stacking branches is a breeze
100%. I think the right answer is to break features into atomic commits, but keep PRs at the feature level. This reduces the PR friction, while letting reviewers easily view change sets for specific features, and if a specific feature needs a patch you don't need to do any rebase gymnastics, just add a patch commit.
AI agents like frequent checkpoints because the git diff is like a form of working memory for a task, and it makes it easy to revert bad approaches. Agents can do this automatically so there isn't much of an excuse not to do it, but it does require some organization of work before prompting.
It's not about splitting up the PR: it is about splitting up the _work_.
If you don't have feature flags, that is step one. Even if you don't have a framework, you can use a Strategy or a configuration parameter to enable/disable the new feature, and still have automated testing with and without your changes.
Keep merging each PR into master under a feature flag, that's how it's done. Huge PRs that implement a feature in one swoop are pretty much the worst case scenario for every stage: review, testing, deployment and monitoring.
> You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
+100 to this. My job should be thoughtfully building the solution, not playing around with git rebase for hours.
Just use jj instead of git and cut your rebasing time by 95%.
Suddenly rebasing a stack of branches becomes 1 command.
I do agree with the common refrain, actually, and disagree with the idea that work can be so big and complex that it has to be in one pull request.
> A big queue of PR's for reviewers to review
Yes, yes please. When each one is small and understandable, reviewers better understand the changes, so quality goes up. Also, when priorities change and the team has to work on something else, they can stop in the middle, and at least some of the benefits from the changes have been merged.
The PR train doesn't need to be dumped out in one go. It can come one at a time, each one with context around why it's there and where it fits into the grander plan.
> The [totality] of the feature is split across multiple change sets, increasing cognitive load (coherence is lost)
A primary goal of code review is to build up the mental map of the feature in the reviewers' brains. I argue it's better for that to be constructed over time, piece by piece. The immediate cognitive load for each pull request is lower, and over time, the brain makes the connections to understand the bigger picture.
They'll rarely achieve the same understanding of the feature that you have, you who created and built it. This is whether they get the whole shebang at once or piecemeal. That's OK, though. Review is about reducing risk, not eliminating it.
> You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
I've learned not to charge too far ahead with feature work, because it does get harder to manage the farther you venture from the trunk. You will get conflicts. Saving up all the changes into one big hunk doesn't fix that.
A big benefit of trunk-based development, though, is that you're frequently merging back into the mainline, so all these problems shrink down. The way to do that is with lots of small changes.
One last thing: It is definitely more work, for you as the author, to split up a large set of changes into reviewable pieces. It is absolutely worth it, though. You get better quality reviews; you buy the ability to deprioritize at any time and come back later; most importantly for me, you grasp more about what you made during the effort. If you struggle to break up a big set of changes into pieces that others can understand, there's a good chance it has deeper problems, and you'll want to work those out before presenting them to your coworkers.