Printf debugging is ok

2025-01-062:42119145www.polymonster.co.uk

I stopped going on Twitter a while ago because it has the tendency to evoke rage, as it is designed to do. But every now and then I check back in - it can be useful sometimes for keeping up with…

I stopped going on Twitter a while ago because it has the tendency to evoke rage, as it is designed to do. But every now and then I check back in - it can be useful sometimes for keeping up with graphics research, gamedev news and some people do post nice things, like sharing projects they are working on, so there is something to pull me back from time to time.

After checking the other day I saw this debate going around about not using an IDE or debugger, just using ‘notepad’ to write code. I looked in the comments, people arguing about who was right and all the usual toxic vibes, and it reminded me of some earlier occasions of people discussing the same topic.

It feels like the same old debate has been going on for a long time now, it’s packaged differently each time, but I don’t really know why people get so wound up about things. The main arguments are “if you need to use a debugger you’re an idiot and you don’t understand the code you are writing” (that’s not an actual quote but there was a similar take along those lines). Then there is “If you can’t use a debugger you’re an idiot”. The hating on the ‘printf’ crew is omnipresent.

At the risk of poking a hornet’s nest, I just wanted to share some thoughts and ideas on this subject in a balanced way, because I don’t think there needs to be an ultimate solution here. We need to debug code and there are tools out there to help us, some are more useful than others in certain situations, but at the end of the day do whatever you need to do to fix those bugs.

Debuggers

I use a debugger regularly, I will launch most work in C++ from Visual Studio or Xcode and preferably run in a debug build. I know for some people this is often a terrible UX because of the performance of debug builds, so a prerequisite here is fast debug builds. This is hard to retrofit but having a usable debug build is useful. Once running I can use the debugger break and step if I need to, and if I encounter a crash then there is a nice call stack I can look through in more detail.

I have noticed that it is extremely common for graduate and junior software engineers to have little to no debugging knowledge or experience. It’s not something that seems like it is taught at university and I have also been told stories of teachers imposing their usage of VIM and esoteric debugging strategies upon the students. For the record I am not a VIM user (another topic that ends up in polarising debates). I find using a mouse and 2 finger typing works for me.

The moment when you show someone how to use a hardware breakpoint or a watchpoint and find a bug immediately is like seeing the lightbulb appear on top of their head, a whole world of possibility opening in front of them, or the dismay of the wasted hours trying to catch some dodgy logic through layers and layers of object oriented spaghetti.

Some of those argue about using only ‘notepad’ and no debugger because they can dry run their code on paper and they “don’t write bugs”, but I find it difficult to understand how they work within a larger team project or codebase . A lot of bugs and issues I have ever had to fix were not in code I wrote myself, they were in legacy systems, colleague’s code, or in open source code (and some hard as nails bugs to track too!) that had been just lifted into a project. If you believe in the impending AI coding apocalypse then human engineers may merely be around to debug and fix issues with AI generated code. So yeah, being able to write perfect code yourself is one thing, but using a debugger to debug existing code in a large complex project shouldn’t be a thing of shame and we might need all the tools we can to help.

Along with debuggers we get all sorts of other tools, which also should be used as and when we need them. Address sanitizer can catch memory issues easily, where in a bygone era we would have this 1 in 1000 crash somewhere reading outside of an array bounds, we can enable ASan and catch this every time without the undefined behaviour lottery. Same for undefined behaviour sanitizer, now we can catch UB when it’s benign and not only when a noticeable side effect occurs.

I don’t know if these notepad-only coders are taking all of those tools off the table as well, but when you have something like ASan that can catch an issue for you I just don’t really know why you wouldn’t use it. I have seen a lot of comments that seem to suggest the debugger slows them down, but in this case I certainly think the debugger speeds you up.

So if you’re reading this and you don’t know about these tools I would say take a look and see, they can be useful and might be able to save you a lot of time. There are tons of things you can do and it’s hard to cover it all here. I learned a lot from working with other people and side by side debugging difficult problems. I think there should be more resources to teach these skills instead of it being handed down information.

Printf debugging is OK

So for the ‘printf’ haters I would also say that whilst using a debugger most of the time, sometimes I revert to ‘printf’ debugging. There are some situations where there is no other choice - in the past I have had to debug release builds where we were unable to reproduce the bug in debug. Even pulling in debug modules for the engine (for on screen debug info) changed the executable such that we couldn’t reproduce the issue. The last thing was to put a few print statements in using the raw ‘printf’ and removing them and adding more as we narrowed down the issue and eventually extracted enough information to fix the problem.

I have also had the need to use ‘printf’ when debugging certain kinds of behaviours in an application. In the case of something like touch event tracking for mobile devices, if you try to debug an issue with breakpoints you interrupt the hardware and it makes it difficult to reproduce issues in the same way they appear naturally. So here printing the state of touch down events, touch up events, and being able to see the logical flow can identify a problem. There are many more scenarios that benefit from this type of debugging. Just throw the prints in and make sure to remove them after so no one knew you were ever there, like a ninja.

Custom UI based debugging tools can go one step further than printf debugging, providing some similar traits but also allowing more flexibility and controllability, I assume the notepad wielders who don’t use a regular debugger must have some such custom tools and things to help them track down issues. I am a big fan of embedded debugging and profiling tools within an application. You know stuff like performance counters that I can just pop-up in a UI or tweakable values to help to refine behaviours or visual appearance. I find that since the explosion of ImGui the level of integrated ad-hoc debugging tools and info has exponentially increased.

But with these kinds of custom tools, I personally wouldn’t try and re-invent the wheel. I would aim to make stuff that complements the existing tools I can pull off the shelf. So for example I like to have a quick, at a glance profiler for all my key performance hotspots that I can check whenever I notice something. But for more in-depth profiling I would use a CPU or GPU profiler to dig deeper.

Just doing what needs to be done

At the end of the day, finding bugs is just something that we need to get done, whatever helps you find and fix the issue doesn’t bother me as long as we get the job done. On a closing note, I noticed some code in a pull request left in by accident by another person:

if(some_condition) { int x = 0;
}

I found this interesting. I do the same thing except I usually name my variable ‘a’. This is to insert some code where a breakpoint can be put on the ‘int x’ line and then it kind of acts like a conditional breakpoint when some_condition is true. You could use a conditional breakpoint within the debugger, but they can be slow and for me historically unreliable, but this little snippet gives you your own conditional breakpoint that works without fail.

Just make sure to remove the code before the PR next time!


Read the original article

Comments

  • By kentonv 2025-01-064:507 reply

    If you have a reproducible test case that runs reasonably quickly, then I think printf debugging is usually just as good as a "real" debugger, and a lot easier. I typically have my test output log open in one editor frame. I make a change to the code in another frame, save it, my build system immediately re-runs the test, and the test log immediately refreshes. So when the test isn't working, I just keep adding printfs to get more info out of the log until I figure it out.

    This sounds so dumb but it works out to be equivalent to some very powerful debugger features. You don't need a magical debugger that lets you modify code on-the-fly while continuing the same debug session... just change the code and re-run the test. You don't need a magical record-replay debugger that lets you go "back in time"... just add a printf earlier in the control flow and re-run the test. You don't need a magical debugger that can break when a property is modified... just temporarily modify the property to have a setter function and printf in the setter.

    Most importantly, though, this sort of debugging is performed using the same language and user interface I use all day to write code in the first place, so I don't have to spend time trying to remember how to do stuff in a debugger... it's just code.

    BUT... this is all contingent on having fast-running automated tests that can reproduce your bugs. But you should have that anyway.

    • By whatevertrevor 2025-01-065:34

      > Most importantly, though, this sort of debugging is performed using the same language and user interface I use all day to write code in the first place, so I don't have to spend time trying to remember how to do stuff in a debugger... it's just code.

      Absolutely! Running a commandline debugger adds additional context I have to keep in my head (syntax for all the commands, output format etc), that actively competes with context required to debug my code. Printfs work just fine without incurring that penalty, granted this argument applies less to IDE debuggers because their UX is usually intuitive.

    • By eru 2025-01-065:101 reply

      > BUT... this is all contingent on having fast-running automated tests that can reproduce your bugs. But you should have that anyway.

      Ideally, yes. But for many bugs, getting to a reproduction is already more than half the battle. And a debugger can help you with that.

      • By skissane 2025-01-066:111 reply

        > But for many bugs, getting to a reproduction is already more than half the battle.

        Today I am working on a bug where a token expires after 2 hours and then we fail to request a new one instead we just keep on using the now expired one, which (of course) doesn’t work. I have a script to reproduce it but it takes 2 hours to run. It would be great if there was some configuration knob to turn down the expiry just for this test so we can reproduce it faster - but there isn’t because nobody thought of that.

        • By atq2119 2025-01-066:281 reply

          Maybe it takes less than 2 hours to add such a configuration knob?

          • By skissane 2025-01-066:421 reply

            That’s a good idea except it is in another team’s service and I’m not very familiar with their code. But I can try

            Well, the hard part isn’t actually the code change (reasonably obvious) it is deploying my own copy of their service…

            • By pantalaimon 2025-01-067:112 reply

              Can you just set the local time two hours into the future?

              Or modify the expiry time locally / not use the one from the token at all.

              • By skissane 2025-01-067:431 reply

                The token isn’t being verified in my service it is being verified in a remote service. I can’t easily change the clock on the remote service. It runs in a K8S pod and (AFAIK) K8S doesn’t support time namespaces. And even if it does, the pod is deployed by a custom K8S operator so unsure if I could make the operator turn that on even if it is available. And the remote service is a complex monolith which uses lots of RAM so trying to run it on my own laptop will be painful

                • By majewsky 2025-01-0613:09

                  Reminds me of the classic adage that "everything can be solved by another layer of abstraction, except for the problem of too much abstraction layers".

              • By kmoser 2025-01-0614:51

                Or modify the token itself so it's no longer valid? (Although that might come back with "invalid token" rather than "expired token.")

    • By roca 2025-01-069:40

      Adding print statements, rebuilding and rerunning the program and figuring out where in the logs the bug showed up this time can a lot more tedious than setting a (possibly conditional) breakpoint in a reverse-execution debugger and doing "reverse-continue".

    • By seanmcdirmid 2025-01-065:503 reply

      The crashes/bugs I deal with are rarely straight down failures, they are often the 1 out of 100 runs kind, so printf debugging is the only way to go really. And I used to be big on using debuggers, but now I’m horribly out of practice.

      • By skissane 2025-01-066:201 reply

        > The crashes/bugs I deal with are rarely straight down failures, they are often the 1 out of 100 runs kind, so printf debugging is the only way to go really.

        Another thing I’ve found helpful, is to write out a “system state dump” (say in JSON) to a file whenever certain errors happen. Like we had a production system that was randomly hanging and running out of DB connections. So now whenever the DB connection pool is exhausted, it dumps a JSON file to S3 listing the status of every DB connection, including the stack dump of the thread that owns it, the HTTP request that thread is serving, the user account, etc. Once we did that, it went from “we don’t understand why this application is randomly falling over” to “oh this is the thing that is consistently triggering it”

        When it writes a dump, it then starts a “lockout period” in which it won’t write any further dumps even if the error reoccurs. Don’t want to make a meltdown worse by getting bogged down endlessly writing out diagnostics.

        • By alextingle 2025-01-067:201 reply

          Why not just dump a core, rather than going to all that effort?

          • By skissane 2025-01-067:53

            Because it is a lot easier to analyse a few megabytes of JSON than a heap dump which is gigabytes in size.

            Not my original idea, I got it from some Oracle products which have a similar feature (most notably Oracle Database)

      • By roca 2025-01-069:431 reply

        Record-and-replay debuggers are often better than anything else for debugging intermittent failures: run the program with recording many times until you eventually get the bug, then debug that recording at your leisure; you'll never have to waste time reproducing it again.

        • By johnisgood 2025-01-0612:27

          I'm very rusty when it comes to debuggers.

          Where may I read about particular workflows involving debuggers, e.g. gdb?

          (Mainly for programs written in C.)

          Most of my issues are related to issues with concurrency though, deadlocks and whatnot.

      • By throwaway2037 2025-01-069:512 reply

            > they are often the 1 out of 100 runs kind
        
        Can you share an example? In my whole career, I have only seen one or two of them, but most of my work is CRUD type of stuff, not really gaming or systems programming where such a thing might happen.

        • By arter4 2025-01-0610:37

          Let's say your application talks to a database.

          You reuse connections with a connection pool, but you accidentally reuse connections with different privileges and scopes. As a result, sometimes you get to read some data you shouldn't read and sometimes you don't.

          Or, concurrency bugs.

          You don't properly serialize transactions and sometimes two transactions overlap in time leading to conflicts.

        • By seanmcdirmid 2025-01-0615:53

          In Android they are way too common. Like an animation that decides to fire after the UI element it was made for has been destroyed, or a race condition in destroying a resource, or VE logging.

    • By rtpg 2025-01-066:09

      I would say that like... saying "OK so `a.b` looks like this, what does `a.b.c` look like?" is a very nice flow with deep and messy objects (especially when working in languages with absolutely garbage default serialization logic like Javascript)

      But even with a debugger, there's still loads of value of sitting up, moving from writing a bunch of single line statements all over, and to writing a real test harness ASAP to not have to rely on the debugger.

      For any non-trivial problem, you'll often very quickly appreciate a properly formatted output stack, and that output will be shaped to the problem you are looking at. Very hard for an in-process debugger to have the answer for you there immediately.

      Being serious about debugging can even get into things like writing actual new entrypoints (scripts with arguments and whatnot) and things like adding branches togglable through environment variables.

      I think a lot of people's mindset in debugging is "if I walk the tightrope and put in _one more hack_ I'll find my answer" and it gets more and more precarious. Debugging is always a bit of an exercise of mental gymnastics, but if you practice being really good at printf debugging _and_ configuring program flow easily, you can be a lot less stressed out.

      Like if you think your problem is going to take more than 20 minutes to debug, you probably should start writing a couple helper functions to get you on the right foot.

    • By NikkiA 2025-01-067:331 reply

      Things like variable watchpoints mean that debuggers are still the better option.

      • By guelo 2025-01-067:431 reply

        You can add the condition to the print statement and grep for it if it gets too verbose.

        • By NikkiA 2025-01-067:57

          Watchpoints can tell you which function is causing the variable to change, making printf's conditional cannot.

  • By victorNicollet 2025-01-0611:451 reply

    One of the hardest bugs I've investigated required the extreme version of debugging with printf: sprinkling the code with dump statements to produce about 500GiB of compressed binary trace, and writing a dedicated program to sift through it.

    The main symptom was a non-deterministic crash in the middle of a 15-minute multi-threaded execution that should have been 100% deterministic. The debugger revealed that the contents of an array had been modified incorrectly, but stepping through the code prevented the crash, and it was not always the same array or the same position within that array. I suspected that the array writes were somehow dependent on a race, but placing a data breakpoint prevented the crash. So, I started dumping trace information. It was a rather silly game of adding more traces, running the 15-minute process 10 times to see if the overhead of producing the traces made the race disappear, and trying again.

    The root cause was a "read, decompress and return a copy of data X from disk" method which was called with the 2023 assumption that a fresh copy would be returned every time, but was written with the 2018 optimization that if two threads asked for the same data "at the same time", the same copy could be returned to both to save on decompression time...

    • By overfl0w 2025-01-0611:472 reply

      Those are the kind of bugs one remembers for life.

      • By victorNicollet 2025-01-0611:54

        Indeed. Worst week of 2023 !

        But I consider myself lucky that the issue could be reproduced on a local machine (arguably, one with 8 cores and 64GiB RAM) and not only on the 32 core, 256GiB RAM server. Having to work remotely on a server would have easily added another week of investigation.

  • By onre 2025-01-063:491 reply

    I've gotten an OS to run on a new platform with a debugging tool portfolio consisting of a handful of LEDs and a pushbutton. After getting to the point where I could printf() to the console felt like more than anyone could ever ask for.

    Anecdote aside, it certainly doesn't hurt to be able to debug things without a debugger if it comes to that.

    • By shadowgovt 2025-01-064:251 reply

      Since most of my work is in distributed systems, I find the advice to never printf downright laughable.

      "Oh sure, lemme just set a breakpoint on this network service. Hm... Looks like my error is 'request timed out', how strange."

      That having been said: there are some very clever solutions in cloud-land for "printf" debugging. (Edit: forgot this changed names) Snapshot Debugger (https://github.com/GoogleCloudPlatform/snapshot-debugger) can set up a system where some percentage of your instances are run in a breakpointed mode, and for some percentage of requests passing through the service, they can log relevant state. You can change what you're tracking in realtime by adding listeners in the source code view. Very slick.

      • By mark_undoio 2025-01-069:29

        > "Oh sure, lemme just set a breakpoint on this network service. Hm... Looks like my error is 'request timed out', how strange."

        Time travel debugging (https://en.wikipedia.org/wiki/Time_travel_debugging) can help with this because it separates "recording" (i.e. reproducing the bug) from "replaying" (i.e. debugging).

        Breakpoints only need to be set in the replay phase, once you've captured a recording of the bug.

HackerNews