Replacing a $3000/mo Heroku bill with a $55/mo server

2025-10-2120:28813556disco.cloud

This content has moved. If you are not redirected, please click here: blog/how-idealistorg-replaced-a-3000mo-heroku-bill-with-a-55mo-server/

This content has moved. If you are not redirected, please click here:

blog/how-idealistorg-replaced-a-3000mo-heroku-bill-with-a-55mo-server/


Read the original article

Comments

  • By speedgoose 2025-10-2121:1215 reply

    Looking at the htop screenshot, I notice the lack of swap. You may want to enable earlyoom, so your whole server doesn't go down when a service goes bananas. The Linux Kernel OOM killer is often a bit too late to trigger.

    You can also enable zram to compress ram, so you can over-provision like the pros'. A lot of long-running software leaks memory that compresses pretty well.

    Here is how I do it on my Hetzner bare-metal servers using Ansible: https://gist.github.com/fungiboletus/794a265cc186e79cd5eb2fe... It also works on VMs.

    • By TheDong 2025-10-223:053 reply

      Even better than earlyoom is systemd-oomd[0] or oomd[1].

      systemd-oomd and oomd use the kernel's PSI[2] information which makes them more efficient and responsive, while earlyoom is just polling.

      earlyoom keeps getting suggested, even though we have PSI now, just because people are used to using it and recommending it from back before the kernel had cgroups v2.

      [0]: https://www.freedesktop.org/software/systemd/man/latest/syst...

      [1]: https://github.com/facebookincubator/oomd

      [2]: https://docs.kernel.org/accounting/psi.html

      • By geokon 2025-10-228:591 reply

        Do you have any insight in to why this isn't included by default in distros like Ubuntu. It's kind of bewildering that the default behavior on Ubuntu is to just lock up the whole system on OOM

        • By TheDong 2025-10-229:422 reply

          systemd-oomd I'm pretty sure is enabled by default in fedora and ubuntu desktop.

          I think it's off on the server variants.

          • By galangalalgol 2025-10-2210:171 reply

            Is there any way to get something like the oomd or zram that works on gpu memory? I run into gpu memory leaks more often. Itbseems to be electron usually.

            • By fireant 2025-10-233:50

              GPU memory model quite different from CPU memory model, with application level explicit synchronization and coherency and so on. I don't think that transparent compression would be possible, and even if it would surely carry drastic perf downside

          • By geokon 2025-10-2210:53

            Kubuntu LTS definitely didnt have it by default. And there are no system settings exposing it (or ZRAM)

      • By CGamesPlay 2025-10-224:332 reply

        "earlyoom is just polling"?

        > systemd-oomd periodically polls PSI statistics for the system and those cgroups to decide when to take action.

        It's unclear if the docs for systemd-oomd are incorrect or misleading; I do see from the kernel.org link that the recommended usage pattern is to use the `poll` system call, which in this context would mean "not polling", if I understand correctly.

        • By TheDong 2025-10-226:36

          systemd-oomd, oomd, and earlyoom all do poll for when to actually take action on OOM conditions.

          What I was trying to say is that the actual information on when there's memory pressure is more accurate for systemd-oomd / oomd because they use PSI, which the kernel itself is updating over time, and they just poll that, while earlyoom is also internally making its own estimates at a lower granularity than the kernel does.

        • By 100721 2025-10-224:483 reply

          Unrelated to the topic, it seems awfully unintuitive to name a function ‘poll’ if the result is ‘not polling.’ I’m guessing there’s some history and maybe backwards-compatible rewrites?

          • By CGamesPlay 2025-10-226:101 reply

            Specifically, earlyoom’s README says it repeatedly checks (“periodically polls”) the memory pressure, using CPU each time even when there is no change. The “poll” system call waits for the kernel to notify the process that the file has changed, using no CPU until the call resolves. It’s unclear what systemd-oomd does, because it uses the phrase “periodically polls”,

          • By unilynx 2025-10-226:04

            Poll takes a timeout parameter. ‘Not polling’ is just a really long timeout

          • By friendzis 2025-10-226:07

            "Let the underlying platform do the polling and return once the condition is met"

      • By speedgoose 2025-10-227:47

        Thanks, I will try that out.

    • By Bender 2025-10-2123:012 reply

      Another option would be to have more memory that required over-engineer and to adjust the oom score per app, adding early kill weight to non critical apps and negative weight to important apps. oom_score_adj is already set to -1000 by OpenSSH for example.

          NSDJUST=$(pgrep -x nsd); echo -en '-378' > /proc/"${NSDJUST}"/oom_score_adj
      
      Another useful thing to do is effecively disable over-commit on all staging and production servers (0 ratio instead of 2 memory to fully disable as these do different things, memory 0 still uses formula)

          vm.overcommit_memory = 0
          vm.overcommit_ratio = 0
      
      Also use a formula to set min_free and reserved memory using a formula from Redhat that I do not have handy based on installed memory. min_free can vary from 512KB to 16GB depending on installed memory.

          vm.admin_reserve_kbytes = 262144
          vm.user_reserve_kbytes = 262144
          vm.min_free_kbytes = 1024000
      
      At least that worked for me in about 50,000 physical servers for over a decade that were not permitted to have swap and installed memory varied from 144GB to 4TB of RAM. OOM would only occur when the people configuring and pushing code would massively over-commit and not account for memory required by the kernel. Not following best practices defined by Java and thats a much longer story.

      Another option is to limit memory per application in cgroups but that requires more explaining than I am putting in an HN comment.

      Another useful thing is to never OOM kill in the first place on servers that are only doing things in memory and need not commit anything to disk. So don't do this on a disked database. This is for ephemeral nodes that should self heal. Wait 60 seconds so drac/ilo can capture crash message and then earth shattering kaboom...

          # cattle vs kittens, mooooo...
          kernel.panic = 60
          vm.panic_on_oom = 2
      
      For a funny side note, those options can also be used as a holy hand grenade to intentionally unsafely reboot NFS diskless farms when failing over to entirely different NFS server clusters. setting panic to 15 mins, triggering OOM panic by setting min_free to 16TB at the command line via Ansible not in sysctl.conf, swapping clusters, arp storm and reconverge.

      • By liqilin1567 2025-10-224:07

        Thanks for sharing I think these are very useful suggestions.

      • By benterix 2025-10-227:492 reply

        The lengths people will go to avoid k8s... (very easy on Hetzner Cloud BTW).

        • By Bender 2025-10-2210:37

          That's a more complex path I avoided discussing when I referenced CGroups. When I started doing these things kube clusters did not exist. These tips were for people using bare metal that have not decided as a company to go the k3/k8 route. Some of these settings will still apply to k8 physical nodes. The good people of Hetzner would be managing these settings on their bare metal that Kubernetes is running on and would not likely want their k8 nodes getting all broken, sticky and confused after a K8 daemon update results in memory leakage, billions of orphaned processes, etc...

          Companies that use k3/k8's they may still have bare metal nodes that are dedicated to a role such as databases, ceph storage nodes, DMZ SFTP servers, PCI hosts that were deemed out of scope for kube clusters and of course any "kittens" such as Linux nodes turned into proprietary appliances after installing some proprietary application that will blow chunks if shimmed into k8's or any other type of abstraction layer.

        • By lillecarl 2025-10-2213:441 reply

          Every ClusterAPI infrastructure provider is similarly easy? Or what makes Hetzner Kubernetes extra easy?

          • By benterix 2025-10-2214:47

            I mentioned Hetnzer only because the original article mentions it. To be fair, currently it is harder to use than any managed k8s offering because you need to deploy your control plane yourself (but fortunately there are several project that make it as easy as it can be, and this is what I was referring to).

    • By levkk 2025-10-2121:5116 reply

      Yeah, no way. As soon as you hit swap, _most_ apps are going to have a bad, bad time. This is well known, so much so that all EC2 instances in AWS disable it by default. Sure, they want to sell you more RAM, but it's also just true that swap doesn't work for today's expectations.

      Maybe back in the 90s, it was okay to wait 2-3 seconds for a button click, but today we just assume the thing is dead and reboot.

      • By bayindirh 2025-10-2122:158 reply

        This is a wrong belief because a) SSDs make swap almost invisible, so you can have that escape ramp if something goes wrong b) SWAP space is not solely an escape ramp which RAM overflows into anymore.

        In the age of microservices and cattle servers, reboot/reinstall might be cheap, but in the long run it is not. A long running server, albeit being cattle, is always a better solution because esp. with some excess RAM, the server "warms up" with all hot data cached and will be a low latency unit in your fleet, given you pay the required attention to your software development and service configuration.

        Secondly, Kernel swaps out unused pages to SWAP, relieving pressure from RAM. So, SWAP is often used even if you fill 1% of your RAM. This allows for more hot data to be cached, allowing better resource utilization and performance in the long run.

        So, eff it, we ball is never a good system administration strategy. Even if everything is ephemeral and can be rebooted in three seconds.

        Sure, some things like Kubernetes forces "no SWAP, period" policies because it kills pods when pressure exceeds some value, but for more traditional setups, it's still valuable.

        • By kryptiskt 2025-10-220:024 reply

          My work Ubuntu laptop has 40GB of RAM and and a very fast Nvme SSD, if it gets under memory pressure it slows to a crawl and is for all practical purposes frozen while swapping wildly for 15-20 minutes.

          So no, my experience with swap isn't that it's invisible with SSD.

          • By interroboink 2025-10-221:391 reply

            I don't know your exact situation, but be sure you're not mixing up "thrashing" with "using swap". Obviously, thrashing implies swap usage, but not the other way around.

            • By db48x 2025-10-223:441 reply

              If it’s frozen, or if the mouse suddenly takes seconds to respond to every movement, then it’s not just using some swap. It’s thrashing for sure.

              • By pdimitar 2025-10-2212:18

                I get it that the distinction is real but nobody using the machine cares at this point. It must not happen and if disabling swap removes it then people will disable swap.

          • By webstrand 2025-10-223:141 reply

            I've experimented with no-swap and find the same thing happens. I think the issue is that linux can also evict executable pages (since it can just reload them from disk).

            I've had good experience with linux's multi-generation LRU feature, specifically the /sys/kernel/mm/lru_gen/min_ttl_ms feature that triggers OOM-killer when the "working set of the last N ms doesn't fit in memory".

            • By ValdikSS 2025-10-2213:06

                  Enables Multi-Gen LRU (improved page reclaim and caching policy).
                  Prevents thrashing, improves loading speeds under low ram conditions.
                  Requires kernel 6.1+.
                  Has dramatic effect especially on slower HDDs.
                  For slower HDDs, consider 1000 instead of 300 for min_ttl_ms.
              
                  sudo tee /etc/tmpfiles.d/mglru.conf <<EOF
                  w-      /sys/kernel/mm/lru_gen/enabled          -       -       -       -       y
                  w-      /sys/kernel/mm/lru_gen/min_ttl_ms       -       -       -       -       300
                  EOF

          • By omgwtfbyobbq 2025-10-224:29

            It's seldom invisible, but in my experience how visible it is depends on the size/modularity/performance/etc of what's being swapped and the underlying hardware.

            On my 8gb M1 Mac, I can have a ton of tabs open and it'll swap with minimal slowdown. On the other hand, running a 4k external display and a small (4gb) llm is at best horrible and will sometimes require a hard reset.

            I've seen similar with different combinations of software/hardware.

          • By baq 2025-10-224:491 reply

            Linux being absolute dogshit if it’s under any sort of memory pressure is the reason, not swap or no swap. Modern systems would be much better off tweaking dirty bytes/ratios, but fundamentally the kernel needs to be dragged into the XXI century sometime.

            • By ValdikSS 2025-10-2213:08

              It's kind of solved since kernel 6.1 with MGLRU, see above.

              Dirty buffer should also be tuned (limited), absolutely. Default is 20% of RAM, (with 5 second writeback and 30 second expire intervals), which is COMPLETELY insane. I limit it to 64 MB max usually, with 1 second writeback and 3 second expire intervals.

        • By db48x 2025-10-223:433 reply

          This is not really true of most SSDs. When Linux is really thrashing the swap it’ll be essentially unusable unless the disk is _really_ fast. Fast enough SSDs are available though. Note that when it’s really thrashing the swap the workload is 100% random 4KB reads and writes in equal quantities. Many SSDs have high read speeds and high write speeds but have much worse performance under mixed workloads.

          I once used an Intel Optane drive as swap for a job that needed hundreds of gigabytes of ram (in a computer that maxed out at 64 gigs). The latency was so low that even while the task was running the machine was almost perfectly usable; in fact I could almost watch videos without dropping frames at the same time.

          • By ValdikSS 2025-10-2213:161 reply

            It's fixed since Kernel 6.1 + MGLRU, see above, or read this: https://notes.valdikss.org.ru/linux-for-old-pc-from-2007/en/...

            • By webstrand 2025-10-2215:241 reply

              Do you know how the le9 patch compares to mg_lru? The latter applies to all memory, not just files as far as I can tell. The former might still be useful in preventing eager OOM while still keeping executable file-backed pages in memory?

              • By ValdikSS 2025-10-2222:08

                le9 is a 'simple' method to keep the fixed amount of the page cache. It works exceptionally well for what it is, but it requires manual tuning of the amount of cache in MB.

                MGLRU is basically a smarter version of already existing eviction algorithm, with evicts (or keeps) both page cache and anon pages, and combined with min_ttl_ms it tries to keep current active page cache for a specified amount of time. It still takes into account swappiness and does not operate on a fixed amount of page cache, unlike le9.

                Both are effective in trashing prevention, both are different. MGLRU, especially with higher min_ttl_ms, could cause OOM killer more frequently than you'd like it to be called. I find le9 more effective for desktop use on old low-end machines, but that's only because it just keeps the (large/er amounts of) page cache. It's not very preferable for embedded systems for example.

          • By fulafel 2025-10-226:401 reply

            > Note that when it’s really thrashing the swap the workload is 100% random 4KB reads and writes in equal quantities.

            The free memory won't go below a configurable percentage and the contiguous io algorithms of the swap code and i/o stack can still do their work.

            • By db48x 2025-10-2212:371 reply

              That may be the intention, but you shouldn’t rely on it. In practice the average IO size is, or at least was, almost always 4KB.

              Here’s a screenshot from atop while the task was running: <https://db48x.net/temp/Screenshot%20from%202019-11-19%2023-4...>. Note the number of page faults, the swin and swout (swap in and swap out) numbers, and the disk activity on nvme0n1. Swap in is 150k, and the number of disk reads was 116k with an average size of 6KB. Swap out was 150k with 150k disk writes of 4KB. It’s also reading from sdh at a fair clip (though not as fast as I wanted!)

              <https://db48x.net/temp/Screenshot%20from%202019-12-09%2011-4...> is interesting because it actually shows 24KB average write size. But notice that swout is 47k but there were actually 57k writes. That’s because the program I was testing had to write data out to disk to be useful, and I had it going to a different partition on the same nvme disk. Notice the high queue depth; this was a very large serial write. The swap activity was still all 4KB random IO.

              • By fulafel 2025-10-235:201 reply

                That's surprising. Do you know what your application memory access pattern is like, is it really this random and the single page io is working along its grain, or is the page clustering, io readahead etc just MIA?

                • By db48x 2025-10-243:56

                  I didn’t delve very deep into it, but the program was written in Go. At this point in the lifecycle of the program we had optimized it quite a bit, removing all the inefficiencies that we could. It was now spending around two thirds of its cpu cycles on garbage collection. It had this ridiculously large heap that was still growing, but hardly any of it was actually garbage.

                  I rewrote a slice of the program in Rust with quite promising results, but by that time there wasn’t really any demand left. You see, one of the many uses of Reposurgeon <http://www.catb.org/esr/reposurgeon/> is to convert SVN repositories into Git repositories. These performance results were taken while reposurgeon was running on a dump of the GCC source code repository. At the time this was the single largest open source SVN repository left in the world with 287k commits. Now that it’s been converted to a Git repository it’s unlikely that future Reposurgeon users will have the same problem.

                  Also, someone pointed out that MG-LRU <https://docs.kernel.org/admin-guide/mm/multigen_lru.html> might help by increasing the block size of the reads and writes. It was introduced a year or more after I took these screenshots, so I can’t easily verify that.

        • By eru 2025-10-2123:481 reply

          How long is long running? You should be getting the warm caches after at most a few hours.

          > Secondly, Kernel swaps out unused pages to SWAP, relieving pressure from RAM. So, SWAP is often used even if you fill 1% of your RAM. This allows for more hot data to be cached, allowing better resource utilization and performance in the long run.

          Yes, and you can observe that even in your desktop at home (if you are running something like Linux).

          > So, eff it, we ball is never a good system administration strategy. Even if everything is ephemeral and can be rebooted in three seconds.

          I wouldn't be so quick. Google ran their servers without swap for ages. (I don't know if they still do it.) They decided that taking the slight inefficiency in memory usage, because they have to keep the 'leaked' pages around in actual RAM, is worth it to get predictability in performance.

          For what it's worth, I add generous swap to all my personal machines, mostly so that the kernel can offload cold / leaked pages and keep more disk content cached in RAM. (As a secondary reason: I also like to have a generous amount of /tmp space that's backed by swap, if necessary.)

          With swap files, instead of swap partitions, it's fairly easy to shrink and grow your swap space, depending on what your needs for free space on your disk are.

          • By bayindirh 2025-10-2210:55

            > Yes, and you can observe that even in your desktop...

            Yup, that part of my comment was culmination of using Linux desktops for the last two decades. :)

            > I wouldn't be so quick. Google ran their servers without swap for ages.

            If you're designing this from get go and planning accordingly, it doesn't fit into my definition of eff it, we ball, but let's try this and see whether we can make it work.

            > With swap files, instead of swap partitions,...

            I'm a graybeard. I eyeball a swap partition size while installing the OS, and just let it be. Being mindful and having good amount of RAM means that SWAP acts as a eviction area for OS first, and as an escape ramp second, in very rare cases.

            --

            Sent from my desktop.

        • By gchamonlive 2025-10-2122:283 reply

          > SSDs make swap almost invisible

          It doesn't. SSDs came a long way but so did memory dies and buses, and with that the way programs work also changed as more and more they are able to fit their stacks and heaps on memory more often than not.

          I have had a problem with shellcheck that for some reason eats up all my ram when I open I believe .zshrc and trust me, it's not invisible. The system crawls to a halt.

          • By bayindirh 2025-10-2122:371 reply

            It depends on the SSD, I may say.

            If we're talking about SATA SSDs which top at 600MBps, then yes, an aggressive application can make itself known. However, if you have a modern NVMe, esp. a 4x4 one like Samsung 9x0 series or if you're using a Mac, I bet you'll notice the problem much later, if ever. Remember the SSD trashing problem on M1 Macs? People never noticed that system used SWAP that heavily and trashed the SSD on board.

            Then, if you're using a server with a couple of SAS or NVMe SSDs, you'll not notice the problem again, esp. if these are backed by RAID (even md counts).

            • By gchamonlive 2025-10-2123:15

              Now that you say, I have a new Lenovo yoga with those SoC ram with crazy parallel channel config (16gb spread across 8 dies of 2gb). It's noticeably faster than my Acer nitro with dual channel 16gb ddr5. I'll check that, but I'd say it's not what the average home user (and even server I'd risk saying) would have.

          • By xienze 2025-10-2123:161 reply

            > it's not invisible. The system crawls to a halt.

            I’m gonna guess you’re not old enough to remember computers with memory measured in MB and IDE hard disks? Swapping was absolutely brutal back then. I agree with the other poster, swap hitting an SSD is a barely noticeable in comparison.

            • By pdimitar 2025-10-2212:22

              I am not sure exactly what your point is. Is it "hey, it can be much worse"? If so, not a very interesting argument if your machine crawls to a halt.

          • By justsomehnguy 2025-10-2122:566 reply

            What do you prefer:

            ( ) a 1% chance the system would crawl to a halt but would work

            ( ) a 1% change the kernel would die and nothing would work

            • By gchamonlive 2025-10-2123:16

              I think I've not made myself as clear as I could. Swap is important for efficient system performance way before you hit OOM on main memory. It's not, however, going to save system responsiveness in case of OOM. This is what I mean.

            • By eru 2025-10-220:091 reply

              The trade-off depends on how your system is set up.

              Eg Google used to (and perhaps still does?) run their servers without swap, because they had built fault tolerance in their fleet anyway, so were happier to deal with the occasional crash than with the occasional slowdown.

              For your desktop at home, you'd probably rather deal with a slowdown that gives you a chance to close a few programs, then just crashing your system. After all, if you are standing physically in front of your computer, you can always just manually hit the reset button, if the slowdown is too agonising.

              • By macintux 2025-10-220:55

                That’s very common to distributed systems: much better to have a failed node than a slow node. Slow nodes are often contagious.

            • By andai 2025-10-2123:144 reply

              Can someone explain this to me? Doesn't swap just delay the fundamental issue? Or is there a qualitative difference?

              • By eru 2025-10-2123:52

                Swap delays the 'fundamental issue', if you have a leak that keeps growing.

                If your problem doesn't keep growing, and you just have more data that programs want to keep in memory than you have RAM, but the actual working set of what's accessed frequently still fits in RAM, then swap perfectly solves this.

                Think lots of programs open in the background, or lots of open tabs in your browser, but you only ever rapidly switch between at most a handful at a time. Or you are starting a memory hungry game and you don't want to be bothered with closing all the existing memory hungry programs that idle in the background while you play.

              • By danielheath 2025-10-221:35

                I run a chat server on a small instance; when someone uploads a large image to the chat, the 'thumbnail the image' process would cause the OOM-killer to take out random other processes.

                Adding a couple of gb of swap means the image resizing is _slow_, but completes without causing issues.

              • By charcircuit 2025-10-223:29

                The problem is freezing the system for hours or more to delay the issue is not worth it. I'd rather a program get killed immediately than having my system locked up for hours before a program gets killed.

              • By justsomehnguy 2025-10-220:082 reply

                https://news.ycombinator.com/item?id=45007821

                > Doesn't swap just delay the fundamental issue?

                The fundamental issue here is what the linux fanboys literally think what killing a working process and most of the time the process[0] is a good solution for not solving the fundamental problem of memory allocation in the Linux kernel.

                Availability of swap allows you to avoid malloc failure in a rare case your processes request more memory than physically (or 'physically', heh) present in the system. But in the mind of so called linux administrators even if a one byte of the swap would be used then the system would immediately crawl to a stop and never would recover itself. Why it always should be the worst and the most idiotic scenario instead of a sane 'needed 100MB more, got it - while some shit in the memory which wasn't accessed since the boot was swapped out - did the things it needed to do and freed that 100MB' is never explained by them.

                [0] imagine a dedicated machine for *SQL server - which process would have the most memory usage on that system?

                • By ssl-3 2025-10-221:021 reply

                  Indeed.

                  Also: When those processes that haven't been active since boot (and which may never be active again) are swapped out, more system RAM can become available for disk caching to help performance of things that are actively being used.

                  And that's... that's actually putting RAM to good use, instead of letting it sit idle. That's good.

                  (As many are always quick to point out: Swap can't fix a perpetual memory leak. But I don't think I've ever seen anyone claim that it could.)

                  • By qotgalaxy 2025-10-221:592 reply

                    What if I care more about the performance of things that aren't being used right now than the things that are? I'm sick of switching to my DAW and having to listen to my drive thrash when I try to play a (say) sampler I had loaded.

                    • By ssl-3 2025-10-224:41

                      Just set swappiness to [say] 5, 2, 1, or even 0, and move on with your project with a system that is more reluctant to go into swap.

                      And maybe plan on getting more RAM.

                      (It's your system. You're allowed to tune it to fit your usage.)

                    • By db48x 2025-10-224:32

                      Sounds like you just need more memory.

                • By ta1243 2025-10-2212:041 reply

                  If I've got 128G of ram and need 100M more to get it, something is wrong.

                  What if I've got 64G of ram and 64G of swap and need the same amount of memory?

                  • By justsomehnguy 2025-10-2219:58

                    "Why it always should be the worst and the most idiotic scenario "

                    And no, if you need 100MB more then it's literally not important how much RAM do you have. You just needed 100MB more this time.

            • By ta1243 2025-10-2212:00

              The second by a long shot.

              Detecting things are down is far easier than detecting things are slow.

              I'd rather that oom started killing things though than a kernel panic or a slow system. Ideally the thing that is leaking, but if not the process using the most memory (and yes I know that "using" is tricky)

            • By pdimitar 2025-10-2213:24

              I don't count crawling to a halt as a working machine. Plus it depends. Back in the day I had computers that got blocked for 30-ish seconds which was annoying but gave you the window of opportunity to go kill the offending program. But then you had some that we left, out of curiosity, to work throughout the entire workday and they never recovered.

              So most of the time I'd prefer option 3: the OOM killer to reap a few offending programs and let me handle restarting them.

        • By hhh 2025-10-223:141 reply

          Kubernetes supports swap now.

          I still don’t use it though.

        • By vasco 2025-10-2122:451 reply

          In EC2 using any kind of swapping is just wrong, the comment you replied to already made all the points that can be made though.

          • By bayindirh 2025-10-2122:531 reply

            From my understanding, the comment I'm replying to uses EC2 example to portray that swapping is wrong in any and all circumstances, and I just replied with my experience with my system administrator hat.

            I'm not an AWS guy. I can see and touch the servers I manage, and in my experience, SWAP works, and works well.

            • By matt-p 2025-10-2123:161 reply

              Just for context EC2 typically uses network storage that, for obvious reasons, often has fairly rubbish latency and performance characteristics. Swap works fine if you have local storage, though obviously it burns through your SSD/NVME drive faster and can other side effects on it's performance (usually not particularly noticeable).

              • By bayindirh 2025-10-2210:501 reply

                Thanks, I'll keep that in mind if I start to use EC2 for workloads.

                However, from my experience, normal (eviction based) usage of SWAP doesn't impact the life of an SSD in a measurable manner. My 256GB system SSD (of my desktop system) shows 78% life remaining after 4 years of power on hours, which also served as /home for at least half of its life.

                • By vasco 2025-10-2220:241 reply

                  You don't care about life of any hardware in the cloud, that doesn't really matter either unless you work for the cloud provider in their datacenter teams.

                  • By bayindirh 2025-10-237:54

                    Yes, but I care about hardware life on my own personal systems and infrastructure I manage, so... :)

        • By adastra22 2025-10-2122:402 reply

          What pressure? If your ram is underutilized, what pressure are you talking about?

          If the slowest drive on the machine is the SSD, how does caching to swap help?

          • By bayindirh 2025-10-2122:483 reply

            A long running Linux system uses 100% of its RAM. Every byte unused for applications will be used as a disk cache, given you read more data than your total RAM amount.

            This cache is evictable, but it'll be there eventually.

            Linux used to don't touch unused pages in the RAM in the older days if your RAM was not under pressure, but now it swaps out pages unused for a long time. This allows more cache space in RAM.

            > how does caching to swap help?

            I think I failed to convey what I tried to say. Let me retry:

            Kernel doesn't cache to SSD. It swaps out unused (not accessed) but unevictable pages to SWAP, assuming that these pages will stay stale for a very long time, allowing more RAM to be used as cache.

            When I look to my desktop system, in 12 days, Kernel moved 2592MB of my RAM to SWAP despite having ~20GB of free space. ~15GB of this free space is used as disk cache.

            So, to have 2.5GB more disk cache, Kernel moved 2592 MB of non-accessed pages to SWAP.

            • By adastra22 2025-10-2123:436 reply

              Yes, and if I am writing an API service, for example, I don’t want to suddenly add latency because I hit pages that have been swapped out. I want guarantees about my API call latency variance, at least when the server isn’t overloaded.

              I DON’T WANT THE KERNEL PRIORITIZING CACHE OVER NRU PAGES.

              The easiest way to do this is to disable swap.

              • By eru 2025-10-2123:551 reply

                You better not write your API in Python, or any language/library that uses amortised algorithms in the standard (like Rust and C++ do). And let's not mention garbage collection.

                • By pdimitar 2025-10-2213:46

                  Huh? Could you please clarify wrt to Rust and C++? Can't they use another allocator if needed? Or that's not the problem?

              • By dwattttt 2025-10-2210:551 reply

                If you're getting this far into the details of your memory usage, shouldn't you use mlock to actually lock in the parts of memory you need to stay there? Then you get to have three tiers of priority: pages you never want swapped, cache, then pages that haven't been used recently.

                • By pdimitar 2025-10-2213:491 reply

                  Can mlock be instructed to f.ex. "never swap pages from this pid"?

                  • By bayindirh 2025-10-2213:511 reply

                    The application requests this itself from the Kernel. See https://man7.org/linux/man-pages/man2/mlock.2.html

                    • By dwattttt 2025-10-2220:47

                      From the link, mlockall with MCL_CURRENT | MCL_FUTURE

                      > Lock all pages which are currently mapped into the address space of the process.

                      > Lock all pages which will become mapped into the address space of the process in the future.

              • By bayindirh 2025-10-2210:44

                > I DON’T WANT THE KERNEL PRIORITIZING CACHE OVER NRU PAGES.

                Then tell the Kernel about it. Don't remove a feature which might benefit other things running on your system.

              • By baq 2025-10-224:54

                If you’re writing services in anything higher level than C you’re leaking something somewhere that you probably have no idea exists and the runtime won’t ever touch again.

              • By sethherr 2025-10-2123:551 reply

                I’m asking because I genuinely don’t know - what are “pages” here?

                • By adastra22 2025-10-220:00

                  That’s a fair question. A page is the smallest allocatable unit of RAM, from the OS/kernel perspective. The size is set by the CPU, traditionally 4kB, but these days 8kB-4MB are also common.

                  When you call malloc(), it requests a big chunk of memory from the OS, in units of pages. It then uses an allocator to divide it up into smaller, variable length chunks to form each malloc() request.

                  You may have heard of “heap” memory vs “stack” memory. The stack of course is the execution/call stack, and heap is called that because the “heap allocator” is the algorithm originally used for keeping track of unused chunks of these pages.

                  (This is beginner CS stuff so sorry if it came off as patronizing—I assume you’re either not a coder or self-taught, which is fine.)

              • By gnosek 2025-10-224:08

                Or you can set the vm.swappiness sysctl to 0.

            • By ta1243 2025-10-2212:101 reply

              > A long running Linux system uses 100% of its RAM.

              How about this server:

                           total       used       free     shared    buffers     cached
                Mem:          8106       7646        459          0        149       6815
                -/+ buffers/cache:        681       7424
                Swap:         6228         25       6202
              
              Uptime of 2,105 days - nearly 6 years.

              How long does the server have to run to reach 100% of ram?

              • By bayindirh 2025-10-2212:151 reply

                You already maxed it from Kernel's PoV. 8GB of RAM, where 6.8GB is cache. ~700MB is resident and 459 is free because I assume Kernel wants to have some free space to allocate something quite fast.

                25MB swap use seems normal for a server which doesn't juggle much tasks, but works on one.

                • By ta1243 2025-10-2217:14

                  So not 100% of ram, less than 95%

            • By wallstop 2025-10-2122:561 reply

              Edit:

                  wallstop@fridge:~$ free -m
                                 total        used        free      shared  buff/cache   available
                  Mem:           15838        9627        3939          26        2637        6210
                  Swap:           4095           0        4095
              
              
                  wallstop@fridge:~$ uptime
              
                  00:43:54 up 37 days, 23:24,  1 user,  load average: 0.00, 0.00, 0.00

              • By bayindirh 2025-10-2123:043 reply

                The command you want to use is "free -m".

                This is from another system I have close:

                                   total        used        free      shared  buff/cache   available
                    Mem:           31881        1423        1042          10       29884       30457
                    Swap:            976           2         974
                
                2MB of SWAP used, 1423 MB RAM used, 29GB cache, 1042 MB Free. Total RAM 32 GB.

                • By eru 2025-10-2123:581 reply

                  If you are interested in human consumption, there's "free --human" which decided on useful units by itself. The "--human" switch is also available for "du --human" or "df --human" or "ls -l --human". It's often abbreviated as "-h", but not always, since that also often stands for "--help".

                  • By bayindirh 2025-10-2210:46

                    Thanks, I generally use free -m since my brain can unconsciously parse it after all these years. ls -lh is one of my learned commands though. I type it in automatically when analyzing things.

                    ls -lrt, ls -lSh and ls -lShr are also very common in my daily use, depending on what I'm doing.

                • By ta1243 2025-10-2212:101 reply

                  So that 2M of used swap is completely irrelevant. Same on my laptop

                                 total        used        free      shared  buff/cache   available
                      Mem:           31989       11350        4474        2459       16164       19708
                      Swap:           6047          20        6027
                  
                  My syslog server on the other hand (which does a ton of stuff on disk) does use swap

                      Mem:            1919         333          75           0        1511        1403
                      Swap:           2047         803        1244
                  
                  With uptime of 235 days.

                  If I were to increase this to 8G of ram instead of 2G, but for arguments sake had to have no swap as the tradeoff, would that be better or worse. Swap fans say worse.

                  • By bayindirh 2025-10-2212:21

                    > So that 2M of used swap is completely irrelevant.

                    As I noted somewhere, my other system has 2,5GB of SWAP allocated over 13 days. That system is a desktop system and juggles tons of things everyday.

                    I have another server with tons of RAM, and the Kernel decided not to evict anything to SWAP (yet).

                    > If I were to increase this to 8G of ram instead of 2G, but for arguments sake had to have no swap as the tradeoff, would that be better or worse. Swap fans say worse.

                    I'm not a SWAP fan, but I support its use. On the other hand I won't say it'd be worse, but it'd be overkill for that server. Maybe I can try 4, but that doesn't seem to be necessary if these numbers are stable over time.

                • By wallstop 2025-10-220:44

                  Thanks! My other problem was formatting. Just wanted to share that I see 0 swap usage and nowhere near 100% memory usage as a counterpoint.

          • By adgjlsfhk1 2025-10-2122:511 reply

            The OS uses almost all the ram in your system (it just doesn't tell you because then users complain that their OS is too ram heavy). The primary thing it uses it for is caching as much of your storage system as possible. (e.g. all of the filesystem metadata and most of the files anyone on the system has touched recently). As such, if you have RAM that hasn't been touched recently, the OS can page it out and make the rest of the system faster.

            • By adastra22 2025-10-2123:471 reply

              At the cost of tanking performance for the less frequently used code path. Sometimes it is more important to optimize in ways that minimize worst case performance rather than a marginal improvement to typical work loads. This is often the case for distributed systems, e.g. SaaS backends.

              • By bayindirh 2025-10-2210:13

                You can request things from Kernel, like pinning cores or telling kernel not swap your pages out (see mlockall() / madvise()).

                The easiest way affecting everything running on the system might not be the best or even the correct way to do things.

                There's always more than one way to solve a problem.

                Reading the Full Manual (TM) is important.

        • By commandersaki 2025-10-2122:351 reply

          This is a wrong belief

          This is not about belief, but lived experience. Setting up swap to me is a choice between a unresponsive system (with swap) or a responsive system with a few oom kills or downed system.

          • By bayindirh 2025-10-2122:392 reply

            > This is not about belief, but lived experience.

            I mean, I manage some servers, and this is my experience.

            > Setting up swap to me is a choice between a unresponsive system (with swap) or a responsive system with a few oom kills or downed system.

            Sorry, but are you sure that you budgeted your system requirements correctly? A Linux system shall neither fill SWAP nor trigger OOM regularly.

            • By eru 2025-10-220:021 reply

              Swap also works really well for desktop workloads. (I guess that's why Apple uses it so heavily on their Macbooks etc.)

              With a good amount of swap, you don't have to worry about closing programs. As long as your 'working set' stays smaller than your RAM, your computer stays fast and responsive, regardless of what's open and idling in the background.

              • By bayindirh 2025-10-2210:42

                Yes, this is my experience, too. However, I still tend to observe my memory usage even if I have plenty of free RAM.

                Old habits die hard, but I'm not complaining about this one. :)

            • By commandersaki 2025-10-221:28

              It doesn’t happen often, and I have a multi user system with unpredictable workloads. It’s also not about swap filling up, but giving the pretense the system is operable in a memory exhausted state which means oom killer doesn’t run, but the system is unresponsive and never recovers.

              Without swap oom killer runs and things become responsive.

      • By KaiserPro 2025-10-2122:222 reply

        Yeahna, thats just memory exhaustion.

        Swap helps you use ram more efficiently, as you put the hot stuff in swap and let the rest fester on disk.

        Sure if you overwhelm it, then you're gonna have a bad day, but thats the same without swap.

        Seriously, swap is good, don't believe the noise.

        • By adastra22 2025-10-2122:431 reply

          I don’t understand. If you provision the system with enough RAM, then you can for every page in RAM, hot or not.

          • By akvadrako 2025-10-2123:161 reply

            Only if you have more RAM than disk space, which is wasteful for many applications.

            • By adastra22 2025-10-2123:362 reply

              Running out of memory kills performance. It is better to kill the VM and restart it so that any active VM remains low latency.

              That is my interpretation of what people are saying upthread, at least. To which posters such as yourself are saying “you still need swap.” Why?

              • By eru 2025-10-220:032 reply

                RAM costs money, disk space costs less money.

                It's a bit wasteful to provision your computers so that all the cold data lives in expensive RAM.

                • By fluoridation 2025-10-220:083 reply

                  >It's a bit wasteful to provision your computers so that all the cold data lives in expensive RAM.

                  But that's a job applications are already doing. They put data that's being actively worked on in RAM they leave all the rest in storage. Why would you need swap once you can already fit the entire working set in RAM?

                  • By vlovich123 2025-10-220:402 reply

                    Because then you have more active working memory as infrequently used pages are moved to compressed swap and can be used for more page cache or just normal resident memory.

                    Swap ram by itself would be stupid but no one doing this isn’t also turning on compression.

                    • By eru 2025-10-228:391 reply

                      > Swap ram by itself would be stupid but no one doing this isn’t also turning on compression.

                      I'm not sure what you mean here? Swapping out infrequently accesses pages to disk to make space for more disk cache makes sense with our without compression.

                      • By vlovich123 2025-10-2215:09

                        Swapping out to RAM without compression is stupid - then you’re just shuffling pages around in memory. Compression is key so that you free up space. Swap to disk is separate.

                    • By fluoridation 2025-10-2216:541 reply

                      >Because then you have more active working memory as infrequently used pages are moved to compressed swap and can be used for more page cache or just normal resident memory.

                      Uhh... A VMM that swaps out to disk an allocated page to make room for more disk cache would be braindead. The process has allocated that memory to use it. The kernel doesn't have enough information to deem disk cache a higher priority. The only thing that should cause it to be swapped out is either another process or the kernel requesting memory.

                      • By vlovich123 2025-10-232:37

                        > A VMM that swaps out to disk an allocated page to make room for more disk cache would be braindead

                        Claiming any decision is “brain dead” in something as heuristic heavy and impossible to compute optimally as resident memory pages is quite the statement to make; this is a form of the knapsack problem (NP-complete at least) with the added benefit of time where the items are needed in some specific indeterminate order in the future and there’s a whole bunch of different workloads and workload permutations that alter this.

                        To drive this point home in case you disagree, what’s dumber? Swapping out to disk an allocated page (from the kernel’s perspective) that’s just sitting in the free list of the userspace allocator for that process or a page of some frequently accessed page of data?

                        Now, I agree that VMMs may not do this because it’s difficult to come up with these kinds of scenarios that don’t penalize the general case, more importantly than performance this has to be a mechanism that is explainable to others and understandable for them. But claiming it’s a braindead option to even consider is IMHO a bridge too far.

                  • By akvadrako 2025-10-227:01

                    This subthread is about a poster's claim above that every page would be in RAM if you have enough, "hot or not", not just the working set.

                  • By eru 2025-10-220:271 reply

                    Sure, some applications are written to manually do a job that your kernel can already do for you.

                    In that case, and if you are only running these applications, the need for swap is much less.

                    • By fluoridation 2025-10-220:471 reply

                      You mean to tell me most applications you've ever used read the entire file system, loading every file into memory, and rely on the OS to move the unused stuff to swap?

                      • By eru 2025-10-224:451 reply

                        No? What makes you think so?

                        • By fluoridation 2025-10-224:591 reply

                          Then what do you mean, some applications organize hot and cold data in RAM and storage respectively? Just about every application does it.

                          • By eru 2025-10-228:411 reply

                            A silly but realistic example: lots of applications leak a bit of memory here and there.

                            Almost by definition, that leaked memory is never accessed again, so it's very cold. But the applications don't put this on disk by themselves. (If the app's developers knew about which specific bit is leaking, they'd rather fix the leak then write it to disk.)

                            • By fluoridation 2025-10-2211:09

                              That's just recognizing that there's a spectrum of hotness to data. But the question remains: if all the data that the application wants to keep in memory does fit in memory, why do you need swap?

                • By adastra22 2025-10-220:071 reply

                  When building distributed systems, service degradation means you’ll have to provision more systems. Cheaper to provision fewer systems with more RAM.

                  • By eru 2025-10-220:28

                    It depends on what you are doing, and how your system behaves.

                    If you size your RAM and swap right, you get no service degradation, but still get away with using less RAM.

                    But when I was at Google (about a decade ago), they followed exactly the philosophy you were outlining and disabled swap.

              • By KaiserPro 2025-10-228:56

                > Running out of memory kills performance. It is better to kill the VM and restart it so that any active VM remains low latency.

                Right, you seem to be not understanding what I'm getting at.

                Memory exhaustion is bad, regardless of swap or not.

                Swap gets you a better performing machine because you can swap out shit to disk and use that ram for vfs cache.

                the whole "low latency" and "I want my VM to die quicker" is tacitly saying that you haven't right sized your instances, your programme is shit, and you don't have decent monitoring.

                Like if you're hovering on 90% ram used, then your machine is too small, unless you have decent bounds/cgroups to enforce memory limits.

        • By gchamonlive 2025-10-2122:351 reply

          It's good, and Aws shouldn't disable it by default, but it won't save the system from OOM.

          • By matt-p 2025-10-2123:181 reply

            I bet there's a big "burns through our SSDs faster" spreadsheet column or similar that caused it to be disabled.

            • By gchamonlive 2025-10-2123:30

              Maybe. Or maybe it's an arbitrary decision.

              Many won't enable swap. For some swap wouldn't help anyways, but others it could help soak up spikes. The latter in some cases will upgrade to a larger instance without even evaluating if swap could help, generating AWS more money.

              Either way it's far-fetched to derive intention from the fact.

      • By Dylan16807 2025-10-222:57

        "as soon as you hit swap" is a bad way of looking at things. Looking around at some servers I run, most of them have .5-2GB of swap used despite a bunch of gigabytes of free memory. That data is never or almost never going to be touched, and keeping it in memory would be a waste. On a smaller server that can be a significant waste.

        Swap is good to have. The value is limited but real.

        Also not having swap doesn't prevent thrashing, it just means that as memory gets completely full you start dropping and re-reading executable code over and over. The solution is the same in both cases, kill programs before performance falls off a cliff. But swap gives you more room before you reach the cliff.

      • By gchamonlive 2025-10-2122:042 reply

        How programs use ram also changed from the 90s. Back then they were written targeting machines that they knew would have a hard time fitting all their data in memory, so hitting swap wouldn't hurt perceived performance too drastically since many operations were already optimized to balance data load between memory and disk.

        Nowadays when a program hits swap it's not going to fallback to a different memory usage profile that prioritises disk access. It's going to use swap as if it were actual ram, so you get to see the program choking the entire system.

        • By winrid 2025-10-2122:151 reply

          Exactly. Nowadays, most web services are run in a GC'ed runtime. That VM will walk pointers all over the place and reach into swap all the time.

          • By cogman10 2025-10-2122:354 reply

            Depends entirely on the runtime.

            If your GC is a moving collector, then absolutely this is something to watch out for.

            There are, however, a number of runtimes that will leave memory in place. They are effectively just calling `malloc` for the objects and `free` when the GC algorithm detects an object is dead.

            Go, the CLR, Ruby, Python, Swift, and I think node(?) all fit in this category. The JVM has a moving collector.

            • By zozbot234 2025-10-2123:102 reply

              Every garbage collector has to constantly sift through the entire reference graph of the running program to figure out what objects have become garbage. Generational GC's can trace through the oldest generations less often, but that's about it.

              Tracing garbage collectors solve a single problem really really well - managing a complex, possibly cyclical reference graph, which is in fact inherent to some problems where GC is thus irreplaceable - and are just about terrible wrt. any other system-level or performance-related factor of evaluation.

              • By cogman10 2025-10-2123:341 reply

                > Every garbage collector has to constantly sift through the entire reference graph of the running program to figure out what objects have become garbage.

                There's a lot of "it depends" here.

                For example, an RC garbage collector (Like swift and python?) doesn't ever trace through the graph.

                The reason I brought up moving collectors is by their nature, they take up a lot more heap space, at least 2x what they need. The advantage of the non-moving collectors is they are much more prompt at returning memory to the OS. The JVM in particular has issues here because it has pretty chunky objects.

                • By Dylan16807 2025-10-223:15

                  > The reason I brought up moving collectors is by their nature, they take up a lot more heap space, at least 2x what they need.

                  If the implementer cares about memory use it won't. There are ways to compact objects that are a lot less memory-intensive than copying the whole graph from A to B and then deleting A.

              • By eru 2025-10-220:051 reply

                Modern garbage collectors have come a long way.

                Even not so modern ones: have you heard of generational garbage collection?

                But even in eg Python they introduced 'immortal objects' which the GC knows not to bother with.

                • By winrid 2025-10-226:291 reply

                  It doesn't matter. The GC does not know what heap allocations are in memory vs swap, and since you don't write applications thinking about that, running a VM with a moving GC on swap is a bad idea.

                  • By eru 2025-10-228:371 reply

                    A moving GC can make sure to separate hot and cold data, and then rely on the kernel to keep hot data in RAM.

                    • By winrid 2025-10-2220:51

                      Yeah but in practice I'm not sure that really works well with any GCs today? Ive tried this with modern JVM and Node vms, it always ended up with random multi second lockups. Not worth the time.

            • By manwe150 2025-10-222:26

              MemBalancer is a relatively new analysis paper that argues having swap allows maximum performance by allowing small excesses, that avoids needing to over-provision ram instead. The kind of gc does not matter since data spends very little time in that state and on the flip side, most of the time the application has twice has access to twice as much memory to use

            • By masklinn 2025-10-225:33

              Python’s not a mover but the cycle breaker will walk through every object in the VM.

              Also since the refcounts are inline, adding a reference to a cold object will update that object. IIRC Swift has the latter issue as well (unless the heap object’s RC was moved to the side table).

            • By eru 2025-10-220:051 reply

              A moving GC should be better at this, because it can compact your memory.

              • By cogman10 2025-10-220:291 reply

                A moving collector has to move to somewhere and, generally by it's nature, it's constantly moving data all across the heap. That's what makes it end up touching a lot more memory while also requiring more memory. On minor collections I'll move memory between 2 different locations and on major collections it'll end up moving the entire old gen.

                It's that "touching" of all the pages controlled by the GC that ultimately wrecks swap performance. But also the fact that moving collector like to hold onto memory as downsizing is pretty hard to do efficiently.

                Non-moving collectors are generally ultimately using C allocators which are fairly good at avoiding fragmentation. Not perfect and not as fast as a moving collector, but also fast enough for most use cases.

                Java's G1 collector would be the worst example of this. It's constantly moving blocks of memory all over the place.

                • By eru 2025-10-224:44

                  > It's that "touching" of all the pages controlled by the GC that ultimately wrecks swap performance. But also the fact that moving collector like to hold onto memory as downsizing is pretty hard to do efficiently.

                  The memory that's now not in use, but still held onto, can be swapped out.

        • By zoeysmithe 2025-10-2122:552 reply

          This is really interesting and I've never really heard about this. What is going on with the kernel team then? Are they just going to keep swap as-is for backwards compatibility then everyone else just disables it? Or if this advice just for high performance clusters?

          • By kccqzy 2025-10-2123:10

            No. I use swap for my home machines. Most people should leave swap enabled. In fact I recommend the setup outlined in the kernel docs for tmpfs: https://docs.kernel.org/filesystems/tmpfs.html which is to have a big swap and use tmpfs for /tmp and /var/tmp.

          • By gchamonlive 2025-10-2123:12

            As someone else said, swap is important not only in the case the system exhaust main memory, but it's used to efficiently use system memory before that (caching, offload page blocks to swap that aren't frequently used etc...)

      • By LaurensBER 2025-10-2122:131 reply

        The beauty of ZRAM is that on any modern-ish CPU it's surprisingly fast. We're talking 2-3 ms instead of 2-3 seconds ;)

        I regularly use it on my Snapdragon 870 tablet (not exactly a top of the line CPU) to prevent OOM crashes (it's running an ancient kernel and the Android OOM killer basically crashes the whole thing) when running a load of tabs in Brave and a Linux environment (through Tmux) at the same time.

        ZRAM won't save you if you do actually need to store and actively use more than the physical memory but if 60% of your physical memory is not actively used (think background tabs or servers that are running but not taking requests) it absolutely does wonders!

        On most (web) app servers I happily leave it enabled to handle temporary spikes, memory leaks or applications that load a whole bunch of resources that they never ever use.

        I'm also running it on my Kubernetes cluster. It allows me to set reasonable strict memory limits while still having the certainty that Pods can handle (short) spikes above my limit.

        • By geokon 2025-10-229:09

          My understanding was that if you're doing random access - ZRAM has near-zero overhead. While data is being fetched from RAM, you have enough cycles to decompress blocks.

          Would love to be corrected if I'm wrong

      • By slyall 2025-10-222:07

        My 2cents is that in a lot of cases swap is being used for unimportant stuff leave more RAM for your app. Do a "ps aux" and look at all the RAM used by weird stuff. Good news is those things will be swapped out.

        Example on my personal VPS

           $ free -m
                          total        used        free      shared  buff/cache   available
           Mem:            3923        1225         328         217        2369        2185
           Swap:           1535        1335         200

      • By 01HNNWZ0MV43FF 2025-10-2122:162 reply

        It's not just 3 seconds for a button click, every time I've run out of RAM on a Linux system, everything locks up and it thrashes. It feels like 100x slowdown. I've had better experiences when my CPU was underclocked to 20% speed. I enable swap and install earlyoom. Let processes die, as long as I can move the mouse and operate a terminal.

        • By zozbot234 2025-10-2122:43

          > It feels like 100x slowdown.

          Yup, this is a thing. It happens because file-backed program text and read-only data eventually get evicted from RAM (to make room for process memory) so every access to code and/or data beyond the current 4K page can potentially involve a swap-in from disk. It would be nice if we had ways of setting up the system so that pages of code or data that are truly critical for real-time responsiveness (including parts of the UI) could not get evicted from RAM at all (except perhaps to make room for the OOM reaper itself to do its job) - but this is quite hard to do in practice.

        • By C7E69B041F 2025-10-2122:26

          This, I'm used to restarting my Plasma 2 times a day cause PHPStorm just leaks memory and it eventually crashes and requires hard reboot.

      • By akerl_ 2025-10-2123:25

        Is it possible you misread the comment you're replying to? They aren't recommending adding swap, they're recommending adjusting the memory tunables to make the OOM killer a bit more aggressive so that it starts killing things before the whole server goes to hell.

      • By Hendrikto 2025-10-2210:51

        > This is well known

        But also false. Swap is there so anonymous pages can be evicted. Not as a “slow overflow for RAM”, as a lot of people still believe.

        By disabling swap you can actually *increase* thrashing, because the kernel is more limited in what it can do with the virtual memory.

      • By AlexandrB 2025-10-2213:08

        > Maybe back in the 90s, it was okay to wait 2-3 seconds for a button click, but today we just assume the thing is dead and reboot.

        My experience is the exact opposite. If anything 2-3 second button clicks are more common than ever today since everything has to make a roundtrip to a server somewhere whereas in the 90s 2-3s button click meant your computer was about to BSOD.

        Edit: Apple recently brought "2-3s to open tab" technology to Safari[1].

        [1] https://old.reddit.com/r/MacOS/comments/1nm534e/sluggish_saf...

      • By zymhan 2025-10-2122:341 reply

        Where on earth did you get this misconception?

        • By commandersaki 2025-10-2122:361 reply

          Lived experience? With swap system stays up but is unresponsive, without it is either responsive due to oom kill or completely down.

          • By GuinansEyebrows 2025-10-2122:531 reply

            in either case, what do you do? if you can't reach a box and it's otherwise safe to do so, you just reboot it. so is it just a matter of which situation occurs more often?

            • By commandersaki 2025-10-222:36

              The thing is you can survive memory exhaustion if the oom killer can do its job, which it can't many times when there's swap. I guess the topmost response to this thread talks about an earlyoom tool that might alleivate this, but I've never used it, and I don't find swap helpful anyway so there's no need for me to go down this route.

      • By the8472 2025-10-220:14

        YMMV. Garbage-collected/pointer-chasing languages suffer more from swapping because they touch more of the heap all the time. AWS suffers more from swap because EBS is ridiculously slow and even their instance-attached NVMe is capped compared physical NVMe sticks.

      • By henryfjordan 2025-10-2122:081 reply

        Does HDD vs SSD matter at all these days? I can think of certain caching use-cases where swapping to an SSD might make sense, if the access patterns were "bursty" to certain keys in the cache

        • By winrid 2025-10-2122:121 reply

          It's still extremely slow and can cause very unpredictable performance. I have swap setup with swappiness=1 on some boxes, but I wouldn't generally recommend it.

          • By eru 2025-10-220:071 reply

            HDDs are much, much slower than SSD.

            If swapping to SSD is 'extremely slow', what's your term for swapping to HDD?

            • By baq 2025-10-225:01

              ‘Hard reboot’ (not OP)

      • By elwebmaster 2025-10-221:46

        what an ignorant and clueless comment. Guess what? Todays disks are NVMe drives which are orders of magnitude faster than the 5400rpm HDDs of the 90s. Today's swap is 90s RAM.

      • By goodpoint 2025-10-228:18

        No, swap is absolutely fine if used correctly.

    • By shrubble 2025-10-2122:322 reply

      It's always a good idea to have a tiny amount of swap just in case. Like 1GB.

      • By dd_xplore 2025-10-2211:11

        I have also seen this in Androids (I tested this on multiple devices - S23U, OnePlus 6,8) , whenever I completely turned off the swap , the phone after a day or two of heavy usage would sometimes hang! It felt unintuitive since these devices had lot of RAM, and they shouldn't need swap . But turning off swap has always degraded performance for me.

      • By akerl_ 2025-10-2123:262 reply

        Why?

        • By CGamesPlay 2025-10-224:491 reply

          Because some portion of the RAM used by your daemons isn't actually being accessed, and using that RAM to store file cache is actually a better use than storing idle memory. The old rule about "as much swap as main memory" definitely doesn't hold any more, but a few GB to store unneeded wired memory to dedicate more room to file cache is still useful.

          As a small example from a default Ubuntu installation, "unattended-upgrades" is holding 22MB of RSS, and will not impact system performance at all if it spends next week swapped out. Bigger examples can be found in monolithic services where you don't use some of the features but still have to wire them into RAM. You can page those inactive sections of the individual process into swap, and never notice.

          • By akerl_ 2025-10-2211:211 reply

            If my swap is on my disk, what good is storing file cache there, next to the files?

            • By CGamesPlay 2025-10-2212:28

              There is absolutely no point to doing that, which is why file cache is never swapped out. The swapped part is not-recently-used, wired memory from processes, so that there is more room for file cache.

        • By angch 2025-10-224:151 reply

          Like a highway brake failure ramp, you have room for handling failures gentler. So services don't just get outright killed. If you monitor your swap usage, any usage of swap gives you early warning that your services require more memory already.

          Gives you some time to upgrade, or tune services before it goes ka-boom.

          • By akerl_ 2025-10-224:22

            If your memory usage is creeping up, the way you'll find out that you need more memory is by monitoring memory usage via the same mechanisms you'd hypothetically use to monitor your swap usage.

            If your memory usage spikes suddenly, a nominal amount of swap isn't stopping anything from getting killed; you're at best buying yourself a few seconds, so unless you spend your time just staring at the server, it'll be dead anyways.

    • By tarruda 2025-10-2218:36

      > Here is how I do it on my Hetzner bare-metal servers using Ansible: https://gist.github.com/fungiboletus/794a265cc186e79cd5eb2fe... It also works on VMs.

      As someone with zero ansible experience, can you elaborate on why a yaml list is better than a simple shell script with comments before each command?

    • By cmurf 2025-10-225:09

      Some workloads may do better with zswap. Cache is compressed, and pages evicted to disk based swap on an LRU basis.

      The case of swap thrashing sounds like a misbehaving program, which can maybe be tamed by oomd.

      System responsiveness though needs a complete resource control regime in place, that preserves minimum resources for certain critical processes. This is done with cgroupsv2. By establishing minimum resources, the kernel will limit resources for other processes. Sure, they will suffer. That’s the idea.

    • By statictype 2025-10-2123:06

      Thanks for this. We resorted to setting ram thresholds in systemd.

      Is earlyoom a better solution than that to prevent an erratic process from making an instance unresposnsive?

    • By icetank 2025-10-229:51

      Yeah I had a few servers look up on me without any clear way to recovery because some app was eating up ram. I am ok with the server coming to a crawl as soon as the swap has to be used but at least it won't stop responding all together.

    • By nurettin 2025-10-225:01

      Of course swap should be enabled. But oom killer has always allowed access to an otherwise unreachable system. The pause is there so you can impress your junior padawan who rushed to you in a hurry.

    • By bouncycastle 2025-10-228:47

      sometimes swap seems to accumulate even though there is plenty of ram. It is too "greedy" by default, probably set for desktops not servers in mind.

      Therefore it is better to always tune "vm.swappiness" to 1 in /etc/sysctl.conf

      You can also configure your web server / TCP stack buffers / file limits so they never allocate memory over the physical ram available. (eg. in nginx you can setup worker/connection limits and buffer sizes.)

    • By cactusplant7374 2025-10-2121:174 reply

      What's the performance hit from compressing ram?

      • By YouAreWRONGtoo 2025-10-2121:22

        It's sometimes not a hit, because CPUs have caches and memory bandwidth is the limiting factor.

      • By aidenn0 2025-10-2121:39

        Depends on the algorithm (and how much CPU is in use); if you have a spare CPU, the faster algorithms can more-or-less keep up with your memory bandwidth, making the overhead negligible.

        And of course the overhead is zero when you don't page-out to swap.

      • By waynesonfire 2025-10-2121:371 reply

        > zram, formerly called compcache, is a Linux kernel module for creating a compressed block device in RAM, i.e. a RAM disk with on-the-fly disk compression. The block device created with zram can then be used for swap or as a general-purpose RAM disk

        To clarify OP's represention of the tool, it compresses swap space not resident ram. Outside of niche use-cases, compressing swap has overall little utility.

        • By coppsilgold 2025-10-220:341 reply

          Incorrect, with zram you swap ram to compressed ram.

          It has the benefit of absorbing memory leaks (which for whatever reason compress really well) and compressing stale memory pages.

          Under actual memory pressure performance will degrade. But in many circumstances where your powerful CPU is not fully utilized you can 2x or even 3x your effective RAM (you can opt for zstd compression). zram also enables you to make the trade-off of picking a more powerful CPU for the express purpose of multiplying your RAM if the workload is compatible with the idea.

          PS: On laptops/workstations, zram will not interfere with an SSD swap partition if you need it for hibernation. Though it will almost never be used for anything else if you configure your zram to be 2x your system memory.

          • By masklinn 2025-10-229:10

            > Incorrect, with zram you swap ram to compressed ram.

            That reads like what they said? You reserve part of the RAM as a swap device, and memory is swapped from resident RAM to the swap ramdisk, as long as there’s space on there. And AFAIK linux will not move pages between swap devices because it doesn’t understand them beyond priority.

            Zswap actually seems strictly better in many cases (especially interactive computers / dev machines) as it can more flexibly grow / shrink, and can move pages between the compressed RAM cache and the disk swap.

      • By speedgoose 2025-10-2121:212 reply

        I haven’t scientifically measured, but you don’t compress the whole ram. It is more about reserving a part of the ram to have very fast swap.

        For an algorithm using the whole memory, that’s a terrible idea.

        • By LargoLasskhyfv 2025-10-224:50

          >...but you don’t compress the whole ram.

          I do: https://postimg.cc/G8Gcp3zb (casualmeasurement.png)

        • By sokoloff 2025-10-2121:261 reply

          > It is more about reserving a part of the ram to have very fast swap.

          I understand all of those words, but none of the meaning. Why would I reserve RAM in order to put fast swap on it?

          • By vlovich123 2025-10-2121:36

            Swap to disk involves a relatively small pipe (usually 10x smaller than RAM). So instead of paying the cost to page out to disk immediately, you create compressed pages and store that in a dedicated RAM region for compressed swap.

            This has a number of benefits: in practice more “active” space is freed up as unused pages are compressed and often compressible. Often times that can be freed application memory that is reserved within application space but in the free space of the allocator, especially if that allocator zeroes it those pages in the background, but even active application memory (eg if you have a browser a lot of the memory is probably duplicated many times across processes). So for a usually invisible cost you free up more system RAM. Additionally, the overhead of the swap is typically not much more than a memcpy even compressed which means that you get dedup and if you compressed erroneously (data still needed) paging it back in is relatively cheap.

            It also plays really well with disk swap since the least frequently used pages of that compressed swap can be flushed to disk leaving more space in the compressed RAM region for additional pages. And since you’re flushing retrieving compressed pages from disk you’re reducing writes on an SSD (longevity) and reducing read/write volume (less overhead than naiive direct swap to disk).

            Basically if you think of it as tiered memory, you’ve got registers, l1 cache, l2 cache, l3 cache, normal RAM, compressed swap RAM, disk swap - it’s an extra interim tier that makes the system more efficient.

    • By RobRivera 2025-10-225:351 reply

      To learn tricks like this what resource do you recommend I read? System administrators handbook? (Still on my TOREAD queue)

      • By eitland 2025-10-2210:41

        "The practice of System and Network administration" by Tom Limoncelli and Christine Hogan[1] was, together with "Principles of Network and Systems Administration" by Mark Burgess have probably been the books that influenced my approach to sysadmin the most. I still have them. Between them they covered at a high level (at least back when I was sysadmin before devops and Kubernets etc) anything and everything from

        - hardware, networks, monitoring, provisioning, server room locations in existing buildings, how to prepare server rooms

        - and so on up to hiring and firing sysadmins, salary negotiations[2], vendor negotiations and the first book even had a whole chapter dedicated to "Being happy"

        [1] There is a third author as well now, but those two were the ones that are on the cover of my book from 2005 and that I can remember

        [2] Has mostly worked well after I more or less left sysadmin behind as well

    • By dboreham 2025-10-2123:29

      Haven't used swap since 2010.

    • By 1vuio0pswjnm7 2025-10-233:36

      "You may want to enable earlyoom, so your whole server doesn't go down when a service goes bananas."

      Another option is to run BSD to avoid the Linux oom issue

      For example, I'm not using Hetzner but I run NetBSD entirely from memory (no disk, no swap) and it never "went down" when out of memory

      Looks like some people install FreeBSD and OpenBSD on Hetzner

      https://gist.github.com/c0m4r/142a0480de4258d5da94ce3a2380e8...

      https://computingforgeeks.com/how-to-install-freebsd-on-hetz...

      https://web.archive.org/web/20231211052837if_/https://www.ar...

      https://community.hetzner.com/tutorials/freebsd-openzfs-via-...

      https://www.souji-thenria.net/posts/openbsd_hetzner/

      https://web.archive.org/web/20220814124443if_/https://blog.v...

      https://www.blunix.com/blog/how-to-install-openbsd-on-hetzne...

      https://gist.github.com/ctsrc/9a72bc9a0229496aab5e4d3745af0b...

      If it is possible to boot Hetzner from a BSD install image using "Linux rescue mode"^1 then it should also possible to run NetBSD entirely from memory using custom kernel

      Every user is different but this is how I prefer to run UNIX-like OS for personal, recreational use; I find it more resilient

      1.

      https://docs.hetzner.com/robot/dedicated-server/troubleshoot...

      https://blog.tericcabrel.com/hetzner-rescue-mode-unlock-serv...

      https://github.com/td512/rescue

      https://gainanov.pro/eng-blog/linux/hetzner-rescue-mode/

      https://docs.hetzner.com/cloud/servers/getting-started/rescu...

      ChromeOS has an interesting approach to Linux oom issues. Not sure it has ever been discussed on HN

      https://github.com/dct2012/chromeos-3.14/raw/chromeos-3.14/m...

    • By awesome_dude 2025-10-2122:381 reply

      How do you get swap on a VPS?

      • By justsomehnguy 2025-10-2123:322 reply

        Search "linux enable swap in a file"

            To enable a swap file in Linux, first create the swap file using a command like sudo dd if=/dev/zero of=/swapfile bs=1G count=1 for a 1GB file. Then, set it up with sudo mkswap /swapfile and activate it using sudo swapon /swapfile. To make it permanent, add /swapfile swap swap defaults 0 0 to your /etc/fstab file.

        • By collinmanderson 2025-10-220:002 reply

          Yes. I think might also need to chmod 600 /swapfile. I do this on all my VPS, especially helps for small VPS with only 1GB ram:

             fallocate -l 1G /swapfile
             chmod 600 /swapfile
             mkswap /swapfile
             swapon /swapfile
          
          Works really well with no problems that I've seen. Really helps give a bit more of a buffer before applications get killed. Like others have said, with SSD the performance hit isn't too bad.

          • By awesome_dude 2025-10-222:021 reply

            IME SWAP has been explicitly disabled by the VPS providers.

            Partly it's a money thing (they want to sell you RAM), partly it's so that the shared disk isn't getting thrashed by multiple VPS

            • By efreak 2025-10-228:02

              Get better VPS then. Openvz and other kernel paravirtualization have limits, go for Xen or KVM instead (Xen has paravirtualization as well, but I'm not sure how much it's actually used). Full virtualization (implemented by Xen and KVM) do not allow you to prevent swap from being used.

          • By collinmanderson 2025-10-2217:08

            I forgot to mention what the parent comment said, yes, need to put something like this in /etc/fstab:

            /swapfile swap swap sw 0 0

            or via ansible:

                mount:
                  src: "/swapfile"
                  name: "swap"
                  fstype: "swap"
                  opts: "sw"
                  passno: "0"
                  dump: "0"
                  state: present

        • By awesome_dude 2025-10-2123:332 reply

          Strongly suggest you try doing that on a VPS, then report back

          • By justsomehnguy 2025-10-2123:511 reply

            https://news.ycombinator.com/item?id=45007821

            And that was like... two years ago? 1GB of RAM and actually ~700MB usable before I found the proper magik incantations to really disable kdump.

            Also have used 1GB machines for literally years.

            Strongly suggest you shouldn't strongly suggest.

          • By ahepp 2025-10-2123:442 reply

            What do you think is going to happen? I tested it out on an ec2 instance just now and it seems to have worked as one would expect.

            • By Zekio 2025-10-228:572 reply

              well once you "need" that swap, it will be writing pages across the network due to the storage being external to the physical server, so the latency is terrible

              • By ahepp 2025-10-232:19

                Latency of swap is always terrible in comparison to RAM. RAM vs disk is already something ~1000x right? I've never characterized EBS vs trad ssd, but I would be surprised if it's more than 10x.

                I don't think using swap as "emergency RAM" makes a lot of sense in 2025. The arguments in favor of swap which I find convincing are about allowing the system to evict low use pages which otherwise would not be evictable.

              • By kassner 2025-10-229:38

                Put swap on the “instance store” disk, not EBS.

            • By awesome_dude 2025-10-220:352 reply

              EC2 != VPS

              • By cmpxchg8b 2025-10-225:28

                They both offer virtualized guests under a hypervisor host. EC2 does have more offload specialization hardware but for the most part they are functionally equivalent, unless I'm missing something...

              • By ahepp 2025-10-232:22

                What VPS do you think this would cause problems on? Why?

  • By jdprgm 2025-10-2121:448 reply

    Just saw Nate Berkopec who does a lot of rails performance stuff posting about the same idea yesterday saying Heroku is 25-50x price for performance which is so insane. They clearly have zero interest in competing on price.

    It's a shame they don't just license all their software stack at a reasonable price with a similar model like Sidekiq and let you sort out actually decent hardware. It's insane to consider Heroku if anything has gotten more expensive and worse compared to a decade ago yet in comparison similar priced server hardware has gotten WAY better of a decade. $50 for a dyno with 1 GB of ram in 2025 is robbery. It's even worse considering running a standard rails app hasn't changed dramatically from a resources perspective and if anything has become more efficient. It's comical to consider how many developers are shipping apps on Heroku for hundreds of dollars a month on machines with worse performance/resources than the macbook they are developing it on.

    It's the standard playback that damn near everything in society is going for though just jacking prices and targeting the wealthiest least price sensitive percentiles instead of making good products at fair prices for the masses.

    • By condiment 2025-10-220:582 reply

      Jacked up prices isn't what is happening here. There is a psychological effect that Heroku and other cloud vendors are (wittingly or unwittingly) the beneficiary of. Customer expectations are anchored in the price they pay when they start using the service, and without deliberate effort, those expectations change in _linear_ fashion. Humans think in linear terms, while actual compute hardware improvements are exponential.

      Heroku's pricing has _remained the same_ for at least seven years, while hardware has improved exponentially. So when you look at their pricing and see a scam, what you're actually doing is comparing a 2025 anchor to a mid-2010s price that exists to retain revenue. At the big cloud vendors, they differentiate customers by adding obstacles to unlocking new hardware performance in the form of reservations and updated SKUs. There's deliberate customer action that needs to take place. Heroku doesn't appear to have much competition, so they keep their prices locked and we get to read an article like this whenever a new engineer discovers just how capable modern hardware is.

      • By rtpg 2025-10-223:223 reply

        I mean Heroku is also offering all of the ancillary stuff around their product. It's not literally "just" hosting. It's pretty nice to not have to manage a kube cluster, to get stuff like ephemeral QA envs and the like, etc....

        Heroku has obviously stagnated now but their stack is _very cool_ for if you have a fairly simple system but still want all the nice parts of a mode developed ops system. It almost lets you get away with not having an ops team for quite a while. I don't know any other provider that is low-effort "decent" ops (Fly seems to directionally want to be new Heroku but is still missing a _lot_ in my book, though it also has a lot)

        • By maccard 2025-10-228:40

          I think it’s easy to forget how much you get with a modern setup like this, and how much work it is to maintain it. If you’re at a big corp, the team who maintains this stuff is larger than most mid corp’s engineering departments. For a solo person, it’s fine. But if you have 10-30 engineers, it’s a lot of work, and paying heroku $1000/mo is significantly cheaper than having even a junior engineer spend 40% of their time on keeping up.

        • By TheTaytay 2025-10-2211:201 reply

          Well said. I’ve been expecting an obvious spiritual successor for a long time. They have a surprising number of features compared to most platforms. Their databases/redis and features like forking were quite good (as long as you were super big), logplex/log shipping, auto scale, add-on ecosystem, promotion pipelines, container support if needed (good build packs/git support if you don’t), good CLI or API, OS/patch management, hobby plans and enterprise plans, and more. And on top of all of that, the user/projects system is something mortals can wrap their heads around. They found the sweet spot between raw servers and the complexity quagmire of the mega-clouds a surprisingly long time ago.

          There are some folks with good offerings (Fly, Railway, etc), but the feature set of Heroku is deeper, and more important for production apps, than most people realize. They aren’t a good place for hobbyists anymore though. I agree with that.

          • By cpursley 2025-10-2310:59

            Is it deeper than render.com? Can heroku run static sites or distributed Elixir/Erlang? Personally I’m on fly as the pricing is even better and I prefer the UX, but render is basically what heroku should be in 2025.

        • By 91bananas 2025-10-2219:28

          Heroku made an application I worked on possible. I don't think we had the team to maintain the application stack without something like it. It enabled the company to exist long enough to get the magical stock exit. I'm forever grateful for it existing.

      • By sofixa 2025-10-229:22

        > other cloud vendors

        To be fair, AWS quite proudly talk about all the times they've lowered prices on existing services, or have introduced new generations that are cheaper (e.g. their Graviton EC2 instances).

    • By czhu12 2025-10-2121:474 reply

      > It's a shame they don't just license all their software stack at a reasonable price with a similar model like Sidekiq and let you sort out actually decent hardware

      We built and open sourced https://canine.sh for exactly that reason. There’s no reason PaaS providers should be charging such a giant markup over already marked up cloud providers.

      • By altairprime 2025-10-2123:57

        Heroku is pricing for “# of FTE headcount that can be terminated for switching to Heroku”; in that sense, this article’s $3000/mo bill is well below 1.0 FTE/month at U.S. pricing, so it’s not interesting to Heroku to address. I’m not defending this pricing lens, but it’s why their pricing is so high: if you aren’t switching to Heroku to layoff at least 1-2 FTE of salary per billing period, or using Heroku to replace a competitor’s equivalent replacement thereof, Heroku’s value assigned to you as a customer is net negative and they’d rather you went elsewhere. They can’t slam the door shut on the small fry, or else the unicorns would start up elsewhere, but they can set the pricing in FTE-terms and VCs will pay it for their moonshots without breaking a sweat.

      • By nicoburns 2025-10-2122:022 reply

        This looks decent for what it is. I feel like there are umpteen solutions for easy self-hosted compute (and tbh even a plain Linux VM isn't too bad to manage). The main reason to use a PAAS provider is a managed database with built-in backups.

        • By czhu12 2025-10-220:51

          Its the flexibility and power of Kubernetes that I think is incredible. Scaling to multiple nodes is trivial, if your entire data plane is blown away, the recovery is trivial.

          You can also self host almost any open source service without any fuss, and perform internal networking with telepresence. (For example, if you want to run an internal metabase that is not available on public internet, you can just run `telepresence connect`, and then visit the private instance at metabase.svc.cluster.local).

          Canine tries to leverage all the best practices and pre-existing tools that are already out there.

          But agreed, business critical databases probably shouldn't belong on Kubernetes.

        • By gregsadetsky 2025-10-2122:161 reply

          Fully agreed - our recommendation is to /not/ run your prod Postgres db yourself, but use one of the many great dedicated options out there - Crunchy Data, Neon, Supabase, or AWS RDS..!

          • By bcrosby95 2025-10-2122:231 reply

            It really depends upon how much data you have. If its enough to just dump then go crazy. If it isn't its a bit more trouble.

            Regardless, you're going to have a much easier time developing your app if your datastore access latency is submillisecond rather than tens of milliseconds.

            So that extra trouble might be worth it...

            • By bragr 2025-10-225:581 reply

              You're running at a pretty small scale if running your database locally for sub-milisecond latency is practical. The database solution provided by the DBA team in a data center is going to have about the same latency as RDS or equivalent. Typical intra-datacenter network latency alone is going to be 1-3ms.

              • By bcrosby95 2025-10-2215:22

                They were talking about using things like Supabase, not just RDS.

                Also, "small scale" means different things to different people. Given the full topic at hand, I would call it "nano scale". Depending upon your specific schema, you can serve tens of thousands of queries per second with a single server on modern hardware, which is way more than enough for the vast majority of workloads.

      • By odie5533 2025-10-2122:292 reply

        Does it run Sentry and I can send logs, metrics, and traces to it, and the queries are fast?

      • By sreekanth850 2025-10-2211:16

        Canine looks cool man.

    • By layoric 2025-10-2122:585 reply

      > $50 for a dyno with 1 GB of ram in 2025 is robbery

      AWS isn't much better honestly.. $50/month gets you an m7a.medium which is 1 vCPU (not core) and 4GB of RAM. Yes that's more memory but any wonder why AWS is making money hand-over-fist..

      • By selcuka 2025-10-221:34

        Not sure if it's an apples-to-apples comparison with Heroku's $50 Standard-2X dyno, but an Amazon Lightsail instance with 1GB of RAM and 2 vCPUs is $7/month.

      • By NohatCoder 2025-10-2212:50

        AWS certainly also does daylight robbery. In the AWS model the normal virtual servers are overpriced, but not super overpriced.

        Where they get you is all the ancillary shit, you buy some database/backup/storage/managed service/whatever, and it is priced in dollars per boogaloo, you also have to pay water tax on top, and of course if you use more than the provisioned amount of hafnias the excess ones cost 10x as much.

        Most customers have no idea how little compute they are actually buying with those services.

      • By bearjaws 2025-10-220:20

        That is assuming you need that 1 core 24/7, you can get 2 core / 8gb for $43, this will most likely fit 90% of workloads (steady traffic with spikes, or 9-5 cadence).

        If you reserve that instance you can get it for 40% cheaper, or get 4 cores instead.

        Yes it's more expensive than OVH but you also get everything AWS to offer.

      • By troyvit 2025-10-224:42

        This, plus as a backup plan going from Heroku to AWS wouldn't necessarily solve the problem, at least with our infra. When us-east-1 went down this week so did Heroku for us.

      • By electroly 2025-10-220:191 reply

        m7a doesn't use HyperThreading; 1 vCPU is a full dedicated core.

        To compare to Heroku's standard dynos (which are shared hosting) you want the t3a family which is also shared, and much cheaper.

        • By layoric 2025-10-2310:071 reply

          I must be confused, my understanding was m7a was 4th generation Epyc (Genoa, Bergamo and Siena) which I believe all have 2 threads per core no?

          • By electroly 2025-10-2316:43

            You're not confused--AWS either gets custom chips without it, or they disable the SMT. I'm not sure which. Here's where AWS talks about it: https://aws.amazon.com/ec2/instance-types/m7a/

            > One of the major differences between M7a instances and the previous generations of instances, such as M6a instances, is their vCPU to physical processor core mapping. Every vCPU on a M7a instance is a physical CPU core. This means there is no Simultaneous Multi-Threading (SMT). By contrast, every vCPU on prior generations such as M6a instances is a thread of a CPU core.

            My wild guess is they're disabling it. For Intel instance families they loudly praise their custom Intel processors, but this page does not contain the word "custom" anywhere.

    • By herval 2025-10-222:311 reply

      Heroku is the Vercel of Rails: people will pay a fortune for it simply because it works. This has always been their business model, so it’s not really a new development. There’s little competition since the demand isn’t explosive and the margin is thin, so you end up with stagnation

      • By echelon 2025-10-222:361 reply

        Vercel should have a ton of competition on account of the frontend space being much larger than Heroku's market.

        Netlify sets the same prices.

        Just throw it into a cloud bucket from CI and be done with it.

        • By kazanz 2025-10-223:15

          You'd be surprised. There are very few because it takes a lot more work to build reliable systems across mid-market cloud providers (flakey APIs, missing functionality, etc). Plus you need to know the idiosyncrasies of all the various frameworks + build systems.

          That said, they are emerging. I'm actually working on a drop-in Vercel competitor at https://www.sherpa.sh. We're 70% lower cost by running on EU based CDN and dedicated servers (Hetzner, etc). But we had to build the relationships to solve all the above challenges first.

    • By Onavo 2025-10-223:24

      I am not sure what's there to license. The hard and expensive part is in the labor to keep everything running. You are paying to make DevSecOps Somebody Else's Problem. You are paying for A Solution. You are not paying for software. There are plenty of Heroku clones mentioned in this thread.

    • By tonyhart7 2025-10-222:23

      Yeah, I choose railway app for my PaaS hosting for this reason

    • By __mharrison__ 2025-10-224:12

      Now I know why the teaching platform I use is trying to kick me off...

      Every other time I login to the admin site I get a Heroku error.

    • By teiferer 2025-10-2121:528 reply

      It's insane how much my local shop charges for an oil change, I can do it much cheaper myself!

      It's insane how much a restaurant charges for a decent steak, I can do it much cheaper myself!

      ...!

      • By jdprgm 2025-10-2122:003 reply

        I know you mean this sarcastically but I actually 100% agree with this particular on the steak point. Especially with beef prices at all time record highs and restaurant inflation being out of control post pandemic. It takes so much of the enjoyment out of things for me if I feel i'm being ripped off left and right.

        • By swat535 2025-10-221:571 reply

          What you're missing here is that companies happily pay the premium to Heroku because it lets them focus on building the product and generating business rather than wasting precious engineering time managing infra.

          By the time the product is a success and reaches a scale where it becomes cost prohibitive, they have enough resources to expand or migrate away anyway.

          I suppose for solo devs it might be cheaper to setup a box for fun, but even then, I would argue that not everyone enjoys doing devops and prefers spending their time elsewhere.

          • By xp84 2025-10-2215:47

            Maybe what bothers people so much is more of the fact that when Heroku first came out, it was much harder to do what that platform does. In the past 20 years or so, there has been a ton of improvement in the tools available. What could’ve taken you three full-time employees can probably be done with 20% of someone’s time after the initial set up which also isn’t that hard. So, it seems like instead of charging like 50X the cost of the servers themselves, maybe Heroku could be charging 10X. But it seems like salesforce probably just bought Heroku as a cash-generating machine. They probably figure they have a lot more to lose in cutting the bills of their old customers who don’t want to migrate anything, then they could gain from attracting new customers who aren’t already locked in.

            Honestly, reading these threads it sounds to me like a lot of people are still launching new projects on Heroku. I wouldn’t have guessed that was true before reading this.

        • By grebc 2025-10-2122:471 reply

          Where’s the beef inflation? Local butcher has prime rib fillet $30 AUD/KG cut to your liking.

          • By degamad 2025-10-222:12

            My understanding is that here in Oz we get access to cheaper beef than the rest of the world...

        • By rascul 2025-10-2122:541 reply

          One also doesn't get shamed by the steak snobs if you have different steak preferences.

          • By waynesonfire 2025-10-2123:27

            Or having to cut the steak with a serrated "steak" knife that tears the meat.

      • By andrewstuart2 2025-10-2122:021 reply

        This argument doesn't work with such commoditized software. It's more like comparing an oil change for $100 plus an hour of research and a short drive against a convenient oil change right next door for $2,500.

        • By teiferer 2025-10-2122:511 reply

          Nobody is forced to go to the expensive one. If they are still in business then enough people apparently consider it a reasonable deal. You might not, but others do. Whether I'm being downvoted or not.

          • By Dylan16807 2025-10-223:19

            > If they are still in business then enough people apparently consider it a reasonable deal.

            Or they didn't check. A business still existing is pretty weak evidence that the pricing is reasonable.

      • By xmprt 2025-10-2122:032 reply

        Not the best comment but I agree with the sentiment. I fear far too often, people complain about price when there are competitors/other cheaper options that could be used with a little more effort. If people cared so much then they should just use the alternative.

        No one gets hurt if someone else chooses to waste their money on Heroku so why are people complaining? Of course it applies in cases where there aren't a lot of competitors but there are literally hundreds of different of different options for deploying applications and at least a dozen of them are just as reliable and cheaper than Heroku.

        • By __mharrison__ 2025-10-224:17

          I'm hurt because a service I'm using is based on Heroku. I'm on the "unlimited" plan but they have backtracked on that and now say I'm too big for them...

        • By strken 2025-10-2122:531 reply

          The problem with Heroku's pricing is that it's set high enough that I no longer use it and neither does anyone else I know. I suspect they either pivoted to a different target market than me, which would be inconvenient but I'd be okay with it, or killed off their own growth potential by trying to extract revenue, which I would find sad.

          • By xp84 2025-10-2215:50

            I’m pretty sure their target market is people who have already built something kind of complex on there and don’t have the time/money budget to do a big migration. In that way, they know their customers are stuck but can afford the current prices, so keeping pricing static or gradually increasing makes sense.

      • By g8oz 2025-10-2122:04

        The price value proposition here seems similar to that of a stadium hot dog.

      • By raincole 2025-10-2122:05

        It's just trendy to bash cloud and praise on-premises in 2025. In a few years that will turn around. Then in another few years it will turn around again.

      • By artifaxx 2025-10-2122:52

        Indeed, there are levels to the asymmetry though. Oil change might be ~5x cheaper vs the 20-50x claimed for Heroku...

      • By landdate 2025-10-220:021 reply

        > for an oil change, I can do it much cheaper myself

        Really? I mean oil changes are pretty cheap. You can get an oil change at walmart for like 40 bucks.

        • By RedShift1 2025-10-221:41

          And you get the stripped out bolt hole for free too.

  • By tempest_ 2025-10-2120:5411 reply

    The cloud has made people forget how far you can get with a single machine.

    Hosting staging envs in pricey cloud envs seems crazy to me but I understand why you would want to because modern clouds can have a lot of moving parts.

    • By jeroenhd 2025-10-2121:112 reply

      Teaching a whole bunch of developers some cloud basics and having a few cloud people around is relatively cheap for quite a while. Plus, having test/staging/prod on similar configurations will help catch mistakes earlier. None of that "localstack runs just fine but it turns out Amazon SES isn't available in region antartica-east-1". Then, eventually, you pay a couple people's wages extra in cloud bills, and leaving the cloud becomes profitable.

      Cloud isn't worth it until suddenly it is because you can't deploy your own servers fast enough, and then it's worth it until it exceeds the price of a solid infrastructure team and hardware. There's a curve to how much you're saving by throwing everything in the cloud.

      • By nine_k 2025-10-2121:28

        Deploying to your private cloud requires basically the same skills. Containers, k8s or whatnot, S3, etc. Operating a large DB on bare metal is different from using a managed DB like Aurora, bit for developers, the difference is hardly visible.

      • By matt-p 2025-10-2123:211 reply

        RDS/managed database is extremely nice I will admit, otherwise I agree. Similarly s3, if you're going to do object storage, then running minio or whatever locally is probably not cheaper overall than R2 or similar.

        • By objektif 2025-10-2213:35

          I would never ever go back to hosting own DB. It is just a maintenance nightmare.

    • By rikafurude21 2025-10-2121:007 reply

      The cloud has made people afraid of linux servers. The markup is essentially just the price business has to pay because of developer insecurity. The irony is that self hosting is relatively simple, and alot of fun. Personally never got the appeal of Heroku, Vercel and similar, because theres nothing better than spinning up a server and setting it up from scratch. Every developer should try it.

      • By jampekka 2025-10-2121:166 reply

        > The irony is that self hosting is relatively simple, and alot of fun. Personally never got the appeal of Heroku, Vercel and similar, because theres nothing better than spinning up a server and setting it up from scratch.

        It's fun the first time, but becomes an annoying faff when it has to be repeated constantly.

        In Heroku, Vercel and similar you git push and you're running. On a linux server you set up the OS, the server authentication, the application itself, the systemctl jobs, the reverse proxy, the code deployment, the ssl key management, the monitoring etc etc.

        I still do prefer a linux server due to the flexibility, but the UX could be a lot better.

        • By lelanthran 2025-10-228:561 reply

          > It's fun the first time, but becomes an annoying faff when it has to be repeated constantly.

          I have to ask - do scripts not work for you?

          When I had to do this back in 2005 it was automated with 3 main steps:

          1. A preseed (IIRC) debian installation disc (all the packages I needed where installed at install time), and

          2. Which included a first-boot bash script that retrieved pre-compiled binaries from our internal ftp site, and

          3. A final script that applied changes to the default config files and ran a small test to ensure everything started.

          Zero human interaction after powering a machine on with the disc in the drive.

          These days I would do it even better (system-d configs, Nix perhaps, text files (such as systemd units) can be retrieved automagically after boot, etc).

          • By chickensong 2025-10-2220:071 reply

            Your example only covers basic provisioning. The additional items mentioned by the parent comment can be a significant investment, both initially and over time.

            • By lelanthran 2025-10-234:521 reply

              > Your example only covers basic provisioning.

              No. It covered setting up all the applications needed as well (nginx, monitoring agent, etc), installing keys/credentials.

              What did parent mention that can't be covered by the approach I used?

              • By chickensong 2025-10-2310:571 reply

                I guess I read your comment as OS, the app, and configs, while the parent mentions auxiliary items, ending with "etc etc". The point is, all the extra things that aren't the app take knowledge and resources to set up and maintain.

                Sure you can script all the things into 3 steps, just like you can draw an owl with a couple circles.

                • By lelanthran 2025-10-2315:391 reply

                  > The point is, all the extra things that aren't the app take knowledge and resources to set up and maintain.

                  Maintain, maybe. The setup for everything extra can scripted, and include a few packages I had to build from source myself because there was no binary download.

                  • By chickensong 2025-10-2318:53

                    I hear you, and I'm passionate about automating all the things. I just wanted to add some perspective to the discussion to set expectations for less experienced people who might be considering a switch from PaaS to DIY.

                    I'm not a PaaS user, and I encourage people to avoid vendor lock-in and be in control of their own destiny. It takes work though, and you need to sweat the details if you care about reliability and security, which continue to be problem areas for more DIY solutions.

                    If people aren't willing to put in the work, I'd rather they stick to the managed services so they don't contribute to eroding the already abysmal trust of the industry at large.

        • By teekert 2025-10-2121:431 reply

          I use NixOS and a lot of it is in a single file. I just saw some ansible coming by here, and although I have no experience with it, it looked a lot simpler than Nix (for someone from the old Linux world, like me… eventhough Nix is, looking through your eyelashes, just a pile of key/value pairs).

          • By eru 2025-10-220:15

            Nix is great, but it still requires some training and expertise.

            And the overlap between what Nix does and what the 'cloud' does for you is only partial. (Eg it can still make sense to use Nix in the cloud.)

        • By bigstrat2003 2025-10-220:272 reply

          > It's fun the first time, but becomes an annoying faff when it has to be repeated constantly.

          Certainly true, but there are a whole lot of tools to automate those operations so that you aren't doing them constantly.

          • By chickensong 2025-10-2220:14

            Even with automation, it can be a full-time job just to keep pace with the rate of change, never mind the initial development which can be non-trivial.

          • By liqilin1567 2025-10-223:361 reply

            Mind sharing these tools and what each one does?

            • By c0balt 2025-10-224:412 reply

              Ansible, Salt and Puppet are mostly industry standard. Those tools are commonly referred to as configuration management (systems).

              Ansible basically automates the workflow of: log in to X, do step X (if Y is not present). It has broad support for distros and OSes. It's mostly imperative and can be used like a glorified task runner.

              Salt let's you mostly declaratively describe the state of a system. It comes with a agent/central host system for distributing this configuration from the central host to the minions (push).

              Puppet is also declarative and also comes with an agent/central host system but uses a pull based approach.

              Specialized/ exotic options are also available, like mgmt or NixOS.

              • By liqilin1567 2025-10-225:213 reply

                Thanks, this is very detailed! Could you share some real-world use cases for these tools?

                Actually I am looking for tools to automate DevOps and security for self-hosting

                • By indigo945 2025-10-227:19

                  Salt and Puppet are useful for managing a fleet of servers running various applications, especially when you need to scale those applications horizontally or want geo-distribution.

                  Ansible can also do that, on top of literally anything else you could want - network configuration, infrastructure automation, deployment pipelines, migrations, anything. As always, that flexibility can be a blessing or a curse, but I think Ansible manages it well because it's so KISS.

                  RedHat's commercial Ansible Automation Platform gives you more power for when you need it, but you don't need it starting out.

                • By c0balt 2025-10-228:59

                  The other commenter already answered the usecase question, for self-hosting you will likely find ansible the easiest entrypoint.

                  It is in general the simplest of these systems to get started with and you should be able to incrementally adopt it. There is also a plethora of free online resources available for it.

                • By comprev 2025-10-228:49

                  A combination of HashiCorp Packer and Ansible means I can "publish" a VM ready-to-rock image to a public cloud provider gallery and use it to run a VM in said cloud.

                  Ansible-Lockdown is another excellent example of how Ansible can be used to harden servers via automation.

              • By chickensong 2025-10-2220:26

                > Puppet is also declarative and also comes with an agent/central host system but uses a pull based approach.

                The person you're replying to mentioned a self-hosting use case, so this probably isn't relevant for that, but Ansible can also be configured for a pull approach, which is useful for scaling.

        • By tonyhart7 2025-10-222:31

          "The irony is that self hosting is relatively simple"

          cloud is easy until is not, for 90% of us maybe we dont need a multi region with hot and cold storage

          for those that need it, its neccesary

        • By YouAreWRONGtoo 2025-10-2121:261 reply

          [flagged]

          • By dang 2025-10-2122:341 reply

            Can you please edit out swipes, putdowns, name-calling, etc., from your HN posts? It's not what this site is for, and destroys what it is for.

            This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.

            • By YouAreWRONGtoo 2025-10-2221:291 reply

              I can't help it that humanity is so stupid.

              • By dang 2025-10-233:401 reply

                That's true, but you can stop posting to HN from that place.

                Edit: I feel like I should give you a more fulsome response, so here goes:

                I understand the frustration. I feel it too, even apart from HN making me feel it as part of my job. But I've had to learn some lessons about this, such as:

                1. It doesn't help to assume the position of the-one-who-is-not-stupid. Doing that is supercilious and just means you'll contribute to making things worse.

                2. Far better is to accept that, as one is human, one shares in all the qualities of being human, including a full complement of stupidity.

                3. I forget the third lesson!

                • By YouAreWRONGtoo 2025-10-2314:501 reply

                  Regarding 2., I am not stupid; I might be ignorant in some fields, but do you see me arguing against a world expert in some field I know nothing about?

                  Stupid people ruin everything.

                  • By dang 2025-10-2321:42

                    Ok, but please do stop posting to HN about how stupid others are. Being smarter is your burden to bear.

        • By tbrownaw 2025-10-2121:361 reply

          And all of that takes, what, a week? As a one time thing?

          • By jcynix 2025-10-2122:211 reply

            Takes less than a day, because most of the stuff is scriptable. And for a simple compute node setup at Hetzner (I.e. no bare metal, but just a VM) it takes me less than half an hour.

            • By tbrownaw 2025-10-221:32

              But if you're that familiar with it, the overpriced turnkey stuff wouldn't look so tempting in the first place.

      • By daemonologist 2025-10-2121:051 reply

        I dunno, the cloud has mostly made me afraid of the cloud. You can bury yourself in towering complexity so easily on AWS. (The highly managed stuff like Vercel I don't have much experience with, so maybe it's different.)

        • By ygouzerh 2025-10-2121:13

          I will recommend to try GCP or Azure, the complexity is lower there! AWS is great for big corporate that needs a lot of lego pieces to do their own custom setup. At the contrario, GCP and Azure solutions are often more bundled.

      • By tempest_ 2025-10-2121:141 reply

        It is way more than that though.

        It offloads things like - Power Usage - Colo Costs - Networking (a big one) - Storage (SSD wear / HDD pools) - etc

        It is a long list but what doesnt allow you do it make trade offs like spending way less but accept downtime if your switch dies etc etc.

        For a staging env these are things you might want to do.

        • By brandon272 2025-10-220:27

          "Self hosting" may actually be referring not to hosting your own on-prem hardware, but to renting bare metal in which case the concerns around power usage, networking, etc. are offloaded to the provider.

      • By fragmede 2025-10-2121:281 reply

        Never got the appeal of having someone else do something for you, and giving them money, in exchange for goods and services? Vercel is easy. You pay them to make it easy. When you're just getting started, you start on easy mode before you jump into the deep end of the pool. Everybody's got a different cup of tea, and some like it hot and others like it cold.

        • By rikafurude21 2025-10-2121:312 reply

          Sure I love having someone else do work for me and paying them for that, the question is if that work is worth a 50x markup.

          • By alwa 2025-10-2122:022 reply

            Flour, salt, and water are exceedingly cheap. I have to imagine the loaf of bread I buy from my baker reflects considerably more than a 50x markup compared to baking my own.

            It’s a lot cheaper than me learning to bake as well as he does—not to mention dedicating the time every day to get my daily bread—and I’ll never need bread on the kind of scale that would make it worth my time to do so.

            • By mediaman 2025-10-2122:581 reply

              Bread is a great example! You can buy a loaf for $3-4. It is not a 50x markup. Like growing your own veggies, baking bread is for fun, not for economics.

              But the cloud is different. None of the financial scale benefits are passed on to you. You save serious money running it in-house. The arguments around scale have no validity for the vast, vast majority of use cases.

              Vercel isn't selling bread: they're selling a fancy steak dinner, and yes, you can make steak at home for much less, and if you eat fancy steak dinners at fancy restaurants every night you're going to go broke.

              So the key is to understand whether your vendors are selling you bread, or a fancy steak dinner, and to not make the mistake of getting the two confused.

              • By alwa 2025-10-221:141 reply

                That’s a tremendously clarifying framework, and it makes a lot of sense to me. Thank you.

                I wonder, though—at the risk of overextending the metaphor—what if I don’t have a kitchen, but I need the lunch meeting to be fed? Wouldn’t (relatively expensive) catering routinely make sense? And isn’t the difference between having steak catered and having sandwiches catered relatively small compared to the alternative of building out a kitchen?

                What if my business is not meaningfully technical: I’ll set up applications to support our primary function, and they might even be essential to the meat of our work. But essential in the same way water and power are: we only notice it when it’s screwed up. Day-to-day, our operational competency is in dispatching vehicles or making sandwiches or something. If we hired somebody with the expertise to maintain things, they’d sit idle—or need a retainer commensurate with what the Vercels and Herokus of the world are charging. We only need to think about the IT stuff when it breaks—and maybe to the extent that, when we expect a spike, we can click one button to have twice as much “application.”

                In that case, isn’t it conceivable that it could be worth the premium to buy our way out of managing some portion of the lower levels of the stack?

                • By thequux 2025-10-226:32

                  In that case, you don't want cloud; you want an MSP, whose core competence is running those IT services. They, in turn, have the skills to colo a rack at a DC or to manage rented servers, amortized across a number of clients.

                  In practice, there are two situations where cloud makes sense:

                  1. You infrequently need to handle traffic that unpredictably bursts to a large multiple of your baseline. (Consider: you can over provision your baseline infrastructure by an order of magnitude before you reach cloud costs) 2. Your organization is dysfunctional in a way that makes provisioning resources extremely difficult but cloud can provide an end run around that dysfunction.

                  Note that both situations are quite rare. most industries that handle that sort of large burst are very predictable: event management know when a client will be large and provision ticket sales infra accordingly, e-commerce knows when the big sale days will be, and so on. In the second case, whatever organizational dysfunction caused the cloud to be appealing will likely wrap itself around the cloud initiative as well.

            • By eru 2025-10-220:203 reply

              Please do yourself a flavour and check the price of flour.

              Water is cheap, yes. Salt isn't all that cheap, but you only need a little bit.

              > [...] and I’ll never need bread on the kind of scale that would make it worth my time to do so.

              If you need bread by hand, it's a very small scale affair. Your physique and time couldn't afford you large scale bread making. You'd a big special mixer and a big special oven etc for that. And you'd probably want a temperature and moisture controlled room just for letting your dough rise.

              • By alwa 2025-10-220:343 reply

                $16 for a 50 pound sack right now

                https://postmates.com/store/restaurant-depot-4538-s-sheridan...

                I blush to admit that I do from time to time pay $21 for a single sourdough loaf. It’s exquisite, it’s vastly superior to anything I could make myself (or anything I’ve found others doing). So I’m happy to pay the extreme premium to keep the guy in business and maintain my reliable access to it.

                It weighs a couple of pounds, though I’m not clear how the water weight factors in to the final weight of a loaf. And I’m sure that flour is fancier than this one. I take your point—I don’t belong in the bread industry :)

                • By eru 2025-10-224:13

                  Well, in your case, you are mostly paying for the guy's labour, I presume.

                  (Similarly to how you pay Amazon or Google etc not just for the raw cloud resources, but for the system they provide.)

                  I grew up in Germany, but now live in Singapore. What's sold as 'good' sourdough bread here would make you fail your baker's training in Germany: huge holes in the dough and other defects. How am I supposed to spread butter over this? And Mischbrot, a mixture of rye and wheat, is almost impossible to find.

                  So we make our own. The goal is mostly to replicate the everyday bread you can buy in Germany for cheap, not to hit any artisanal highs. (Though they are massively better IMHO than anything sold as artisanal here.)

                  Interestingly, the German breads we are talking about are mostly factory made. Factory bread can be good, if that's what customers demand.

                  See https://en.wikipedia.org/wiki/Mischbrot

                  Going on a slight tangent: with tropical heat and humidity, non-sourdough bread goes stale and moldy almost immediately. Sourdough bread can last for several days or even a week without going moldy in a paper bag on the kitchen counter outside the fridge, depending on how sour you go. If you are willing to toast your bread, going stale during that time isn't much of an issue either.

                  (Going dry is not much of an issue with any bread here--- sourdough or not, because it's so humid.)

                • By hwntw 2025-10-2210:19

                  Where do you spend $21 for a loaf of sourdough?! My local baker sells a delicious loaf of artisanal sourdough for £4 here.

                  Of course, the difference between sourdough and anything else is astonishing, I just can't comprehend someone charging $21 for it!

                • By chickensong 2025-10-2220:39

                  You can make amazing sourdough at home in a cast iron pot. It requires time, that's the nature of sourdough, but it's not hard once you learn how. I guarantee you could make bread as good or better for a dollar of ingredients!

              • By jandrewrogers 2025-10-224:211 reply

                > Salt isn't all that cheap

                Wait, what? Salt is literally one of the cheapest of all materials per kilogram that exists in all contexts, including non-food contexts. The cost is almost purely transportation from the point of production. High quality salt is well under a dollar a pound. I am currently using salt that I bought 500g for 0.29 euro. You can get similar in the US (slightly more expensive).

                This was a meme among chemical engineers. Some people complain in reviews on Amazon that the salt they buy is cut with other chemicals that make it less salty. The reality is that there is literally nothing you could cut it with that is cheaper than salt.

                • By eru 2025-10-224:42

                  Well, salt is more expensive than water.

                  But sure, it's cheap otherwise. Point granted.

                  One way or another, salt is not a major driver of cost in bread, because there's relatively little salt in bread. (If there's 1kg of flour, you might have 20g of salt.)

              • By tonyhart7 2025-10-222:382 reply

                bread ingreadient is cheap but the equipment that you need to do baking is not

                also skills, some people just bake better than others

                • By eru 2025-10-224:431 reply

                  > bread ingreadient is cheap but the equipment that you need to do baking is not

                  It's actually not too bad, if look at the capital cost of a bread factory amortised over each loaf of bread.

                  The equipment is comparatively more expensive for a home baker who only bakes perhaps two loafs a week.

                  • By chickensong 2025-10-2220:52

                    A comment in adjacent thread above mentioned paying $21 per-loaf! That could pay for the equipment needed to bake a couple loaves a week. You really don't need much besides a normal oven.

                • By chickensong 2025-10-2220:48

                  Unless you're talking about the oven, the equipment isn't expensive.

                  Some skills are required, but it's really not that hard once you learn the technique and have done it a few times.

          • By fragmede 2025-10-2121:341 reply

            Yeah, but then we're just haggling. If you know how to change the belt on your car and already have the tools, it's different from when you're stranded with no tools and no garage and no belt.

            • By rikafurude21 2025-10-2121:383 reply

              If you're a mechanic you're supposed to know how to change the belt on your car. It would be insane if you write code and work with computers for a living but you dont know how to set up a web server.

              • By auggierose 2025-10-2122:08

                I am pretty sure I know much more about code than you do, and at the same time you probably know much more about web servers and sysadmin than I do. I don't mind if it stays like that. And I am saying this having programmed my own web server in Java about 25 years ago.

              • By rascul 2025-10-2122:59

                A whole lot of coding and working with computers doesn't involve setting up a web server. It's not insane at all.

              • By everyone 2025-10-2122:012 reply

                It would be insane if you write code and work with computers for a living but you dont know how to write a game engine in assembly.

                • By marcosdumay 2025-10-221:21

                  Hum... Writing a game engine is a high-difficulty task that should be available to any reasonably good software developer with a few months to study for it. Making it in assembly is a sure way to take 10 times the time of another low level language like C, but shouldn't be an impossibility either.

                  Configuring a web server is a low-difficulty task that should be available for any good software developer with 3 days to study for it. It's absurd for a developer to need to configure a web server, but insist on paying a large rent and cede control to some 3rd party instead of just doing it.

                • By sgarland 2025-10-2122:561 reply

                  Installing a web server is in no way the same as writing a game engine, let alone in assembly, and I think you know that.

                  • By everyone 2025-10-226:581 reply

                    Fucking every web-dev assumes web-dev is all of programming. I have always been a game dev, never done any internety stuff, I was never interested in it. I would defo find the game engine task a lot easier. I already know what I would do. I wouldnt know where to start with the server + I dont know what the "gotchas" are. If I was forced to do that I would schedule a really long time for it.

                    • By sgarland 2025-10-2213:59

                      I don’t assume that (and am not a dev - DBRE / SRE) at all. I have massive respect for game devs, since you’re one of the few subsets that seems to give a shit about performance.

                      I bet you could figure out `apt install nginx` and a basic config pretty quickly, definitely faster than a web dev could learn game programming. “What do you mean, I have to finish each loop in 16 msec?”

      • By sokoloff 2025-10-2121:27

        > the price business has to pay because of developer insecurity

        Is it mostly developer insecurity, or mostly tech leadership insecurity?

      • By agumonkey 2025-10-2121:03

        my take is that it's fun up until there's just enough brittleness and chaos.. too many instance of the same thing but with too many env variables setup by hand and then fuzzy bug starts to pile up

      • By rapind 2025-10-220:081 reply

        Honestly I think it's the database that makes devs insecure. The stakes are high and you usually want PITR and regular backups even for low traffic apps. Having a "simple" turnkey service for this that can run in any environment (dedicated, VPS, colo, etc.) would be huge.

        I think this is partly responsible for the increased popularity of sqlite as a backend. It's super simple and lightstream for recovery isn't that complicated.

        Most apps don't need 5 9s, but they do care about losing data. Eliminate the possibility of losing data, without paying tons of $ to also eliminate potential outages, and you'll get a lot of customers.

        • By tonyhart7 2025-10-222:361 reply

          isn't that just neon db???? but without losing data part

          • By rapind 2025-10-224:34

            Neon is definitely way more complex than what I'm talking about.

    • By noosphr 2025-10-2121:49

      The cloud was a good deal in 2006 when the smallest aws machine was about the size of a ok dev desktop and took over two years of renting to justify buying the physical machine outright.

      Today the smallest, and even large, aws machines are a joke, comparable to a mobile phone from 15 years ago to a terrible laptop today, and take about three to six months to in rent as buying the hardware outright.

      If you're on the cloud without getting 75% discount you will save money and headcount by doing everything on prem.

    • By odie5533 2025-10-2120:561 reply

      Fully replicating prod is helpful. Saves time since deployment is similar and does a better test of what prod will be.

      • By teaearlgraycold 2025-10-2121:012 reply

        Completely agree. It’s not a staging server if it’s hosted on a different platform.

        • By odie5533 2025-10-2121:041 reply

          I think OP is using these less as staging and more as dev environments for individual developers. That seems like a great use of a single server to me.

          I'd still like a staging + prod, but keeping the dev environments on a separate beefy server seems smart.

          • By ricketycricket 2025-10-2121:091 reply

            I've been using a development server for about 9 years and the best thing I ever did was move to a machine with a low-power Xeon D for a time. It made development painful enough that I quickly fixed the performance issues I was able to overlook on more powerful hardware. I recommend it, even just as an exercise.

            • By eru 2025-10-220:22

              For similar reasons, in the Google office I worked in you had the option to connect to a really intentionally crappy wifi that was simulating a 2G connection.

        • By hamdingers 2025-10-2121:131 reply

          The "platform" software runs on is just other software. If your prod environment is managed kubernetes then you don't lose much if your staging environment is self-hosted kubernetes.

          • By odie5533 2025-10-2121:15

            Load balancers, IAM roles, kubernetes upgrades, postgres upgrades, security settings, DNS records, http routes... there's a lot that can go wrong and makes it useful to have a staging environment.

    • By jamestimmins 2025-10-2121:071 reply

      This could be the premise for a fun project based infra learning site.

      You get X resources in the cloud and know that a certain request/load profile will run against it. You have to configure things to handle that load, and are scored against other people.

      • By YouAreWRONGtoo 2025-10-2121:25

        All it means is that the cloud doesn't work like a power socket, which was the whole point of it.

        Things like Lambda do fit in this model, but they are too inefficient to model every workload.

        Amazon lacks vision.

    • By adgjlsfhk1 2025-10-2122:572 reply

      also how far you can get with a single machine has changed massively in the past 15 years. 15 years ago a (really beefy) single machine meant 8 cores with 256GB ram and a couple TB of storage. Now a single machine can be 256 cores on 8TB of ram and a PB of storage.

      • By layoric 2025-10-2123:021 reply

        Exactly, and the performance of consumer tech is wildly faster. Eg, a Ryzen 5825U mini pc with 16GB memory is ~$250USD with 512GB nvme. That thing will outperform of 14 core Xeon from ~2016 on multicore workloads and absolutely thrash it in single thread. Yes lack of ECC is not good for any serious workload, but great for lower environments/testing/prototyping, and it sips power at ~50W full tilt.

        • By eru 2025-10-220:241 reply

          Curiously, RAM sizes haven't gone up much for consumer tech.

          As an example: my Macbook Pro from 2015 had 16 GiB RAM, and that's what my MacBook Air from 2025 also has.

          • By ericd 2025-10-221:431 reply

            Ehhh Macbook Pros can be configured with up to 128 now, iirc 16 was the max back then. But I guess the baseline hasn't moved as much.

            • By eru 2025-10-224:23

              Yes, there has been some movement. But even an 8 fold increase (128/16) over a decade is nothing compared to what we used to see in the past.

              Oh, and the new machine has unified RAM. The old machine had a bit of extra RAM in the GPU that I'm not counting here.

              As far as I can tell, the new RAM is a lot faster. That counts for something. And presumably also uses less power.

      • By wild_egg 2025-10-2213:51

        I saw a twitter thread recently where someone tried to make this point to someone shilling AWS spaghetti architectures. They were subsequently dog-piled into oblivion but the mental gymnastics people can do around this subject is a sight to behold.

        Simplicity is uncomfortable to a lot of people when they're used to doing things the hard way.

    • By j45 2025-10-2120:571 reply

      Cloud often has everyone thinking it's still 2008.

      • By tempest_ 2025-10-2120:591 reply

        With some obvious exceptions there isnt much you cant run on a 200 Core machine wrt web services.

        • By j45 2025-10-2220:24

          Of course.

          The default thought to use the cloud because it's more performant though for even the most basic to intermediate loads instead of the hardware directly is what I'm referring to and what the article is referring to.

          It's very easy to pay for cloud services per transaction at greatly inflated prices than what it actually costs, and how many cpu cores it actually uses at any given time.

    • By MangoCoffee 2025-10-2121:00

      The cloud has made people forget that the internet is decentralized.

    • By altcognito 2025-10-2121:081 reply

      The weird thing is the relationship between developer costs and operations costs. For startups that pay salaries, $3000 a month is a pittance!*

      * The big caveat: If you don't incur the exact same devops costs that would have happened with a linux instance.

      Many tools (containers in particular) have cropped up that have made things like quick, redundant deployment pretty straightforward and cheap.

      • By andersa 2025-10-2121:22

        The best part is when you start with a $3000/month cloud bill during development and finally realize that hosting the production instance this way would actually cost $300k/month, but now it's too late to change it quickly.

    • By matt-p 2025-10-2123:19

      You put your staging env in the same (kind of) place you put your prod system because you need to replicate your prod environment as faithfully as possible. You also then get to re-use your deployment code.

    • By nimbius 2025-10-2121:173 reply

      you can literally buy a used dell desktop that matches the spec for hetzner (8 core, 32 gigs of ram) for under 500 USD. Why wouldnt you just do that?

      As cloud marches on it continues to seem like a grift.

      • By sodality2 2025-10-2121:32

        Do you plan on keeping it in your home? At that point I'd be worried about ISP networking or power guarantees unless you plan on upgrading to business rates for both. If you mean colo, well, if you're sure you'll be using it in X years, it's worth it, but the flexibility of month-to-month might be preferable.

      • By SchemaLoad 2025-10-2121:443 reply

        Because that used desktop is subject to power outages, internet outages, the cleaners unplugging it, etc. Datacenters have redundancy on everything.

        • By jopsen 2025-10-2210:24

          Not to mention physical security.

          Breaking into a home is relatively easy.

          And unless you live in the US and is willing to actually shot someone (with all the paperwork that entails, as well as physical and legal risks), the fact is that you can't actually stop a burglary.

        • By ngold 2025-10-236:28

          I'm probably wrong, but this argument always cracks me up.

          It used to be called 3 laptops a power scrubber and a backup battery. If you want to go self hosting things. If you were fancy you had two servers.

        • By eru 2025-10-220:25

          Also you still have to pay for the electricity on that thing.

          The cloud costs includes everything.

      • By marcosdumay 2025-10-221:35

        And you'll need some $100/month to colocate that thing, so you are better spending some more and buying a reasonable server that uses only 1U.

HackerNews