Ask HN: When has a "dumb" solution beaten a sophisticated one for you?

Comments

By atrettel 2026-01-1116:423 reply

I recently wrote a command-line full-text search engine [1]. I needed to implement an inverted index. I choose what seems like the "dumb" solution at first glance: a trie (prefix tree).

There are "smarter" solutions like radix tries, hash tables, or even skip lists, but for any design choice, you also have to examine the tradeoffs. A goal of my project is to make the code simpler to understand and less of a black box, so a simpler data structure made sense, especially since other design choices would not have been all that much faster or use that much less memory for this application.

I guess the moral of the story is to just examine all your options during the design stage. Machine learning solutions are just that, another tool in the toolbox. If another simpler and often cheaper solution gets the job done without all of that fuss, you should consider using it, especially if it ends up being more reliable.

[1] https://github.com/atrettel/wosp

By zahlman 2026-01-187:35

> I choose what seems like the "dumb" solution at first glance: a trie (prefix tree).

> There are "smarter" solutions like... hash tables.... A goal of my project is to make the code simpler to understand and less of a black box, so a simpler data structure made sense, especially since other design choices would not have been all that much faster or use that much less memory for this application.

Strangely, my own software-related answer is the opposite for the same reason.

I was implementing something for which I wanted to approximate a https://en.wikipedia.org/wiki/Shortest_common_supersequence , and my research at the time led me to a trie-based approach. But I was working in Python, and didn't want to actually define a node class and all the logic to build the trie, so I bodged it together with a dict (i.e., a hash table).

By bawis 2026-01-1117:401 reply

What body of knowledge (books, tutorials etc) did you use while developing it?

By atrettel 2026-01-1119:171 reply

Before I started the project, I was already vaguely familiar with the notion of an inverted index [1]. That small bit of knowledge meant that I knew where to start looking for more information and saved me a ton of time. Inverted indices form the bulk of many search engines, with the big unknown being how you implement it. I just had to find an adequate data structure for my application.

To figure that out, I remember searching for articles on how to implement inverted indices. Once I had a list of candidate strategies and data structures, I used Wikipedia supplemented by some textbooks like Skiena's [2] and occasionally some (somewhat outdated) information from NIST [3]. I found Wikipedia quite detailed for all of the data structures for this problem, so it was pretty easy to compare the tradeoffs between different design choices here. I originally wanted to implement the inverted index as a hash table but decided to use a trie because it makes wildcard search easier to implement.

After I developed most of the backend, I looked for books on "information retrieval" in general. I found a history book (Bourne and Hahn 2003) on the development of these kind of search systems [4]. I read some portions of this book, and that helped confirm many of the design choices that I made. I actually was just doing what people traditionally did when they first built these systems in the 1960s and 1970s, albeit with more modern tools and much more information on hand.

The harder part of this project for me was writing the interpreter. I actually found YouTube videos on how to write recursive descent parsers to be the most helpful there, particular this one [5]. Textbooks were too theoretical and not concrete enough, though Crafting Interpreters was sometimes helpful [6].

[1] https://en.wikipedia.org/wiki/Inverted_index

[2] https://doi.org/10.1007/978-3-030-54256-6

[3] https://xlinux.nist.gov/dads/

[4] https://doi.org/10.7551/mitpress/3543.001.0001

[5] https://www.youtube.com/watch?v=SToUyjAsaFk

[6] https://craftinginterpreters.com/

By bawis 2026-01-1211:151 reply

Thanks for detailing, how much time you invested in it?

By atrettel 2026-01-1215:31

I spent around 170 hours on this so far, with only 60% of that being coding. The rest was mostly research or writing.

By an-allen 2026-01-188:43

Similar I have a script that has the following format: “q replace all onstances of http: with https: in all txt files recurisvely”

And it goes the ChatGPT comes back with and runs the appropriate command.

By jakevoytko 2026-01-186:551 reply

When I was on Google Docs, I watched the Google Forms team build a sophisticated ML model that attempted to detect when people were using it for nefarious purposes.

It underperformed banning the word "password" from a Google Form.

So that's what they went with.

By demaga 2026-01-1820:46

I wonder if this is just an example of Goodhart's law. How did they measure performance of those models? I would imagine they tried measuring against known cases of forms misuse, aka those forms that contained 'password' field.

By eastoeast 2026-01-113:201 reply

I’m mostly a hardware engineer.

I needed to test pumping water through a special tube, but didn’t have access to a pump. I spent days searching how to rig a pump to this thing.

Then I remembered I could just hang a bucket of water up high to generate enough head pressure. Free instant solution!

By trueismywork 2026-01-188:23

You would have made maximum faget proud

Hacker News