> If any maintainers stop by this comment section, I suggest offering some explanation about what this project _does_.
This. It's a pretty awful way to present anything. The reader is still clueless and unsure about what they are reading even after navigating through 3 or so links. At each click I was hoping to read anything related to APIs but all I was reading is bureaucrat noise.
Perhaps it's a good idea. I can't tell. I wonder if anyone can.
> However, recently I realized that I read code, but almost never write it.
This has been the norm for a few decades. Even software engineering courses emphasize the fact that code is read far more often than it is changed, which leads to all the basic software engineering principles around the importance of making code easier to read and to navigate through.
> I’m not justifying the design but splitting a table with several billion rows is not a trivial task, especially when ORMs and such are involved.
I don't agree. Let me walk you through the process.
- create the new table - follow a basic parallel writes strategy -- update your database consumers to write to the new table without reading from it -- run a batch job to populate the new table with data from the old table -- update your database consumer to read from the new table while writing to both old and new tables
From this point onward, just pick a convenient moment to stop writing to the old database and call the migration done. Do post-migrarion cleanup tasks.
> Additionally, it’s easier to get work scheduled to ship a feature than it is to convince the relevant players to complete the swing.
The ease of piling up technical debt is not a justification to keep broken systems and designs. It's only ok to make a messs to deliver things because you're expected to clean after yourself afterwards.
> Protobufs definitely doesn’t solve the problems described. Capnproto may solve it but I’m not 100% sure. JSON/XML/ASN.1 definitely don’t.
I'm not sure you are serious. What open problem do you have in mind? Support for persisting and deserializing optional fields? Mapping across data types? I mean, some JSON deserializers support deserializing sparse objects even to dictionaries. In .NET you can even deserialize random JSON objects to a dynamic type.
Can you be a little more specific about your assertion?
> I can tell there is no correlation between the domain and the amount of columns.
This is unbelievable. In purely architectural terms that would require your database design to be an amorphous big ball of everything, with no discernible design or modelling involved. This is completely unrealistic. Are queries done at random?
In practical terms, your assertion is irrelevant. Look at the sparse columns. Figure out those with sparse rows. Then move half of the columns to a new table and keep the other half in the original table. Congratulations, you just cut down your column count by half, and sped up your queries.
Even better: discover how your data is being used. Look at queries and check what fields are used in each case. Odds are, that's your table right there.
Let's face it. There is absolutely no technical or architectural reason to reach this point. This problem is really not about structs.