A MySQL-compatible relational database with a storage agnostic query engine. Implemented in pure Go. - dolthub/go-mysql-server
Hi, this is my project :)
For us this package is most important as the query engine that powers Dolt:
https://github.com/dolthub/dolt
We aren't the original authors but have contributed the vast majority of its code at this point. Here's the origin story if you're interested:
https://www.dolthub.com/blog/2020-05-04-adopting-go-mysql-se...
This is very cool! Couple of suggestions:
- Don't use "mysql" in the name, this is a trademark of Oracle corporation and they can very easily sue you personally if they want to, especially since you're using it to develop a competing database product. Other products getting away with it doesn't mean they won't set their sights on you. This is just my suggestion and you can ignore it if you want to.
- Postgres wire/sql compatibility. Postgres is for some reason becoming the relational king so implementing some support sooner rather than later increases your projects relevance.
PostgreSQL support here
https://github.com/dolthub/doltgresql
Background and architecture discussion here
The vanilla package can replicate to or from MySQL via binlog replication. But since it's memory only, that's probably not what you want. You probably want to supply the library a backend with persistence, not the built-in memory-only one
Dolt can do the same two directions of MySQL binlog replication, and also has its own native replication options:
Interesting!
> If you have an existing MySQL or MariaDB server, you can configure Dolt as a read-replica. As the Dolt read-replica consumes data changes from the primary server, it creates Dolt commits, giving you a read-replica with a versioned history of your data changes.
This is really cool.
Have you benchmarked the replication? Or do you know of anyone who's running it against a primary with a couple 10s of thousands of writes per second?
That's a lot. With Percona clusters I started having issues requiring fine-tuning around a third of that at quite short peak loads, maybe ten minutes sustained high load topping out at 6-10k writes/s. Something like 24 cores, 192 GB RAM on the main node.
Not sure how GC works in Golang but if you see 20k writes/s sustained that's what I'd be nervous about. If every write is 4 kB I think it would be something like a quarter of a TB per hour, probably a full TB at edge due to HTTP overhead, so, yeah, a lot to handle on a single node.
Maybe there are performance tricks I don't know about that makes 20k sustained a breeze, I just know that I had to spend time tuning RAM usage and whatnot for peaks quite a bit earlier and already at that load planned for sharding the traffic.
I don't think we do have any benchmarks of replication from mySQL, but I am positive there's no chance it can handle 10,000 TPS.
Missed an opportunity to can this uSql!
I always found the idea behind dolt to be very enticing.
Not enticing enough to build a business around, due to it being that bit too different and the persistence layer being that bit too important. But the sort of thing that I'd love it if the mainstream DBs would adopt.
I didn't realise the engine was written in Go, and honestly the first place my mind wonders is to performance.
If you like the idea of the Dolt prolly trees[1], I'm building a database[2] that uses them for indexing, (eventually) allowing for shared index updates across actors. Our core uses open-source JavaScript[3], but there are a few other implementations including RhizomeDB in Rust[4]. I'm excited about the research in this area.
[1] https://docs.dolthub.com/architecture/storage-engine/prolly-...
We haven't benchmarked the in-memory database implementation bundled in go-mysql-server in a while, but I would be surprised if it's any slower than MySQL, considering that Dolt runs on the same engine and is ~2x slower than MySQL including disk-access.
Version Control is not the type of thing "mainstream DBs would adopt".
We needed to build a custom storage engine to make querying and diffing work at scale:
https://docs.dolthub.com/architecture/storage-engine
It based on the work of Noms including the data structure they invented, Prolly Trees.
https://docs.dolthub.com/architecture/storage-engine/prolly-...
This seems to be a wire-protocol proxy for mysql -> SQL.
The default proxied database is dolt. I'm guessing this is extracted from dolt itself as that claims to be wire-compatible with mysql. Which all makes total sense.
Not a proxy in the traditional sense, no. go-mysql-server is a set of libraries that implement a SQL query engine and server in the abstract. When provided with a compatible database implementation using the provided interfaces, it becomes a MySQL compatible database server. Dolt [1] is the most complete implementation, but the in-memory database implementation the package ships with is suitable for testing.
We didn't extract go-mysql-server from Dolt. We found it sitting around as abandonware, adopted it, and used it to build Dolt's SQL engine on top of the existing storage engine and command line [2]. We decided to keep it a separate package, and implementation agnostic, in the hopes of getting contributions from other people building their own database implementations on top of it.
[1] https://github.com/dolthub/dolt [2] https://www.dolthub.com/blog/2020-05-04-adopting-go-mysql-se...
Really excellent work! For the curious, would you all be creating an in-memory database implementation that is postgres compatible for the doltgres project?
We are moving in that direction but it's not a primary goal at this point. Once we have more basic functionality working correctly in doltgres we will examine splitting off a separate package for it. The in memory implementation has a bunch of MySQL specific stuff in it right now and we're still learning what pieces need to be generalized to share code.