The PostgreSQL documentation and the limitations of community

2023-06-1416:3411760rhaas.blogspot.com

In my opinion, the PostgreSQL documentation is simultaneously excellent and fairly poor, and both its excellence and its shortcomings are di...

In my opinion, the PostgreSQL documentation is simultaneously excellent and fairly poor, and both its excellence and its shortcomings are direct results of the process by which the documentation is produced. The PostgreSQL documentation is stored in the same git repository as the source code, and anyone who patches the source code so as to change documented behavior must also patch the documentation to match.

This means that nearly all documentation updates are made by the developer who is most familiar with what is changing in the code, or sometimes by another developer who has studied those changes closely. Therefore, the documentation is usually extremely accurate. Sure, there are oversights, but it would be incredible to discover that some PostgreSQL command has a documented option which doesn't actually exist, or that a parameter which is documented to take a string argument actually takes an integer or a Boolean. Typically, the descriptions of what SQL statements do and how that behavior is changed by parameter settings or options passed to the command itself are crisp and precise.

We're particularly good at tables and lists. Every SQL command that exists is listed in the documentation index, and anybody who adds a new SQL command will add it to the list, and write a documentation page for it that looks just like all of the other ones. Every configuration parameter that exists is listed in the documentation with a description of what it does using semi-standard phraseology. When someone creates a new one, they're bound to add it to the list and describe it using the same phraseology that's already used for the existing parameters. Corrections to this kind of documentation are often on the level of re-alphabetizing things that have fallen out of alphabetical order, or fixing some bit of markup that wasn't done in the same way as all of the other entries. Grown men slink away in shame when it is pointed out to them that the parameter type is listed as bool when it actually should be enum.

The strengths of this process are also its weaknesses. A developer is, by definition, someone who spends the majority of their time doing development, which is to say writing code. Updating the documentation becomes a task that must be completed so that the code one has written can get committed so that one can move on to the next project and write some more code. If a change to the documentation would be beneficial but is unrelated to any particular patch, it's not likely to get done.

The results are, in a certain sense, pretty comical. Pop over to the documentation index and find the page that describes the work_mem parameter and the page that describes the CREATE TABLE command. If you're anything like me, this is actually quite easy for you, because you know that the work_mem documentation is going to be buried under "Server Configuration" and CREATE TABLE is going to be under "SQL Commands" and, knowing that, you'll have no trouble. But notice that those incredibly important chapters of the documentation get just the same amount of space in the top-level documentation index as "Background Worker Processes" and "Backup Manifest Format," just to pick a couple examples of chapters that I have personally added in faithful observance to the community process. Those equally-prominent chapters have got to be of interest to a comparatively tiny number of users.

But, you know, I was a loyal servant of the community process. I was asked to document that stuff, and I did, and I put it in the documentation in the place where it most logically seemed to go. The fact that the overall structure of the documentation probably isn't for the best is not my fault, nor is correcting it my responsibility. And it's not anyone else's responsibility, either.

Nonetheless, people do try, from time to time. Most efforts fall into one of two categories. Sometimes, someone realizes that a certain section of the documentation has become woefully out of date, often because it emphasizes whatever piece of technology existed first and needs to be rewritten to emphasize some newer innovation that is better (examples from table partitioning, pg_basebackup). Other times, people suggest structural adjustments (example). Such changes sometimes go through smoothly, but many of them prove to be controversial and take much more work to get committed than a documentation entry for a new parameter or a new SQL command.

It's not difficult to understand why this happens. If I add a new feature to do a certain thing to PostgreSQL, I am the expert on that feature. There's nobody else who knows better than I do what the documentation for that feature ought to say. My work might have shortcomings just like anyone else's, but especially if I'm just adding new entries to tables that already contain dozens or hundreds of existing entries, how much difference of opinion can there reasonably be? It's more likely that reading the documentation will cause someone to take issue with the design of the feature itself than it is that they won't like the way it's documented.

But if on the other hand I propose some change to documentation that has existed for a long time, or some kind of structural change, there's a lot more room for disagreement. Because the change isn't strictly mechanical, the right answer is a lot more subjective. And because it's a change to existing content rather than the addition of new content, many more people will be familiar with it and have opinions on how it ought to be changed, if at all. Consequently, even when some developer does take time away from writing code to try to make some larger change to the documentation, it's often an uphill battle to get anything done, and people typically have to be content with small improvements.

I hope I'm convincing you that this mixture of extreme rigor when it comes to mechanical updates and laxity when it comes to broader changes is inherent in the way that the community works. It is neither good nor bad; it just is. If we adopted some other process, it would have its own set of advantages and disadvantages, and it's anybody's guess whether we'd come out ahead. Personally, my guess is that most other things we could do would come out overall worse than what we are currently doing, so I have no particular process change to propose. However, I do think it's worth thinking about why we get the results that we do, both good and bad.


Read the original article

Comments

  • By pgaddict 2023-06-1417:097 reply

    I think the main limitation of our docs is that it mostly explains what the pieces do, not how to use them to achieve a particular goal. For example, we have pretty good documentation of all the pieces to do HA, we just don't tell people how to assemble them together.

    The reason is, I think, that flexibility is a pretty fundamental part of the project. We're great at providing building blocks (and documenting them), but we steer clear of describing a particular way to assemble them together.

    For example, we might describe a particular HA approach, but then that would be perceived as "recommended / official" way, giving it preference over other (and equally valid) approaches and tooling. These "how to" docs are bound to be way more opinionated, so we just focus on documenting the pieces.

    In other words, our docs are written by devs for devs, and we leave the higher level stuff to tutorials written by others etc.

    • By derefr 2023-06-1517:08

      These concerns seem to be specific to cases where there are various competing high-level design "strategies" with political weight behind them.

      There are cases where PG is missing high-level docs where I don't think this applies.

      For example, there's no official doc on how to write PL/pgSQL code. There's just an extremely-low-level language reference, covering each syntax element separately. There's no cookbook (other than the few examples per syntax element that exist to document the edge-cases of use of that syntax element); no tutorial; no efficiency/performance/scalability guide discussing when certain language features should be favored over others given the current way they're executed (e.g. is IF-ELSE, CASE-WHEN, or a series of IFs with early returns cheaper? when should I favor using FOR with a query, vs. when should I query data into an in-memory array variable and then use FOREACH, vs. when should I query data into a TEMPORARY table and then query that?); no place where you can get a sense for how procedure CALLs interact with MVCC (e.g. when they acquire + release locks, and therefore how and when they cause blocking on contended tables vs. how and when a SELECTed function that uses dblink/fdw to run independent txs would do so); etc. There isn't even a single mention of which PL/pgSQL exceptions are potentially raised by what PG builtin functions when called in a PL/pgSQL context; how to name those exceptions to match on them to catch them; or how to raise them yourself. I often need to dig into the PG source code to figure that out! (PL/pgSQL honestly feels, in docs terms, like a proprietary third-party language-engine "plugin" that someone bolted on, where the docs were expected to be provided by the third party, but never were. But it's not! It's a first-party language, and the reference implementation of how to create a language extension!)

    • By akira2501 2023-06-1520:192 reply

      I really miss old-school printed documentation's "Theory of Operation" section. To me it's the most useful way to bridge this gap. The technical and operations manual describe all the parts and how they function, but the theory of operation really laid out how and _why_ all of these things were structured the way they were.

      It also forced the designers to think in those terms and to document the product from an overall perspective rather than a component perspective. It was high level enough to be useful, but not so high level as to be abstracted into hand holding tutorial exercises.

      I feel like most modern software documentation entirely misses this component and would benefit greatly from having it.

      • By giovannibonetti 2023-06-1520:30

        Related: Diátaxis - A systematic framework for technical documentation authoring [1]

        "The Diátaxis framework aims to solve the problem of structure in technical documentation. It adopts a systematic approach to understanding the needs of documentation users in their cycle of interaction with a product.

        Diátaxis identifies four modes of documentation - tutorials, how-to guides, technical reference and explanation. It derives its structure from the relationship between them.(...)"

        [1] https://diataxis.fr/

      • By kaycebasques 2023-06-1522:271 reply

        Can you link me to a good old school "theory of operation" section? I get the idea but I want to see firsthand what you mean.

        • By akira2501 2023-06-1523:31

          The older Harris Corporation AM and FM transmitter manuals was what came to mind when I wrote that. As equipment got modernized, there was less for the operator to know, so they get shorter and shorter over time. Look at the SX-1 AM manual under "Principles of Operation" to something like the HT-35 FM manual under the same.

          Also.. early computer manufacturers like MITS had manuals in a similar vein for their Altair 8800 box but you can find many examples in this space, there's a stub of a Wikipedia page just for it:

          https://en.wikipedia.org/wiki/Theory_of_operation

    • By briffle 2023-06-1516:522 reply

      Another good example is the differences in the documentation for Indexes, vs the https://use-the-index-luke.com/ that explains many of the reasons WHY you want to organize it with great examples.

      A problem I have is so many tutorials, or 'best practices' I find on the internet are for older versions that don't really apply as well in newer versions of postgres. Like searching for logical replication, you find lots of information for pg_logical for older versions of postgres, but many of those parts are now baked into postgres, but with a different syntax, etc.

      I would love to see a 'tutorials/guide' and 'best practices' part of the documentation that is updated with each new release, that give examples of the most common tasks, and when/why to use them, and when to move to something more advanced.

      Some really basic stuff like "this is the 3 best ways to handle replication in version 15, and the 2 or 3 most common ways to do backups, or these are the recommended ways to migrate from the previous version either in place, or to a new server, etc.

      • By friendzis 2023-06-167:03

        > Like searching for logical replication, you find lots of information for pg_logical for older versions of postgres, but many of those parts are now baked into postgres, but with a different syntax, etc.

        I tend to see this as duality of static/ephemeral. On one hand you have "vendors" treating software as rolling moving target - latest and greatest is all there is. On the other hand, there are community guides and tutorials that are out there and never retracted or amended.

        Let's say I find an article that luckily states being written against older version of software. How do I know whether the information is still valid and applicable to a newer version? Well, if I am lucky and the vendor publishes growing changelog I have to read it and judge of the changes are applicable. Maybe I have to hunt down changelogs from all the interim versions just to see if anything important to me changed. Most probably I will find a very succinct summary of the change like "toll x was merged into tool y" or "updated interface of x". Now I have to try and find the discussion around the change in mailing lists (hopefully the team has not migrated to Discord).

        Contrast this with hardware. Quite possibly I will find erratas and/or product change notices discussing the changes, maybe even application guides discussing how to apply changes. Unless there is an entirely new product line being released there is high change there will be rather detailed documentation on changes. Software is usually majorly lacking in this regard.

        And no, versioned documentation is not the answer. Documentation gets updated not only technically, but also as discussed structurally. Once a piece of documentation gets moved to another section or even reworded I can no longer reasonably search for changes. Documentation of a changing, evolving thing is hard and interestingly this is where open source gets hit the most - there are almost no incentives for someone to write good documentation.

      • By starkparker 2023-06-161:16

        > I would love to see a 'tutorials/guide' and 'best practices' part of the documentation that is updated with each new release, that give examples of the most common tasks, and when/why to use them, and when to move to something more advanced.

        Tutorials and best practices come about from real-world usage, which is why they're often out of date — by the time someone becomes enough of an expert on features specific to a version to write good-quality content of that nature, a new version is out. The only way to update those is through real-world usage, which means shipping pre-release versions for long-enough periods of time and having them used to ship real things, all before the release.

        Does PostgreSQL do this? It hasn't in my experience, but it's admittedly limited to the last 5-6 years.

    • By friendzis 2023-06-1516:27

      This reminds me of technical documentation for embedded devices. Usually you get multiple classes of documents: data sheets, application notes, reference designs, user guides, erratas.

      The problems described come from trying to be everything in one place, but it does not have to be. As I understand you try to be mostly a data sheet, which is probably a net good, because it is the document needed to be maintained, even if hard to navigate.

      However, there are more document classes that can be produced. Yes, a reference design is inevitably going to be opinionated, whether it is produced by project team or some internet person. A reference design produced by project team at least has a fighting chance at staying somewhat up to date. And one can discuss tradeoffs between different approaches in an application note.

    • By Rapzid 2023-06-1521:441 reply

      > I think the main limitation of our docs is that it mostly explains what the pieces do, not how to use them to achieve a particular goal

      I honestly prefer this type of documentation. ASP.NET Core has the complete opposite problem where it's too example based.

      • By F-W-M 2023-06-1613:401 reply

        For many things it's simple enough to read the code, did this e.g. for the Configuration system in ASP.NET Core. Dunno, if I could do this with postgres.

        • By Rapzid 2023-06-1618:34

          Yeah you'll end up reading the source code regardless if it's simple or not though haha. I've had to read through tons of the Identity and Auth code to get a handle on how to integrate with it and there are TONS of interfaces, implementations, and etc you have to cross-reference and hunt down to start building a mental model. You don't even get a diagram in the docs explaining how all the different filters and crap tie together(on top of auth having ITS OWN MIDDLEWARE PIPELINE) lol.

          Combine that with maybe the info you are looking for actually is in the docs, but it's sprinkled across loads of examples so it's hard to find or build a comprehensive understanding of.

    • By nazka 2023-06-167:01

      How-to and guides are an amazing way to do docs. I love the guides of Ruby on Rails. I rarely used the API documentation.

      There is a great article talking about the 4 different types of documentation named the “document system”.

      https://documentation.divio.com/

    • By starkparker 2023-06-161:13

      From the post:

      > But, you know, I was a loyal servant of the community process. I was asked to document that stuff, and I did, and I put it in the documentation in the place where it most logically seemed to go. The fact that the overall structure of the documentation probably isn't for the best is not my fault, nor is correcting it my responsibility. And it's not anyone else's responsibility, either.

      ...

      > It's not difficult to understand why this happens. If I add a new feature to do a certain thing to PostgreSQL, I am the expert on that feature. There's nobody else who knows better than I do what the documentation for that feature ought to say. My work might have shortcomings just like anyone else's, but especially if I'm just adding new entries to tables that already contain dozens or hundreds of existing entries, how much difference of opinion can there reasonably be? It's more likely that reading the documentation will cause someone to take issue with the design of the feature itself than it is that they won't like the way it's documented.

      Which, I think, is blind to the bigger issue.

      As you note:

      > The reason is, I think, that flexibility is a pretty fundamental part of the project. We're great at providing building blocks (and documenting them), but we steer clear of describing a particular way to assemble them together.

      ...

      > In other words, our docs are written by devs for devs, and we leave the higher level stuff to tutorials written by others etc.

      The deeper and more common problem between those two symptoms, which the OP misses, is that the people writing the docs often don't use, and maybe haven't ever used, the tool or features they're documenting in their most common productive modes. The most productive devs are often the least knowledgable people in how most, or even many, users use it.

      Companies often hire (and compensate) someone to try to take the giant mess of dev-written reference content and make guides out of them. But if those people don't use the product either, you just get better-organized docs that still miss the point.

      Most tools need usage experts writing docs far more than they need feature or software experts. The time of open-source tools' usage experts — "written by others etc." — is often as or more valuable than the time of the open-source project's engineers'. Usage experts are likely being compensated to do almost anything but document the open-source tool, or might even be compensated to document the tools privately or internally for others in an organization to use them better than potential competitors — the opposite of community.

      The kinds of tools where this trends toward open tutorial creation and documentation tend to have communities of users who aren't as focused on specific tools or narrow usage as systems tools like PostgreSQL — gamedev and media production tools come to mind.

      "Make a game" or "make a movie" are no less varied than "make an app" or "make a service", and can still be prone to tooling disputes (ie. using Unity vs. Unreal vs. Godot, Premiere vs. Final Cut vs. Davinci), but seem to fall into the trap of hoarding tool knowledge less often. Maybe because there's more authorship and recognition to those types of work, I'm not sure.

  • By kaycebasques 2023-06-1516:416 reply

    Here's my perspective. I've been a technical writer (TW) for ~10 years. 3 at an IoT startup, 7 at Google.

    > The strengths of this process are also its weaknesses. A developer is, by definition, someone who spends the majority of their time doing development, which is to say writing code. Updating the documentation becomes a task that must be completed so that the code one has written can get committed so that one can move on to the next project and write some more code.

    I may be misinterpreting, but I get the sense that the author feels that there is some kind of more optimal way to split up docs duties. IMO there is not. At least, not for reference docs. As the author said, the people implementing the code are in the best position to keep the reference information up-to-date.

    If I grokked the rest of the article correctly, the author is essentially saying that the engineers have trouble writing and maintaining the other main types of docs [1] --- guides, tutorials, and overviews (explanations). It also sounds like they are having a "too many cooks in the kitchen" problem with pushing through changes to the other docs. I have a simple answer to that: hire some strong technical writers and make it clear to everyone that the TWs are Responsible and Accountable [2] for those docs. Also, make it explicit that the engineers are a Consulted role when it comes to guides / tutorials / overviews. Writing these types of docs is hard, specialized work. As the author said, the engineers have lots of other priorities. Of course, it's a bit self-serving for a TW to say "the solution is to hire TWs" but I get the sense that people don't realize that the easiest way to get good guides / tutorials / overviews is to hire people who have thought long and hard about those specialized tasks. If you want a good database, you don't expect your TWs to do the job. You get a database engineer. If you want good tutorials / guides / overviews, you likewise shouldn't expect your database engineer to do the job. You get TWs.

    [1] https://diataxis.fr

    [2] https://en.m.wikipedia.org/wiki/Responsibility_assignment_ma...

    • By actuallyalys 2023-06-1521:48

      As someone who’s been both a software developer and a technical writer, I agree that a lot of these problems seem best suited for a technical writer. The expertise with creating different types of documents is valuable, of course, but there’s another benefit: The organization committing resources in the form of making it someone’s entire responsibility.

      While I think updating reference documents lends itself to subject matter experts (especially in a project where this approach is already successful), I think structuring them can be separated and given to a technical writer with more technical expertise or a developer with more documentation expertise.

    • By tetha 2023-06-1522:071 reply

      We're seeing similar things at work as the postgres documentation has.

      We in infra-ops can give you more details about how our database clusters are designed for resilience, security, safety than you want on more levels than most people in the company know exist. We also have reasoning for all of this available. This is really good to have for customer question sets during sales.

      However, this doesn't tell a developer how to connect his spring boot thingy to it, and how to connect and manage his service well. In fact, 80 - 90% or more of the things we know about our database are not relevant to a simple small-scale application running queries on it. And quite a lot of the issues you can have with running your application on a rock-solid database are entirely not relevant at a DBA level. Like, the database doesn't care if your DDL modification is backwards compatible.

      And that's something we're currently learning together with a foundation team at work. They are documenting on how to use it well from their side, we're learning about easy mistakes to make and document those, and help with the actionable documentation. And in hard cases we kinda have to talk what's the plan, because it's usually not smart from a DBAs perspective.

      • By nightpool 2023-06-1523:552 reply

        How many small/new services do you have, and how many developers / teams do you have? If most of your developers are working on "small-scale applications running queries", aren't you just spending a lot of developer time on repeating the same work over and over again?

        • By tetha 2023-06-168:57

          20 - 30 development teams deploying 100 - 150 services in usually multiple stages. About half or a little more are rather new.

          And we have different angles of approach here. For example, we from infra-ops have setup a bunch of reusable and versioned template modules. These modules encapsulate all of the gritty consul/nomad/certificate handling and someone actually using our databases mostly needs to supply the module with the necessary environment variables their application needs. This in turn has been extended by a bunch of these teams to provide preset templates for spring boot, .Net, python, libpq so most applications just need to configure "I'm springdev and need to go to this database".

          And currently another team is looking at building our own spring boot seeds, and/or using tools like cookiecutter or backstage to provide skeleton templates including most necessary boilerplate. We've started to maintain a template nomad job for simple applications as well.

          With these things maturing, getting a simple service stood up can be done fairly rapidly if you know what you're doing. Setup the skeleton, throw out a couple of things you don't need, be done in an hour or so.

          This however doesn't solve the initial problem. It's still tricky to know how to approach this stack and it takes time to either teach people how to approach this stack, or to condense all of this into simple how-to documents and/or workflows to get systems going.

        • By burnished 2023-06-160:211 reply

          Reusable components and code generation covers this pretty well.

          • By nightpool 2023-06-160:331 reply

            "However, this doesn't tell a developer how to connect his spring boot thingy to it, and how to connect and manage his service well" indicates that it does not

            • By burnished 2023-06-160:46

              Oh, pardon me, I was answering that as a question on its own - I should have made that clear.

    • By kaycebasques 2023-06-1519:43

      Just re-read the last two paragraphs from Haas. I have a couple further comments / questions.

      Quote from Haas:

      > But if on the other hand I propose some change to documentation that has existed for a long time, or some kind of structural change, there's a lot more room for disagreement. Because the change isn't strictly mechanical, the right answer is a lot more subjective. And because it's a change to existing content rather than the addition of new content, many more people will be familiar with it and have opinions on how it ought to be changed, if at all. Consequently, even when some developer does take time away from writing code to try to make some larger change to the documentation, it's often an uphill battle to get anything done, and people typically have to be content with small improvements.

      Honest question: do PostgreSQL engineering decisions have the same dynamics as what's described here for docs decisions? If not, what is different about engineering decisions versus docs decisions? Is it just that engineering decisions can be literally benchmarked whereas docs decisions do not seem benchmark-able? Are there any other potential explanations for different dynamics between eng and docs?

      If it does indeed just boil down to "docs are not benchmarkable" then I would suggest otherwise. You can create docs benchmarks. They won't have the rigor of engineering benchmarks but they at least establish some notion of docs quality and facilitate more targeted discussions during docs reviews. E.g. when there's a disagreement between an author and reviewer the author can say, "what docs benchmarks am I not following here?" The power of that kind of interaction is that you often do realize that the docs benchmarks are incomplete and some new dimension needs to be added to them. Or do the PostgreSQL contributor docs already have some guidelines along the lines of a "content quality checklist" and it's still not working?

      A rigorous effort to survey the PostgreSQL community and get a deep sense of what the overall community considers "high-quality docs" can itself be a super insightful experience. Different developer communities often need / want a different focus in the docs. That can be the foundation for a fairly authoritative docs quality checklist.

      Another thing that I'm very interested in, and will need to think through deeply some other day, is this notion that once you publish a doc, you can't touch it. It happens all the time and it's really weird.

    • By BSEdlMMldESB 2023-06-161:07

      we need to develop the techniques (and then the tooling) so that the source code is sufficient to reconstruct the documentation.

      LLMs are fit for this use. I think they could get trained on assembly language.

      further, I think that at this point, the whole x86 spec is as impossible to analyze 'rationally' as any natural language. funny that it was created by dozens upon dozens of engineers in a rigorous and formal way; and the result is now mega complex. but I digress...

      my point is to reconsider x86 spec as a 'natural' language which is the common "tongue" between software and PC hardware.

      but by this point there's no expectation that a modern intel cpu is implementing the instructions in the way they were first created. it may well be emulating the x86 spec to the best of its capacity (another computer program "written" in 'microcode', a x86 virtual machine); so when thinking about x86 assembly spec as a language, the software computer is talking to the hardware computer; and if they agree the computer works. (so the socket and motherboard's buses can be understood as network devices xP )

      finally, if I can come up with this (LLMs trained on assembly) so can any security researcher from any advanced academy or corporation.... meaning security researches NOT using LLMs like this ones I imagined are not doing their jobs correctly.

    • By _sharp 2023-06-160:30

      Our team recently adopted the Diataxis framework, I guess. We followed this guide:

      https://documentation.divio.com/

      We've been referring to it as the Divio system though, which appears to be the same thing.

    • By btilly 2023-06-1517:194 reply

      Just curious. Where do you think that an open source project like PostgreSQL gets a budget to hire anyone? Let alone to dictate a new line of authority to the volunteers who are already maintaining it?

      And don't forget that there are valuable volunteers who are likely to go elsewhere if too many new rules are added that they don't want to live with.

      • By laurenz_albe 2023-06-237:57

        PostgreSQL itself doesn't hire anyone. The trade mark is owned by the "PostgreSQL Community Association of Canada" (https://www.postgres.ca/), which does not hire people or engage in commercial activity. It's just there to protect the name.

        There is a PostgreSQL core team (https://www.postgresql.org/developer/core/) that is a kind of steering committee, but does not make decisions concerning development. It consists of people from different companies.

        The people who develop PostgreSQL are mostly hired by companies that have a commercial interest in the database and usually sell consulting, support, training, products that need PostgreSQL and customized closed-source forks of the database software.

        There is no single authority in the PostgreSQL project. Decisions are made in mailing list discussions, cumbersome as that may be. The ultimate authority over the code is wielded by the committers (https://wiki.postgresql.org/wiki/Committers), which are highly trusted core developers who work for different companies. Development is slower than it might be with a single central authority, but that also part of the reason for the high software quality that PostgreSQL is known for.

      • By minorninth 2023-06-1518:301 reply

        PostgreSQL, like many other open-source projects, has sponsors and accepts donations. Here's their sponsors page:

        https://www.postgresql.org/about/sponsors/

        Also, I think it's important to note that a lot of contributors aren't volunteering their "free time", they're being paid by some other employer to contribute to PostgreSQL as part of their job:

        https://www.enterprisedb.com/blog/importance-of-giving-back-...

        • By btilly 2023-06-1522:38

          If you read https://www.postgresql.org/about/policies/sponsorship/ you'll find that the list of sponsors is essentially a recognition for companies paying their employees to contribute to PostgreSQL.

          It isn't for contributing into a pot of money allowing some central PostgreSQL committee to hand out money for other things, like hiring people to do documentation.

      • By kaycebasques 2023-06-1517:281 reply

        Open Web Docs is a potential model to draw inspiration from regarding funding: https://openwebdocs.org

        Presumably, PostgreSQL has leaders who are responsible for steering the ship. If the project is going to succeed long-term, those leaders have to find ways to keep their contributors happy while also creating an organizational structure that leads to good docs. Easier said than done, I know, but it really is as simple as that.

        Sorry if any of my comments came off naive or obtuse when it comes to open source dynamics. But the reality is that you need good docs, and I'm just trying to give an honest assessment from my experience of the conditions that lead to good docs.

        • By btilly 2023-06-1517:531 reply

          Sorry if any of my comments came off naive or obtuse when it comes to open source dynamics.

          If you want that apology to be meaningful, you should learn something.

          When you're talking about a highly successful open source project that has been going for more than 3 decades, it is beyond ludicrous for you to say, "If the project is going to succeed long-term..." It already has succeeded long-term. And you would be better off figuring out why it works rather than lecturing about how it must work.

          When you talk about "a potential model to draw from" for funding, please note that I've been involved with open source for about a quarter of a century. I've seen a LOT of funding models attempted. Mostly they run into one big problem. And that problem is that adding funding creates bruised egos because people say, "Why is he getting paid when I'm not?"

          The one funding model that DOESN'T have this problem is when a company decides to pay its employees to work on features that it wants in the project. Now there are no bruised egos - the money comes from the company and it is clear why one person gets paid while another does not. There are still challenges with this model - employees are under pressure to get their contributions accepted whether or not the project likes them - but we've learned how to navigate those.

          But now we're left back where we started. Companies who hire core developers don't generally need comprehensive documentation - they build internal documentation straight for their use case. So comprehensive external documentation is hard to find. Sometimes you'll wind up with things like an excellent introductory tutorial like https://docs.python.org/3/tutorial/. Usually, you don't. And generally it is hard to simply pay someone to take care of it for you.

          • By kaycebasques 2023-06-1518:532 reply

            > When you're talking about a highly successful open source project that has been going for more than 3 decades, it is beyond ludicrous for you to say, "If the project is going to succeed long-term..." It already has succeeded long-term.

            Yes, your reaction here totally makes sense. Feedback acknowledged.

            > If you want that apology to be meaningful, you should learn something.

            I have re-read my earlier comments and I feel that you are being more hostile to me than is justified. I do not think you are adhering to HN's code of conduct guidelines for comments: https://news.ycombinator.com/newsguidelines.html#comments

            > you would be better off figuring out why it works rather than lecturing about how it must work

            This doesn't seem fair. The original post is about the limitations of the PostgreSQL docs. Docs have been the focus of my career for 10 years. I have experienced and analyzed docs problems in many contexts: small orgs, large orgs, open source, closed source. I made an on-topic comment about ways to resolve the problems that the PostgreSQL docs are facing. Is it the only solution? Of course not. But I totally have relevant experience in this domain and, just like you have a good idea about what generally works and doesn't work regarding open source funding, I have a pretty good idea about what generally works for creating the conditions that lead to good docs.

            > So comprehensive external documentation is hard to find.

            Again, I think the web platform space is relevant here. Web platform documentation could easily devolve into a tragedy of the commons situation. Yet MDN does exist and is an amazing resource.

            Paragraphs 4 to 6 of your last comment seem to be arguing that hiring TWs is not an option for PostgreSQL. That is totally understandable. On another day maybe we would have arrived at that understanding on friendly terms and would have had a constructive conversation about how to create good docs when hiring TWs is not possible. But it's clear that my ideas aren't welcome here so I'll just stop now.

            • By wrs 2023-06-1519:33

              If someone gets weirdly hostile and condescending towards you on HN (sadly not uncommon), I recommend that you try to just ignore them and keep contributing. I’d like to hear what you have to say.

            • By burnished 2023-06-160:28

              Hey, I enjoyed your comments and appreciated your experienced view.

              And, that person is being overtly hostile - I can only describe their behavior as making shit up in order to justify picking a fight and talking down to you.

              I don't think you should take anything they say to heart - flag and move on at this point.

      • By justinclift 2023-06-179:12

        > Where do you think that an open source project like PostgreSQL gets a budget to hire anyone?

        From rough memory of an informal discussion with one of the members involved in the PG sponsorship area a while back, they had something like several hundred thousand US$ sitting around mostly unused.

        The real problem is proposing something like this, and getting it okay-d. Seems like a tough (political) challenge.

  • By fdr 2023-06-1517:331 reply

    I think Haas is basically right, that the structure flows from the community structure, and that it's not clear alterations would be a net win. pgsql-hackers is producing the kind of docs only they can, but many usful kinds of docs they cannot produce (per Haas's theory, e.g. more narrative in nature) are delegated to the relative anarchy of the Internet, in blogs, comment threads, stack exchange, and such.

    While there should be a lot of hesitancy at the implied assumption that any particular arrangement is at the efficient frontier -- most situations can be improved in most or all dimensions -- an exchanged loss in the kind of documents that pgsql-hackers is suited to producing is hard to replace.

    • By paulryanrogers 2023-06-161:01

      The mailing list is very accessible if a bit old school. You can often get answers from core contributors. Overall I think their approach has scaled well enough.

      And in the past they have come around on things like replication in core. So perhaps the future will bring some more handholding docs to supplement the reference docs.

HackerNews