Hacker News

Markdown is holding you back

2025-11-2220:039065newsletter.bphogan.com

Explore why Markdown, despite its ubiquity, might not be the best fit for technical content.

Show article

I've used many content formats over the years, and while I love Markdown, I run into its limitations daily when I work on larger documentation projects.

In this issue, you'll look at Markdown and explore why it might not be the best fit for technical content, and what else might work instead.

Markdown Lacks the Structure You Need

Markdown is everywhere. It's human-readable, approachable, and has just enough syntax to make docs look good in GitHub or a static site. That ease of use is why it's become the default choice for developer documentation. I'm using Markdown right now to write this newsletter issue. I love it.

But Markdown's biggest advantage is its biggest drawback: it doesn't describe the content like other formats can.

Think about how your content gets consumed. Your content isn't just for human readers. Machines use it too. Your content gets indexed by search engines, and parsed by LLMs, and those things parse the well-formed HTML your systems publish. Markdown's basic syntax only emits a small subset of the available semantic tags HTML allows.

IDE integrations can use your docs, too. And AI agents rely on structure to answer developer questions. If you're only feeding them plain-text Markdown documents to reduce the number of tokens you send, you're not providing as much context as you could.

Worse, when you want to reuse your content or syndicate content into another system, you quickly find out that Markdown is more of the lowest common denominator than a source of truth, as not all Markdown flavors are the same.

There are other options you can use that give you more control. But first, let's look deeper into why you should move away from Markdown for serious work.

Markdown is "implicit typing" for content

If you're a developer, you know all about type systems in programming languages. Some languages use Implicit typing, in which the compiler or interpreter infers the data type from the value. These languages give you flexibility, but no guarantees. That's why many developers prefer languages that use explicit typing, where you predefine data types when writing the code. In those languages, the compiler doesn't just build your code; it guarantees specific rules are followed. That's the main reason for the rise of TypeScript over JavaScript: compile-time guarantees.

Markdown is implicit typing. It lets you write quickly, but without constraints or guarantees. There's no schema. No way to enforce consistency. A heading in one file might be a concept, in another it might be a step, and there's no machine-readable distinction between the two.

To complicate things further, there are multiple flavors of Markdown, each with its own features and markup. Here are just a few:

You think you're writing "Markdown," but what works in one tool may not render in another. Some Markdown processors allow footnotes, Others ignore soft line breaks. And some even require different formatting for code blocks. Inconsistency makes Markdown a shaky foundation for anything beyond the most basic document.

And then there's MDX, which people often use to extend Markdown to support things it doesn't:

Here's a typical MDX snippet:

# Install

<Command>npm install my-library</Command>

That <Command> tag isn't Markdown at all; it's a React component. Instead of using a code block, the author chose to create a special component to standardize how all commands would display in the documentation.

It works beautifully on their site because their publishing system knows what <Command> means. But if they try to syndicate this content to another system, it breaks because that system also needs to implement that component. And even if it was supported elsewhere, there's no guarantee that the component is implemented the same way.

MDX shows that even in Markdown-centric ecosystems, people instinctively add more expressive markup. They know plain Markdown isn't enough. They're reinventing semantic markup, but in a way that's custom, brittle, and not portable.

Why semantic markup matters

Semantic markup describes what content is, not just how it should look. It's the difference between saying "here's a bullet with some text" and "here's a step in a procedure." To a human, those may look the same on a page. To a machine or to a publishing pipeline, they are entirely different.

Web developers already went through all this with HTML. Prior to HTML5, you had <div> as a logical container. But HTML5 introduced <section>, <article>, <aside>, and many other elements that described the content.

Semantic markup matters for two important and related reasons:

Transformation and reuse. With semantic markup, you can publish the same content to HTML, PDF, ePub, or even plain Markdown. With Markdown as your source, you can't easily go to another format. You can't turn a bullet into a <step> or a paragraph into a <para> without guessing. You can't add context if it wasn't there to begin with, but you can strip out what you don't need when you transform the document, and you can choose how to present each thing in a consistent way.

Machine consumption. LLMs and agents can make better use of content that carries structure. A step marked as a <step> is unambiguous. A bullet point might be a step, or a note, or just a list item. The machine has to guess. This is why XML was a preferred mechanism for web services for a long time, and why JSON Schema exists.

Let's explore four formats that give you more control over structure than plain Markdown.

reStructuredText

reStructuredText is a plain-text markup language from the Python/Docutils ecosystem that supports directives, roles, and structural semantics. It is the foundational format used by Sphinx for generating documentation.

Installation
============ .. code-block:: bash npm install my-library .. note:: This library requires Node.JS ≥ 22. See also :ref:`usage-guide`.

Here you see a code-block directive, an admonition (note), and an explicit cross-reference via :ref:. You'll find support for images, figures, topics, sidebars, pull quotes, epigraphs, and citations as well.

All of those encode semantics, not just presentation.

AsciiDoc

AsciiDoc aims to be human-readable but semantically expressive. It has attributes, conditional content, include mechanisms, and more.

Here's an example of AsciiDoc:

= Installation
:revnumber: 1.2
:platform: linux
:prev_section: introduction
:next_section: create-project

[source,bash]
----
npm install my-library
----

NOTE: This library requires Node.JS ≥ 22.

See <<usage,Usage Guide>> for examples.

AsciiDoc has native support for document front-matter. Attributes like :revnumber: or :platform: let you parameterize content.

<<usage,Usage Guide>> is a cross-reference syntax.

Like reStructuredText, AsciiDoc supports admonitions like NOTE and WARNING so you don't have to build your own custom renderer. It also has support for sidebars, and you can add line highlighting and callouts to your code blocks without additional extensions.

And if you're writing technical documentation, there's explicit support for marking up UI elements and keyboard shortcuts.

Using AsciiDoctor, you can transform AsciiDoc into other formats, including HTML, PDF, ePub, and DocBook, which you'll look at next.

DocBook (XML)

DocBook is an XML-based document model explicitly designed for technical publishing. It expresses hierarchical and semantic structure in tags and attributes, enabling industrial-grade transformations.

Here's an example

<article id="install-library">
 <title>Installation</title>
 <command>npm install my-library</command>
 <note>This library requires Node.JS &gt;= 22</note>
 <xref linkend="usage-chapter">Usage Guide</xref>
</article>

Every tag is meaningful: <command> vs <para>, <note> vs <xref>. You'll find predefined tags for function names, variables, application names, keyboard shortcuts, UI elements, and much more. Being able to mark up the specific product names and terminology you use makes it so much easier to create glossaries and indexes. And Docbook has tags for defining index terms, too.

DocBook's rich ecosystem of XSLT stylesheets supports transforming to HTML, PDF, man pages, and even Markdown. Using DocBook ensures structure and validation at scale, as long as you use the tags it provides.

Then there's DITA.

DITA (Darwin Information Typing Architecture)

DITA is a standard for writing, managing, and publishing content. It's a topic-based XML architecture with built-in reuse, specialization, and modular content design. It's an open standard, and it's widely used in enterprises for multi-channel, structured content that needs standardization and reuse.

Here's an example:

<task id="install">
 <title>Installation</title>
 <steps>
 <step><cmd>npm install my-library</cmd></step>
 </steps>
 <prolog>
 <note>This library requires Node.js &gt;= 22</note>
 </prolog>
</task>

DITA defines types like <task> and <step>, which cleanly map to procedural structure. You can compose topics, reuse via content references (conrefs), and specialize as your domain evolves.

One of the more interesting features DITA provides is the ability to filter content and create multiple versions from a single document.

The DITA Open Toolkit and many enterprise tools handle rendering, transformation, and reuse pipelines.

Ew. XML.

Yes, XML. The syntax is more verbose than Markdown. Tooling is less ubiquitous than Markdown. Migration requires effort, and your team may resist the learning curve. For small docs, Markdown's features are often enough.

But if you're already bolting semantics onto Markdown with MDX or plugins or custom scripts, you're paying that complexity cost anyway, and you don't get the benefits of standardization or portability. You're building a fragile, custom semantic layer instead of adopting one that already works.

So where does that leave you?

If you're writing a quick README or a short-lived doc, Markdown is fine. It's fast, approachable, and does the job. If you're building a developer documentation site that needs some structure, reStructuredText or AsciiDoc are better choices. They balance expressiveness with usability. And if you're managing a large doc set that needs syndication, reuse, and multi-channel publishing, DocBook and DITA give you the semantics and tooling to make that process more manageable.

The key is to start with the richest format you can manage and export downward. Markdown makes a great output for developers. It's approachable and familiar. But be careful not to lock yourself into it as your source of truth, because you can't add context back as easily as you can strip it out.

Things To Explore

I have a new book out. Check out Write Better with Vale. This book walks you through implementing Vale, the prose linter, on your next writing project to create consistent, quality content.
Tidewave.ai is a full-stack coding agent from the creators of the Elixir programming language. It supports Ruby on Rails, Phoenix, and React applications and has a free tier. You'll need an API key for OpenAI, Anthropic, or GitHub Copilot to use it.
Google's Chrome for Developers blog has a post on creating accessible carousels. It's worth the read if you have to implement one of these on your site.

Before the next issue, here are a couple of things you should try to get some hands-on experience with a different format.

As always, thanks for reading. Share this issue with someone who you think would find this helpful.

I'd love to talk with you about this issue on BlueSky, Mastodon, Twitter, or LinkedIn. Let's connect!

Please support this newsletter and my work by encouraging others to subscribe and by buying a friend a copy of Write Better with Vale, tmux 3, Exercises for Programmers, Small, Sharp Software Tools, or any of my other books.

Read the original article

zdw

Karma: 136259

@Hacker__News
@hacker._news

Comments

By procaryote 2025-11-237:52

I use markdown because it's easy to read without rendering. All of the alternatives in the article seem worse

If I wanted more structure, I'd just write html; or mix html into the markdown.

Pandoc lets me do things like generate libreoffice or microsoft word documents from the markdown, using a reference document for styling of headings etc. This also gives me good enough control to generate OK looking pdfs. It's not LaTeX levels of control, but it's much easier

I don't want to do extra work to hypothetically make things easier for an LLM.

By dschuessler 2025-11-2222:203 reply

You can include arbitrary HTML tags in Markdown at any place you need them.[0] I am not aware of any Markdown tooling that does not support this.

So, no, Markdown is not holding me back. It is perfectly capable of what the author claims it isn't.

[0]: https://daringfireball.net/projects/markdown/syntax#html

By throwaway150 2025-11-230:001 reply

> You can include arbitrary HTML tags in Markdown at any place you need them.

That is well known and I am sure the author is aware of it. The problem they are describing is not whether HTML is technically allowed inside Markdown. It's that when you are writing Markdown, you are writing Markdown, not HTML, and that comes with some problems.

> It is perfectly capable of what the author claims it isn't.

In theory, yes. In practice, using Markdown becomes much less appealing once you start dropping raw HTML all over the place. The whole point of choosing Markdown is that you do not want to spend your time typing <p>, <a>, <li> and the rest. You want to write in Markdown, with only occasional HTML when absolutely necessary.

That is exactly where the author's complaints become relevant. If the solution to Markdown's limitations is routinely switching to HTML, then the argument becomes circular. If you are expected to write HTML to address the author's complaints, why bother with Markdown at all? If the answer is just "write HTML", then you may as well skip Markdown in the first place.

By vorpalhex 2025-11-232:241 reply

Most markdown engines allow short tags to stand in for html, so for frequent features you can just use a short tag.

Alternatively you can extend markdown. I wrote a simple text based game engine that was markdown based but I needed some arbitrary additions appropriate for a game.. so I just added a few elements.

By hysan 2025-11-235:14

The author addresses this too. Once you start down this path, you go down the road of non-standardization which means losing portability, etc. I don’t see how this is a point against the author?

By henrebotha 2025-11-2222:532 reply

There are real limitations to this: You can't arbitrarily mix and match HTML and Markdown. As soon as you introduce an HTML block, you're locked out of Markdown syntax.

AsciiDoc lets you mix and match however you want. Or, put differently: AsciiDoc's superiority over Markdown extends even to being better at shelling out to HTML.

By vidarh 2025-11-2223:01

While that's true, I'd take Markdown + extensions to allow inline HTML or custom tags over AsciiDoc any day, even at the cost of losing some compatibility - converting that to plain Markdown is usually easy enough.

By tefkah 2025-11-2222:56

mdx does tho. you could just not define any components, then you can nest markdown inside html no problem

By hizanberg 2025-11-237:28

I also put interactive components in my markdown docs, I’m only using Markdown for content now.

By jimmar 2025-11-2222:091 reply

Markdown is the minimum viable product. It’s easy to learn and still readable if not rendered in an alternate format. It’s great.

For making PDFs, I’ve recently moved from AsciiDoc to Typst. I couldn’t find a good way to get AsciiDoc to make accessible PDFs, and I found myself struggling to control the output. Typst solves all of AsciiDoc’s problems for me.

But in the end, no markup language will make you write better. It’s kind of like saying that ballpoint pens are limiting your writing, so you should switch to mechanical pencils.

By undeveloper 2025-11-2222:252 reply

typst looks interesting -- but how are you writing it? from what I looked at, it looks like theres an official web editor and a vscode plugin with limited support. this feels pretty limited, as someone who came in expecting something like obsidian.

By TRiG_Ireland 2025-11-2222:411 reply

I'm not aware of any limitations in the Tinymist plugin.

And you can just write it in the plain text editor of your choice, and keep an eye on the PDF with typst watch.

By addaon 2025-11-230:48

> I'm not aware of any limitations in the Tinymist plugin.

I looked into this a while ago, and couldn't find a workflow I could live with. Have things improved? What's the workflow like for working on an image in, say, OmniGraffle to include in the document? Does text search in embedded PDFs work these days? LinkBack so I can edit the images easily inline?

By MillironX 2025-11-2223:22

I've started experimenting with Typst for a few documents, and here's my stack:

- Zed editor with Typst plugin

- Tinymist LSP settings turned on to render on save in Zed, see https://code.millironx.com/millironx/nix-dotfiles/src/commit...

- Okular open to the output document. Okular refreshes the document when changed on disk.

It's not as polished as say, LaTeX Workshop in VSCode, but it gets the job done.