Transpiler, a Meaningless Word (2023)

2025-11-0620:26125113people.csail.mit.edu

This tool is different from a compiler which often has a complex frontend, an optimizing middle end, and code generators for various backends. The big problem around most of the arguments to…

Show article

This tool is different from a compiler which often has a complex frontend, an optimizing middle end, and code generators for various backends. The big problem around most of the arguments to distinguish between compilers and “transpilers” focus on language syntax. However, anyone who wants one of these tools to actually work has to contend with the fact that different languages will have different semantics and translating between those is a complex task; a task that compilers already do.

Lie #1: Transpilers Don’t have Frontends

Let’s look at a simple Python to C transpiler. Both Nuitka and Mojo both actually target this exact problem but sanely call themselves compilers. It takes python code that looks like this:

def fact(n):
 x = 1
 for i in range(1, n):
 x *= i
 return x

Into some C code like this:

int fact(int n) {
 int x = 1;
 for (int i = 1; i < n; i++) {
 x *= i;
 }
 return x;
}

Wow, pretty simple! But of course, that piece of python is not very idiomatic. We can make it a bit more terse using list comprehensions:

import functools as ft
def fact(n):
 lst = range(1, n)
 return ft.reduce(lambda acc, x: acc*x, lst)

Now our “transpiler” is in a little bit of trouble. The implementation of reduce is in pure Python so maybe we can still transpile it but range is implemented purely in C.

Looking into the implementation, what’s even clearer is that matching the semantics of this program is even harder: range is a Python generator which means that instead of actually computing the numbers from 1 to n, it only produces them when asked. This allows our method to save memory because we don’t actually have to allocate n words and can work using just the memory for the lazy implementation of the generator and the local variables.

Another problem is that there are hundreds of built-in library functions that need to be compiled from Python from C. Even a moderately useful subset would be unwieldy to implement by hand in our simple “transpiler”. Maybe one strategy we can take is to build a some sort of tool that would simplify these hundreds of definitions into a more uniform representation to work with.

We’ll call it the transpiler-not-frontend to make sure people understand we’re not building a compiler here. It is not hard to find examples of things mislabelled as transpilers. However, I won’t name any specific projects because this is just a dumb diatribe about words, I actually think the projects themselves are cool.

Lie #2: Transpilers are Simple

BabelJS is arguably one of the first “transpilers” that was developed so that people could experiment with JavaScript’s new language features that did not yet have browser implementations. Technically, ECMAScript features.

For example, ES6 added support for generators (similar to those in Python) but a lot of browser frontends did not support them. Generators are pretty nice:

function *range(max) {
 for (var i = 0; i < max; i += 1) {
 yield i;
 }
}
// Force the evaluation of the generator
console.log([0, ...range(10)])

Facebook’s regenerator is a BabelJS-based “transpiler” to transform generators into language constructs that already existed in JavaScript. Shouldn’t be too hard, right?

var _marked = /*#__PURE__*/regeneratorRuntime.mark(range);
function range(max) {
 var i;
 return regeneratorRuntime.wrap(function range$(_context) {
 while (1) {
 switch (_context.prev = _context.next) {
 case 0:
 i = 0;
 case 1:
 if (!(i < max)) {
 _context.next = 7;
 break;
 }
 _context.next = 4;
 return i;
 case 4:
 i += 1;
 _context.next = 1;
 break;
 case 7:
 case "end":
 return _context.stop();
 }
 }
 }, _marked);
}
// Force the evaluation of the generator
console.log([0, ...range(10)]);

Guess what, it is. Implementing generators is a whole-program transformation: they fundamentally rely on the ability of the program to save its internal stack and pause its execution. In fact, making it fast requires enough tricks that we wrote a paper on it.

The point here is that people call arbitrarily complex tools “transpilers”. Again, the problem is the misguided focus on language syntax and a lack of understanding of the semantic difference.

Lie #3: Transpilers Target the Same Level of Abstraction

This is pretty much the same as (2). The input and output languages have the syntax of JavaScript but the fact that compiling one feature requires a whole program transformation gives away the fact that these are not the same language. If we’re to get beyond the vagaries of syntax and actually talk about what the expressive power of languages is, we need to talk about semantics.

Lie #4: Transpilers Don’t have Backends

BabelJS has a list of “presets” which target different versions of JavaScript. This is not very different from LLVM having multiple different backends. If you’re going to argue that the backends all compile to the same language, see (3). People might argue that when Babel is compiling its operations, it can do it piecemeal: that is, the compilation of nullish coaleascing operators has nothing to how classes are compiled.

This is exactly what compiler frontends do as well: they transform a large surface area of syntax into a smaller language and a lot of operations are simple syntactic sugar which can be represented using other, more foundational primitives in the language. For example, in the Rust compiler, the mid-level representation (MIR) does away with features like if-let by compiling them into match statements. In fact, clippy, a style suggestion tool for Rust, implements this as source-to-source transformation: if you have simple match statements in your program in your program, Clippy will suggest a rewrite to you.

Compilers already do things that “transpilers” are supposed to do. And they do it better because they are built on the foundation of language semantics instead of syntactic manipulation.

Lie #5: Compilers only Target Machine Code

This one is interesting because instead of defining the characteristics of a “transpiler”, it focuses on restricting the definition of a compiler. Unfortunately, this one too is wrong. The term is widely used in many contexts where we are not generating assembly code and instead generating bytecode for some sort of virtual machine. For example, the JVM has an ahead-of-time compiler from Java source code to the JVM bytecode and another just-in-time compiler to native instructions. These kinds of multi-tier compilation schemes are extremely common in dynamic languages like JavaScript as well.

Lie #6: Transpilers are not Compilers

People seemed to scared of compilers and resort to claims like “I don’t want something as complex”, or “string interpolation is good enough”. This is silly. Anyone who has built one of these “transpilers” knows that inevitably, they get complex and poorly maintained precisely because of the delusion that they aren’t doing something complex.

Programming languages are not just syntax; they have semantics too. Pretending that you can get away with just manipulating the former is delusional and results in bad tools.

Lindsey Kuper has a well-written article on the same topic.

Read the original article

Comments

By jasode 2025-11-139:0211 reply

Whenever someone argues the uselessness or redundancy of a particular word, a helpful framework to understand their perspective is "Lumpers vs Splitters" : https://en.wikipedia.org/wiki/Lumpers_and_splitters

An extreme caricature example of a "lumper" would just use the word "computer" to label all Turing Complete devices with logic gates. In that mindset, having a bunch of different words like "mainframe", "pc", "smartphone", "game console", "FPGA", etc are all redundant because they're all "computers" which makes the various other words pointless.

On the other hand, the Splitters focus on the differences and I previously commented why "transpiler" keeps being used even though it's "redundant" for the Lumpers : https://news.ycombinator.com/item?id=28602355

We're all Lumpers vs Splitters to different degrees for different topics. A casual music listener who thinks of orchestral music as background sounds for the elevator would be "lump" both Mozart and Bach together as "classical music". But an enthusiast would get irritated and argue "Bach is not classical music, it's Baroque music. Mozart is classical music."

The latest example of this I saw was someone complaining about the word "embedding" used in LLMs. They were asking ... if an embedding is a vector, why didn't they just re-use the word "vector"?!? Why is there an extra different word?!? Lumpers-vs-splitters.

By Izkata 2025-11-1316:051 reply

"Compiler" encompassing "transpiler" I think is wrong anyway. There's a third term that doesn't seem to get nearly as much pushback, that didn't come up in your link, has yet to be mentioned here, and isn't in the article, but adds context for these two: decompiler.

Compiling is high-level to low-level (source code to runnable, you rarely look at the output).

Decompiling is low-level to high-level (runnable to source code, you do it to get and use the output).

Transpiling is between two languages of roughly the same level (source code to source code, you do it to get and use the output).

Certainly there's some wishy-washy-ness due to how languages relate to each other, but none of these terms really acts like a superset of the others.

By cestith 2025-11-1317:033 reply

I like your definitions, but all three of these could be called subsets of compilers.

By ablob 2025-11-1318:031 reply

By the definitions given they can not, as no function subsumes another. By whatever you define as "compiler" maybe, but I see no point in this kind of interaction that essentially boils down to subsumtion to an entity you refuse to describe any further.

Is there a merit to this? Can whatever you call compiler do more? Is it all three of the things mentioned combined? Who knows - as is stands I only know that you disagree with the definitions given/proposed.

By cestith 2025-11-1414:52

I think they are fine definitions. I think a transpiler, a term rewriter, an assembler, a stand-alone optimizer, and even some pretty printers are subclasses of compilers.

I define a compiler as something that takes an input in a language, does transformations, and produces a transformed output in a language. All of them do that, and they are more specific terms for types of compilers.

By jrm4 2025-11-1319:211 reply

Except that they do what useful words do; provide (more) useful information.

By cestith 2025-11-1414:53

Fair. I don’t believe I said they were useless terms for differentiation of types of compilers, though. I just said they can all be thought of as a class as different types of compilers.

By coldtea 2025-11-1422:211 reply

So what? A car, a bike, and a truck can all be called subsets of vehicles, but we still have (and need) different words for each type.

By cestith 2025-11-1721:08

Primates, canids, felids, and ungulates are all subsets of mammals and all have further subsets. Mammalia is further a subset of animalia. When we’re discussing categorizations, it’s often helpful to have multiple levels of categories. I’m not sure why you seem to be calling out specificity as a gotcha, when my argument isn’t at all that we don’t need multiple terms. It’s that we should consider these things in terms of similarity and specific differences, not throw away a term as useless as the article and its headline suggest.

By cyco130 2025-11-1312:48

It's all about context, isn't it? "Humans vs. animals" is an important distinction to make in some contexts and useless in others. Insisting on the fact that humans are also animals if we're talking about, say, "language in humans vs. animals" is unproductive. It just makes discussions harder by forcing everyone to add "_non-human_ animals" to every mention. But if we're talking about, say, cellular biology, it's unproductive to force everyone to write "human and animal cells" instead of just "animal cells".

Similarly, distinguishing between transpilers and compilers might be important in some contexts and useless in others. Transpilers are source-to-source compilers, a subset of compilers. Whether it matters depends on the context.

By mlyle 2025-11-1315:21

I think the argument here is not really where one should draw the line and whether transpiler should be a different word...

I think the argument centers on how transpilers are often justified as being something quite different in difficulty than writing a whole compiler -- and in practice, nearly the whole set of problems of writing a compiler show up.

So, it's more like, don't use the distinction to lie to yourself.

By mjburgess 2025-11-1311:044 reply

I'm not convinced your L/S dichotomy applies. The concern there is that the natural world (or some objective target domain) has natural joints, and the job of the scientist (, philosopher, et al.) is to uncover those joints. You want to keep 'hair splitting' until the finest bones of reality are clear, then grouping hairs up into lumps, so their joints and connections are clear. The debate is whether the present categorisation objectively under/over-generates , and whether there is a factor of the matter. If it over-includes, then real structure is missing.

In the case of embeddings vs. vectors, classical vs., baroque, transpiler vs., compiler -- i think the apparent 'lumper' is just a person ignorant of classification scheme offered, or at least, ignorant of what property it purports to capture.

In each case there is a real objective distinction beneath the broader category that one offers in reply, and that settles the matter. There is no debate: a transpiler is a specific kind of compiler; an embedding vector is a specific kinds of vector; and so on.

There is nothing at stake here as far as whether the categorisation is tracking objective structure. There is only ignorance on the part of the lumper: the ignorant will, of course, always adopt more general categories ("thing" in the most zero-knowledge case).

A real splitter/lumper debate would be something like: how do we classify all possible programs which have programs as their input and output? Then a brainstorm which does not include present joint-carving terms, eg., transformers = whole class, transformer-sourcers = whole class on source code, ...

By jasode 2025-11-1314:342 reply

> i think the apparent 'lumper' is just a person ignorant of classification scheme offered, or at least, ignorant of what property it purports to capture.

>In each case there is a real objective distinction

No, Lumper-vs-Splitter doesn't simply boil down to plain ignorance. The L/S debate in the most sophisticated sense involves participants who actually know the proposed classifications but _chooses_ to discount them.

Here's another old example of a "transpiler" disagreement subthread where all 4 commenters actually know the distinctions of what that word is trying to capture but 3-out-of-4 still think that extra word is unnecessary: https://news.ycombinator.com/item?id=15160415

Lumping-vs-Splitting is more about emphasis vs de-emphasis via the UI of language. I.e. "I do actually see the extra distinctions you're making but I don't elevate that difference to require a separate word/category."

The _choice_ by different users of language to encode the difference into another distinct word is subjective not objective.

Another example could be the term "social media". There's the seemingly weekly thread where somebody proclaims, "I quit all social media" and then there's the reply of "Do you consider HN to be social media?". Both the "yes" and "no" sides already know and can enumerate how Facebook works differently than HN so "ignorance of differences" of each website is not the root of the L/S. It's subjective for the particular person to lump in HN with "social media" because the differences don't matter. Likewise, it's subjective for another person to split HN as separate from social media because the differences do matter.

By tshaddox 2025-11-1317:29

> Here's another old example of a "transpiler" disagreement subthread where all 4 commenters actually know the distinctions of what that word is trying to capture but 3-out-of-4 still think that extra word is unnecessary

Ha. I see this same thing play out often where someone is arguing that “X is confusing” for some X, and their argument consists of explaining all relevant concepts accurately and clearly, thus demonstrating that they are not confused.

By mjburgess 2025-11-1315:141 reply

I agree there can be such debates; that's kinda my point.

I'm just saying, often there is no real debate it's just one side is ignorant of the distinctions being made.

Any debate in which one side makes distinctions and the other is ignorant of them will be an apparent L vs. S case -- to show "it's a real one" requires showing that answering the apparent L's question doesnt "settle the matter".

In the vast majority of such debates you can just say, eg., "transpilers are compilers that maintain the language level across input/output langs; and sometimes that useful to note -- eg., that typescript has a js target." -- if such a response answers the question, then it was a genuine question, not a debate position.

I think in the cases you list most people offering L-apparent questions are asking a sincerely learning question: why (because I don't know) are you making such a distinction? That might be delivered with some frustration at their misperception of "wasted cognitive effort" in such distinction-making -- but it isnt a technical position on the quality of one's classification scheme

By pessimizer 2025-11-1317:061 reply

> it's just one side is ignorant of the distinctions being made.

> No, Lumper-vs-Splitter doesn't simply boil down to plain ignorance.

If I can boil it down to my own interpretation: when this argument occurs, both sides usually know exactly what each other are talking about, but one side is demanding that the distinction being drawn should not be important, while the other side is saying that it is important to them.

To me, it's "Lumpers" demanding that everyone share their value system, and "Splitters" saying that if you remove this terminology, you will make it more difficult to talk about the things that I want to talk about. My judgement about it all is that "Lumpers" are usually intentionally trying to make it more difficult to talk about things that they don't like or want to suppress, but pretending that they aren't as a rhetorical deceit.

All terminology that makes a useful distinction is helpful. Any distinction that people use is useful. "Lumpers" are demanding that people not find a particular distinction useful.

Your "apparent L's" are almost always feigning misunderstanding. It's the "why do you care?" argument, which is almost always coming from somebody who really, really cares and has had this same pretend argument with everybody who uses the word they don't like.

By mjburgess 2025-11-1320:42

I mean, I agree. I think most L's are either engaged in a rhetorical performance of the kind you describe, or theyire averse to cognitive effort, or ignorant in the literal sense.

There are a small number of highly technical cases where an L vs S debate makes sense, biological categorisation being one of them. But mostly, it's an illusion of disagreement.

Of course, the pathological-S case is a person inviting distinctions which are contextually inappropriate ("this isnt just an embedding vector, it's a 1580-dim! EV!"). So there can be S-type pathologies, but i think those are rarer and mostly people roll their eyes rather than mistake it as an actual "position".

By seg_lol 2025-11-1311:481 reply

> I'm not convinced your L/S dichotomy applies.

Proceeds to urm actually split.

By datadrivenangel 2025-11-1314:182 reply

All ontologies are false. But some are useful.

By mjburgess 2025-11-1320:44

All ontologies people claim to be ontologies are false in toto

All "ontologies" are false.

There is, to disquote, one ontology which is true -- and the game is to find it. The reason getting close to that one is useful, the explanation of utility, is its singular truth

By cestith 2025-11-1317:05

To be a lumper for a second, all models are flawed. But some are useful.

By ethmarks 2025-11-1314:18

Ahh, so you're a meta-splitter.

https://xkcd.com/2518/

By thfuran 2025-11-1315:171 reply

>An extreme caricature example of a "lumper" would just use the word "computer" to label all Turing Complete devices with logic gates.

"Computer"? You mean object, right?

By ObscureScience 2025-11-1317:10

You mean point of local maximum in the mass field?

By kragen 2025-11-1310:581 reply

> An extreme caricature example of a "lumper" would just use the word "computer" to label all Turing Complete devices with logic gates.

I don't think that's a caricature at all; I've often seen people argue that it should include things like Vannevar Bush's differential analyzer, basically because historically it did, even though such devices are neither Turing-complete nor contain logic gates.

By mjburgess 2025-11-1311:112 reply

'computer' is an ambiguous word. In a mathematical sense a computational process is just any which can be described as a function from the naturals to naturals. Ie., any discrete function. This includes a vast array of processes.

A programmable computer is a physical device which has input states which can be deterministicaly set, and reliably produce output states.

A digital computer is one whose state transition is discrete. An analogue computer has continuous state transition -- but still, necessarily, discrete states (by def of computer).

An electronic digital programmable computer is an electric computer whose voltage transitions count as states discretely (ie., 0/1 V cutoffs, etc.); its programmable because we can set those states causally and deterministically; and its output state arises causally and deterministically from its input state.

In any given context these 'hidden adjectives' will be inlined. The 'inlining' of these adjectives causes an apparent gatekeepery Lumpy/Splitter debate -- but it isnt a real one. Its just ignorance about the objective structure of the domain, and so a mistaken understanding about what adjectives/properties are being inlined.

By lesam 2025-11-1312:021 reply

In fact ‘computer’ used to be a job description: a person who computes.

By kragen 2025-11-1312:30

Yes, definitely. And "nice" used to mean "insignificant". But they don't have those meanings now.

By kragen 2025-11-1311:141 reply

Most functions from the naturals to naturals are uncomputable, which I would think calls into question your first definition.

It's unfortunate that "computer" is the word we ended up with for these things.

By mjburgess 2025-11-1311:241 reply

Ah well, that's true -- so we can be more specific: discrete, discrete computable, and so on.

But to the overall point, this kind of reply is exactly why I don't think this is a case of L vs. S -- your reply just forces a concession to my definition, because I am just wrong about the property I was purporting to capture.

With all the right joint-carving properties to hand, there is a very clear matrix and hierarchy of definitions:

abstract mathematical hierarchy vs., physical hierarchy

With the physical serving as implementations of partial elements of the mathematical.

By kragen 2025-11-1311:271 reply

Word definitions are arbitrary social constructs, so they can't really be correct or incorrect, just popular or unpopular. Your suggested definitions do not reflect current popular usage of the word "computer" anywhere I'm familiar with, which is roughly "Turing-complete digital device that isn't a cellphone, tablet, video game console, or pocket calculator". This is a definition with major ontological problems, including things such as automotive engine control units, UNIVAC 1, the Cray-1, a Commodore PET, and my laptop, which have nothing in common that they don't also share with my cellphone or an Xbox. Nevertheless, that seems to be the common usage.

By lo_zamoyski 2025-11-1312:391 reply

> Word definitions are arbitrary social constructs, so they can't really be correct or incorrect, just popular or unpopular.

If you mean that classifications are a matter of convention and utility, then that can be the case, but it isn’t always and can’t be entirely. Classifications of utility presuppose objective features and thus the possibility of classification. How else could something be said to be useful?

Where paradigmatic artifacts are concerned, we are dealing with classifications that join human use with objective features. A computer understood as a physical device used for the purpose of computing presupposes a human use of that physical thing “computer-wise”, that is to say objectively, no physical device per se is a computer, because nothing inherent in the thing is computing (what Searle called “observer relative“). But the physical machine is objectively something which is to say ultimately a collection of physical elements of certain kinds operating on one another in a manner that affords a computational use.

We may compare paradigmatic artifacts with natural kinds, which do have an objective identity. For instance, human beings may be classified according to an ontological genus and an ontological specific difference such as “rational animal“.

Now, we may dispute certain definitions, but the point is that if reality is intelligible–something presupposed by science and by our discussion here at the risk of otherwise falling into incoherence–that means concepts reflect reality, and since concepts are general, we already have the basis for classification.

By kragen 2025-11-1313:081 reply

No, I don't mean that classifications are a matter of convention and utility, just word definitions. I think that some classifications can be better or worse, precisely because concepts can reflect reality well or poorly. That's why I said that the currently popular definition of "computer" has ontological problems.

I'm not sure that your definition helps capture what people mean by "computer" or helps us approach a more ontologically coherent definition either. If, by words like "computing" and "computation", you mean things like "what computers do", it's almost entirely circular, except for your introduction of observer-relativity. (Which is an interesting question of its own—perhaps the turbulence at the base of Niagara Falls this morning could be correctly interpreted as finding a proof of the Riemann Hypothesis, if we knew what features to pay attention to.)

But, if you mean things like "numerical calculation", most of the time that people are using computers, they are not using them for numerical calculation or anything similar; they are using them to store, retrieve, transmit, and search data, and if anything the programmers think of as numerical is happening at all, it's entirely subordinate to that higher purpose, things like array indexing. (Which is again observer-relative—you can think of array indexing as integer arithmetic mod 2⁶⁴, but you can also model it purely in terms of propositional logic.)

And I think that's one of the biggest pitfalls in the "computer" terminology: it puts the focus on relatively minor applications like accounting, 3-D rendering, and LLM inference, rather than on either the machine's Protean or universal nature or the purposes to which it is normally put. (This is a separate pitfall from random and arbitrary exclusions like cellphones and game consoles.)

By lo_zamoyski 2025-11-1317:37

> That's why I said that the currently popular definition of "computer" has ontological problems.

Indeed. To elaborate a bit more on this...

Whether a definition is good or bad is at least partly determined by its purpose. Good as what kind of definition?

If the purpose is theoretical, then the common notion of "computer" suffers from epistemic inadequacy. (I'm not sure the common notion rises above mere association and family resemblance to the rank of "definition".)

If the purpose is practical, then under prevailing conditions, what people mean by "computer" in common speech is usually adequate: "this particular form factor of machine used for this extrinsic purpose". Most people would call desktop PCs "computers", but they wouldn't call their mobile phones computers, even though ontologically and even operationally, there is no essential difference. From the perspective of immediate utility as given, there is a difference.

I don't see the relevance of "social construction" here, though. Sure, people could agree on a definition of computer, and that definition may be theoretically correct or merely practically useful or perhaps neither, but this sounds like a distraction.

> I'm not sure that your definition helps capture what people mean by "computer" or helps us approach a more ontologically coherent definition either.

In common speech? No. But the common meaning is not scientific (in the broad sense of that term, which includes ontology) and inadequate for ontological definition, because it isn't a theoretical term. So while common speech can be a good starting point for analysis, it is often inadequate for theoretical purposes. Common meanings must be examined, clarified, and refined. Technical terminology exists for a reason.

> If, by words like "computing" and "computation", you mean things like "what computers do", it's almost entirely circular

I don't see how. Computation is something human beings do and have been doing forever. It preexists machines. All machines do is mechanize the formalizable part of the process, but the computer is never party to the semantic meaning of the observing human being. It merely stands in a relation of correspondence with human formalism, the same way five beads on an abacus or the squiggle "5" on a piece of people denote the number 5. The same is true of representations that denote something other than numbers (a denotation that is, btw, entirely conventional).

Machines do not possess intrinsic purpose. The parts are accidentally arranged in a manner that merely gives the ensemble certain affordances that can be parlayed into furthering various desired human ends. This may be difficult for many today to see, because science has - for practical purposes or for philosophical reasons - projected a mechanistic conceptual framework onto reality that recasts things like organisms in mechanistic terms. But while this can be practically useful, theoretically, this mechanistic mangling of reality has severe ontological problems.

By AlienRobot 2025-11-1311:201 reply

That's very interesting!

Splitters make more sense to me since different things should be categorized differently.

However, I believe a major problem in modern computing is when the splitter becomes an "abstraction-splitter."

For example, take the mouse. The mouse is used to control the mouse cursor, and that's very easy to understand. But we also have other devices that can control the mouse cursor, such as the stylus and touchscreen devices.

A lumper would just say that all these types of devices are "mouses" since they behave the same way mouses do, while a splitter would come up with some stupid term like "pointing devices" and then further split it into "precise pointing devices" and "coarse pointing devices" ensuring that nobody has absolutely no idea what they are talking about.

As modern hardware and software keeps getting built on piles and piles of abstractions, I feel this problem keeps getting worse.

By ethmarks 2025-11-1314:281 reply

Doesn't it make sense to use words that mean what you're using them to mean?

By your logic I could use the term "apple" to describe apples, oranges, limes, and all other fruit because they all behave in much the same ways that apples do. But that's silly because there are differences between apples and oranges [citation needed]. If you want to describe both apples and oranges, the word for that is "fruit", not "apple".

Using a touchscreen is less precise than using a mouse. If the user is using a touchscreen, buttons need to be bigger to accommodate for the user's lack of input precision. So doesn't it make sense to distinguish between mice and touchscreens? If all you care about is "thing that acts like a mouse", the word for that is "pointing device", not "mouse".

By AlienRobot 2025-11-1414:47

The point is that it's simpler to understand what something is by analogy (a touchscreen is a mouse) than by abstraction (a mouse is a pointing device; a touchscreen is also a pointing device), since you need a third, abstracting concept to do the latter.

By bondarchuk 2025-11-1317:591 reply

Whenever someone argues the uselessness or redundancy of a particular word we just have to remember that the word exists because at least two parties have found it useful to communicate something between them.

By thfuran 2025-11-1321:15

But they may have done so before the meaning shifted or before other, more useful words were coined.

By zem 2025-11-1322:08

in addition to that, some people just seem to have an extreme aversion to neologisms. I remember being surprised by that when ajax (the web technology) first came out and there was a huge "why does this thing which is just <I honestly forget what it was 'just'> need its own name?" faction.

By antonvs 2025-11-1315:26

> But an enthusiast would get irritated and argue "Bach is not classical music, it's Baroque music. Mozart is classical music."

Baroque music is a kind of classical music, though.

By downboots 2025-11-147:03

and a combination lock is a permutation lock

By oersted 2025-11-139:262 reply

I don't understand what the issue is: a transpiler is a compiler that outputs in a language that human programmers use.

It's good to be aware of that from an engineering standpoint, because the host language will have significantly different limitations, interoperability and ecosystem, compared to regular binary or some VM byte-code.

Also, I believe that they are meaningfully different in terms of compiler architecture. Outputting an assembly-like is quite different from generating an AST of a high-level programming language. Yes of course it's fuzzy because some compilers use intermediate representations that in some cases are fairly high-level, but still they are not meant for human use and there are many practical differences.

It's a clearly delineated concept, why not have a word for it.

By kragen 2025-11-1311:013 reply

GCC outputs textual GNU assembly language, in which I have written, for example, a web server, a Tetris game, a Forth interpreter, and an interpreter for an object-oriented language with pattern-matching. Perhaps you are under the illusion that I am not a human programmer because this is some kind of superhuman feat, but to me it seems easier than programming in high-level languages. It just takes longer. I think that's a common experience.

Historically speaking, almost all video games and operating systems were written in assembly languages similar to this until the 80s.

By oersted 2025-11-1314:291 reply

Of course I'm aware of this, I've written some assembly too, most definitions are leaky. And if GNU assembly had wide adoption among programmers right now and an ecosystem around it, then some people might also call GCC a transpiler (in that specific mode, which is not the default), if they care about the fact that it outputs in a language that they may read or write by hand comfortably.

They also called C a high-level language at that time. There was also more emphasis on the distinction between assemblers and compilers. Indeed, they may have used the word compiler more in the sense we use transpiler now, I'm sure people were also saying that it was just a fancy assembler. Terminology shifts.

By kragen 2025-11-1322:09

I think what happened was that, when writing in assembly language was a common thing to do, programmers had a clearer idea of what a compiler did, so they knew better than to say "transpiler".

By crazygringo 2025-11-1312:591 reply

https://news.ycombinator.com/item?id=45912557

By kragen 2025-11-1313:24

Thank you for the link; I've responded comprehensively at https://news.ycombinator.com/item?id=45914592.

By NuclearPM 2025-11-1312:34

You’re being \__

By krapp 2025-11-1314:431 reply

The issue is confused because of Javascript and the trend to consider Javascript "bytecode for the web" because it is primarily "compiled" from other languages, rather than being considered a language in its own right.

I've gotten into arguments with people who refuse to accept that there is any difference worth considering between javascript and bytecode or assembly. From that perspective, the difference between a "transpiler" and a "compiler" is just aesthetics.

By oersted 2025-11-1314:53

I do think you are right, the concept is not common outside of the JS ecosystem to be fair. Indeed, it probably wouldn't make much sense to transpile in the first place, if it wasn't for these browser limitations. People would just make fully new languages, and it is starting to happen with WebAssembly.

And the ecosystem of JVM and BEAM hosted languages does make the concept even murkier.

By jchw 2025-11-139:043 reply

Transpilers are compilers that translate from one programming language to the other. I am not 100% sure where these "lies" come from, but it's literally in the name, it's clearly a portmanteau of translating compiler... Where exactly are people thinking the "-piler" suffix comes from?

Yes, I know. You could argue that a C compiler is a transpiler, because assembly language is generally considered a programming language. If this is you, you have discovered that there are sometimes concepts that are not easy to rigorously define but are easy for people to understand. This is not a rare phenomenon. For me, the difference is that a transpiler is intending to target a programming language that will be later compiled by another compiler, and not just an assembler. But, it is ultimately true that this definition is still likely not 100% rigorous, nor is it likely going to have 100% consensus. Yet, people somehow know a transpiler when they see one. The word will continue to be used because it ultimately serves a useful purpose in communication.

By s20n 2025-11-1313:181 reply

One distinction is that compilers generally translate from a higher-level language to a lower-level language whereas Transpilers target two languages which are very close in the abstraction level. For example a program that translated x86 assembly to RISC-V assembly would be considered a transpiler.

By kragen 2025-11-1313:232 reply

The article we are discussing has "Transpilers Target the Same Level of Abstraction" as "Lie #3", and it clearly explains why that is not true of the programs most commonly described as "transpilers". (Also, I've never heard anyone call a cross-assembler a "transpiler".)

By MrJohz 2025-11-147:37

I don't really agree with their argument, though. Pretty much all the features that Babel deals with are syntax sugar, in the sense that if they didn't exist, you could largely emulate them at runtime by writing a bit more code or using a library. The sugar adds a layer of abstraction, but it's a very thin layer, enough that most JavaScript developers could compile (or transpile) the sugar away in their head.

On the other hand, C to Assembly is not such a thin layer of abstraction. Even the parts that seem relatively simple can change massively as soon as an optimisation pass is involved. There is a very clear difference in abstraction layer going on here.

I'll give you that these definitions are fuzzy. Nim uses a source-to-source compiler, and the difference in abstraction between Nim and C certainly feels a lot smaller than the difference between C and Assembly. But the C that Nim generates is, as I understand it, very low-level, and behaves a lot closer to assembly, so maybe in practice the difference in abstraction is greater than it initially seems? I don't think there's a lot of value in trying to make a hard-and-fast set of rules here.

However, it's clear that there is a certain subset of compilers that aim to do source-to-source desugaring transformations, and that this subset of compilers have certain similarities and requirements that mean it makes sense to group them together in some way. And to do that, we have the term "transpiler".

By jchw 2025-11-1320:38

Abstraction layers are close to the truth, but I think it's just slightly off. It comes down to the fact that transpilers are considered source-to-source compilers, but one man's intermediate code is another man's source code. If you are logically considering neither the input and the output to be "source code", then you might not consider it to be a transpiler for the same reasons that an assembler is rarely called a compiler, even though assemblers can have compiler-like features: consider LLVM IR, for example. This is why a cross-assembler is not often referred to as a transpiler. Of course, terminology is often tricky: the term "recompiler" is often used for this sort of thing, even though neither the input nor the output is generally considered "source code", probably because they are designed to essentially construct a result as similar as possible to if you were able to recompile the source code for another target. This seems to contrast fairly well with "decompiler", as a recompiler may perform similar reconstructive analysis to a decompiler, but ultimately outputs more object code. Not that I am an authority on anything here, but I think these terms ultimately do make sense and reconcile with each-other.

When people say "Same Level of Abstraction", I think what they are expressing is that they believe both of the programming languages for the input and output are of a similar level of expressiveness, though it isn't always exact, and the example of compiling down constructs like async/await shows how this isn't always cut-and-dry. It doesn't imply that source-to-source translations, though, are necessarily trivial, either: A transpiler that tries to compile Go code to Python would have to deal with non-trivial transformations even though Python is arguably a higher level of abstraction and expressiveness, not lower. The issue isn't necessarily the abstraction level or expressiveness, it's just an impedance mismatch between the source language and the destination language. It also doesn't mean that the resulting code is readable or not readable, only that the code isn't considered low level enough to be bytecode or "object code". You can easily see how there is some subjectivity here, but usually things fall far away enough from the gray area that there isn't much of a need to worry about this. If you can decompile Java bytecode and .NET IL back to nearly full-fidelity source code, does that call into question whether they're "compilers" or the bytecode is really object code? I think in those cases it gets close and more specific factors start to play into the semantics. To me this is nothing unusual with terminology and semantics, they often get a lot more detailed as you zoom in, which becomes necessary when you get close to boundaries. And that makes it easier to just apply a tautological definition in some cases: like for Java and .NET, we can say their bytecode is object code because that's what they're considered to be already, because that's what the developers consider them to be. Not as satisfying, but a useful shortcut: if we are already willing to accept this in other contexts, there's not necessarily a good reason to question it now.

And to go full circle, most compilers are not considered transpilers, IMO, because their output is considered to be object code or intermediate code rather than source code. And again, the distinction is not exact, because the intermediate code is also turing complete, also has a human readable representation, and people can and do write code in assembly. But brainfuck is also turing complete, and that doesn't mean that brainfuck and C are similarly expressive.

By vrighter 2025-11-1714:10

so is the thing that generates bytecode from java or c# source code a compiler or a transpiler? Because the generated code will be compiled by the JIT at runtime.

By kragen 2025-11-1313:211 reply

On the contrary: it reifies people's prejudices and prevents them from seeing reality, often in the service of intentional deception, which for my purposes is the opposite of a useful purpose in communication.

There's currently a fad in my country for selling "micellar water" for personal skin cleansing, touted as an innovation. But "micelles" are just the structure that any surfactant forms in water, such as soap, dish detergent, or shampoo, once a certain critical concentration is reached, so "micellar water" is just water with detergent in it. People believe they are buying a new product because it's named with words that they don't know, but they are being intentionally deceived.

Similarly, health food stores are selling "collagen supplements" for US$300 per kilogram to prevent your skin from aging. These generally consist of collagen hydrolysate. The more common name for collagen hydrolysate is "gelatin". Food-grade gelatin sells for US$15 per kilogram. (There is some evidence that it works, but it's far from overwhelming, but what I'm focusing on here is the terminology.) People believe they are buying a special new health supplement because they don't know what gelatin is, but they are being intentionally deceived.

You might argue, "People somehow know micellar water when they see it," or, "People somehow know collagen supplements when they see them," but in fact they don't; they are merely repeating what it says on the jar because they don't know any better. They are imagining a distinction that doesn't exist in the real world, and that delusion makes them vulnerable to deception.

Precisely the same is true of "transpilers". The term is commonly used to mislead people into believing that a certain piece of software is not a compiler, so that knowledge about compilers does not apply to it.

By jchw 2025-11-1314:071 reply

> The term is commonly used to mislead people into believing that a certain piece of software is not a compiler, so that knowledge about compilers does not apply to it.

Why would people use a word that has the word "compiler" in it to try to trick people into thinking something is not a compiler? I'm filing this into "issues not caused by the thing that is being complained about".

By kragen 2025-11-1321:501 reply

Apparently nobody has ever said to you, "No, it's not a compiler, it's a transpiler," which makes you a luckier person than I am. People know less than you think.

By jchw 2025-11-1322:32

I don't even understand why someone would say that. What's the point in asserting that something isn't a compiler? Not that I doubt that this really happens, but I don't know what saying something "isn't a compiler" is meant to prove. Is it meant to downplay the complexity of a transpiler?

Obviously I believe transpilers are compilers. A cursory Google search shows that the word transpiler is equated to "source-to-source compiler" right away. If it truly wasn't a compiler, didn't have a true frontend and really did a trivial syntax-to-syntax translation, surely it would only be a translator, right? That is my assumption.

But all that put aside for a moment, I do stand by one thing; that's still not really an issue I blame on the existence of the word transpiler. If anything, it feels like it is in spite of the word transpiler, which itself heavily hints at the truth...