I'm building a clarity-first language (compiles to C++)

2026-02-114:303551github.com

A clarity-first programming language that compiles to C++ - taman-islam/roxlang

ROX is a minimal, clarity-first programming language built on a simple belief:

Programming logic should not have to fight the language.

ROX removes implicit behavior, hidden conversions, and syntactic tricks so that expressing logic feels direct and mechanical rather than negotiated.

ROX compiles .rox source files into C++20 (.cc), which are then compiled into native executables using clang++.

This repository contains the ROX compiler implementation written in C++20.

In many languages, expressing simple logic often requires navigating:

  • Implicit type coercions
  • Silent conversions
  • Operator overloading
  • Hidden control flow
  • Exception systems
  • Special cases

ROX intentionally removes these.

The goal is not convenience. The goal is clarity of expression.

In ROX:

  • Every type is explicit.
  • Every error is a value.
  • Every access is deliberate.
  • Every control structure is visible.
  • Nothing implicit happens behind your back.

You write the logic. The language stays out of the way.

ROX v0 enforces:

  • No implicit type conversions
  • No bracket indexing ([] only for list literals)
  • No exceptions — errors are explicit values (rox_result[T])
  • A single loop construct (repeat)
  • Explicit control flow only
  • Strict compile-time type checking

The surface area is intentionally small and opinionated.

Two Sum implemented in ROX:

function two_sum(list[num] nums, num target) -> list[num] {
    num n = nums.size();

    repeat i in range(0, n, 1) {
        repeat j in range(i + 1, n, 1) {

            rox_result[num] r1 = nums.at(i);
            if (not isOk(r1)) { return [-1, -1]; }
            num v1 = getValue(r1);

            rox_result[num] r2 = nums.at(j);
            if (not isOk(r2)) { return [-1, -1]; }
            num v2 = getValue(r2);

            if (v1 + v2 == target) {
                return [i, j];
            }
        }
    }

    return [-1, -1];
}
  • .at() returns rox_result[T]
  • Errors must be handled explicitly
  • No implicit casts
  • range is the only loop construct
  • Lists are accessed only via .at()

ROX prioritizes clarity over convenience. Explicitness may cost more keystrokes, but it eliminates hidden behavior.

  • num (64 bit signed integer)
  • bool
  • float
  • char
  • string
  • none
  • list[T]
  • dictionary[K, V]
  • rox_result[T]
  • if / else
  • repeat i in range(start, end, step)
  • break
  • continue
  • print(val) -> none (supports string, num, float, bool, char, list)
  • isOk(rox_result[T]) -> bool
  • getValue(rox_result[T]) -> T
  • getError(rox_result[T]) -> string
  • num32_abs(n)
  • num32_min(a, b)
  • num32_max(a, b)
  • num32_pow(base, exp) -> rox_result[num32]
  • num_abs(n)
  • num_min(a, b)
  • num_max(a, b)
  • num_pow(base, exp) -> rox_result[num]
  • float_abs(n)
  • float_min(a, b)
  • float_max(a, b)
  • float_pow(base, exp)
  • float_sqrt(n) -> rox_result[float]
  • float_sin(n)
  • float_cos(n)
  • float_tan(n)
  • float_log(n) -> rox_result[float]
  • float_exp(n)
  • float_floor(n)
  • float_ceil(n)

ROX does not use exceptions. Errors are explicit values:

rox_result[num] r = nums.at(i);
if (not isOk(r)) {
    return [-1, -1];
}
num value = getValue(r);

To get the error message:

if (not isOk(r)) {
    print("Error: ", getError(r), "\n");
}

Nothing throws. Nothing hides.

Strings are immutable sequences of UTF-8 bytes.

string s = "Hello, World!";
print(s);
print("\n");

Hash maps for key-value storage.

dictionary[string, num] scores;
scores.set("Alice", 100);
if (scores.has("Alice")) {
    print(scores.get("Alice"));
}

Comments are single-line and start with //.

// This is a comment
num x = 10; // This is also a comment

ROX is compiled, not interpreted.

.rox.cc → native binary

  1. ROX source is parsed and type-checked.
  2. C++20 code is generated.
  3. clang++ compiles the emitted C++ into an executable.

The generated C++ is intentionally straightforward and readable.

  • C++20-compatible compiler (e.g., clang++)
  • Make
./rox run test/two_sum.rox
./rox format test/two_sum.rox
./rox generate test/two_sum.rox
./rox compile test/two_sum.rox

You can run all verified test programs with the provided script:

Alternatively, run them individually:

./rox run test/two_sum.rox

The test/ directory contains verified implementations of:

  • Two Sum
  • Valid Parentheses
  • Binary Search
  • Maximum Subarray (Kadane’s Algorithm)
  • Longest Substring Without Repeating Characters

These serve as correctness and regression tests for the compiler pipeline.

ROX v0 focuses on:

  • Core type system
  • Explicit error handling
  • Deterministic control flow
  • Clean C++ code generation
  • Minimal language surface

Future directions (ROX++) may include:

  • Module system
  • Expanded standard library
  • Static analysis improvements

ROX is not trying to compete with mainstream languages. It is an exploration of this question:

What does programming look like when the language refuses to be clever?

A local web-based playground is available to try ROX code interactively.

Then open http://localhost:3000 in your browser.


Read the original article

Comments

  • By HendrikHensen 2026-02-157:531 reply

    If this is to be a real, (relatively) widely-used language, I would make some tough choices on where to innovate, and where to just leave things the same.

    One thing I noticed in the example is `num target`, especially because the focus is on "clarity". When I read the example, I was sure that `num` would be something like the JavaScript `Number` type. But to my surprise, it's just a 64-bit integer.

    For an extremely long time, languages have had "int", "integer", "int64", and similar. If you aim for clarity, I would strongly advise to just keep those names and don't try to invent new words for them just because. Both because of familiarity (most programmers coming to your language will already be familiar other languages which have "int(eger)"), and because of clarity ("int(eger)" is unambiguous, it is a well defined term to mean a round number; "num" is ambiguous and "number" can mean any type of number, e.g. integer, decimal, imaginary, complex, etc).

    The most clear are when the data types are fully explicit, eg. `int64` (signed), `uint64` (unsigned), `int32`, etc.

    • By hedayet 2026-02-158:142 reply

      [author here] That’s a good point. I can see why int might be clearer than num, especially given the long history of that naming. I’ll think about it.

      • By Nevermark 2026-02-158:281 reply

        Definitely int for signed numbers. But I would call it "int64".

        Clarity means saying what you mean. The typename int64 could not be clearer that you are getting 64 bits.

        This is consistent with your (num32 -->) "int32".

        And it would remain consistent if you later add smaller or larger integers.

        This also fits your philosophy of letting the developer decide and getting out of their way. I.e. don't use naming to somehow shoehorn in the "standard" int size. Even if you would often be right. Make/let the developer make a conscious decision.

        Later, "int" could be a big integer, with no bit limit. Or the name will be available for someone else to create that.

        I do like your approach.

        (For unsigned, I would call them "nat32", "nat64", if you ever go there. I.e. unsigned int is actually an oxymoron. A sign is what defines an integer. Natural numbers are the unsigned ones. This would be a case of using the standard math term for its standard meaning, instead of the odd historical accident found in C. Math is more universal, has more lasting and careful terminology - befitting universal clarity. I am not a fan of new names for things for specialized contexts. It just adds confusion or distance between branches of knowledge for no reason. Just a thought.)

        • By hedayet 2026-02-162:04

          Thank you and others for articulating this naming (and semantic) problem so well.

          I've just shipped int64, float64, and removed num32, float, etc.

      • By jiggawatts 2026-02-159:49

        I would recommend outright copying Rust.

        Among other things, it's a systems programming language and hence its naming scheme is largely (if not entirely) compatible with modern C++ types.

        I.e.:

            +----------------+-------------------------+------------------------------+
            | Rust           | Modern C++              | Notes                        |
            +----------------+-------------------------+------------------------------+
            | i8             | std::int8_t             | exact 8-bit signed           |
            | u8             | std::uint8_t            | exact 8-bit unsigned         |
            | i16            | std::int16_t            | exact 16-bit signed          |
            | u16            | std::uint16_t           | exact 16-bit unsigned        |
            | i32            | std::int32_t            | exact 32-bit signed          |
            | u32            | std::uint32_t           | exact 32-bit unsigned        |
            | i64            | std::int64_t            | exact 64-bit signed          |
            | u64            | std::uint64_t           | exact 64-bit unsigned        |
            | i128           | (no standard type)      | GCC/Clang: __int128          |
            | u128           | (no standard type)      | GCC/Clang: unsigned __int128 |
            | isize          | std::intptr_t           | pointer-sized signed         |
            | usize          | std::uintptr_t          | pointer-sized unsigned       |
            | f32            | float                   | IEEE-754 single precision    |
            | f64            | double                  | IEEE-754 double precision    |
            | bool           | bool                    | same semantics               |
            | char           | char32_t                | Unicode scalar value         |
            +----------------+-------------------------+------------------------------+

  • By amluto 2026-02-155:395 reply

    > 3. Values follow a strict rule: primitives pass by value, containers pass by read-only reference. This prevents accidental aliasing/mutation across scopes and keeps ownership implicit but predictable.

    There are plenty of languages where functions cannot mutate their parameters or anything their parameters reference — Haskell is one example. But these languages tend to have the ability to (reasonably) efficiently make copies of most of a data structure so that you can, for example, take a list as a parameter and return that list with one element changed. These are called persistent data structures.

    Are you planning to add this as a first-class feature? This might be complex to implement efficiently on top of C++’s object model — there’s usually a very specialized GC involved.

    • By hedayet 2026-02-157:051 reply

      [author here] ROX avoids implicit structural sharing and persistent data structures. Allocation and mutation are explicit - if I want a modified container, I construct one.

      This is intentionally more resource-intensive. ROX trades some efficiency for simplicity and predictability.

      The goal is clarity of logic and clarity of behavior, even at slightly higher cost. And future optimizations should preserve that model rather than hide it.

      • By amluto 2026-02-1514:43

        I would check your “slightly”. If I have an algorithm that operates on an n-element data structure through a helper function (this includes almost any nontrivial program - think managing caches, small databases or tables, lists of clients, etc), you get an extra multiplicative factor of n for each operation. All those nice linear-time or n-log-n algorithms turn into n^2, and accidentally quadratic programs can be bad news.

        And if the language offers no facility at all to get the factor of n back, users may be forced to use something else.

    • By cyber_kinetist 2026-02-156:31

      I've seen some C++ libraries that implement persistent data structures like immer (https://github.com/arximboldi/immer) - but seems it requires the use of the Boehm GC (which is notorious to be slow, since it is a conservative GC and cannot exploit any of the specific semantics/runtime characteristics of the language you're making).

    • By pjmlp 2026-02-159:30

      Brainstorming a bit, you could get into that via hazardous or deferred pointers, but yeah I guess it falls down into specialized GC kind of solution.

    • By paulddraper 2026-02-156:19

      I don’t see the relevance of special GC.

      But yes you need immutable data structures designed for amortized efficient copy.

    • By eager_learner 2026-02-155:44

      Comments like amluto's above, are the reason my time spent on HN is not wasted.

  • By nynx 2026-02-156:573 reply

    This is an interesting line in the readme:

    > The language forces clarity — not ceremony.

    I find this statement curious because a language, like this, without ability to build abstractions forces exactly the opposite.

    • By raverbashing 2026-02-159:16

      Yup exactly this

      It's the "C is a simple language" BS again

      Using a circular sawblade without the saw is as simple as it gets as well

      The simpler it is the more you get annoyed at it, the more it is easier to shoot yourself in the foot with it, because the world is not perfect

      Abstractions are great and I'm dying on this hill

      "getError" what year is it again?

    • By saghm 2026-02-157:57

      Yeah, this seems to be a common thing nowadays, although often with the value cited as "simplicity". I've always found it a bit odd because it seems to me like there are tradeoffs where making things at one level of granularity more clear or simple (or whatever you want to call it) will come at the cost of making things less clear and simple if you zoom in or out a bit at what the code is doing. Assembly is more "clear" in terms of what the processor is doing, but it makes the overall control flow and logic of a program less clear than a higher level language. Explicitly defining when memory is allocated and freed makes the performance characteristics of a program more clear, but it's "ceremony" compared to a garbage collected language that doesn't require manually handling that by default.

      I think my fundamental issue with this sort of prioritization is that I think that there's a lot of value in being able to jump between different mental models of a program, and whether something is clear or absolutely ridden with "ceremony" can be drastically different depending on those models. By optimizing for exactly one model, you're making programs written in that language harder to think about in pretty much every other model while quickly hitting diminishing returns on how useful it is to try to make that one level of granularity even more clear. This is especially problematic when trying to debug or optimize programs after the initial work to write them is complete; having it be super clear what each individual line of code is doing isolation might not be enough to help me ensure that my overall architecture isn't flawed, and similarly having a bunch of great high-level abstractions won't necessarily help me notice bugs that can live entirely in one line of code.

      I don't think these are specific use cases that a language can just consider to be outside of the scope in the same way they might choose not to support systems programming or DSLs or whatever; programmers need to be able to translate the ideas of how the program works into code and then diff between them to identify issues at both a macro and micro level regardless of what types of programs they're working on.

    • By hedayet 2026-02-157:35

      [author here] That’s a very good point - "not ceremony" was poorly phrased.

      ROX does introduce more explicitness, which indeed introduces more ceremony. The goal isn’t to reduce keystrokes; it’s to reduce hidden behaviour.

      A better framing would be: ROX prioritizes clarity over convenience. Explicitness may cost more keystrokes, but it eliminates hidden behavior. [README updated]

HackerNews