
The Om language is: a novel, maximally-simple concatenative, homoiconic programming and algorithm notation language with: a trivial-to-parse data transfer format. unicode-correct: any UTF-8 text…
The Om language is:
The Om language is not:
This program and the accompanying materials are made available under the terms of the Eclipse Public License, Version 1.0, which accompanies this distribution.
For more information about this license, please see the Eclipse Public License FAQ.
The Om source code can be used for:
The Om source code is downloadable from the Om GitHub repository:
To run scripts which build the dependency Libraries and generate the build project, the following programs are required:
sudo apt-get install build-essential)To build the Documentation in the build project, the following additional programs are required:
To ensure that correct programs are used, programs should be listed in the command line path in the following order:
The following libraries are required to build the Om code:
A build project, containing targets for building the interpreter, tests, and documentation, can be generated into "[builds directory path]/Om/projects/[project]" by running the appropriate "generate" script from the desired builds directory:
Arguments include the desired project name (required), followed by any desired CMake arguments.
By default, this script automatically installs all external dependency libraries (downloading and building as necessary) into "[builds directory path]/[dependency name]/downloads/[MD5]/build/[platform]/install". This behaviour can be overridden by passing paths of pre-installed dependency libraries to the script:
-D Icu4cInstallDirectory:Path="[absolute ICU4C install directory path]"-D BoostInstallDirectory:Path="[absolute Boost install directory path]"The Om.Interpreter target builds the interpreter executable as "[Om build directory path]/executables/[platform]/[configuration]/Om.Interpreter". The interpreter:
The Om.Test target builds the test executable, which runs all unit tests, as "[Om build directory path]/executables/[platform]/[configuration]/Om.Test". These tests are also run when building the RUN_TESTS target (which is included when building the ALL_BUILD target).
The Om.Documentation target builds this documentation into the following folders in "[Om build directory path]/documentation":
Om is a header-only C++ library that can be incorporated into any C++ or Objective-C++ project as follows:
Om::Language::System::Initialize function prior to use (e.g. in the main function), passing in the desired UTF-8 locale string (e.g. "en_US.UTF-8").Om::Language::Environment, populate with any additional operator-program mappings, and call one of its Om::Language::Environment::Evaluate functions to evaluate a program.For more in-depth usage of the library, see the Om code documentation.
An Om program is a combination of three elements—operator, separator, and operand—as follows:
An operator has the following syntax:
Backquotes (`) in operators are disregarded if the code point following is not a backquote, operand brace, or separator code point.
A separator has the following syntax:
An operand has the following syntax:
The Om language is concatenative, meaning that each Om program evaluates to a function (that takes a program as input, and returns a program as output) and the concatenation of two programs (with an intervening separator, as necessary) evaluates to the composition of the corresponding functions.
Unlike other concatenative languages, the Om language uses prefix notation. A function takes the remainder of the program as input and returns a program as output (which gets passed as input to the leftward function).
Prefix notation has the following advantages over postfix notation:
Only the terms (operators and operands) of a program are significant to functions: separators are discarded from input, and are inserted between output terms in a "normalized" form (for consistent formatting and proper operator separation).
There are three fundamental types of functions:
Programs are evaluated as functions in the following way:
For example, program "A B" is the concatenation of programs "A", " ", and "B". The separator evaluates to the identity operation and can be disregarded. The programs "A" and "B" evaluate to functions which will be denoted as A and B, respectively. The input and output are handled by the composed function as follows:
B receives the input, and its output becomes the input for function A.A receives the input, and its output becomes that of the composed function.Any programs may be concatenated together; however, note that concatenating programs "A" and "B" without an intervening separator would result in a program containing a single operator "AB", which is unrelated to operators "A" or "B".
All operation implementations provided are documented in the Operation module.
There are no traditional data types in the Om language: every data value is represented by an operand.
The Om language uses a unique panmorphic type system, from Ancient Greek πᾶν (pan, "all") and μορφή (morphē, “form”), in which all data values are exposed exclusively through a common immutable interface.
In the case of the Om language, every data value is entirely represented in the language as an operand. Any operation will accept any operand as a valid input and interrogate its data solely through its contained program (a sequence of operator, separator, and/or operand). The operation is then free to process the data however is appropriate, and any operand that it produces as output can then be interrogated and processed by the next operation in the same way.
Although any operand can be treated as containing a literal array of operand, operator and/or separator elements, the implementation of operands takes advantage of some optimizations:
Operations in a program can be ordered by the programmer to increase performance by minimizing conversions between program implementations, but it is not necessary for obtaining a correct computation. Where relevant, an operation will document the program implementation types of its inputs and outputs to allow for this optional level of optimization.
All program implementations provided are documented in the Program module.
The following program contains a single operand containing an operator "Hello,", a separator " ", and another operator "world!":
{Hello, world!} |
{Hello, world!} |
The following program contains a single operand containing an operator "Hello,", a separator " ", and an operand "{universe!}" which in turn contains a single operator "universe!":
{Hello, {universe!}} |
{Hello, {universe!}} |
Note that separators are significant inside operands:
{Hello, world!} |
{Hello, world!} |
Operands can be dropped and copied via the drop and copy operations:
copy {A}{B}{C} |
{A}{A}{B}{C} |
The drop operation can therefore be used for comments:
drop {This is a comment.} {This is not a comment.} |
{This is not a comment.} |
The choose operation selects one of two operands, depending on whether a third is empty:
choose {It was empty.}{It was non-empty.}{I am not empty.} |
{It was non-empty.} |
choose {It was empty.}{It was non-empty.}{} |
{It was empty.} |
An operation without sufficient operands evaluates to itself and whatever operands are provided:
choose {It was empty.}{It was non-empty.} |
choose{It was empty.}{It was non-empty.} |
The quote and dequote operations add and remove a layer of operand braces, respectively:
dequote {copy} {A} |
{A}{A} |
Operands can be popped from and pushed into:
<-[characters] {ABC} |
{A}{BC} |
->[literal] {A}{BC} |
{ABC} |
<-[terms] {some terms} |
{some}{terms} |
A new operator definition can be provided with the define operation, where the first operand is treated as containing a Lexicon with operator-to-operand mappings, and the second operand contains the program to evaluate using the defined operator:
define { double-quote {quote quote} } { double-quote {A} } |
{{{A}}} |
Any string can be used as an operator, with separators and operand braces escaped with a backquote:
define { double` quote {quote quote} } { double` quote {A} } |
{{{A}}} |
<-[terms] { double` quote operator } |
{double` quote}{operator} |
Unicode is fully supported:
<-[characters] {한글} |
{한}{글} |
<-[code` points] {한글} |
{ᄒ}{ᅡᆫ글} |
Strings are automatically normalized to NFD, but can be explicitly normalized to NFKD using the normalize operation:
Recursion is very efficient in the Om language, due to (a) the "eager" evaluation model enabled by prefix concatenative syntax (i.e. data is consumed immediately rather than being left on a stack), and (b) the non-recursive evaluation implementation in the evaluator that minimizes memory overhead of recursive calls and prevents stack overflow. The following example uses recursion to give the minutes in a colon-delimited 24-hour time string:
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } { minutes {1:23} } |
{23} |
An important feature of Om is that each step of an evaluation can be represented as a program. The following is the above program broken down into evaluation steps, where the code that is about to be replaced is bold, and the latest replacement is italicized:
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
define
{ minutes { dequote choose {minutes} {} = {:} <-[characters] } }
{ {23} } |
define { minutes { dequote choose {minutes} {} = {:} <-[characters] } } |
{23} |
The rearrange operation provides operand name binding, allowing for a more applicative style. The following example is a simplistic implementation of a left fold, along with an example call:
define
{
[Fold]<- {
rearrange
{
rearrange
{
dequote
choose
quote Result
pair pair pair {[Fold]<-} Function Result Remainder
Remainder
}
{Result Remainder}
dequote Function Base <-[terms] Source
}
{Function Base Source}
}
}
{
[Fold]<- {[literal]<-} {} {1 2 3}
}
The result is {321}.
The example works as follows:
[Fold]<- takes three operands:
Source.Function is applied to:BaseSourceSourceFunction are:
Remainder is empty, the Result is output. Otherwise, Function, Result, and Remainder are passed to a recursive [Fold]<- call.A few things should be noted about the above example:
There are several ways to contribute to the Om project:
Om is written in modern, portable C++ that adheres to the Sparist C++ Coding Standard.
Note: Because this is an early-stage project, there are not yet any compatibility guarantees between versions.
Om code can be forked from the Om GitHub repository.
See the Using section for instructions on building the code.
When adding or removing files from source, re-run the "generate" script from the build directory to update the project.
Additional native functionality can be added to the Om language by implementing new operations in C++.
There are two ways to implement an operation: as a composite operation, or an atomic operation.
To implement a composite operation, or an atomic operation that consumes no operands:
class in the Om::Language::Operation namespace.GetName() method, which returns a static char const * containing the name.Give(Om::Language::Evaluation &) method, with no return value, to give existing operations and/or elements to the evaluation.To define an atomic operation that consumes one or more operands:
For any operation implementation, code must be added to the operation header that inserts the operation into the system when the header is included, as follows (where WhateverOperation is a stand-in for the name of the operation class):
New data types can be added to the Om language by extending Om::Language::Program and defining the functions necessary to instantiate the class. Use existing programs as a guide.
Program types should be defined in the Om::Language namespace.
Some basic free static analysis tools can be applied to the Om code:
python [path]/hfcca.py -p -v code
[path]/cloc.pl code
The Om.Test target of the Xcode project generates test coverage data that can be viewed as follows:
"*.ipp" and "*.hpp" to the "SDK Files" list.Changes can be submitted to Om via pull request.
Issues are reported and tracked with the Om GitHub issue tracker.
Before reporting an issue, please search existing issues first to ensure that it is not a duplicate.
The Om language is currently a spare-time project of one person. If you would like to speed the development of the Om language in either a general or domain-specific direction, please contact me at information@sparist.com.
The following additional reading may help explain some of the concepts that contributed to the Om language:
Thanks to all of the people who contributed to:
Would recommend placing example language syntax above the fold. Was tough to have to scroll halfway down the entire site to see any syntax. Nobody cares about the EBNF syntax until they have a feel for the language.
Almost every site for a new language that gets posted here does this. Every time someone points out how they don't care about anything until they've seen what code actually looks like. I'm surprised this still happens.
Ah, thought it was new.
Alt opinion: syntax is the least important part of a programming language. I can't wait for the day someone invents one where it's defined entirely as an AST (with the S standing for Semantic). Just bring your own weird syntax.
I guess Unison is the closest to this platonic ideal right now? https://github.com/unisonweb/unison/issues/499
That's cool, but I might prefer semantic whitespace. Sure would be neat if we could both work with the same code in our preferred forms.
Ah yes, now there is a LISP I can get behind!
Honestly, we can do better than LISPs.
Just use curly brackets and boom. LISP 3k.
You're welcome.
It can even be used to represent serialized objects...
Love this take! Unison is exactly this, and it's awesome!
Here's a quote from one of the creators:
> But here's the super cool thing about our language! Since we don't store your code in a text/source code representation, and instead as a typechecked AST, we have the freedom to change the surface syntax of the language very easily, which is something we've done several times in the past. We have this unique possibility that other languages don't have, in that we could have more than one "surface syntax" for the language. We could have our current syntax, but also a javascript-like syntax, or a python-like syntax.
Can Raku do something like this? I was lightly exploring it recently, and I thought I saw that something like this may be possible with it.
I'm not super familiar with Raku, but if RakuAST is what you had in mind it looks a bit different:
use experimental :rakuast;
my $ast = RakuAST::Call::Name.new(
name => RakuAST::Name.from-identifier("say"),
args => RakuAST::ArgList.new(
RakuAST::StrLiteral.new("Hello world")
)
);
Looks more like "low-level programming an AST" (which I believe other languages offer as well), rather than using a bidirectional transform. I don't know how you'd get Raku code back out, for example.Edit: I should have looked deeper, `DEPARSE` does exactly this:
https://docs.raku.org/type/RakuAST
Neat!
It also goes from source code to AST:
$ raku -e 'say Q|say "Hello World!"|.AST'
RakuAST::StatementList.new(
RakuAST::Statement::Expression.new(
expression => RakuAST::Call::Name::WithoutParentheses.new(
name => RakuAST::Name.from-identifier("say"),
args => RakuAST::ArgList.new(
RakuAST::QuotedString.new(
segments => (
RakuAST::StrLiteral.new("Hello World!"),
)
)
)
)
)
)thanks, all.
I completely agree: If it is ugly-as-sin-but-useful I will learn it.
The aesthetic of mathematics as it appears in journals is I think questionable, but undeniably convenient for communication, so it is every language making the case that you (dear reader) can say something very complicated and useful in the ideal amount of space.
"Hello world" isn't that: That's the one program everyone should be able to write correctly, 100% of the time. That's how we can talk about brainfuck as exercise, but APL is serious.
Or put another way, even if seeing a new kind of "hello world" excites dear reader, it's probably not going to excite me, unless it's objectively disgusting.
What Om does here is exactly right for me: It tells me what it is, and makes it easy for me to drill down to each of those things to figure out what the author means by that, and decide if I am convinced.
I mean, that's the point right? I'm here trying to learn something new and that requires I allow myself to be convinced, and since "hello world" is table-stakes, seeing it can only slow my ability to be convinced.
This is a Very Bad Idea. Two people working with the same language will be unable to reason about each other's code, because it requires understanding their bespoke syntax and its nuances.
No it won't? That's exactly the point -- each of those people will be viewing the code in their own preferred syntax. If there is semantic nuance in the writer's syntax, the reader will see it presented in the best way their preferred syntax's representation can provide.
Imagine all the hours saved that are currently spent on tired tabs vs spaces debates, or manicuring .prettierrc, etc etc. The color of the bike shed might matter (sometimes a lot) to some people, I know, but it's storing bikes away from the elements and thieves that is the goal, not obsessing over optimizing something that is demonstrably a subjective matter of taste.
Those are both formatting examples though? You're suggesting totally different syntaxes, which means you can't even point to the same line in a codebase when talking about a PR. This throws up massive hurdles around communication when you could just agree on one standard and move on.
class Bean {
private boolean sprouted;
public void sprout() {
this.sprouted = true;
// ...
}
}
or data Bean = Dormant | Sprouted
sprout :: Bean -> Bean
sprout Dormant = Sprouted
sprout Sprouted = -- aw, beans, we could have modeled
-- this state as impossible to construct,
-- but you chose runtime checks, so
-- here we are.
As for pointing to the source line, I think JavaScript people solved that one for us with source maps. Just because we download and execute a single 4Mb line of minified code, doesn't mean we can't tell which line of the original source caused the error. :)Oh lord, yeah this convinces me even more that this is a bad idea. I can't even tell at a glance if those do the same thing. Just pick one and move on, you're requiring everyone to pass around sourcemaps literally everywhere they go, one for every single pair of syntaxes. You can't even talk about the code with the same language with each other. Is Bean a "class" or a "datatype"? If I'm using one syntax, how do I tell you to fix a bug in your syntax?
> If I'm using one syntax, how do I tell you to fix a bug in your syntax?
How about "Hey, your Bean ain't sprouting"? :)
I'm sorry, I feel like I'm not communicating this properly. Um, have you ever discussed with someone a book or a TV show that was translated into your language? Did you have problems referring to the exact parts you liked or disliked? :)
Aren't LLMs supposed to write machine code directly, no more programming languages at all, any day now? Joking aside, programming languages are a good mental exercise. Forth was my first language after assembly. Didn't like the stack juggling and ended up using its macro assembler more and more, it became something else, conventions over code I suppose, like what to keep in registers. Forth (and Unix) got the composability requirement right, the testing of individual units.
So instead of using programming languages designed specifically to effectively express algorithms and data structures, we are going to use natural language like English that is clearly not expressive enough for this? It’s like rewriting a paper about sheaf cohomology in plain English without any mathematical notation and expecting it to be accessible to everyone.
:) Not exactly. We'll use English to get a kinda description, then test and debug to make that functional, then cycling the functionality with users to nail down what is actually needed. Which won't be written down anywhere. Like before. Except with autocomplete that tries to predict a page or two of code at a time. Often pretty accurately.
I do not think you are saying same thing here :). No one doubts we can put "make a todo app" into english, and that you can yeah test with users. But that's different from a task which would articulate, in only english, the precise layout and architecture of the MVC that make the app possible.
English is fine, but I am personally a lot faster in my mind and fingers and IDE with a language suited for this stuff. AI guys just want to be teachers deep down I think :).
I'm still waiting to see the first show HN I made a language designed for LLMs to write programs better.
A few days ago I asked Claude what kind of language it would like to program in, and it said something like Forth but with static typing, contracts, and constraint solving, implemented on the Erlang BEAM.
So I have been prodding Claude Code for a few sessions to actually do it. It's a silly experiment, but fun to watch. Right now it's implementing a JSON parser in the generated language as a kind of milestone example.
If you don't mind, could you drop some code? I'd be interested to see the result :)
sure, for example it generated this small demo of the type and contract safeguards. As you can see, it's mostly "Forth but with things":
# bank.ax — The Safe Bank (Milestone 2)
#
# Demonstrates property-based verification of financial operations.
# Each function has PRE/POST contracts. VERIFY auto-generates random
# inputs, filters by PRE, runs the function, and checks POST holds.
# DEPOSIT: add amount to balance
# Stack: [amount, balance] (amount on top)
# PRE: amount > 0 AND balance >= 0
# POST: result >= 0
DEF deposit : int int -> int
PRE { OVER 0 GTE SWAP 0 GT AND }
ADD
POST DUP 0 GTE
END
# WITHDRAW: subtract amount from balance
# Stack: [amount, balance] (amount on top)
# PRE: amount > 0 AND balance >= amount
# POST: result >= 0
DEF withdraw : int int -> int
PRE { OVER OVER GTE SWAP 0 GT AND }
SUB
POST DUP 0 GTE
END
# Verify both functions — 500 random tests each
VERIFY deposit 500
VERIFY withdraw 500
# Prove both functions — mathematically, for ALL inputs
PROVE deposit
PROVE withdraw
# Demo: manual operations
1000 200 deposit SAY
1000 300 withdraw SAY
Running it outputs: VERIFY deposit: OK — 500 tests passed (1056 skipped by PRE)
VERIFY withdraw: OK — 500 tests passed (1606 skipped by PRE)
PROVE deposit: PROVEN — POST holds for all inputs satisfying PRE
PROVE withdraw: PROVEN — POST holds for all inputs satisfying PRE
1200
700How does it do the proving?
Spawns and calls z3 under the hood, I did let it cheat there because otherwise it's rabbit holes below rabbit holes all the way down :)
thanks.
> There's something philosophically fun about that I feel = like asking a painter to design their ideal brush.
With the large difference that painters actually paint and use the brush, they don't just algorithmically regurgitate paintings they've been shown.
I plan to keep dumping tokens into it here and there to see where it goes, it's a fun rabbit hole.
I find that it's able to produce working code in the generated language just from the README and the previous examples without much trouble. And I've seen the strict typing help it catch and correct bugs quickly. I steer it very little. The only thing I keep an eye on is not allowing it to cheat about things like writing lots of concrete functionality in the host language and then just calling that from the interpreter.
If you want to go really meta ask Claude Code to re-implement itself in the new language. You could keep going doing this.
It kind of feels like it wants to go there or thereabouts, but to be honest, I just half understand half of the roadmap it has written for itself:
- **v0.0.1** (complete): Interpreter, PRE/POST contracts, REPL, TIMES/WHILE, FILTER/MAP/REDUCE, strings, I/O
- **v0.1.0** (complete): Static type checker, VERIFY (property-based contract testing), maps, Safe Bank milestone
- **v0.2.0** (complete): PROVE — compile-time contract verification via Z3 SMT solver
- **v0.3.0** (complete): Algebraic data types (TYPE/MATCH) — Option, Result, and user-defined sum types with exhaustiveness checking
- **v0.4.0** (complete): JSON parser/encoder milestone — wildcard MATCH, string primitives, ROT4, PAIRS, NUM_STR, VERIFY for sum types
- **v0.4.1** (current): PROVE for IF/ELSE branches (via SMT-LIB `ite`), ABS/MIN/MAX, function call inlining
- **v0.5.0** (next): Practical language features — LET bindings, IMPORT/modules, error handling, standard library
- **v0.6.0**: PROVE for MATCH/algebraic types, refinement-style reasoning
- **v0.7.0**: Typed BEAM concurrency (typed message passing, stateful actors)
- **v0.8.0**: BEAM bytecode compilation
- **Future**: Declarative constraint solving, tensor/distribution primitives, multi-agent collaboration
I'd say the sky is the limit, but in fact the limit is the stingy token budget of the 20€ Claude sub...It came up a few weeks ago already, can't find the link
There have been a few.
Conrad Barski, Feb 25, 2026:
> Working atari 2600 flappy bird, by just asking chatgpt to directly output the raw bytes for a cartridge image
TBH this is a bit unexpected: it should know how to encode instructions, of course, but calculating all jumps on the fly is rather hard (I think).
Let's have more programming language posts (even about "retro" ones like Icon, SNOBOL, Bliss, MUMPS, etc.), guys.
And less about AI topics.
Om.
And therein lies the tragedy of folks exploiting well known culturally loaded symbols/concepts for attention/etc. on the Internet.
For example, I really don't know what to make of this similarly named language; enthusiastic kid or attention-seeking influencer? - https://omlang.com/
people exploit anything for attention/etc. on the Internet.
if everyone on HN started downvoting all the AI posts (atleast the slop ones) it would cut down submissions by half
I think it's going to take some time for the reality of ns;nt and the disappointments to sink in. They need to learn that one cannot "claw" their way to a money machine.