Epistemic status: Don’t take it too seriously. Or do. idk, I can’t stop you.
Free-standing function call syntax considered kind of suboptimal.
Epistemic status: Don’t take it too seriously. Or do. idk, I can’t stop you.
As opposed to code like this. (This is not real Rust code. Quick challenge for the curious Rustacean, can you explain why we cannot rewrite the above code like this, even if we import all of the symbols?)
fn get_ids(data: Vec<Widget>) -> Vec<Id> {
collect(map(filter(iter(data), |w| w.alive), |w| w.id))
}
I honestly feel like this should be so obvious that it shouldn’t even be up for debate. The first code example—with its nice ‘pipelining’ or ‘method chaining’ or whatever you want to call—it just works. It can be read line-by-line. It’s easy to annotate it with comments. It doesn’t require introduction of new variables to become more readable since it’s already readable as is.
As opposed to, y’know, the first word in the line describing the final action our function performs.
Let me make it very clear: This is an article hot take about syntax. In practice, semantics beat syntax every day of the week. In other words, don’t take it too seriously.
Second, this is not about imperative vs. functional programming. This article takes for granted that you’re already on board with concepts such as ‘map’ and ‘filter’. It’s possible to overuse that style, but I won’t talk about it here.
Here is a feature that’s so bog-standard in modern programming languages that it barely feels like a feature at all. Member access for structs or classes with our beloved friend the .
-operator.
This is a form of pipelining. It puts the data first, the operator in the middle, and concludes with the action (restricting to a member field). That’s an instance of what I call pipelining.
type Bar struct {
field int
}
func get_field(bar Bar) int {
return bar.field
}
// vs. syntax like that of Python's `getattr` function
func get_field(bar Bar) int {
return getattr(bar, "field")
}
You see what I am getting at, right? It’s the same principle. One of the reasons why x.y
-style member access syntax (and x.y()
-style method call syntax!) is popular is since it’s easy to read and chains easily.
Let’s make the comparison slightly more fair, and pretend that we have to write x.get(y)
. Compare:
fizz.get(bar).get(buzz).get(foo)
// vs.
get(get(get(fizz, bar), buzz), foo)
Which one of these is easier to read? The pipelined syntax, obviously. This example is easy to parse either way, but imagine you’d like to blend out some information and purely focus on the final operation.
<previous stuff>.get(foo)
// vs.
get(<previous stuff>, foo)
You see the problem, right? In the first example, we have ‘all of the previous stuff’ and then apply another operation to it. In the second example, the operation which we want to perform (get
) and the new operand (foo
) are spread out with ‘all of the previous stuff’ sitting between them.
Looking back at our original example, the problem should be obvious:
fn get_ids(data: Vec<Widget>) -> Vec<Id> {
collect(map(filter(iter(data), |w| w.alive), |w| w.id))
}
-----------------------------1 // it's fun to parse the whole line to find the start
------------------------2
-----------------3
---------------------------------------4 // all the way back to find the second arg
-------------5
------------------------------------------------------6 // and all the way back again
-----7 // okay the final step is the first word in the line that makes sense
I cannot deny the allegations: I just don’t think it makes sense to write code like that as long as a clearly better option exists.
Why would I have to parse the whole line just to figure out where my input comes in, and why is the data flow ‘from the inside to the outside’? It’s kind of silly, if you ask me.
Readability is nice, and I could add add a whole section complaining about the mess that’s Python’s ‘functional’ features.
However, let’s take a step back and talk about ease of editing. Going back to the example above, imagine you’d like to add another map
(or any other function call) in the middle there. How easy is this?
fn get_ids(data: Vec<Widget>) -> Vec<Id> {
collect(map(filter(map(iter(data), |w| w.toWingding()), |w| w.alive), |w| w.id))
}
Consider:
git diff
of this is going to be basically unreadable, everything is crammed onto one line. fn get_ids(data: Vec<Widget>) -> Vec<Id> {
data.iter()
.map(|w| w.toWingding())
.filter(|w| w.alive)
.map(|w| w.id)
.collect()
}
This is adding a single line of code. No parentheses counting. It’s easy and obvious. It’s easy to write and easy to review. Perhaps most importantly, it shows up incredibly nicely in the blame
layer of whatever editor or code exploration tool you’re using.
You might think that this issue is just about trying to cram everything onto a single line, but frankly, trying to move away from that doesn’t help much. It will still mess up your git diffs and the blame layer.
You can, of course, just assign the result of every filter
and map
call to a helper variable, and I will (begrudgingly) acknowledge that that works, and is significantly better than trying to do absurd levels of nesting.
When you press .
in your IDE, it will show a neat little pop-up that tells you which methods you can call or which fields you can access.
This is probably the single IDE feature with the biggest value add, and if not that, then at least the single most frequently used one. Some people will tell you that static analysis for namespace or module-level code discovery is useless in the age of AI autocompletion and vibe coding, but I very much disagree.1
“grug very like type systems make programming easier. for grug, type systems most value when grug hit dot on keyboard and list of things grug can do pop up magic. this 90% of value of type system or more to grug” — grug
Words to live by. What he’s describing here is something that essentially requires pipelining to work at all. (And types or type annotation, but having those is the direction the industry is moving in anyway.)
It doesn’t matter if it’s the trusty .
operator, C++’s ->
, or if it’s something more bespoke such as Elm’s or Gleam’s |>
or Haskell’s &
. In the end, it’s a pipeline operator—the same principle applies. If your LSP knows the type of what’s on the left, it should in principle be able to offer suggestions for what to do next.
If your favorite language’s LSP/IDE does a poor job at offering suggestions during pipelining, then it’s probably one of the following reasons:
In either case, great editor/LSP support is more or less considered mandatory for modern programming languages. And of course, this is where pipelining shines.
Ask any IDE, autocompleting fizz.bu... -> fizz.buzz()
is much easier than autocompleting bu... -> buzz(...)
, for the obvious reason that you didn’t even write fizz
in the second example yet, so your editor has less information to work with.
Pipelining is amazing at data processing, and allows you to transform code that’s commonly written with ‘inside-out’ control flow into ’line-by-line’ transformations.
Where could this possibly be more clear than in SQL, the presumably single most significant language for querying and aggregating complex large-scale datasets?
You’ll be pleased to hear that, yes, people are in fact working on bringing pipelining to SQL. (Whether it’s actually going to happen in this specific form is a different question, let’s not get too carried away here.)
Unless you’re one of those people who spends so much time dealing with SQL that it’s become second nature, and the thought that the control flow of nested queries is hard to follow for the average non-database engineer is incomprehensible to you, I guess.
Personally, I’m a fan.
Anyway, if you’re interested, listen to this ten minute talk presented at HYTRADBOI 2025.
I’ll put their example of how a standard nested query can be simplified here, for convenience:
SELECT c_count, COUNT(*) AS custdist
FROM
(
SELECT c_custkey, COUNT(o_orderkey) c_count
FROM customer
LEFT OUTER JOIN orders
ON c_custkey = o_custkey
AND o_comment NOT LIKE '%unusual%'
GROUP BY c_custkey
) AS c_orders
GROUP BY c_count
ORDER BY custdist DESC;
Versus the SQL Syntax she told you not to worry about:
FROM customer
|> LEFT OUTER JOIN orders
ON c_custkey = o_custkey
AND o_comment NOT LIKE '%unusual%'
|> AGGREGATE COUNT(o_orderkey) AS c_count
GROUP BY c_custkey
|> AGGREGATE COUNT(*) AS custdist
GROUP BY c_count
|> ORDER BY custdist DESC;
Less nesting. More aligned with other languages and LINQ. Can easily be read line-by-line.
Here’s a more skeptical voice (warning, LinkedIn!). Franck Pachot raises the great point that the SELECT
statement at the top of a query is (essentially) its function signature and specifies the return type. With pipe syntax, you lose some of this readability.
I agree, but that seems like a solvable problem to me.
Out of the Gang of Four Design Patterns, the builder pattern is one that isn’t completely irredeemable.
And—surprise, surprise—it fits pretty well into pipelining. Any situation where you need to construct a complex, stateful object (e.g. a client or runtime), it’s a great way to feed complex, optional arguments into an object.
Some people say they prefer optional/named arguments, but honestly, I don’t understand why: An optional named foo
parameter is harder to track down in code (and harder to mark as deprecated!) than all instances of a .setFoo()
builder function.
If you have no clue what I’m talking about, this here is the type of pattern I’m talking about. You have a ‘builder’ object, call some methods on it to configure it, and finally build()
the object you’re actually interested in.
use tokio::runtime::Builder;
fn main() {
// build runtime
let runtime = Builder::new_multi_thread()
.worker_threads(4)
.thread_name("my-custom-name")
.thread_stack_size(3 * 1024 * 1024)
.build()
.unwrap();
// use runtime ...
}
This too, is pipelining.
Haskell is hard to read.
It has these weird operators like <$>
, <*>
, $
, or >>=
and when you ask Haskell programmers about what they mean, they say something like “Oh, this is just a special case of the generalized Kleisli Monad Operator >=>
in the category of endo-pro-applicatives over a locally small poset.” and your eyes have glazed over before they’ve even finished the sentence.
(It also doesn’t help that Haskell allows you to define custom operators however you please, yes.)
If you’re wondering “How could a language have so many bespoke operators?”, my understanding is that most of them are just fancy ways of telling Haskell to compose some functions in a highly advanced way. Here’s the second-most basic2 example, the $
operator.
Imagine you have functions foo
, bar
, and some value data
. In a “““normal””” language you might write foo(data)
. In Haskell, this is written as foo data
. This is since foo
will automatically ‘grab’ values to the right as its arguments, so you don’t need the parentheses.
A consequence of this is that bar(foo(data))
is written as bar (foo data)
in Haskell. If you wrote bar foo data
, the compiler will interpret it as bar(foo)(data)
, which would be wrong. This is what people mean when they say that Haskell’s function call syntax is left-associative.
The $
operator is nothing but syntactic sugar that allows you to write bar $ foo data
instead of having to write bar (foo data)
. That’s it. People were fed-up with having to put parens everywhere, I guess.
If your eyes glazed over at this point, I can’t blame you.
Let’s get back on track.
Talking about any of the fancier operators would be punching well above my weight-class, so I’ll just stick to what I’ve been saying throughout this entire post already. Here’s a stilted Haskell toy example, intentionally not written in pointfree style.
-- Take an input string `content`
-- Split into lines, check whether each line is a palindrome and stringify
-- Ex. "foo\nradar" -> "False\nTrue"
checkPalindromes :: String -> String
checkPalindromes content = unlines $ map (show . isPalindrome) $ lines $ map toLower content
where
isPalindrome xs = xs == reverse xs
If you want to figure out the flow of data, this whole function body has to be read right-to-left.
To make things even funnier, you need to start with the where
clause to figure out which local “variables” are being defined. This happens (for whatever reason) at the end of the function instead of at the start. (Calling isPalindrome
a variable is misleading, but that’s besides the point.)
At this point you might wonder if Haskell has some sort of pipelining operator, and yes, it turns out that one was added in 2014! That’s pretty late considering that Haskell exists since 1990. This allows us to refactor the above code as follows:
checkPalindromes :: String -> String
checkPalindromes content =
content
& map toLower
& lines
& map (show . isPalindrome)
& unlines
where
isPalindrome xs = xs == reverse xs
Isn’t that way easier to read?
This is code which you can show to an enterprise Java programmer, tell them that they’re looking at Java Streams with slightly weird syntax, and they’ll get the idea.
Of course, in reality nothing is as simple. The Haskell ecosystem seems to be split between users of $
, users of &
, and users of the Flow-provided operators, which allow the same functionality, but allow you to write |>
instead of &
.3
I don’t know what to say about that, other than that—not entirely unlike C++—Haskell has its own share of operator-related and cultural historical baggage, and a split ecosystem, and this makes the language significantly less approachable than it has to be.
In the beginning I said that ‘Pipelining is the feature that allows you to omit a single argument from your parameter list, by instead passing the previous value.’
I still think that this is true, but it doesn’t get across the whole picture. If you’ve paid attention in the previous sections, you’ll have noticed that object.member
and iterator & map
share basically nothing in common outside of the order of operations.
In the first case, we’re accessing a value that’s scoped to the object. In the second, we’re ‘just’ passing an expression to a free-standing function.
Or in other words, pipelining is not the same as pipelining. Even from an IDE-perspective, they’re different. In Java, your editor will look for methods associated with an object and walk up the inheritance chain. In Haskell, your editor will put a so-called ’typed hole’, and try to deduce which functions have a type that ‘fits’ into the hole using Hindley-Milner Type Inference.
Personally, I like type inference (and type classes), but I also like if types have a namespace attached to them, with methods and associated functions. I am pragmatic like that.
What I like about Rust is that it gives me the best out of both worlds here: You get traits and type inference without needing to wrap your head around a fully functional, immutable, lazy, monad-driven programming paradigm, and you get methods and associated values without the absolute dumpster fire of complex inheritance chains or AbstractBeanFactoryConstructors.
I’ve not seen any other language that even comes close to the convenience of Rust’s pipelines, and its lack of higher-kinded types or inheritance did not stop it. Quite the opposite, if anything.
I like pipelining. That’s the one thing that definitely should be obvious if you’ve read all the way through this article.
I just think they’re neat, y’know?
I like reading my code top-to-bottom, left-to-right instead of from-the-inside-to-the-outside.
I like when I don’t need to count arguments and parentheses to figure out which value is the first argument of the second function, and which is the second argument of the first function.
I like when my editor can show me all fields of a struct, and all methods or functions associated with a value, just when I press .
on my keyboard. It’s great.
I like when git diff
and the blame
layer of the code repository don’t look like complete ass.
I like when adding a function call in the middle of a process doesn’t require me to parse the whole line to add the closing parenthesis, and doesn’t require me to adjust the nesting of the whole block.
I like when my functions distinguish between ‘a main value which we are acting upon’ and ‘secondary arguments’, as opposed to treating them all as the same.
I like when I don’t have to pollute my namespaces with a ton of helper variables or free-standing functions that I had to pull in from somewhere.
If you’re writing pipelined code—and not trying overly hard to fit everything into a single, convoluted, nested pipeline—then your functions will naturally split up into a few pipeline chunks.
Each chunk starts with a piece of ‘main data’ that travels on a conveyer belt, where every line performs exactly one action to transform it. Finally, a single value comes out at the end and gets its own name, so that it may be used later.
And that is—in my humble opinion—exactly how it should be. Neat, convenient, separated ‘chunks’, each of which can easily be understood in its own right.
Thanks to kreest for proofreading this article.
The author keeps calling it "pipelining", but I think the right term is "method chaining".
Compare with a simple pipeline in bash:
grep needle < haystack.txt | sed 's/foo/bar/g' | xargs wc -l
Each of those components executes in parallel, with the intermediate results streaming between them. You get a similar effect with coroutines.Compare Ruby:
data = File.readlines("haystack.txt")
.map(&:strip)
.grep(/needle/)
.map { |i| i.gsub('foo', 'bar') }
.map { |i| File.readlines(i).count }
In that case, each line is processed sequentially, with a complete array being created between each step. Nothing actually gets pipelined.Despite being clean and readable, I don't tend to do it any more, because it's harder to debug. More often these days, I write things like this:
data = File.readlines("haystack.txt")
data = data.map(&:strip)
data = data.grep(/needle/)
data = data.map { |i| i.gsub('foo', 'bar') }
data = data.map { |i| File.readlines(i).count }
It's ugly, but you know what? I can set a breakpoint anywhere and inspect the intermediate states without having to edit the script in prod. Sometimes ugly and boring is better.> The author keeps calling it "pipelining", but I think the right term is "method chaining". [...] You get a similar effect with coroutines.
The inventor of the shell pipeline, Douglas McIlroy, always understood the equivalency between pipelines and coroutines; it was deliberate. See https://www.cs.dartmouth.edu/~doug/sieve/sieve.pdf It goes even deeper than it appears, too. The way pipes were originally implemented in the Unix kernel was when the pipe buffer was filled[1] by the writer the kernel continued execution directly in the blocked reader process without bouncing through the scheduler. Effectively, arguably literally, coroutines; one process call the write function and execution continues with a read call returning the data.
Interestingly, Solaris Doors operate the same way by design--no bouncing through the scheduler--unlike pipes today where long ago I think most Unix kernels moved away from direct execution switching to better support multiple readers, etc.
[1] Or even on the first write? I'd have to double-check the source again.
I don’t find your “seasoned developer” version ugly at all. It just looks more mature and relaxed. It also has the benefits that you can actually do error handling and have space to add comments. Maybe people don’t like it because of the repetition of “data =“ but in fact you could use descriptive new variable names making the code even more readable (auto documenting). I’ve always felt method chaining to look “cramped”, if that’s the right word. Like a person drawing on paper but only using the upper left corner. However, this surely is also a matter of preference or what your used to.
I have a lot of code like this. The reason I prefer pipelines now is the mental overhead of understanding the intermediate step variables.
Something like
lines = File.readlines("haystack.txt")
stripped_lines = lines.map(&:strip)
needle_lines = stripped_lines.grep(/needle/)
transformed_lines = needle_lines.map { |line| line.gsub('foo', 'bar') }
line_counts = transformed_lines.map { |file_path| File.readlines(file_path).count }
is a hell to read and understand later imo. You have to read a lot of intermediate variables that do not matter in anything else in the code after you set it up, but you do not know in advance necessarily which matter and which don't unless you read and understand all of it. Also, it pollutes your workspace with too much stuff, so while this makes it easier to debug, it makes it also harder to read some time after. Moreover becomes even more crumpy if you need to repeat code. You probably need to define a function block then, which moves the crumpiness there.What I do now is starting defining the transformation in each step as a pure function, and chain them after once everything works, plus enclosing it into an error handler so that I depend on breakpoint debugging less.
There is certainly a trade off, but as a codebase grows larger and deals with more cases where the same code needs to be applied, the benefits of a concise yet expressive notation shows.
Code in this "named-pipeline" style is already self-documenting: using the same variable name makes it clear that we are dealing with a pipeline/chain. Using more descriptive names for the intermediate steps hides this, making each line more readable (and even then you're likely to end up with `dataStripped = data.map(&:strip)`) at the cost of making the block as a whole less readable.
> Maybe people don’t like it because of the repetition of “data =“
Eh, at first glance it looks "amateurish" due to all the repeated stuff. Chaining explicitly eliminates redundant operations - a more minimal representation of data flow - so it looks more "professional". But I also know better than to act on that impulse. ;)
That said, it really depends on the language at play. Some will compile all the repetition of `data =` away such that the variable's memory isn't re-written until after the last operation in that list; it'll hang out in a register or on the stack somewhere. Others will run the code exactly as written, bouncing data between the heap, stack, and registers - inefficiencies and all.
IMO, a comment like "We wind up debugging this a lot, please keep this syntax" would go a long way to help the next engineer. Assuming that the actual processing dwarfs the overhead present in this section, it would be even better to add discrete exception handling and post-conditions to make it more robust.
In most debuggers I have used, if you put a breakpoint on the first line of the method chain, you can "step over" each function in the chain until you get to the one you want.
Bit annoying, but serviceable. Though there's nothing wrong with your approach either.
debuggers can take it even further if they want that UX. in firefox given a chain of foo().bar().baz() you can set a breakpoint on any of 'em.
https://gist.github.com/user-attachments/assets/3329d736-70f...
> The author keeps calling it "pipelining", but I think the right term is "method chaining".
Allow me, too, to disagree. I think the right term is "function composition".
Instead of writing
h(g(f(x)))
as a way to say "first apply f to x, after which g is applied to the result of this, after which h is applied to the result of this", we can use function composition to compose f, g and h, and then "stuff" the value x into this "pipeline of composed functions".We can use whatever syntax we want for that, but I like Elm syntax which would look like:
x |> f >> g >> h
If you add in a call to “.lazy“ it won’t create all the intermediate arrays. There since at least 2.7. https://ruby-doc.org/core-2.7.0/Enumerator/Lazy.html
I do the same with Python, replacing multilevel comprehensions with intermediary steps of generator expressions, which are lazy and therefore do not impact performance and memory usage.
Ultimately it will depend on the functions being chained. If they can work with one part of the result, or a subset of parts, then they might not block, otherwise they will still need to get a complete result and the lazy cannot help.
Not much different from having a `sort` in shell pipeline I guess?
I think the best term is "function composition", but with a particular syntax so pipelining seems alright. Method chaining is a common case, where some base object is repeatedly modified by some action and then the object reference is returned by the "method", thus allowing the "chaining", but what if you're not dealing with objects and methods? The pipelined composition pattern is more general than method chaining imho.
You make an interesting point about debugging which is something I have also encountered in practice. There is an interesting tension here which I am unsure about how to best resolve.
In PRQL we use the pipelining approach by using the output of the last step as the implicit last argument of the next step. In M Lang (MS Power BI/Power Query), which is quite similar in many ways, they use second approach in that each step has to be named. This is very useful for debugging as you point out but also a lot more verbose and can be tedious. I like both but prefer the ergonomics of PRQL for interactive work.
Update: Actually, PRQL has a decent answer to this. Say you have a query like:
from invoices
filter total > 1_000
derive invoice_age = @2025-04-23 - invoice_date
filter invoice_age > 3months
and you want to figure out why the result set is empty. You can pipe the results into an intermediate reference like so: from invoices
filter total > 1_000
into tmp
from tmp
derive invoice_age = @2025-04-23 - invoice_date
filter invoice_age > 3months
So, good ergonomics on the happy path and a simple enough workaround when you need it. You can try these out in the PRQL Playground btw: https://prql-lang.org/playground/> The author keeps calling it "pipelining", but I think the right term is "method chaining".
I believe the correct definition for this concept is the Thrush combinator[0]. In some ML-based languages[1], such as F#, the |> operator is defined[2] for same:
[1..10] |> List.map (fun i -> i + 1)
Other functional languages have libraries which also provide this operator, such as the Scala Mouse[3] project.0 - https://leanpub.com/combinators/read#leanpub-auto-the-thrush
1 - https://en.wikipedia.org/wiki/ML_(programming_language)
2 - https://fsharpforfunandprofit.com/posts/defining-functions/
I'm not sure that's right, method chaining is just immediately acting on the return of the previous function, directly. It doesn't pass the return into the next function like a pipeline. The method must exist on the returned object. That is different to pipelines or thrush operators. Evaluation happens in the order it is written.
Unless I misunderstood the author, because method chaining is super common where I feel thrush operators are pretty rare, I would be surprised if they meant the latter.
They cite Gleam explicitly, which has a thrush operator in place of method chaining.
I get the impression (though I haven't checked) that the thrush operator is a backport of OOP-style method chaining to functional languages that don't support dot-method notation.
Shouldn’t modern debuggers be able to handle that easily? You can step in, step out, until you get where you want, or you could set a breakpoint in the method you want to debug instead of at the call site.
Even if your debugger can't do that, an AI agent can easily change the code for you to add intermediate output.
...an AI agent can independently patch your debugger to modify the semantics? Wow that's crazy.
Incidentally, have you ever considered investing in real estate? I happen to own an interest in a lovely bridge which, for personal reasons, I must suddenly sell at a below-market price.
> Despite being clean and readable, I don't tend to do it any more, because it's harder to debug. More often these days, I write things like this:
data = File.readlines("haystack.txt")
data = data.map(&:strip)
data = data.grep(/needle/)
data = data.map { |i| i.gsub('foo', 'bar') }
data = data.map { |i| File.readlines(i).count }
Hard disagree. It's less readable, the intend is unclear (where does it end?), and the variables are rewritten on every step and everything is named "data" (and please don't call them data_1, data_2, ...) so now you have to run a debugger to figure out what even is going on, rather than just... reading the code.The person you are quoting already conceded that is less readable, but that the ability to set a breakpoint easily (without having to stop the process and modify the code) is more important.
I myself agree, and find myself doing that too, especially in frontend code that executes in a browser. Debuggability is much more important than marginally-better readability, for production code.
> Debuggability is much more important than marginally-better readability, for production code.
I find this take surprising. I guess it depends on how much weight you give to "marginally-better", but IMHO readability is the single most important factor when it comes to writing code in most code-bases. You write code once, it may need to be debugged (by yourself or others) on rare occasions. However anytime anyone needs to understand the code (to update it, debug it, or just make changes in adjacent code) they will have to read it. In a shared code-base your code will be read many more times than it will be updated/debugged.
Yeah, part of it is that I do find
const foo = something()
.hoge()
.hige()
.hage();
better, sure, but not actually significantly harder to read than: let foo = something();
foo = foo.hoge();
foo = foo.hige();
foo = foo.hage();
But, while reading is more common than debugging, debugging a production app is often more important. I guess I am mostly thinking about web apps, because that is the area where I have mainly found the available debuggers lacking. Although they are getting better, I believe, I've frequently seen problems where they can't debug into some standard language feature because it's implemented in C++ native code, or they just don't expose the implicit temporary variables in a useful way.(I also often see similar-ish problems in languages where the debuggers just aren't that advanced, due to lack of popularity, or whatever.)
Particularly with web apps, though, we often want to attach to the current production app for initial debugging instead of modifying the app and running it locally, usually because somebody has reported a bug that happens in production (but how to reproduce it locally is not yet clear).
Alternatively stated, I guess, I believe readability is important, and maybe the "second most important thing", but nevertheless we should not prefer fancy/elegant code that feels nice to us to write and read, but makes debugging more difficult (with the prevailing debuggers) in any significant way.
In an ideal world, a difference like the above wouldn't be harder to debug, in which case I would also prefer the first version.
(And probably in the real world, the problems would be with async functions less conducive to the pithy hypothetical example. I'm a stalwart opponent of libraries like RxJs for the sole reason that you pay back with interest all of the gains you realized during development, the first time you have to debug something weird.)
> Each of those components executes in parallel, with the intermediate results streaming between them. You get a similar effect with coroutines.
Processes run in parallel, but they process the data in a strict sequential order: «grep» must produce a chunk of data before «sed» can proceed, and «sed» must produce another chunk of data before «xargs» can do its part. «xargs» in no way can ever pick up the output of «grep» and bypass the «sed» step. If the preceding step is busy crunching the data and is not producing the data, the subsequent step will be blocked (the process will fall asleep). So it is both, a pipeline and a chain.
It is actually a directed data flow graph.
Also, if you replace «haystack.txt» with a /dev/haystack, i.e.
grep needle < /dev/haystack | sed 's/foo/bar/g' | xargs wc -l
and /dev/haystack is waiting on the device it is attached to to yield a new chunk of data, all of the three, «grep», «sed» and «xargs» will block.For debugging method chains you can just use `tap`
Isn't the difference between a pipeline and a method chain that a pipeline doesn't have to wait for the previous process to complete in order to send results to the next step? Grep sends lines as it finds them to sed and sed on to xargs, which acts as a sink to collect the data (an is necessary otherwise wc -l would write out a series of ones).
Given File.readlines("haystack.txt"), the entire file must be resident in memory before .grep(/needle/) is performed, which may cause unnecessary utilization. Iirc, in frameworks like Polars, the collect() chain ending method tells the compiler that the previous methods will be performed as a stream and thus not require pulling the entirety into memory in order to perform an operation on a subset of the corpus.
Yeah, I've always heard this called method chaining. It's widespread in C#, particularly with Linq (which was explicitly designed to leverage it).
I've only ever heard the term 'pipelining' in reference to GPUs, or as an abstract umbrella term for moving data around.
I have to object against reusing the 'data' var. Make up a new name for each assignment in particular when types and data structures change (like the last step is switching from strings to ints).
Other than that I think both styles are fine.
I agree with this comment: https://news.ycombinator.com/item?id=43759814 that this pollutes current scope, which is especially bad if scoping is not that narrow (the case in Python where if-branches do not define their own scope, I don´t know for Ruby).
Another problem of having different names for each step is that you can no longer quickly comment out a single step to try things out, which you can if you either have the pipeline or a single variable name.
In Python, such steps like map() and filter() would execute concurrently, without large intermediate arrays. It lacks the chaining syntax for them, too.
Java streams are the closest equivalent, both by the concurrent execution model, and syntactically. And yes, the Java debugger can show you the state of the intermediate streams.
> would execute concurrently
Iterators are not (necessarily) concurrent. I believe you mean lazily.
Concurrent, not parallel.
That is, iterators' execution flow is interspersed, with the `yield` statement explicitly giving control to another coroutine, and then continuing the current coroutine at another yield point, like the call to next(). This is very similar to JS coroutines implemented via promises, with `await` yielding control.
Even though there is only one thread of execution, the parts of the pipeline execute together in lockstep, not sequentially, so there's no need for a previous part to completely compute a large list before the following part can start iterating over it.
if you work with I/O, when you can have all sorts of wrong/invalid data and I/O errors, the chaining is a nightmare, as each chain can have numerous different errors/exceptions.
the chaining really only works if your language is strongly typed and you are somewhat guaranteed that variables will be of expected type.
Syntactic sugar can sometimes fool us into thinking the underlying process is more efficient or streamlined. As a new programmer, I probably would have assumed that "storing" `data` at each step would be more expensive.
It absolutely becomes very inefficient, though the threshold data set size varies according to context. Most languages don't have lightweight coroutines as an alternative (but see Lua!), so the convenient alternatives have larger fixed cost. Plus cache locality means cache utilization might be helpful, or even better, as opposed to switching back-and-for every data element, though coroutine-based approaches can also use buffering strategies, which not coincidentally is how pipes work.
But, yes, naive call chaining like that is sometimes a significant performance problem in the real world. For example, in the land of JavaScript. One of the more egregious examples I've personally seen was a Bash script that used Bash arrays rather than pipelines, though in that case it had to do with the loss of concurrency, not data churn.
It depends on the language you're using.
For my Ruby example, each of those method calls will allocate an Array on the heap, where it will persist until all references are removed and the GC runs again. The extra overhead of the named reference is somewhere between Tiny and Zero, depending on your interpreter. No extra copies are made; it's just a reference.
In most compiled languages: the overhead is exactly zero. At runtime, nothing even knows it's called "data" unless you have debug symbols.
If these are going to be large arrays and you actually care about memory usage, you wouldn't write the code the way I did. You might use lazy enumerators, or just flatten it out into a simple procedure; either of those would process one line at a time, discarding all the intermediate results as it goes.
Also, "File.readlines(i).count" is an atrocity of wasted memory. If you care about efficiency at all, that's the first part to go. :)
Reading this, I am so happy that my first language was a scheme where I could see the result of the first optimization passes.
This helped me quickly develop a sense for how code is optimized and what code is eventually executed.
Exactly that. It looks nice but it's annoying to debug
I do it in a similar way you mentioned
I think updating the former to the latter when you are actually debugging something isn’t that big of a deal.
But with actually checked in code, the tradeoff in readability is pretty substantial
I'm personally someone who advocates for languages to keep their feature set small and shoot to achieve a finished feature set quickly.
However.
I would be lying if I didn't secretly wish that all languages adopted the `|>` syntax from Elixir.
```
params
|> Map.get("user")
|> create_user()
|> notify_admin()
```
We might be able to cross one more language off your wishlist soon, Javascript is on the way to getting a pipeline operator, the proposal is currently at Stage 2
https://github.com/tc39/proposal-pipeline-operator
I'm very excited for it.
It also has barely seen any activity in years. It is going nowhere. The TC39 committee is utterly dysfunctional and anti-progress, and will not let any this or any other new syntax into JavaScript. Records and tuples has just been killed, despite being cited in surveys as a major missing feature[1]. Pattern matching is stuck in stage 1 and hasn't been presented since 2022. Ditto for type annotations and a million other things.
Our only hope is if TypeScript finally gives up on the broken TC39 process and starts to implement its own syntax enhancements again.
[1] https://2024.stateofjs.com/en-US/usage/#top_currently_missin...
I wouldn’t hold your breath for TypeScript introducing any new supra-JS features. In the old days they did a little bit, but now those features (namely enums) are considered harmful.
More specifically, with the (also ironically gummed up in tc39) type syntax [1], and importantly node introducing the --strip-types option [2], TS is only ever going to look more and more like standards compliant JS.
Records and Tuples weren't stopped because of tc39, but rather the engine developers. Read the notes.
Aren't the engine devs all part of the TC39 committee? I know they stopped SIMD in JS because they were mire interested in shipping WASM, and then adding SIMD to it.
I would say representatives of the engine teams are involved. However not involved enough clearly, because it should have been withdrawn waaay before now due to this issue.
It was also replaced with the Composite proposal, which is similar but not exactly the same.
I was excited for that proposal, but it veered off course some years ago – some TC39 members have stuck to the position that without member property support or async/await support, they will not let the feature move forward.
It seems like most people are just asking for the simple function piping everyone expects from the |> syntax, but that doesn't look likely to happen.
I don't actually see why `|> await foo(bar)` wouldn't be acceptable if you must support futures.
I'm not a JS dev so idk what member property support is.
Seems like it'd force the rest of the pipeline to be peppered with `await` which might not be desirable
"bar"
|> await getFuture(%);
|> baz(await %);
|> bat(await %);
My guess is the TC committee would want this to be more seamless.This also gets weird because if the `|>` is a special function that sends in a magic `%` parameter, it'd have to be context sensitive to whether or not an `async` thing happens within the bounds. Whether or not it does will determine if the subsequent pipes are dealing with a future of % or just % directly.
It wouldn't though? The first await would... await the value out of the future. You still do the syntactic transformation with the magic parameter. In your example you're awaiting the future returned by getFuture twice and improperly awaiting the output of baz (which isn't async in the example).
In reality it would look like:
"bar"
|> await getFuture()
|> baz()
|> await bat()
(assuming getFuture and bat are both async). You do need |> to be aware of the case where the await keyword is present, but that's about it. The above would effectively transform to: await bat(baz(await getFuture("bar")));
I don't see the problem with this.Correct me if I'm wrong, but if you use the below syntax
"bar"
|> await getFuture()
How would you disambiguate it from your intended meaning and the below: "bar"
|> await getFutureAsyncFactory()
Basically, an async function that returns a function which is intended to be the pipeline processor.Typically in JS you do this with parens like so:
(await getFutureAsyncFactory())("input")
But the use of parens doesn't transpose to the pipeline setting well IMO
I don't think |> really can support applying the result of one of its composite applications in general, so it's not ambiguous.
Given this example:
(await getFutureAsyncFactory("bar"))("input")
the getFutureAsyncFactory function is async, but the function it returns is not (or it may be and we just don't await it). Basically, using |> like you stated above doesn't do what you want. If you wanted the same semantics, you would have to do something like: ("bar" |> await getFutureAsyncFactory())("input")
to invoke the returned function.The whole pipeline takes on the value of the last function specified.
Ah sorry I didn't explain properly, I meant
a |> await f()
and a |> (await f())
Might be expected to do the same thing.But the latter is syntactically undistinguishable from
a |> await returnsF()
What do you think about a |> f |> g
Where you don't really call the function with () in the pipeline syntax? I think that would be more natural.It's still not ambiguous. Your second example would be a syntax error (probably, if I was designing it at least) because you're missing the invocation parenthesis after the wrapped value:
a |> (await f())()
which removes any sort of ambiguity. Your first example calls f() with a as its first argument while the second (after my fix) calls and awaits f() and then invokes that result with a as its first argument.For the last example, it would look like:
a |> (await f())() | g()
assuming f() is still async and returns a function. g() must be a function, so the parenthesis have to be added.I worry about "soon" here. I've been excited for this proposal for years now (8 maybe? I forget), and I'm not sure it'll ever actually get traction at this point.
A while ago, I wondered how close you could get to a pipeline operator using existing JavaScript features. In case anyone might like to have a look, I wrote a proof-of-concept function called "Chute" [1]. It chains function and method calls in a dot-notation style like the basic example below.
chute(7) // setup a chute and give it a seed value
.toString // call methods of the current data (parens optional)
.parseInt // send the current data through global native Fns
.do(x=>[x]) // through a chain of one or more local / inline Fns
.JSON.stringify // through nested global functions (native / custom)
.JSON.parse
.do(x=>x[0])
.log // through built in Chute methods
.add_one // global custom Fns (e.g. const add_one=x=>x+1)
() // end a chute with '()' and get the result
[1] https://chute.pages.dev/ | https://github.com/gregabbott/chutePHP RFC for version 8.5 too: https://wiki.php.net/rfc/pipe-operator-v3
All of their examples are wordier than just function chaining and I worry they’ve lost the plot somewhere.
They list this as a con of F# (also Elixir) pipes:
value |> x=> x.foo()
The insistence on an arrow function is pure hallucination value |> x.foo()
Should be perfectly achievable as it is in these other languages. What’s more, doing so removes all of the handwringing about await. And I’m frankly at a loss why you would want to put yield in the middle of one of these chains instead of after.I believe you meant to say we will need a transpiler, not polyfill. Of course, a lot of us are already using transpilers, so that's nothing new.
How do you polyfill syntax?
Letting your JS/TS compiler convert it into supported form. Not really a polyfill, but it allows to use new features in the source and still support older targets. This was done a lot when ES6 was new, I remember.
Polyfills are for runtime behavior that can't be replicated with a simple syntax transformation, such as adding new functions to built-in objects like string.prototype contains or the Symbol constructor and prototype or custom elements.
I haven't looked at the member properties bits but I suspect the pipeline syntax just needs the transform to be supported in build tools, rather than adding yet another polyfill.
I prefer Scala. You can write
``` params.get("user") |> create_user |> notify_admin ```
Even more concise and it doesn't even require a special language feature, it's just regular syntax of the language ( |> is a method like .get(...) so you could even write `params.get("user").|>(create_user) if you wanted to)
In elixir, ```Map.get("user") |> create_user |> notify_admin ``` would aso be valid, standard elixir, just not idiomatic (parens are optional, but preferred in most cases, and one-line pipes are also frowned upon except for scripting).
With the disclaimer that I don't know Elixir and haven't programmed with the pipeline operator before: I don't like that special () syntax. That syntax denotes application of the function without passing any arguments, but the whole point here is that an argument is being passed. It seems clearer to me to just put the pipeline operator and the name of the function that it's being used with. I don't see how it's unclear that application is being handled by the pipeline operator.
Also, what if the function you want to use is returned by some nullary function? You couldn't just do |> getfunc(), as presumably the pipeline operator will interfere with the usual meaning of the parentheses and will try to pass something to getfunc. Would |> ( getfunc() ) work? This is the kind of problem that can arise when one language feature is permitted to change the ordinary behaviour of an existing feature in the name of convenience. (Unless of course I'm just missing something.)
I am also confused with such syntax of "passing as first argument" pipes. Having to write `x |> foo` instead of `x |> foo()` does not solve much, because you have the same lack of clarity if you need to pass a second argument. Ie `x |> foo(y)` in this case means `foo(x,y)`, but if `foo(y)` actually gives you a function to apply to `x` prob you should write `x |> foo(y)()` or `x |> (foo(y))()` then as I understand it? If that even makes sense in a language. In any case, you have the same issue as before, in different contexts `foo(y)` is interpreted differently.
I just find this syntax too inconsistent and vague, and hence actually annoying. Which is why I prefer defining pipes as composition of functions which can then be applied to whatever data. Then eg one can write sth like `(|> foo1 foo2 (foo3) #(foo4 % y))` and know that foo1 and foo2 are references to functions, foo3 evaluates to another function, and when one needs more arguments in foo4 they have to explicitly state that. This gives another function, and there is no ambiguity here whatsoever.
It would be silly to use a pipeline for x |> foo(). What's nice is being able to write:
def main_loop(%Game{} = game) do
game
|> get_move()
|> play_move()
|> win_check()
|> end_turn()
end
instead of the much harder to read: def main_loop(%Game{} = game)
end_turn(win_check(play_move(get_move(game))))
end
For an example with multiple parameters, this pipeline: schema
|> order_by(^constraint)
|> Repo.all()
|> Repo.preload(preload_opts)
would be identical to this: Repo.preload(Repo.all(order_by(schema, ^constraint)), preload_opts)
To address your question above,> if `foo(y)` actually gives you a function to apply to `x` prob you should write `x |> foo(y)()`
If foo(y) returned a function, then to call it with x, you would have to write foo(y).(x) or x |> foo(y).(), so the syntax around calling the anonymous function isn't affected by the pipe. Also, you're not generally going to be using pipelines with functions that return functions so much as with functions that return data which is then consumed as the first argument by the next function in the pipeline. See my previous comment on this thread for more on that point.
There's no inconsistency or ambiguity in the pipeline operator's behavior. It's just syntactic sugar that's handy for making your code easier to read.
> Having to write `x |> foo` instead of `x |> foo()` does not solve much, because you have the same lack of clarity if you need to pass a second argument
That's actually true. In Scala that is not so nice, because then it becomes `x |> foo(_, arg2)` or, even worse, `x |> (param => foo(param, arg2))`. I have a few such cases in my sourcecode and I really don't like it. Haskell and PureScript do a much better job keeping the code clean in such cases.
> It seems clearer to me to just put the pipeline operator and the name of the function that it's being used with.
I agree with that and it confused me that it looks like the function is not referenced but actually applied/executed.
Oh that's nice!
Isn't it being a method call not quite equivalent? Are you able to define the method over arbitrary data types?
In Elixir, it is just a macro so it applies to all functions. I'm only a Scala novice so I'm not sure how it would work there.
> Are you able to define the method over arbitrary data types?
Yes exactly, which is why it is not equivalent. No macro needed here. In Scala 2 syntax:
``` implicit class AnyOps[A](private val a: A) extends AnyVal { def |>[B](f: A => B) = f(a) } ```
> I would be lying if I didn't secretly wish that all languages adopted the `|>` syntax from Elixir.
This is usually the Thrush combinator[0], exists in other languages as well, and can be informally defined as:
f(g(x)) = g(x) |> f
0 - https://leanpub.com/combinators/read#leanpub-auto-the-thrushNot quite. Note that the Elixir pipe puts the left hand of the pipe as the first argument in the right-hand function. E.g.
x |> f(y) = f(x, y)
As a result, the Elixir variant cannot be defined as a well-typed function, but must be a macro.I've been using Elxir for a long time and had that same hope after having experienced how clear, concise and maintainable apps can be when the core is all a bunch of pipelines (and the boundary does error handling using cases and withs). But having seen the pipe operator in Ruby, I now think it was a bad idea.
The problem is that method-chaining is common in several OO languages, including Ruby. This means the functions on an object return an object, which can then call other functions on itself. In contrast, the pipe operator calls a function, passing in what's on the left side of it as the first argument. In order to work properly, this means you'll need functions that take the data as the first argument and return the same shape to return, whether that's a list, a map, a string or a struct, etc.
When you add a pipe operator to an OO language where method-chaining is common, you'll start getting two different types of APIs and it ends up messier than if you'd just stuck with chaining method calls. I much prefer passing immutable data into a pipeline of functions as Elixir does it, but I'd pick method chaining over a mix of method chaining and pipelines.
I'm a big fan of the Elixir operator, and it should be standard in all functional programming languages. You need it because everything is just a function and you can't do anything like method chaining, because none of the return values have anything like methods. The |> is "just" syntax sugar for a load of nested functions. Whereas the Rust style method chaining doesn't need language support - it's more of a programming style.
Note also that it works well in Elixir because it was created at the same time as most of the standard library. That means that the standard library takes the relevant argument in the first position all the time. Very rarely do you need to pipe into the second argument (and you need a lambda or convenience function to make that work).
Agree. This is absolutely my fave part of Elixir. Whenever I can get something to flow elegantly thru a pipeline like that, I feel like it’s a win against chaos.
R has a lovely toolkit for data science using this syntax, called the tidyverse. My favorite dev experience, it's so easy to just write code
Yes, a small feature set is important, and adding the functional-style pipe to languages that already have chaining with the dot seems to clutter up the design space. However, dot-chaining has the severe limitation that you can only pass to the first or "this" argument.
Is there any language with a single feature that gives the best of both worlds?
FWIW you can pass to other arguments than first in this syntax
```
params
|> Map.get("user")
|> create_user()
|> (¬ify_admin("signup", &1)).() ```
or
```
params
|> Map.get("user")
|> create_user()
|> (fn user -> notify_admin("signup", user) end).() ```
BTW, there's a convenience macro of Kernel.then/2 [0] which IMO looks a little cleaner:
params
|> Map.get("user")
|> create_user()
|> then(¬ify_admin("signup", &1))
params
|> Map.get("user")
|> create_user()
|> then(fn user -> notify_admin("signup", user) end)
[0] https://hexdocs.pm/elixir/1.18.3/Kernel.html#then/2Do concatenative langs like Factor fit the bill?
The pipe operator relies on the first argument being the subject of the operation. A lot of languages have the arguments in a different order, and OO languages sometimes use function chaining to get a similar result.
IIRC the usual workaround in Elixir involves be small lambda that rearranges things:
"World"
|> then(&concat("Hello ", &1))
I imagine a shorter syntax could someday be possible, where some special placeholder expression could be used, ex: "World"
|> concat("Hello ", &1)
However that creates a new problem: If the implicit-first-argument form is still permitted (foo() instead of foo(&1)) then it becomes confusing which function-arity is being called. A human could easily fail to notice the absence or presence of the special placeholder on some lines, and invoke the wrong thing.Yeah, R (tidyverse) has `.` as such a placeholder. It is useful but indeed I find the syntax off, though I find the syntax off even without it, anyway. I would rather define pipes as compositions of functions, which are pretty unambiguous in terms of what arguments they get, and then apply these to whatever i want.
Yeah I really hate that syntax and I can’t even explain why so I kind of blot it out, but you’re right.
My dislike does improve my test coverage though, since I tend to pop out a real method instead.
Last time I checked (2020) there were already a few rejected proposals to shorten the syntax for this. It seemed like they were pretty exasperated by them at the time.
You could make use of `flip` from Haskell.
flip :: (x -> y -> z) -> (y -> x -> x)
flip f = \y -> \x -> f x y
x |> (flip f)(y) -- f(x, y)
Pipelines are one of the greatest Gleam features[1].
I wouldn't say it's a Gleam feature per se, in that it's not something that it's added that isn't already in Elixir.
I hate to be that guy, but I believe the `|>` syntax started with F# before Elixir picked it up.
(No disagreements with your post, just want to give credit where it's due. I'm also a big fan of the syntax)
I turn older then f#, it’s been an ML language thing for a while but not sure where it first appeared
It seems like it originated in the Isabelle proof assistant ML dialect in the mid 90s https://web.archive.org/web/20190217164203/https://blogs.msd...
I feel like Haskell really missed a trick by having $ not go the other way, though it's trivial to make your own symbol that goes the other way.
Haskell has & which goes the other way:
users
& map validate
& catMaybes
& mapM persist
Yes, `&` (reverse apply) is equivalent to `|>`, but it is interesting that there is no common operator for reversed compose `.`, so function compositions are still read right-to-left.
In my programming language, I added `.>` as a reverse-compose operator, so pipelines of function compositions can also be read uniformly left-to-right, e.g.
process = map validate .> catMaybes .> mapM persist
Elm (written in Haskell) uses |> and <| for pipelining forwards and backwards, and function composition is >> and <<. These have made it into Haskell via nri-prelude https://hackage.haskell.org/package/nri-prelude (written by a company that uses a lot of Elm in order to make writing Haskell look more like writing Elm).
There is also https://hackage.haskell.org/package/flow which uses .> and <. for function composition.
EDIT: in no way do I want to claim the originality of these things in Elm or the Haskell package inspired by it. AFAIK |> came from F# but it could be miles earlier.
Maybe not common, but there’s Control.Arrow.(>>>)
Also you can (|>) = (&) (with an appropriate fixity declaration) to get
users
|> map validate
|> catMaybes
|> mapM persist
I guess I'm showing how long it's been since I was a student of Haskell then. Glad to see the addition!
I wish there were a variation that can destructure more ergonomically.
Instead of:
```
fetch_data()
|> (fn
{:ok, val, _meta} -> val
:error -> "default value"
end).()|> String.upcase()
```
Something like this:
```
fetch_data()
|>? {:ok, val, _meta} -> val
|>? :error -> "default value"
|> String.upcase()
```
fetch_data()
|> case do
{:ok, val, _meta} -> val
:error -> "default value"
end
You have the extra "case do...end" block but it's pretty close?This is for sequential conditions. If you have nested conditions, check out a where block instead. https://dev.to/martinthenth/using-elixirs-with-statement-5e3...
Thanks, that looks good!
It would be even better without the `>`, though. The `|>` is a bit awkward to type, and more noisy visually.
I disagree, because then it can be very ambiguous with an existing `|` operator. The language has to be able to tell that this is a pipeline and not doing a bitwise or operation on the output of multiple functions.
Yes, I’m talking about a language where `|` would be the pipe operator and nothing else, like in a shell. Retrofitting a new operator into an existing language tends to be suboptimal.
Elixir itself adopted this operator from F#
Lisp macros allow a general solution to this that doesn't just handle chained collection operators but allows you to decide the order in which you write any chain of calls.
For example, we can write: (foo (bar (baz x))) as (-> x baz bar foo)
If there are additional arguments, we can accommodate those too: (sin (* x pi) as (-> x (* pi) sin)
Where expression so far gets inserted as the first argument to any form. If you want it inserted as the last argument, you can use ->> instead:
(filter positive? (map sin x)) as (->> x (map sin) (filter positive?))
You can also get full control of where to place the previous expression using as->.
Full details at https://clojure.org/guides/threading_macros
I find the threading operators in Clojure bring much joy and increase readability. I think it's interesting because it makes me actually consider function argument order much more because I want to increase opportunities to use them.
These threading macros can increase performance, the developer even has a parallelizing threading macro.
I use these with xforms transducers.
Yeah, I found this when I was playing around with Hy a while back. I wanted a generic `->` style operator, and isn't wasn't too much trouble to write a macro to introduce one.
That's sort of an argument for the existence of macros as a whole, you can't really do this as neatly in something like python (although I've tried) - I can see the downside of working in a codebase with hundreds of these kind of custom language features though.
Yes threading macros are so much nicer than method chaining, because it allows general function reuse, rather than being limited to the methods that happen to be defined in your initial data object.