Hacker News

Magistral — the first reasoning model by Mistral AI

2025-06-1014:08939424mistral.ai

Stands to reason.

Show article

Announcing Magistral — the first reasoning model by Mistral AI — excelling in domain-specific, transparent, and multilingual reasoning.

The best human thinking isn’t linear — it weaves through logic, insight, uncertainty, and discovery. Reasoning language models have enabled us to augment and delegate complex thinking and deep understanding to AI, improving our ability to work through problems requiring precise, step-by-step deliberation and analysis.

But this space is still nascent. Lack of specialized depth needed for domain-specific problems, limited transparency, and inconsistent reasoning in the desired language — are just some of the known limitations of early thinking models.

Today, we’re excited to announce our latest contribution to AI research with Magistral — our first reasoning model. Released in both open and enterprise versions, Magistral is designed to think things through — in ways familiar to us — while bringing expertise across professional domains, transparent reasoning that you can follow and verify, along with deep multilingual flexibility.

A one-shot physics simulation showcasing gravity, friction and collisions with Magistral Medium in Preview.

Highlights.

Magistral is a dual-release model focused on real-world reasoning and feedback-driven improvement.

We’re releasing the model in two variants: Magistral Small — a 24B parameter open-source version and Magistral Medium — a more powerful, enterprise version.
Magistral Medium scored 73.6% on AIME2024, and 90% with majority voting @64. Magistral Small scored 70.7% and 83.3% respectively.
Reason natively — Magistral’s chain-of-thought works across global languages and alphabets.
Suited for a wide range of enterprise use cases — from structured calculations and programmatic logic to decision trees and rule-based systems.
With the new Think mode and Flash Answers in Le Chat, you can get responses at 10x the speed compared to most competitors.
The release is supported by our latest paper covering comprehensive evaluations of Magistral, our training infrastructure, reinforcement learning algorithm, and novel observations for training reasoning models.

As we’ve open-sourced Magistral Small, we welcome the community to examine, modify and build upon its architecture and reasoning processes to further accelerate the emergence of thinking language models. Our earlier open models have already been leveraged by the community for exciting projects like ether0 and DeepHermes 3.

Purpose-built for transparent reasoning.

Magistral is fine-tuned for multi-step logic, improving interpretability and providing a traceable thought process in the user’s language, unlike general-purpose models.

We aim to iterate the model quickly starting with this release. Expect the models to constantly improve.

Multilingual dexterity.

The model excels in maintaining high-fidelity reasoning across numerous languages. Magistral is especially well-suited to reason in languages including English, French, Spanish, German, Italian, Arabic, Russian, and Simplified Chinese.

Prompt and response in Arabic with Magistral Medium in Preview in Le Chat.

10x faster reasoning with Le Chat.

With Flash Answers in Le Chat, Magistral Medium achieves up to 10x faster token throughput than most competitors. This enables real-time reasoning and user feedback, at scale.

Speed comparison of Magistral Medium in Preview in Le Chat against ChatGPT.

Versatility in application.

Magistral is ideal for general purpose use requiring longer thought processing and better accuracy than with non-reasoning LLMs. From legal research and financial forecasting to software development and creative storytelling — this model solves multi-step challenges where transparency and precision are critical.

Business strategy and operations.

Building on our flagship models, Magistral is designed for research, strategic planning, operational optimization, and data-driven decision making — whether executing risk assessment and modelling with multiple factors, or calculating optimal delivery windows under constraints.

Regulated industries and sectors.

Legal, finance, healthcare, and government professionals get traceable reasoning that meets compliance requirements. Every conclusion can be traced back through its logical steps, providing auditability for high-stakes environments with domain-specialized AI.

Systems, software, and data engineering.

Magistral enhances coding and development use cases: compared to non-reasoning models, it significantly improves project planning, backend architecture, frontend design, and data engineering through sequenced, multi-step actions involving external tools or API.

Content and communication.

Our early tests indicated that Magistral is an excellent creative companion. We highly recommend it for creative writing and storytelling, with the model capable of producing coherent or — if needed — delightfully eccentric copy.

Availability

Magistral Small is an open-weight model, and is available for self-deployment under the Apache 2.0 license. You can download it from:

Hugging Face: https://huggingface.co/mistralai/Magistral-Small-2506

You can try out a preview version of Magistral Medium in Le Chat or via API on La Plateforme.

Magistral Medium is also available on Amazon SageMaker, and soon on IBM WatsonX, Azure AI and Google Cloud Marketplace.

For enterprise and custom solutions, including on-premises deployments, contact our sales team.

Magistral represents a significant contribution by Mistral AI to the open source community, with input from seasoned experts and interns. And we’re keen to grow our family to further shape future AI innovation.

If you’re interested in joining us on our mission to democratize artificial intelligenceI, we welcome your applications to join our team!

Read the original article

meetpateltech

Karma: 13575

@Hacker__News
@hacker._news

Comments

By danielhanchen 2025-06-1014:217 reply

I made some GGUFs for those interested in running them at https://huggingface.co/unsloth/Magistral-Small-2506-GGUF

ollama run hf.co/unsloth/Magistral-Small-2506-GGUF:UD-Q4_K_XL

./llama.cpp/llama-cli -hf unsloth/Magistral-Small-2506-GGUF:UD-Q4_K_XL --jinja --temp 0.7 --top-k -1 --top-p 0.95 -ngl 99

Please use --jinja for llama.cpp and use temperature = 0.7, top-p 0.95!

Also best to increase Ollama's context length to say 8K at least: OLLAMA_CONTEXT_LENGTH=8192 ollama serve &. Some other details in https://docs.unsloth.ai/basics/magistral

By ozgune 2025-06-1017:452 reply

Their benchmarks are interesting. They are comparing to DeepSeek-V3's (non-reasoning) December and DeepSeek-R1's January releases. I feel that comparing to DeepSeek-R1-0528 would be more fair.

For example, R1 scores 79.8 on AIME 2024, R1-0528 performs 91.4.

R1 scores 70 on AIME 2025, R1-0528 scores 87.5. R1-0528 does similarly better for GPQA Diamond, LiveCodeBench, and Aider (about 10-15 points higher).

https://huggingface.co/deepseek-ai/DeepSeek-R1-0528

By derefr 2025-06-1120:26

I presume that "outdated upon release" benchmarks like these happen because the benchmark and the models in it were chosen first, before the model was created; and the model's development progress was measured using the benchmark. It then doesn't occur to anyone that the benchmark the engineers had been relying upon isn't also a good/useful benchmark for marketing upon release. From the inside view, it's just a benchmark, already there, already achieving impressive results, a whole-company internal target to hit for months — so why not publish it?

By semi-extrinsic 2025-06-1022:372 reply

Would also be interesting to compare with R1-0528-Qwen3-8B (chain-of-thought distilled from Deepseek-R1-0528 and post-trained into Qwen3-8B). It scores 86 and 76 on AIME 2024 and 2025 respectively.

Currently running the 6-bit XL quant on a single old RTX 2080 Ti and I'm quite impressed TBH. Simply wild for a sub-8GB download.

By saratogacx 2025-06-1115:251 reply

I have the same card on my machine at home, what is your config to run the model?

By semi-extrinsic 2025-06-1116:02

Downloaded the gguf file by unsloth, ran llama-cli from llama.cpp with that file as an argument.

IIUC, nowadays there is a jinja templated metadata-struct inside the gguf file itself. This contains the chat template and other config.

By danielhanchen 2025-06-119:02

I'm surprised it does very well as well - that's pretty cool to see!

By danielhanchen 2025-06-1015:002 reply

Their paper https://mistral.ai/static/research/magistral.pdf is also cool! They edited GRPO via:

1. Removed KL Divergence

2. Normalize by total length (Dr. GRPO style)

3. Minibatch normalization for advantages

4. Relaxing trust region

By gyrovagueGeist 2025-06-1017:581 reply

Does anyone know why they added minibatch advantage normalization (or when it can be useful)?

The paper they cite "What matters in on-policy RL" claims it does not lead to much difference on their suite of test problems, and (mean-of-minibatch)-normalization doesn't seem theoretically motivated for convergence to the optimal policy?

By danielhanchen 2025-06-1023:05

Tbh I'm unsure as well I took a skim of the paper so if I find anything I'll post it here!

By Onavo 2025-06-1015:263 reply

> Removed KL Divergence

Wait, how are they computing the loss?

By danielhanchen 2025-06-1015:32

Oh it's the KL term sorry - beta * KL ie they set beta to 0.

The goal of it was to "force" the model not to stray to far away from the original checkpoint, but it can hinder the model from learning new things

By trc001 2025-06-1022:37

It's become trendy to delete it. I say trendy because many papers delete it without offering any proof that it is meaningless

By mjburgess 2025-06-1015:44

It's just a penalty term that they delete

By monkmartinez 2025-06-1016:491 reply

At the risk of dating myself; Unsloth is the Bomb-dot-com!!! I use your models all the time and they just work. Thank you!!! What does llama.cpp normally use if not "jinja" for their templates?

By danielhanchen 2025-06-1023:05

Oh thanks! Yes I was gonna bring it up to them! Imo if there is a chat template, by default it should be --jinja

By gavi 2025-06-1020:051 reply

too much thinking

https://gist.github.com/gavi/b9985f730f5deefe49b6a28e5569d46...

By fzzzy 2025-06-1020:122 reply

My impression from running the first R1 release locally was that it also does too much thinking.

By reissbaker 2025-06-117:42

Magistral Small seems wayyy too heavy-handed with its RL to me:

\boxed{Hey! How can I help you today?}

They clearly rewarded the \boxed{...} formatting during their RL training, since it makes it easier to naively extract answers to math problems and thus verify them. But Magistral uses it for pretty much everything, even when it's inappropriate (in my own testing as well).

It also forgets to <think> unless you use their special system prompt reminding it to.

Honestly a little disappointing. It obviously benchmarks well, but it seems a little overcooked on non-benchmark usage.

By cluckindan 2025-06-1020:523 reply

It does not do any thinking. It is a statistical model, just like the rest of them.

By LordDragonfang 2025-06-1022:293 reply

"Thinking" is a term of art referring to the hidden/internal output of "reasoning" models where they output "chain of thought" before giving an answer[1]. This technique and name stem from the early observation that LLMs do better when explicitly told to "think step by step"[2]. Hope that helps clarify things for you for future constructive discussion.

[1] https://arxiv.org/html/2410.10630v1

[2] https://arxiv.org/pdf/2205.11916

By bobsomers 2025-06-1022:431 reply

We are aware of the term of art.

The point that was trying to be made, which I agree with, is that anthropomorphizing a statistical model isn’t actually helpful. It only serves to confuse laypersons into assuming these models are capable of a lot more than they really are.

That’s perfect if you’re a salesperson trying to dump your bad AI startup onto the public with an IPO, but unhelpful for pretty much any other reason, especially true understanding of what’s going on.

By LordDragonfang 2025-06-1023:221 reply

If that was their point, it would have been more constructive to actually make it.

To your point, it's only anthropomorphization if you make the anthrocentric assumption that "thinking" refers to something that only humans can do.[1]

And I don't think it confuses laypeople, when literally telling it to "think" achieves the very similar results as in humans - it produces output that someone provided it out-of-context would easily identify as "thinking out loud", and improves the accuracy of results like how... thinking does.

The best mental model of RLHF'd LLMs that I've seen is that they are statistical models "simulating"[1] how a human-like character would respond to a given natural-language input. To calculate the statistically "most likely" answer that an intelligent creature would give to a non-trivial question, with any sort of accuracy, you need emergent effects which look an awful like like a (low fidelity) simulation of intelligence. This includes simulating "thought". (And the distinction between "simulating thinking" and "thinking" is a distinction without a difference given enough accuracy)

I'm curious as to what "capabilities" you think the layperson is misled about, because if anything they tend to exceed layperson understanding IME. And I'm curious what mental model you have of LLMs that provides more "true understanding" of how a statistical model can generate answers that appear nowhere in its training.

[1] It also begs the question of whether there exists a clear and narrow definition of what "thinking" is that everyone can agree on. I suspect if you ask five philosophers you'll get six different answers, as the saying goes.

[2] https://www.astralcodexten.com/p/janus-simulators

By zer00eyz 2025-06-112:52

> It also begs the question of whether there exists a clear and narrow definition of what "thinking" is that everyone can agree on. I suspect if you ask five philosophers you'll get six different answers, as the saying goes.

And yet we added a hand wavy 7th to humanize a peice of technology.

By andrepd 2025-06-1110:121 reply

It's a misleading "term of art" which is more accurately described as a "term of marketing". Reasoning is precisely what LLMs don't do and it's precisely why they are unsuited to many tasks they are peddled for.

By LordDragonfang 2025-06-1116:511 reply

How are you defining "reasoning" such that you are confident that LLMs are definitely not doing it? What evidence do you have to that effect? (And are you certain that none of your reasoning applies to humans as well?)

By cluckindan 2025-06-1118:181 reply

They don’t ”think”.

https://arxiv.org/abs/2503.09211

They don’t ”reason”.

https://ml-site.cdn-apple.com/papers/the-illusion-of-thinkin...

They don’t even always output their internal state accurately.

https://arxiv.org/abs/2505.05410

By LordDragonfang 2025-06-1122:441 reply

> https://arxiv.org/abs/2503.09211

I am thoroughly unimpressed by this paper. It sets up a vague strawman definition of "thinking" that I'm not aware of anyone using (and makes no claim it applies to humans) and then knocks down the strawman.

It also leans way too heavy on determinism - For one thing, we have no way of knowing if human brains are deterministic (until we solve whether reality itself is). For another, I doubt you would suddenly reverse your position if we created a LoRa composed of atmospheric noise, so it does not support your real position.

> https://ml-site.cdn-apple.com/papers/the-illusion-of-thinkin...

This one is more substantial, but:

"While these models demonstrate improved performance on reasoning benchmarks, their fundamental capabilities, scaling properties, and limitations remain insufficiently understood. [...] Through extensive experimentation across diverse puzzles, we show that frontier LRMs face a complete accuracy collapse beyond certain complexities. [...] We found that LRMs have limitations in exact computation: they fail to use explicit algorithms and reason inconsistently across puzzles."

Starts by saying "we actually don't understand them" (meaning we don't know well enough to give a yes or no) and then proceeds to list flaws that, as I keep saying, also can be applied to most (if not all) humans' ability to reason. Human reasoning also collapses in accuracy above a certain complexities, and certainly are observed to fail to use explicit algorithms, as well as reasoning inconsistently across puzzles.

So unless your definition of anthropomorphization excludes most humans, this is far from a slam dunk.

> They don’t even always output their internal state accurately.

I have some really bad news about humans for you. I believe (Buddha et al, 500 BCE) is the foundational text on this, but there's been some more recent research (Hume, 1739), (Kierkegaard, 1849)

By cluckindan 2025-06-125:281 reply

Whodathunkit, some people are so infatuated with their simulacra that they choose to go tooth and nail in defense of the simulation.

My point was congruent with the argument that LLMs are not humans or possess human-like thinking and reasoning, and you have conveniently demonstrated that.

By LordDragonfang 2025-06-1218:30

> My point was congruent with the argument that LLMs are not humans or possess human-like thinking and reasoning, and you have conveniently demonstrated that.

I mean, they are obviously not humans, that is trivially true, yes.

I don't know what I said makes you believe I demonstrated that they do not possess human-like thinking and reasoning, though, considering I've mostly pointed out ways they seem similar to humans. Can you articulate your point there?

By MindTheAbstract 2025-06-116:34

I know this is the terminology, but I'd argue that the activations are the actual thinking. It's probably too late to change that, but I wish people would refer to thinking as the work Anthropic and Deepmind are doing with their mech interp

By boredhedgehog 2025-06-117:192 reply

These kind of comments are the equivalent of going to dog owners' forums, analyzing word choices in every post and warning the dog owners about the dangers of anthropomorphizing their pets, an effort as accurate as it is boorish and ineffectual.

By cluckindan 2025-06-1120:55

Dogs will not be quite as widely influencing decisions concerning other people.

By cluckindan 2025-06-118:26

[flagged]

By robmccoll 2025-06-1022:112 reply

What are we doing when we think?

By cluckindan 2025-06-116:011 reply

Human neurons are not reducible to arithmetic artificial neurons in a statistical model. Do not conflate them.

By jeffhuys 2025-06-116:411 reply

Why not, actually?

By cluckindan 2025-06-116:522 reply

Because we do not have a complete understanding of human neurons. How are we supposed to accurately model something we cannot directly observe?

By TheDong 2025-06-117:052 reply

Do you also complain when someone says "Half-life 2 has great water-physics" with "Don't call it physics, we still don't understand all the physical laws of the universe, and also they use limited-precision floating-point, so it's not water-physics, it's just a bunch of math"?

Like, we've agreed that "water-physics" and "cloth physics" in 3d graphics refers to a mathematical approximation of something we don't actually understand at the subatomic level (are there strings down there? Who knows).

Can "thinking" in AI not refer to this intentionally false imitation that has a similar observable outward effect?

Like, we're okay saying minecraft's water has "water physics", why are we not okay saying "in the AI context, thinking is a term that externally looks a bit like a human thinking, even though at a deeper layer it's unrelated"?

Or is thinking special, is it like "soul" and we must defend the word with our life else we lose our humanity? If I say "that building's been thinking about falling over for 50 years", did I commit a huge faux pas against my humanity?

By autoexec 2025-06-1110:53

> Do you also complain when someone says "Half-life 2 has great water-physics"

I would if they said the water in Half-life 2 was great for quenching your thirst or that in the near future everyone will only drink water from Half-life 2 and it will flow from our kitchen taps when it's clear that however good Half-life 2 is at approximating what water looks and acts like it isn't capable of being a beverage and isn't likely to ever become one. Right now there are a lot of people going around saying that what passes for AI these days has the ability to reason and that AGI is right around the corner but that's just as obvious a lie and every bit as unlikely, but the more it gets repeated the more people end up falling for it.

It's frustrating because at some point (if it hasn't happened already) you're going to find yourself feeling very thirsty and be shocked to discover that the only thing you have access to is Half-life 2 water, even though it does nothing for you except make you even more thirsty since it looks close enough to remind you of the real thing. All because some idiot either fell for the hype or saved enough money by not supplying you with real water that they don't care how thirsty that leaves you.

The more companies force the use of flawed and unreasoning AI to do things that require actual reasoning the worse your life is going to get. The constant misrepresentation of AI and what it's capable of is accelerating that outcome.

By cluckindan 2025-06-118:18

That’s comparing apples to oranges. Nobody is going to be making a real cruise ship based on game water physics simulations.

In such a task, better water simulations are used. We have those, because we can directly observe the behavior of water under different conditions. It’s okay because the people doing it are explicitly aware that they are using simulation.

AI will get used in real decisions affecting other people, and the people doing those decisions will be influenced by the terminology we choose to use.

By inimino 2025-06-119:371 reply

Just because you don't know how does not mean that we can't.

By cluckindan 2025-06-1111:07

Prove it, then.

By otabdeveloper4 2025-06-115:021 reply

We don't know yet. But we do know it's certainly not statistical token prediction.

(People can do statistical token prediction too, but that's called "bullshitting", not "thinking". Thinking is a much wider class of activity.)

By LordDragonfang 2025-06-1117:041 reply

Do we know that with certainty? Do we actually?

Because my understanding is that how "thinking" works is actually still a total mystery. How is it we no for certain that the basis for the analog electric-potential-based computing done by neurons is not based on statistical prediction?

Do we have actual evidence of that, or are you just doing "statistical token prediction" yourself?

By cluckindan 2025-06-126:311 reply

You’re reversing the burden of proof in a similar manner as religious people often do. Absence of evidence is not evidence of absence, and so on.

By LordDragonfang 2025-06-1218:23

I'm not reversing it lol. You're the one making a claim, the burden of evidence is on you.

Absence of evidence is not evidence of absence, but it is still absence of evidence. Making a claim without any is more religious that not. After all, we know humans can't be descended from monkeys!

By lxe 2025-06-1015:151 reply

Thanks for all you do!

By danielhanchen 2025-06-1015:17

Thanks!

By trebligdivad 2025-06-111:401 reply

Nice! I'm running on CPU only, so it's interesting to compare - the Magistral-Small-2506_Q8_0.gguf runs at under 2 tokens/s on my 16 core, but your UD-IQ2_XXS gets about 5.5 tokens/s which is fast enough to be useful - but it does hallucinate a bit more and loop a little; but still actually pretty good for something so small.

By danielhanchen 2025-06-119:01

Oh nice! I normally suggest maybe Q4_K_XL to be on the safe side :)

By cpldcpu 2025-06-1015:051 reply

But this is just the SFT - "distilled" model, not the one optimized with RL, right?

By danielhanchen 2025-06-1015:12

Oh I think it's SFT + RL as mentioned in the paper - they said combining both is actually more performant than just RL

By pu_pe 2025-06-1014:4115 reply

Benchmarks suggest this model loses to Deepseek-R1 in every one-shot comparison. Considering they were likely not even pitting it against the newer R1 version (no mention of that in the article) and at more than double the cost, this looks like the best AI company in the EU is struggling to keep up with the state-of-the-art.

By hmottestad 2025-06-1018:432 reply

With how amazing the first R1 model was and how little compute they needed to create it, I'm really wondering how the new R1 model isn't beating o3 and 2.5 Pro on every single benchmark.

Magistral Small is only 24B and scores 70.7% on AIME2024 while the 32B distill of R1 scores 72.6%. And with majority voting @64 the Magistral Small manages 83.3%, which is better than the full R1. Since I can run a 24B model on a regular gaming GPU it's a lot more accessible than the full blown R1.

https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-...

By reissbaker 2025-06-111:532 reply

It's not better than full R1; Mistral is using misleading benchmarks. The latest version of R1, R1-0528, is much better: 91.4% on AIME2024 pass@1. Mistral uses the original R1 release from January in their comparisons, presumably because it makes their numbers look more competitive.

That being said, it's still very impressive for a 24B.

I'm really wondering how the new R1 model isn't beating o3 and 2.5 Pro on every single benchmark.

Sidenote, but I'm pretty sure DeepSeek is focused on V4, and after that will train an R2 on top. The V3-0324 and R1-0528 releases weren't retrained from scratch, they just continued training from the previous V3/R1 checkpoints. They're nice bumps, but V4/R2 will be more significant.

Of course, OpenAI, Google, and Anthropic will have released new models by then too...

By redman25 2025-06-119:061 reply

It may not have been intentionally misleading. Some benchmarks can take a lot of horsepower and time to run. Their preparation for release likely was done well in advance of the model release before the new deepseek r1 model had even been available to test.

By reissbaker 2025-06-1112:18

AIME24, etc are pretty cheap to run using any DeepSeek API. Regardless, they didn't even run the benchmarks for R1 themselves, they just republished DeepSeek's published numbers from January. They could have published the ones from May, but chose not to.

By hmottestad 2025-06-140:47

Mistral isn’t using misleading benchmarks. I linked to DeepSeek’s own benchmark results that DeepSeek created. I couldn’t find anything newer.

Can you link me to the benchmark you found?

By adventured 2025-06-1020:134 reply

It's because DeepSeek was a fast copy. That was the easy part and it's why they didn't have to use so much compute to get near the top. Going well beyond o3 or 2.5 Pro is drastically more expensive than fast copy. China's cultural approach to building substantial things produces this sort of outcome regularly, you see the same approach in automobiles, planes, Internet services, industrial machinery, military, et al. Innovation is very expensive and time consuming, fast copy is more often very inexpensive and rapid. 85% good enough is often good enough, that additional 10-15% is comically expensive and difficult as you climb.

By orbital-decay 2025-06-1023:351 reply

This terrible and vague stereotyping about "China" while having no clue about the subject should have no place on HN but somehow always creeps in and is upvoted by someone. DeepSeek is not "China", they had nobody to copy from, they released their first 7B reasoning model back in April 2024, it was ahead of then-SotA models in math and validated their approach. They did a ton of new things besides training a reasoning model, and likely have more to come, as they have a completely different background than most AI companies. It's more of a cross-pollination of different areas of expertise.

By SoMomentary 2025-06-1110:542 reply

I thought it had been bandied about that Deepseek had exfiltrated a bunch of data from OpenAI's models, which was then used to train theirs? Did this ultimately prove untrue? My apologies, I don't always keep up on the latest drama in the AI circles - so maybe that has been well proven wrong.

By orbital-decay 2025-06-1116:40

Sam Altman threw a fit and claimed this, without providing evidence. He's... not exactly a person to trust blindly. Training on other model outputs (or at least doing sanity checks against them) is pretty common, but these models seem very different, DS has prior art, and by all signs this claim makes little sense and is hard to believe.

By glomgril 2025-06-1117:00

one man's exfiltration is another man's distillation `¯\_(ツ)_/¯`

you could say they're playing by a different set of rules, but distilling from the best available model is the current meta across the industry. only they know what fraction of their post-training data is generated from openai models, but personally i'd bet my ass it's greater than zero because they are clearly competent and in their position it would have been dumb to not do this.

however you want to frame it, they have pushed the field forward -- especially in the realm of open-weight models.

By natrys 2025-06-1021:211 reply

Not disagreeing with the overarching point but:

> That was the easy part

Is a bit hand-wavy in that it doesn't explain why it's only DeepSeek who can do this "easy" thing, but still not Meta, Mistral or anyone else really. There are many other players who have way more compute than DeepSeek (even inside China, not even considering rest of the world), and I can assure you more or less everyone trains on synthetic data/distillation from whatever bigger model they can access.

By refulgentis 2025-06-1023:311 reply

They all have. I don't hope to convince you of that, everyones use case differs. Generally, AIME / prose / code benchmarks that don't involve successive tool calls are used to hide some very dark realities.

IMHO tool calling is by far the most clearly economically valuable function for an LLM, and r1 self-admittedly just...couldn't do it.

There's a lot of puff out there that's just completely misaligned with reality, ex. Gemini 2.5 Pro is by far the worst tool caller, Gemini 2.5 Flash thinking is better, 2.5 Flash is even better. And either Llama 4 beats all Gemini 2.5s except 2.5 Flash not thinking.

I'm all for "these differences will net out in the long run", Google's at least figured out how to micro optimize for Aider edit formatting without tools. Over the last 3 months, they're up 10% on edit performance. But it's horrible UX to have these specially formatted code blocks in the middle of prose. They desperately need to clean up their absurd tool-calling system. But I've been saying that for a year now. And they don't take it seriously, at all. One of their most visible leads tweeted "hey what are the best edit formats?" and a day later is tweeting the official guide for doing edits. I'm a Xoogler and that absolutely reeks of BigCo dysfunction - someone realized a problem 2 months after release and now we have "fixed" it without training, and now that's the right way to do things. Because if it isn't, well, what would we do? Shrugs

I'm also unsure how much longer it's worth giving a pass on this stuff. Everyone is competing on agentic stuff because that's the golden goose, real automation, and that needs tools. It would be utterly unsurprising to me for Google to keep missing a pain signal on this, vis a vis Anthropic, which doubled down on it mid-2024.

As long as I'm dumping info, BFCL is not a good proxy for this quality. Think "converts prose to JSON" not "file reading and editing"

By natrys 2025-06-110:351 reply

I don't mind the info dump, but I am struggling to connect the relevance of this to topic at hand. I mean, focusing on a single specific capability and generalising it to mean "they all have" caught up with DeepSeek all across the board (which was the original topic) is a reductive and wild take. Especially when it seems to me that this seems more because of misaligned incentive than because it's truly a hard problem.

I am not really invested in this niche topic but I will observe that, yes I agree Llama 4 is really good here. And yet it's a far worse coder, far less intelligent than DeepSeek and that's not even arguable. So no it didn't "catch up" any more than what you could say by pointing out Llama is multimodal but DeepSeek isn't. That's just talking about a different things entirely.

Regardless, I do agree BFCL is not the best measure either, the Tau-bench is more real world relevant. But end of the day, most frontier labs are not incentive aligned to care about this. Meta cares because this is something Zuck personally cares about, Llama models are actually for small businesses solving grunt automation, not for random people coding at home. People like Salesforce care (xLAM), even China had GLM before DeepSeek was a thing. DeepSeek might care so long as it looks good for coding benchmarks, but that's pretty much the extent of it.

And I suspect Google doesn't truly care because in the long run they want to build everything themselves. They already have a CodeAssist product around coding which likely uses fine-tune of their mainline Gemini models to do something even more specific to their plugin.

There is a possibility that at the frontier, models are struggling to be better in a specific and constrained way, without getting worse at other things. It's either this, or even Anthropic has gone rogue because their Aider scores are way down now from before. How does that make sense if they are supposed to be all around better at agentic stuff in tool agnostic way? Then you realise they now have Claude Coder and it just makes way more economic sense to tie yourself to that, be context inefficient to your heart's content so that you can burn tokens instead of being, you know, just generally better.

By refulgentis 2025-06-111:002 reply

> I am struggling to connect the relevance of this

> focusing on a single specific capability and

> I am not really invested in this niche topic

Right: I definitely ceded a "but it doesn't matter to me!" argument in my comment.

I sense a little "doth protest too much", in the multiple paragraphs devoted to taking that and extending it to the underpinning of automation is "irrelevant" "single" "specific", "niche".

This would also be news to DeepSeek, who put a lot of work to launch it in the r1 update a couple weeks back.

Separately, I assure you, it would be news to anyone on the Gemini team that they don't care because they want to own everything. I passed this along via DM and got "I wish :)" in return - there's been a fire drill trying to improve it via AIDER in the short term, is my understanding.

If we ignore that, and posit there is an upper management conspiracy to suppress performance, its just getting public cover by a lower upper management rush to improve scores...I guess that's possible.

Finally, one of my favorite quotes is "when faced with a contradiction, first check your premises" - to your Q about why no one can compete with DeepSeek R1 25-01, I'd humbly suggest you may be undergeneralizing, given even tool calls are "irrelevant" and "niche" to you.

By Vetch 2025-06-112:19

I think the point remains that few have been able to catch up to OpenAI. For a while it was just Anthropic. Then Google after failing a bunch of times. So, if we relax this to LLMs not by OpenAI, Anthropic or Google, then Deepseek is really the only one that's managed to reach their quality tier (even though many others have thrown their hat into the ring). We can also get approximate glimpses into which models people use by looking at OpenRouter, sorted by Top Weekly.

In the top 10, are models by OpenAI (gpt4omini), Google (gemini flashes and pros), Anthropic (Sonnets) and Deepseeks'. Even though the company list grows shorter if we instead look at top model usage grouped by order of magnitude, it retains the same companies.

Personally, the models meeting my quality bar are: gpt 4.1, o4-mini, o3, gpt2.5pro, gemini2.5flash (not 2.0), claude sonnet, deepseek and deepseek r1 (both versions). Claude Sonnet 3.5 was the first time I found LLMs to be useful for programming work. This is not to say there are no good models by others (such as Alibaba, Meta, Mistral, Cohere, THUDM, LG, perhaps Microsoft), particularly in compute constrained scenarios, just that only Deepseek reaches the Quality tier of the big 3.

By natrys 2025-06-111:481 reply

Interesting presumption about R1 25-01 being what's talked about, you knowledge cut-off does appear to know R1 update two weeks back was a thing, and that it even improved on function calling.

Of course you have to pretend I meant the former, otherwise "they all have" doesn't entirely make sense. Not that it made total sense before either, but if I say your definition of "they" is laughably narrow, I suspect you will go back to your google contact and confirm that nothing else really exists outside it.

Oh and do a ctrl-f on "irrelevant" please, perhaps some fact grounding is in order. There was an interesting conversation to be had about underpinning of automation somehow without intelligence (Llama 4) but who has time for that if we can have hallucination go hand in hand with forced agendas (free disclaimer to boot) and projection ("doth protest too much")? Truly unforeseeable.

By refulgentis 2025-06-114:281 reply

I don't know what you're talking about, partially because of poor grammar ("you knowledge cut-off does appear") and "presumption" (this was front and center on their API page at r1 release, and its in the r1 update notes). I sort of stopped reading after there because I realized you might be referring to me having a "knowledge cut-off", which is bizarre and also hard to understand, and it's unlikely to be particularly interesting conversation given that and the last volley relied on lots of stuff about tool calling being, inter alia, niche.

By natrys 2025-06-1111:07

> you might be referring to me having a "knowledge cut-off"

Don't forget I also referred to you having "hallucination". In retrospect, likening your logical consistency to an LLM was premature, because not even gpt-3.5 era models could pull off a gem like:

> You: to your Q about why no one can compete with DeepSeek R1 25-01 blah blah blah

>> Me: ...why would you presume I was talking about 25-01 when 28-05 exists and you even seem to know it?

>>> You: this was front and center on their API page!

Riveting stuff. Few more digs about poor grammar and how many times you stopped reading, and you might even sell the misdirection.

By MaxPock 2025-06-1021:09

I understand that the French are very innovative so why isn't their model SOTA ?

By melicerte 2025-06-1015:072 reply

If you look at Mistral investors[0], you will quickly understand that Mistral is far from being European. My understanding is it is mainly owned by US companies with a few other companies from EU and other places in the world.

[0] https://tracxn.com/d/companies/mistral-ai/__SLZq7rzxLYqqA97j... (edited for typo)

By pdabbadabba 2025-06-1016:041 reply

For the purposes of GP's comment, I think the nationalities of the people actually running the company and doing the work are more relevant than who has invested.

By derektank 2025-06-1016:116 reply

And, perhaps most relevantly, the regulatory environment the people are working in. French people working in America are probably more productive than French people working in France (if for no other reason because they probably work more hours in America than France).

By 8n4vidtmkvmk 2025-06-1016:185 reply

Are we sure more time butt in office equates to more productivity?

By FabHK 2025-06-113:58

> Are we sure more time butt in office equates to more productivity?

Typically more output, but less productivity (= output/time).

By 1propionyl 2025-06-1016:592 reply

Yes, specifically when it comes to open-ended research or development, collocation is non-negotiable. There are greater than linear benefits in creativity of approach, agility in adapting to new intermediate discoveries, etc that you get by putting a number of talented people who get along in the same space who form a community of practice.

Remote work and flattening communication down to what digital media (Slack, Zoom, etc) afford strangle the beneficial network effects.

By throwaway0123_5 2025-06-1017:511 reply

I think they were talking about total time spent working rather than remote vs. in-person. I've seen more than a few studies over the years showing that going from 40 to 35 or 30 hours/wk has minimal or positive impacts on productivity. Idk if that would apply to all work environments though, and I don't recall any of the studies being about research productivity specifically.

By hdjrudni 2025-06-111:53

> I think they were talking about total time spent working rather than remote vs. in-person.

I was, yes. I should have omitted the "in office" part but I was referencing the "work more hours in America than France"

By distortionfield 2025-06-1018:431 reply

You’re being downvoted but you’re right. The number of people who act like a web cam reproduces the in person experience perfectly, for good and bad, is hilarious to me.

By alienbaby 2025-06-1020:451 reply

I think the mistake people make is believing that one approach is best for all. Diffferent people work most effectively in different ways.

By jama211 2025-06-1419:13

Well said. If you make me commute to an office I’m far far far less productive, simple as that.

By meta_ai_x 2025-06-1016:51

Yes, especially in cutting edge research areas where other high functioning people with high energy isarelso there.

You can write your in-house CRUD app in your basement or your office and it doesn't matter.

The vast majority of HN crowd and general social/mainstream media don't make the difference between these two scenarios

By adventured 2025-06-1020:184 reply

$89,000 GDP per capita vs $46,000 rather proves the point about productivity per butt. US office workers are extraordinarily productive in terms of what their work generates (thanks to numerous well understood things like the outsized US scaling abilities). Measuring beyond that is very difficult due to the variance of every business.

By ath92 2025-06-113:292 reply

Weird take. Norway has about the same gdp per capita as the USA with stricter regulations than France. Ireland’s GDP per capita is higher than that of the USA, with less bureaucracy than France but more than the US. Not to mention that all of these are before adjusting for PPP. Almost as if GDP per capita is not a good measurement of productivity.

By FabHK 2025-06-114:03

Many wrinkles here.

First, one should probably look at GNP (or even GNI) rather than GDP to reduce the distortionary impact of foreign direct investment, company headquarters for tax reasons, etc.

Next, need to distinguish between market rate and PPP, as you highlight.

Lastly, these are all measures of output (per capita), while productivity is output per input, in this context output per hour worked. There the differences are less pronounced.

By HPsquared 2025-06-1115:39

Monaco is the most productive country in the world in nominal GDP per capita. A very industrious place, it seems!

By cataphract 2025-06-1021:35

A part of that figure is an artifact of how strong the dollar is though.

By palata 2025-06-1022:33

> $89,000 GDP per capita vs $46,000 rather proves the point about productivity per butt.

So if I work 24h/day in a farm in Afghanistan, I should earn more than software developers in the Silicon Valley (because I'm pretty sure that they sleep)? Is that how you say GDP works?

By 77pt77 2025-06-1123:02

Yes, and Louisiana has a GDP per capita on par of higher than France and is a shithole compared to the worst areas of Europe, let alone France.

But I wouldn't expect someone like you to know, understand or even acknowledge it.

By numpad0 2025-06-1019:081 reply

I think maybe we should completely switch to admitting this. Every extra second you sit in the (home)office adds to productivity, just not necessarily converting into market values, that can be inflated with hype. Also longer hours is not necessarily safe or sustainable.

We only wish more time != more productivity because it's inconvenient in multiple ways if it were. We imagine a multiplier in there to balance the equation, such factor that can completely negate production, using mere anecdotal experiences as proofs.

Maybe that's not scientific, maybe time spent very closely match productivity, and maybe production as well as productivity need external, artificial regulations.

By mschild 2025-06-1020:521 reply

> Every extra second you sit in the (home)office adds to productivity

I'm not sure I believe that. I think at some point the additional hours worked will ultimately decrease the output/unit of time and at some point that you'll reach a peak whereafter every hour worked extra will lead to an overall productivity loss.

Its also something that I think is extremely hard to consistently measure, especially for your typical office worker.

By maigret 2025-06-1111:17

Here you go https://cs.stanford.edu/people/eroberts/cs181/projects/crunc...

By chairmansteve 2025-06-1018:24

Spoken like a guy who's never been to France.

Classic drive by internet trope.

Maybe try a little harder, have an informed opinion about something.

By whiplash451 2025-06-1016:581 reply

> they probably work more hours in America than France

Not sure that's even true. Mistral is known to be a really hard-working place

By gwervc 2025-06-1017:203 reply

I'm pretty sure there is way less regulations in the US in respect to France where going over the legal 35h/week requires additional capital and legal paperwork.

By Saline9515 2025-06-1020:37

In France most white collar jobs are categorized as "management" ("cadre"), and they have no time limit. It is very common for workers to clock 12h days in consultancies (10am-10pm) and in state administrations, for instance.

By algoghostf 2025-06-1018:472 reply

This is not true. Government workers or factory workers can limit to 35h (with some salary loss or days off loss), but else than that (especially in tech) it is very competitive and working 50 hours+/week is not exceptionl.

By kgwgk 2025-06-1019:231 reply

> 50 hours+/week is not exceptionl.

https://www.legifrance.gouv.fr/codes/article_lc/LEGIARTI0000...

Au cours d'une même semaine, la durée maximale hebdomadaire de travail est de quarante-huit heures.

https://www.legifrance.gouv.fr/codes/article_lc/LEGIARTI0000...

La durée hebdomadaire de travail calculée sur une période quelconque de douze semaines consécutives ne peut dépasser quarante-quatre heures, sauf dans les cas prévus aux articles L. 3121-23 à L. 3121-25.

By Saline9515 2025-06-1020:401 reply

Everyone is "forfait cadre", which allow them to work with no practical time limit since they don't log their time spent at work. https://www.service-public.fr/particuliers/vosdroits/F19261

By kgwgk 2025-06-1022:361 reply

It seems that 20% of employees in the private sector are "cadres" and half of them are on "forfait jours". That makes around 10% of the private sector employees working 218 days per year without the 48/44 weekly hour limits. It's more than I thought but I doubt that many of them work more than 10 hours per day. Whether that's "exceptional" or not is a matter of definition, of course.

By psychoslave 2025-06-116:511 reply

What do you mean with work more than 10h/day for intellectual work? You don't stop to think the moment you are away from the production machine. And the exact opposite can often happen: you go away from the computer/board/paper/office, make a walk trying to wander at something else as far as you can stear consciousness, and then the solutions/ideas land in your mind.

By kgwgk 2025-06-119:161 reply

You’re not wrong but what did the commenter above meant with “50 hours+/week”? Weeks have three times as many hours. Years also have many more than 218 days.

Anyway I found an official survey saying that 40% of them work more than 50 hours per week (but fewer weeks than regular employees) so I guess it’s not so rare (around one private sector employee in twenty).

By Xmd5a 2025-06-1119:09

Used to work 70h/week on average, like every week of the year. I don't think I ever wworked less than 50h in a week

By greenavocado 2025-06-1019:423 reply

In the USA most software engineers are FLSA-exempt ("computer employee" exemption).

No overtime pay regardless of hours worked.

No legal maximum hours per day/week.

No mandatory rest periods/breaks (federally).

The US approach places the burden on the individual employee to negotiate protections or prove misclassification, while French law places the burden on the employer to comply with strict, state-enforced standards.

The French Labor Code (Code du travail) applies to virtually all employees in France, regardless of sector (private tech company, government agency, non-profit, etc.), unless explicitly exempted. Software engineering is not an exempted profession. Maximum hour limits are absolute. The caps of 44 hours per week, 48 hours average over 12 weeks, and 10/12 hours per day are legal maximums for almost all employees. Tech companies cannot simply ignore them. The requirements for employee consent, strict annual limits (usually max 220 hours/year), premium pay (+25%/+50%), and compensatory rest apply to software engineers just like any other employee.

"Cadre" Status is not an exemption. Many software engineers are classified as Cadres (managers/professionals) but this status does not automatically exempt them from working time rules.

Cadre au forfait jours (Days-Based Framework): This is common for senior engineers/managers. They are exempt from tracking daily/weekly hours but must still have a maximum of 218 work days per year (including weekends, holidays, and RTT days). Their annual workload must not endanger their health. 80-hour weeks would obliterate this rest requirement and pose severe health risks, making it illegal. Employers must monitor their workload and health.

Cadre au forfait heures (Hours-Based Framework) or Non-Cadre: These employees are fully subject to the standard daily/weekly/hourly limits and overtime rules. 80+ hours/week is blatantly illegal.

The tech industry, especially gaming/startups, sometimes tries to import unsustainable "crunch" cultures. This is illegal in France.

EDIT: Fixed work days

By algoghostf 2025-06-117:501 reply

I think there is theory and there is real life. As tech worker, in 20 years career, in private sector, I have always been on forfait jours, working more than 10h/day on average, during many years weekend included. I never got paid extra hours. So I get what you say about the perception and the law. The French law is protective (i.e if I can prove that in a court I'll get my extra hours paid for sure but my career would end. Period.

By Xmd5a 2025-06-1119:16

>I'll get my extra hours paid for sure but my career would end.

Are you working in an area that is that specific ? I'm French but I'm naive.

By Saline9515 2025-06-1020:44

Some State services, such as the "Trésor", which oversees French economic policies, do not respect this at all, and require 12h work days most of the year. The churn is enormous, workers staying there less than a year on average.

By kgwgk 2025-06-1020:00

> 218 rest days per year (including weekends, holidays, and RTT days)

Wouldn’t that be nice, 218 rest days? It’s 218 working days.

By retinaros 2025-06-1017:551 reply

No one works 35hours in software jobs in france except maybe government. Overtime is also not compensated (they give some days off that is it.)

By psalaun 2025-06-1019:331 reply

Even in government; I've worked 50+ hours weeks working for the healthcare branch of the providence state, with a classic 39h/w contract. No compensation of any sort, despite having timesheets.

There are a lot of myths about French worker. Our lifelong worked hours is not exceptional; our productivity is also not exceptional.

By greenavocado 2025-06-1019:452 reply

Pointless suffering. Report violations to the CSE, Médecin du Travail, and Inspection du Travail.

By psalaun 2025-06-1020:13

It was a choice, I loved my job there. I had more exciting projects than most of my friends in the private sector!

By Saline9515 2025-06-1020:46

Excellent way to get blacklisted and never work for the State again if you're a contractor, or end up in a low impact, boring job if you're a career worker.

By epolanski 2025-06-1019:171 reply

[flagged]

By FirmwareBurner 2025-06-1121:471 reply

>You think that European founders and researchers are like "nah, you know what, we're European, we're not ambitious, we don't want to make money, to hell with equity"?

That's the copium HN thinks. European workers bust their asses for glory not for money.

By tomhow 2025-06-1122:392 reply

We're getting complaints about several of your recent comments, and this is a prime example of the kind of comment that is not right for HN. It takes a swipe at the whole HN community (on the false pretence that the HN audience is concentrated via country/region or mindset), and makes a moral judgement based on region/culture.

We've asked you several times to stop commenting in this inflammatory style on HN. We don't want to ban you, as we want HN to be open to a broad range of views and discussion styles, but if you keep commenting in ways that break the guidelines and draw valid complaints from other community members, a ban will be the next step we'll have to take.

If you want HN to be a good place to engage in interesting discussions, please do your part to make it better not worse.

By FirmwareBurner 2025-06-127:261 reply

Where do I complain about unfair and double standard moderation practices that are left unmoderated and even praised and upvoted?

See this guy: https://news.ycombinator.com/item?id=44254864

And there's countless like him that get away with it. You'll then argue that there's no resources to moderate everything on HN, which while true, it's also more than sus how there seems to always be enough resources to moderate conservative viewpoints but rarely attacks from liberals that break the same rules, which is a blatant double standard that HN moderation is ignoring.

You talk the talk about HN being to quote you "open to a broad range of views and discussion styles" but what you actually support is a suppression of free speech and a one sided view of things that can only exist in a biased heavily moderated echo chamber, and not in the free market place of ideas you claim to support.

By tomhow 2025-06-127:421 reply

That comment was posted barely a half hour ago and nobody has flagged it yet. What does it have to with "double standard moderation practices that are left unmoderated and even praised?"

We can't act on things that the community doesn't tell us about. Almost always, when people point to comments that are egregious but still live as evidence that the moderators approve of them, the reality is that we didn't see them. And a major reason for that is that political flamewar is now such a big part the activity on HN that our small team can't ever see all the comments that are flagged.

But please don't try to use other people's transgressions as an excuse for your own. That's an age-old trick that doesn't work well here.

If you are sincere about being a positive contributor to this community, you can easily show that by making an effort to observe the guidelines. You could also make good-faith efforts to hold other community members to high standards by flagging comments, and if you see anything that's particularly egregious, emailing us.

Edit: you added to your comment after I submitted mine, so I'll add a further response.

We don't care about what side you're arguing for. Often we don't know; we don't have time to figure out what each commentator in a flamewar is on about. The topic of bias has been hurled at HN for as long as it's existed. Dan has an ever-growing list of the complaints we get from each side characterising us as being biased towards the other side [1].

We have guidelines for a reason, which is that if people fill their comments with inflammatory rhetoric, the emotional energy that triggers is what dominates people's perception of the discussion, rather than the substance of the points people are trying to get across.

If you have points to make that have substance, and I know that you do, you need to find a way to get them across without being inflammatory, otherwise it's a waste of everyone's time.

[1] https://news.ycombinator.com/item?id=26148870

By FirmwareBurner 2025-06-127:452 reply

>nobody has flagged it yet[..]We can't act on things that the community doesn't tell us about.

Why do you think that is? Is it not a reflection of the userbase bias? Where comments get flagged not based on rules but based on which political side they are targeting?

>You could also make good-faith efforts to hold other community members to high standards by flagging comments

Doesn't help when others vouch for them to support their ideology.

>We don't care about what side you're arguing for.

You don't, but your userbase does. And your moderation is based on what your userbase flags. So you moderation 100% reflects the bias of the community, hence the biased enforcement of your rules.

Answer me why is my comment here is flagged?

https://news.ycombinator.com/item?id=44240839

By tomhow 2025-06-127:531 reply

> Why do you think that is? Is it not a reflection of the userbase bias? Where comments get flagged not based on rules but based on which political side they are targeting?

That comment was a breach of the guidelines but it almost always takes longer than half an hour for a comment to be flagged, and for us to see it, especially on a thread that's over a day old that barely anybody is looking at anymore.

You could flag it yourself. The fact that you didn't makes it seem more like you're trying to prove a point about bias rather than doing your part to support the health of the community.

> Doesn't help when others vouch for them to support their ideology.

People who abuse vouching privileges can have those privileges revoked – if we know about it. Again, when you see this, email us.

The site has all kinds of mechanisms and norms to prevent abuse and dysfunction, but they can only work if people are sincere about making the site better rather than being at war with it.

Edit: Adding this in reply to your addition:

> Answer me why is my comment here is flagged?

> https://news.ycombinator.com/item?id=44240839

These parts break the guidelines against assuming bad faith and fulminating:

Bad faith argument.

I've mostly seen change for the sake of change, wrapped in fluffy artsy BS jargon, making it sound like each UI change is the second coming of Christ and fixes world hunger.

They're not especially egregious but when you develop a reputation for breaking the guidelines, your comments are going to attract more flags, and also trigger more complaints made privately to us via email.

We received emails complaining about your comments, including this one, from people who have a good track record of supporting the community and not being politically partisan.

When we receive these kinds of complaints, nobody is complaining about your politics, just about your inflammatory style and guidelines breaches.

By FirmwareBurner 2025-06-1210:322 reply

>The fact that you didn't

Mate, I don't have time to flag all comments that I find inflammatory, especially when I flagged many comments in the past and nothing happened to them, so what's the point? I flagged this one after and the comment was still there. So why are you throwing the blame on me? Why didn't you remove that comment after I pointed it out?

> Again, when you see this, email us.

Mate please, be serious, me and most normal people have better things to do with our time than go full Karen "I want to talk to the manager" mode, and go to such lengths like emailing HN mods about other peoples' comments. Downvotes and flags are enough for me.

The fact that there are people here who have the time to send you emails about my comments that don't break the rules, just because they're butthurt, says something about those users (unemployed, terminally online, mentally unstable on SSRIs, social and politically activists, etc). Normal, employed people with healthy social lives don't send mod emails about comments they don't like on internet. WTH?

>These parts break the guidelines against assuming bad faith

Then by that yardstick, isn't the comment I was replying too also in bad faith, just like I pointed out initially? Since he was using Android 2 to justify that Android 16 is the superior UI. And I replied that's in bad faith since the alternative to Android 16 shitty UI is not going back to Android 2 to make Android 16 look good, but version 10 is a good counter to that of why version 16 is bad. How is my comment in bad faith and that one not?

>fulminating

Why was it fulminating? Was it any more than the rest of comments everyone on HN? That was just criticism of Android's UI evolution. Since when is criticism of something with arguments considered "fulminating"? Please explain, I'm genuinely curious. Because otherwise this so called rule break of mine in this case feels blatantly discriminatory and double standards.

>you're going to get more flags, and complaints made about you

Bro, you're straight up admitting to biased moderation here. That the community doing the flagging cares more about WHO is saying something rather than WHAT is being said. How can you talk about free speech and fair moderation with a straight face in this case?

Please, answer me these questions.

By tomhow 2025-06-1210:511 reply

When each reply gets longer and longer it's a sign there's less and less chance of finding common ground. I'll try to make this one brief:

- It takes two minutes to write an email pointing out an egregious comment or bad actor.

- People are flagging and complaining about your comments because they are breaking the guidelines, nothing more, nothing less. Maybe you're not aware of it due to a cultural disconnect. If that's the case, I'm sorry you're in that position but I encourage you to take the feedback and work with us to come into alignment with the community. But it's not about your politics, it's all to do with your inflammatory style of commenting.

- I admitted no such bias; I said that your comments have a pattern of breaking the guidelines, including the ones people are complaining about, and when your comments consistently break the guidelines you will inevitably get less patience from everyone than if you lapse occasionally.

Please stop this war. We're not trying to oppress you. We want your point of view to be fairly represented, but that can only happen if you make an effort to play by same rules that everyone else is expected to follow.

By FirmwareBurner 2025-06-1212:361 reply

I wrote a long comment before just to explain you my thought process in detail so you know that I'm here for good faith debates, not to break rules, but OK, I'll make it short for you now: You still haven't explained why my comment on the Android topic I was replying to is in bad faith but the on I replied to isn't, when I explained you in detail why it is, by the same yardstick you used to judge my comment?

You keep saying it's not you who decides what's right and wrong, that it's the users who decide based on the rules by flagging. Then why am I wrong with my assertion that then it's the rule of the mob who decides what is right and what is wrong, and not the rules, since there's obviously no impartial judge here, just the angry mob which is anything but impartial and unbiased, since just like in voting at elections, people don't vote based on facts and logic, but based on feelings and tribalism over a person(see US election results).

So if people want to flag bomb a certain user because of beef and not facts, they will, and instead of giving me an unbiased explanation on why the comment I was replying to was not breaking the rules like you say mine was, you avoid the topic and parrot some boy scout speech on the honesty and integrity of the userbase, when I show you the hypocrisy of that and ask for an exact explanations. If you don't want to explain me why that comment wasn't breaking the rules but somehow mine was, that's fine, just don't have the audacity to piss on me and tell me it's raining.

By tomhow 2025-06-1214:14

If you're asking why this comment [1] wasn't flagged: no users flagged it, because it doesn't break the guidelines. It's not inflammatory. It doesn't set off a flamewar. It doesn't fulminate. It just raises a question, for people to respond to. It got some upvotes and some downvotes and some replies, most of which were fine. Yours was the only one that was inflammatory and broke the guidelines. Not egregiously, but enough, given your recent patterns.

> You keep saying it's not you who decides what's right and wrong, that it's the users who decide based on the rules by flagging

I think this is a misunderstanding. We moderators don't (can't) make judgements about accuracy or truthfulness of comments. All we can do is determine if a comment breaks the guidelines. Comments should only be flagged by users if they break the guidelines. Our enforcement of the guidelines is independent of the accuracy of the comment's content or its ideology.

If a comment of yours is flagged for any reason other than guidelines breaches, you're within your rights to protest. But given your conduct even in this subthread with me, in which your comments continue to be full of guidelines breaches, it seems you're not able to gauge whether your comments are within the guidelines or not.

It's going to keep being a problem if you're not able to correct that.

[1] https://news.ycombinator.com/item?id=44240808

By throw10920 2025-06-1416:34

Your comment has a lot of misunderstandings, but here's the biggest one:

> Then by that yardstick, isn't the comment I was replying too also in bad faith, just like I pointed out initially?

No, it isn't. First, it's completely invalid to say "by that yardstick" - you're comparing completely different things that have no bearing on each other. Second, no, there's zero evidence that the comment you replied to was in bad faith.

The definition of a bad faith argument is one that in inauthentic, and that the argument-maker doesn't actually believe in themselves. Factually, there's no evidence to support your accusation that that comment by @butlike was in bad faith - they didn't make any self-contradictory statements in their comment, nor did they post a single other comment in that whole thread, nor did they say that would indicate that they were acting anything but genuinely.

And, factually, you were breaking the guidelines by assuming bad faith about https://news.ycombinator.com/item?id=44240808.

When challenged on it in https://news.ycombinator.com/item?id=44244578, you said:

> I assumed good faith, but then I used critical thinking and decided it's in bad faith then explained why. You don't need to agree with me on this.

This has three falsehoods in it. First, you did not assume good faith - you assumed bad faith, because there was zero evidence to support the idea that it was in bad faith. Second, you didn't use critical thinking - again, because there was no evidence to support that belief. Third, you did not explain why the comment was in bad faith - you explained why you disagreed with it, indicating that you don't understand the difference between disagreeing with someone's statements, and them being in bad faith (which is further reinforced in the above when you say "And I replied that's in bad faith since the alternative to Android 16 shitty UI is not going back to Android 2 to make Android 16 look good" - no, that's literally not what "bad faith" means).

Finally, more generally, beyond the falsehoods and fallacies that you've been making, you're also acting extremely abrasively, in ways that break the guidelines and that antagonize other users.

The theme of HN is intellectual curiosity. The way that you've been acting is the exact opposite of that.

By vasco 2025-06-1016:202 reply

Most measures of productivity have "hours worked" in the denominator so that can't be right.

By underdeserver 2025-06-1016:272 reply

If I work 1000 hours and you work 2000 hours in the same timeframe, but you outcompeted me and created 3x value, you are 1.5 times more productive.

There's a numerator too.

By vasco 2025-06-1019:181 reply

How does the same exact person get more productive? You forgot the example I replied to? The only thing that changed were hours worked. In your example you change it to less hours worked with more output. You made it circular.

By underdeserver 2025-06-1021:01

You can be more productive just because you're faster.

Magistral is amazingly impressive compared to ChatGPT 3.5. If it had come out two years ago we'd be saying Mistral is the clear leader. But it came out now.

Not saying they worked fewer hours, just that speed matters, and in some cases, up to a limit, working more hours gets your work done faster.

By retinaros 2025-06-1017:541 reply

Most of french people in engineering jobs in France are working late even tho overtime is never paid.

By Disposal8433 2025-06-1018:242 reply

In the USA they have the famous 9 to 5. Most developers' jobs in France are "9 to 6 with 2 hours to eat in the middle and unpaid overtime," so I would say both countries are equivalent.

By pdabbadabba 2025-06-1122:01

I'm not here to debate which country works harder. Among other things, I'm not at all convinced that it's good for a society for people to be so devoted to their jobs.

But it's worth pointing out that the U.S.'s famous 9-to-5 is completely inapplicable to any sort of high-demand job. For many people in a demanding profession like tech, a 9-to-5 job would be an absolute (and often unattainable) dream. Where I live (Washington, D.C.) people who want a 9-to-5 will generally leave industry altogether and work for the government. (And even there, a true 9-to-5 can be elusive.)

By psalaun 2025-06-1019:36

In parisian startups it's more 9 to 7 with 30 min lunch breaks.

By kergonath 2025-06-1022:49

It’s a French company, subject to French laws and European regulations. That’s what matters, from a user point of view.

By epolanski 2025-06-1019:132 reply

Jm2c but I feel conflicted about this arms race.

You can be 6/12 months later, and have not burned tens of billions compared to the best in class, I see it an engineering win.

I absolutely understand those that say "yeah, but customers will only use the best", I see it, but is market share of forever money losing businesses that valuable?

By louiskottmann 2025-06-1019:513 reply

Indeed, and with the technology plateau-ing, being 6-12 months late with less debt is just long term thinking.

Also, Europe being in the race is a big deal for consumers.

By sisve 2025-06-1021:24

Being the best European AI company is also a multi billion business. Its not like China or the US respects GDPR. A lot of companies will choose the best European company.

By ACCount36 2025-06-1020:304 reply

>with the technology plateau-ing

People were claiming that since year 2022. Where's the plateau?

By asadotzler 2025-06-1022:592 reply

The pre-training plateau is real. Nearly all the improvements since then have been around fine tuning and reinforcement learning, which can only get you so far. Without continued scaling in the base models, the hope of AGI is dead. You cannot reach AGI without making the pre-training model itself a whole lot better, with more or better data, both of which are in short supply.

By MindTheAbstract 2025-06-116:40

While I tend to agree, I wonder if synthetic data might be reaching a new high with concepts like Google's AlphaEvolve. It doesn't cover everything, but at least in verifiable concepts, I could see it produce more valuable training data. It's a little unclear to me where AGI will come from (LLMs? EBMs - @LeCun)? Something completely different?)

By ethbr1 2025-06-1023:32

> with more or better data, both of which are in short supply

Hmmm. It's almost as if a company without a user data stream like OpenAI would be driven to release an end-user device for the sole purpose of capturing more training data...

By louiskottmann 2025-06-1112:00

There's frequent discussions about how sonnet-3.5 is in the same ballpark or even outperforms sonnet-3.7 and 4.0, for example.

By psychoslave 2025-06-117:26

Could it be that at least for the "lowest" fruits, most amazing things that can one can hope to obtain from scraping the whole web and throw it at some computation training was already achieved? Maybe AGI simply can not be obtained without some relevant additional probes sent in the wild to feed its learning loops?

By dismalaf 2025-06-118:53

If you can't see it you're blind.

LLMs haven't improved much. What's improved is the chat apps: switching between language model, vision, image and video generation and being able to search the internet is what has made them seem 100x more useful.

Run a single LLM without any tools... They're still pretty dumb.

By adventured 2025-06-1020:072 reply

Why would the debt matter when you have $60 billion in ad revenue and are generating $20 billion in op income? That's OpenAI 5-7 years from now, if they're able to maintain their position with consumers. Once they attach an ad product their margins will rapidly soar due to the comparatively low cost of the ad segment.

The technology is closer to a decade from seeing a plateau for the large general models. GPT o3 is significantly beyond o1 (much less 3.5 which was just Nov 2022). Claude 4 is significantly beyond 3.5. They're not subtle improvements. And most likely there will be a splintering of specialization that will see huge leaps outside the large general models. The radical leap in coding capabilities over the past 12-18 months is just an early example of how that will work, and it will affect every segment of human endeavour.

By aDyslecticCrow 2025-06-1020:26

> Once they attach an ad product their margins will rapidly soar due to the comparatively low cost of the ad segment.

They're burning through computers and capital. No amount of advertising could cover the cost of training or even running these models. The massive subscription costs we've started seeing are just a small glimpse into the money they are burning through.

They will NOT make a profit using the current methods unless the models become at least 10 times more efficient than they are now. At which point can Europe adapt to the innovation without much cost.

It's an arms race to see who can burn the most money the fastest, while selling the result for as little as possible. When they need to start making money, it will all come crashing down.

By epolanski 2025-06-1116:22

You're describing Google Gemini on any Android phone, that's today, sans the ads.

By adventured 2025-06-1019:463 reply

A similar sentiment existed for a long time about Uber and now they're very profitable and own their market. It was worth the burn to capture the market. Who says OpenAI can't roll over to profitable at a stable scale? Conquer the market, hike the price to $29.95 (family account, no ads; $19.95 individual account with ads; etc etc). To say nothing of how they can branch out in terms of being the interaction point that replaces the search box. The advertising value of owning the land that OpenAI is taking is well over $100 billion in annual revenue. Amazon's retail business is terrible, their ad business is fantastic. As OpenAI bolts on an ad product their margin potential will skyrocket and the cost side will be modest in comparison.

Over the coming years it won't be possible to stay a mere 6-12 months behind as the costs to build and maintain the AI super-infrastructure keeps climbing. It'll become a guaranteed implosion scenario. Winning will provide the ongoing immense resources needed to keep pushing up the hill forever. Everybody else - except a few - will fall away. The same outcome took place in search. Anybody spot Lycos, Excite, Hotbot, AltaVista around? It costs an enormous amount of money to try to keep up with Google (Bing, Baidu, Yandex) in search and scale it. This will be an even more brutal example of that, as the costs are even higher to scale.

The only way Mistral survives is if they're heavily subsidized directly by European states.

By otabdeveloper4 2025-06-115:17

> now they're very profitable and own their market.

No they don't. They failed in every market except a few niche ones.

By aDyslecticCrow 2025-06-1020:351 reply

> It was worth the burn to capture the market.

You cannot compare Uber to the AI market. They are too different. Uber captured the market because having three taxi services is annoying. But people are readily jumping between models using multi-model platforms. And nobody is significantly ahead of the pack. There is nothing that sets anyone apart aside from the rate at which they are burning capital. Any advantage is closed within a year.

If OpenAI wants to make a profit, it will raise prices and be dropped at a heartbeat for the next cheapest option. Most software stacks are designed to be model-agnostic, making integration or support a non-factor.

By whiplash451 2025-06-1020:563 reply

Three cab apps are a lot less annoying than three LLM apps each having their piece of your chats history.

The winner-take-all effect is a lot stronger with chat apps.

By snoman 2025-06-1022:44

That’s the exact opposite of the way it is right now (at least for me). I don’t like having multiple ride hailing apps but easily have ChatGPT, Claude, Gemini on my phone (and local LLM at home). There is zero effort cost to go from one to the other.

By aDyslecticCrow 2025-06-1112:37

I interface with AI models using a single website where i can select between models. Code IDEs are doing the same. Companies that facilitate cross model integration are doing doing great (cursor as a famous example). This trend is spreading.

By otabdeveloper4 2025-06-115:18

Professional tip - you can save your prompts somewhere else, you don't need "the cloud" for storing them. It's just text.

By xmcqdpt2 2025-06-1110:58

I think the jury is still out on Uber. They first became profitable in 2023 after 15 years of massive losses. They still burned way more money than they ever made.

By jasonthorsness 2025-06-1014:54

Even if it isn't as capable, having a model with control over training is probably strategically important for every major region of the world. But it could only fall so far behind before it effectively doesn't work in the eyes of the users.

By tootie 2025-06-1015:281 reply

As an occasional user of Mistral, I find their model to give generally excellent results and pretty quickly. I think a lot of teams are now overly focused on winning the benchmarks while producing worse real results.

By esafak 2025-06-1016:253 reply

If so we need to fix the benchmarks.

By paulddraper 2025-06-1016:50

https://en.wikipedia.org/wiki/Goodhart%27s_law

By tootie 2025-06-1020:071 reply

I think there's a fundamental limit to benchmarks when it comes to real-world utility. The best option would be more like a user survey.

By esafak 2025-06-1021:241 reply

That's Chatbot Arena: https://lmarena.ai/leaderboard

By jug 2025-06-1023:51

And unfortunately revealed to be largely a vibe check these days with that whole Llama 4 debacle. But why should we be surprised, really, when users have an easier time feeling if the replies sound human and conversational and _appear_ knowledgeable than actually outsmarting them. This Arena worked well in the ChatGPT 3.0 days… But now?

By riku_iki 2025-06-1016:55

those who try to fix them are fighting alone against huge corps which try to abuse them..

By littlestymaar 2025-06-1015:12

> Benchmarks suggest this model loses to Deepseek-R1 in every one-shot comparison.

That's not particularly surprising though as the Medium variant is likely close to ten times smaller than DeepSeek-R1 (granted it's a dense model and not an MoE, but still).

By funnym0nk3y 2025-06-1014:541 reply

Thought so too. I don't know how it could be different though. They are competing against behemoths like OpenAI or Google, but have only 200 people. Even Anthropic has over 1000 people. DeepSeek has less than 200 people so the comparison seems fair.

By rsanek 2025-06-1015:061 reply

any claim from the deepseek folks should be considered with wide margins of error.

By humpty-d 2025-06-1017:302 reply

I know we distrust them on account of being nefarious Chinese, but has anything come to light with R1 or the people behind it specifically to justify this?

By cdblades 2025-06-1112:36

There's no way to know who's funding it (but being at least state subsidized is highly likely), and you we don't really know how much it cost (but in any case it's still less than OpenAI is spending).

On the other hand I'm aware of no credible accusations of deepseek fudging benchmarks whereas OpenAI has had multiple instances of independent parties not being able to replicate their claimed performances on benchmarks (and not being honest and transparent about their benchmarking).

By mwigdahl 2025-06-110:111 reply

"Deepseek only cost $6 million"?

By baq 2025-06-116:14

“* we only tallied the electricity and rent”

By wafngar 2025-06-1021:46

But they have built a fully “independent” pipeline. Deepseek and others probably trained in gpt4, o1 or whatever data.

By segmondy 2025-06-1016:102 reply

are you really going to compare a 24B model to a 700B+ model?

By a2128 2025-06-1016:311 reply

24B is the size of the Small opensourced model. The Medium model is bigger (they don't seem to disclose its size) and still gets beaten by Deepseek R1

By thot_experiment 2025-06-1016:402 reply

Mistral Large is 123b so one can probably assume that medium is between 24b and 123b, also Mistral 3.1 is by a wide margin my go-to model in real life situations. Benchmarks absolutely don't tell the whole story, and different models have different use cases.

By ohso4 2025-06-1022:04

It's a 70b model, Medium 2 was 70b.

https://xcancel.com/arthurmensch/status/1920136871461433620#...

By Ringz 2025-06-1018:051 reply

Can you please explain what your „real life situations“ are?

By thot_experiment 2025-06-1019:421 reply

I use it as a personal assistant (so tool use integrated into calendar/todo/notes etc) often times using the multimodal aspect (taking a photo of a todo list, asking it to remind me to buy something from a picture). I also use it as a code completion tool in vscode, as well as a replacement for most basic google searches ("how does this syntax work", "what's the torch method for X")

I use it for almost every interaction I have with AI that isn't asking it to oneshot complex code. I fairly frequently run my prompts against Claude/ChatGPT and Mistral 3.1 and find that for most things they're not meaningfully different.

I also spend a lot of time playing around with it for storytelling/integration into narrative games.

By mandelken 2025-06-1021:131 reply

Cool. What framework or program do you use to orchestrate this?

By thot_experiment 2025-06-1023:131 reply

Me, Mistral and Claude writing modules on top of a homebrew assistant framework in node with a web frontend. I started out mostly handwriting the first couple modules and the framework for it. (todo and a time tracker) and now the AI is getting pretty good at replicating the patterns I like using, esp with some prompt engineering as long as I don't ask for entire architectures but just prod it along. It's just so easy to make the exact thing you want now. All the heavy lifting is done by ollama and the node/browser APIs.

The only dependency on the node side is 'mime' which is just a dict of mime types, data lives inside node's new `node:sqlite` everything on the front side that isn't just vanilla is alpine. It runs on my main desktop and has filesystem access (which doesn't yet do anything useful really) but the advantage here is that since I've written (well at least read) all of the code I can put a very high level of trust into my interactions.

By Rastonbury 2025-06-111:19

Did you hook up any search tools?

By moffkalast 2025-06-1017:39

The most important company is to is to QwQ at 30B sjnce it's still the best local reasoning model for that size. A comparison that Mistral did not run for some reason, not even with Qwen3.

By fiatjaf 2025-06-1015:25

This reads like an AI-generated comment. What do you mean by "benchmarks suggest"? The benchmarks are very clear and presented right there in the page.

By mrtksn 2025-06-1014:535 reply

Europe isn't going to catch up in tech as long as its market is open to US tech giants. Tech doesn't have marginal costs, so you want to have one of it in one place and sell it everywhere and when the infra and talent is already in US, EU tech is destined to do niche products.

UK has a bit of it, France has some and that's it. The only viable alternatives are countries who have issues with US and that is China and Russia. China have come up with strong competitors and it is on cutting edge.

Also, it doesn't have anything to do with regulations. 50 US States have the American regulations, its all happening in 1 and some other states happen to host some infrastructure but that's true for rest of of the world too.

If the EU/US relationship gets to Trump/Musk level, then EU can have the cutting edge stuff.

Most influential AI researchers are from Europe(inc. UK), Israel and Canada anyway. Ilya Sutskever just the other day gave speech at his alma matter @Canada for example. Andrej Karpathy is Slovakian. Lot's of Brits, French, Polish, Chinese, German etc. are among the pioneers. Significant portion of the talent is non-American already, they just need a reason to be somewhere else than US to have it outside the US. Chinese got their reason and with the state of the affairs in the world I wouldn't be surprised if Europeans gets theirs in less than 3 and a half years.

By vikramkr 2025-06-1015:085 reply

If you close off the market to US tech giants, maybe they'll have some amount of market dominance at home, but I would doubt that would mean they've "caught up" tech wise. There would be no incentive to compete. American EV manufacturing is pretty far behind Chinese EV manufacturing, protectionism didn't help make a competitive car, it just protected the home market while slowly ceding international market after international market

By mrtksn 2025-06-1015:15

I agree, protectionism is bad most of the time but it has its place. It is bad when you are ahead, it is useful when you are behind(You want them to be exposed to the cutting edge market but before that you want them to be able to exist in first place even if they are not the best at this very moment).

China's EV dominance is a result of local governments investing and buying from local businesses.

It would be the same with Russia&China. They will receive money from the governments and will sell to local buyers and will aim to expand to foreign markets.

As I said, most AI talent is not American but it is concentrated there. Give them a reason to be somewhere else, some will be somewhere else.

By foolswisdom 2025-06-1017:42

The solution to that would be to force companies within the EU market to compete with each other (fair competition laws), just that idea is less popular than the first winner in a market ensuring they stay dominant (because it serves the interest of those who just got power). Same reason why big tech rules EU in the first place.

By littlestymaar 2025-06-1015:17

> There would be no incentive to compete.

Why not ? First of all there would be plenty of incentives for EU companies to compete with one another (and plenty of capital flowing to them as the European market is big enough), then there would be competition with US actors in the rest of the world. That's exactly how the Asian economic model has been built: Japan, Taiwan, South Korea all have used protectionism + export-based subsidies to create market leaders in all kind of domains (from car manufacturing to electronics and shipbuilding).

By saubeidl 2025-06-1015:113 reply

As a counterexample, China's tech industry has caught up and in some ways surpassed the US, partially due to being closed off.

By mitthrowaway2 2025-06-1016:30

I think there's a few more important reasons beyond being closed off:

- Regulatory friendliness (eg. DJI)

- Non-enforcement of foreign patents (eg. LiFePO4 batteries)

- Technology transfer through partnerships with domestic firms

- Government support for industries deemed to be in the national interest

By csomar 2025-06-1016:251 reply

> As a counterexample, China's tech industry has caught up and in some ways surpassed the US, partially due to being closed off.

How did you come up to that conclusion? We don't have access to an alternate universe where the Chinese tech market was open. There is a real possibility that it would have been far ahead had it been open.

By yorwba 2025-06-1018:021 reply

We do have access to records from the before times when the internet was wide open and Facebook, Google and Microsoft were big in China. Well, Microsoft is still big because they're not an internet company and unfazed by censorship, but the exit of Google and Facebook took a lot of pressure off Baidu and the entire Chinese social media ecosystem.

By olalonde 2025-06-115:471 reply

Facebook and Google were never big in China, not even close.

By yorwba 2025-06-1110:471 reply

Google had 31% market share in 2010 http://news.bbc.co.uk/2/hi/business/8455712.stm I haven't been able to find numbers for Facebook.

By olalonde 2025-06-1111:06

Don't have a source either but I was living in China back then and basically no one was using it. It was QQ and Renren.

By hshdhdhj4444 2025-06-1016:07

But also due to the U.S. driving away smart people from the U.S. to China.

By chairmansteve 2025-06-1017:19

China is an example of protectionism working. The world is not governed by simple rules.

By ascorbic 2025-06-1016:45

It's mostly about money. DeepMind was founded in the UK, and is still based in London, but there was no way it could get the funding it needed without selling to Google or some other US company. China is one of the few other countries that can afford to fund that kind of thing.

By Iulioh 2025-06-1015:443 reply

The problem is, CONSUMER level tech

The EU is doing a lot of enterprise level shit and it's great

The biggest company in Europe sells B2B software (SAP)

By mrtksn 2025-06-1015:52

One swallow does not make a summer, all the major platforms are American and that's where Europe lags. I agree that Europe does have some great tech but they are all niche. Europe also have some great consumer tech products but they are all dependent on American platforms. For example some of the best games are French, Polish, Bulgarian, Ukrainian etc. but they all depend on Steam or Apple App Store and have to go by their rules and pay them a significant commission.

By PeterStuer 2025-06-1021:21

SAP sells B2B software, but most of their income is from consultancy and training.

By csomar 2025-06-1016:25

That's a single company and I'd not call that great.

By simianwords 2025-06-1017:39

How can you explain Israel?

By iwontberude 2025-06-1015:17

Which Trump/Musk level? There have been so many.

By tensor 2025-06-1016:251 reply

[flagged]

By demosthanos 2025-06-1016:291 reply

It gets pretty tiresome when people automatically take something as pro-US jingoism without stopping to think for a second about what they're saying. The model they're comparing to is Chinese, not American. How can you possibly construe OP's comment as pro-US?

EDIT: The comment below got flagged to death, so I'm going to link my evidence up here for visibility: https://news.ycombinator.com/item?id=44238841

Not only does the top-level comment make no mention of the US, but their comment history is strongly suggestive that they themselves are European, not US-based.

By tensor 2025-06-1016:311 reply

[flagged]

By demosthanos 2025-06-1016:41

Are we looking at the same account? There's not a lot of data there, as is the nature of a new account, but from where I'm sitting I'd guess they're based in Portugal or if not there specifically then at least on mainland Europe.

Out of 5 comments:

* This one is saying that Europe's biggest AI company isn't keeping up with China.

* Two comments are talking about Europe needing to become more independent from American tech companies.

* A fourth is talking about ElevenLabs's Portuguese support and noting that one of the Portuguese voices has a Spanish accent.

* The fifth is unrelated to geography or language.

I'm not sure where in this you get that they have a US-centric perspective. The only comments they have that mention the US at all are taking the very distinctly 2025-European tack of advocating for increased independence from America. Coupled with this one contrasting with Chinese tech that presents a pretty unified picture of someone who has a lot of interest in Europe having its own strong tech independent from the rest of the world.

The ElevenLabs comment is the most telling for me for pinpointing the geography: in order to know what Portuguese spoken with a Spanish accent sounds like you really have to have spent a significant amount of time in Iberia and most likely speak Portuguese as a native language.

By atemerev 2025-06-1014:455 reply

"EU is leading in regulation", they say.

I don't know what they are thinking.

By cpldcpu 2025-06-1015:071 reply

Sorry, this is just getting old...

Its a trite talking point and not the reason why there are so few consumer-AI companies in Europe.

By atemerev 2025-06-1015:112 reply

And what would be the reason? I am genuinely interested. Also, are there viable not "consumer" AI companies here? Only Mistral seems to train foundation models, and good for them, however, as of now they are absolutely not SOTA.

By baq 2025-06-1015:171 reply

Money.

No, really - EU doesn't have the VCs and the megacorps. People laugh at EU sponsoring projects, but there is no private money to sponsor them. There are plenty of US companies with sites in the EU though, so you have people working the problems, but no branding.

By SV_BubbleTime 2025-06-1015:465 reply

Ok, just a quick question… why does Europe not have the money actual/people?

By hshdhdhj4444 2025-06-1016:142 reply

Part of the answer is debt.

The U.S. has a debt of 35Tn. The entire EU around 16Tn.

If even 10% of the debt difference was invested in tech that would have meant about $2tn more in investment in EU tech.

By atemerev 2025-06-1018:05

The amount of debt you are allowed to take and the abundance of money to invest in new projects are in direct proportion to the competitiveness of the jurisdiction, i.e. business-friendly environment.

EU is not a business-friendly environment.

By bobxmax 2025-06-1017:534 reply

Because Europeans don't take smart risks. Because they over regulate.

It's fascinating watching people circle back to this answer.

Regulation and taxation reduces incentives. Lower incentives, means lower risk-taking.

The fact this is still a lesson that needs to be debated is absurd.

By camjw 2025-06-1019:391 reply

I would love to know what you do for a living and whether you personally have taken any smart risks that have lead you to financial success, or whether you just like sniping on HN about school shootings and pretending to be superior.

By bobxmax 2025-06-1022:47

[flagged]

By baq 2025-06-1018:202 reply

Europeans also mostly don’t suffer from school shootings and generally don’t go bankrupt when they get cancer or just take an ambulance ride to a non-network hospital. Regulation is not all bad, besides the US has more of it than anybody else.

By bobxmax 2025-06-1019:251 reply

The vast majority of Americans don't do either of those things either.

And given what happened in Austria just a few hours back, not the best time for your comment.

By camjw 2025-06-1019:351 reply

There have been 11 mass shootings in the US in the last 7 days so I don't think this disgusting competition is one you're likely to win.

By bobxmax 2025-06-1019:362 reply

Nobody is claiming the US has less mass shootings. It's just pointless whataboutism in a conversation (economic strategy) that has nothing to do with it.

By baq 2025-06-1020:351 reply

Regulation was the point discussed, healthcare and gun controls are two examples where there are massive qualitative and quantitative differences in regulation between EU and USA. E.g. healthcare is a matter of national security in the EU and it's a profit center for pension funds in the USA. Gun controls I'm not too familiar with, I can only see second order effects in the US in the form of an arms race between police and citizens.

By bobxmax 2025-06-1020:39

No, ECONOMIC regulation was the point discussed. That has zilch to do with something like gun control.

By camjw 2025-06-1019:471 reply

Ah good, I thought you were trying to imply there is an equivalent problem in the EU. Which would seem to be intentionally dense of course.

By TulliusCicero 2025-06-1021:011 reply

The mental gymnastics here are incredible. Do you really think the regulations inhibiting tech startup creation are the same ones that protect people when they get cancer or whatever?

Yes, the US has a lot of school shootings, but does anyone think loose gun regulations are why the US is strong on tech?

By bobxmax 2025-06-1022:451 reply

Any time European economic failings are brought up it's always the same thing. "Well at least no school shootings!"

Great, Singapore has less school shootings and homeless people than anywhere in Europe by a country mile and has a soaring economy.

By FabHK 2025-06-114:082 reply

Eh, Singapore's efforts to nurture a thriving startup scene are met with middling success at most.

By bobxmax 2025-06-1114:29

Agreed, but Singapore has only 5 million people which limits their potential in that regard.

By SV_BubbleTime 2025-06-115:301 reply

Are you implying that Singapore is not ultra regulated?

They make Europe look like Texas.

By FabHK 2025-06-1114:09

No, you're right, Singapore is both highly regulated and successful. I just meant to highlight that the soaring economy doesn't include many high-tech startups.

By cdblades 2025-06-1112:391 reply

> Because Europeans don't take smart risks. Because they over regulate.

If you said you can look at the state of VC funding in the US and call it anything approximating "smart risks" I don't know that I'd believe you.

By bobxmax 2025-06-1114:28

No good VC investment looked like a "smart riks" to normies when it happened.

By stefan_ 2025-06-1020:51

Thats hardly unique to Europeans. Look at UAV regulations in the US - regulated to death based on nothing, leading to a 5 to 10 year technology gap to China, while recreational pilots crash and burn every other week.

By kilpikaarna 2025-06-1016:31

Most recently, due to ordoliberalism and coat-according-to-cloth morality guiding economic policy rather than money printer go brrr.

Longer term: cultural and language divisions despite attempts at creating a common market, not running the global reserve currency/military hegemony, social democracies encouraging work-life balance over cutthroat careerism, demographic issues, not getting a boost from being the only consumer economy not to be leveled in WW2, etc.

By fmbb 2025-06-1015:54

Quick questions don’t always have quick answers.

Moneywise, the US does have the good old Exorbitant Privilege to lean on.

By baq 2025-06-1015:502 reply

edit: the parent has since edited out the flamebait.

Maybe, or maybe when silicon valley was busy growing exponentially Europe was still picking itself up from the mess of ww2.

Trying to blame a single reason is futile, naive and childish.

By oceanplexian 2025-06-1017:251 reply

The US was out-innovating Europe a long time before WW2, we had faster, more extensive rail systems, superior high rise construction, earlier to electrification, invention of the telephone, modern manufacturing (Model T), invention of the airplane, the birth of Hollywood and modern motion pictures, the list goes on.

By msgodel 2025-06-1018:29

I think it's funny how the US, Canada, and Scotland/the UK all simultaneously claim to be the home of the telephone.

By bobxmax 2025-06-1017:551 reply

And what's the excuse for Euro's GDP being equal to the US in 2007, and now being over $10T less?

By baq 2025-06-1018:15

In general, the same. In particular, different.

By PeterStuer 2025-06-1021:24

Unlike the US, the EU does not have reserve currency privilige, so we can't print enless trillions of paper and force the rest of the world to give us their companies and goods in return for it.

By whodidntante 2025-06-1023:01

I can only talk about my personal (US based) experience. This includes many US based startups,some VC startups, senior leadership in a large tech company, and a senior executive position in another large tech company. I have also worked with, and built, tech organizations in multiple EU countries, and have been involved in the technical due diligence and acquisition discussions with several EU companies. I admit that my experience is about 6 years old, as I am no longer in the tech industry, and I do not know what has changed during this time.

Money: There is more money for US startups. Investors (US and EU) want to invest in US based startups, not EU startups. US investors are willing to risk more money and take greater risk. EU startups that gain traction will attract US companies in that they provide a good way to extend their market to the EU, not as much for their innovations. Tech entrepreneurs (US or EU) want to work in the US if they can, because that is where the excitement and risk taking is and where the money can be made.

Teams: Building and managing EU tech teams is very different than US tech teams. EU teams need a lot more emotional hand holding, and EU engineers are far more salary oriented than equity oriented. It is far more difficult to motivate them to go above and beyond - the "we need to get this fix or feature in tonight so we can deploy n the morning" simply will not get done if it is already 5pm. Firing EU workers is much more difficult. There are a lot more regulations for EU teams, in order to "protect" them, and that results in the teams being more "lifestyle" teams rather than "innovation teams". EU teams get paid a lot less than their US counterparts.

Failure: Good failure is not a problem in the US, it can actually be a badge of honor. EU is very risk averse, and people avoid failure.

There are of course exceptions all around, but the weight of these observations and experiences are in favor of US teams.

This is in no way saying it is better to live in the US, there are a lot of things about the EU that are more attractive than the US, and I would probably have a better lifestyle living in Europe now that I am no longer working. But innovation and money is not one of them.

By dmos62 2025-06-1014:552 reply

It is fairly common to struggle to understand why different cultures think the way they do.

By moralestapia 2025-06-1014:571 reply

Ugh.

Edit: Parent changed their comment significantly, from something quite unpleasant to what it is now. I'm not deleting my comment as I'm not that kind of person.

By dmos62 2025-06-1016:07

I did. I initially said that Europeans often struggle to understand other cultures too. Which was an immature way to point out that the cultural dissonance works both ways. I realized that I was obfuscating my point and rewrote my comment to be clearer, but now that you gave me a chance to think on it some more, I wish I would have said what I wanted to say more directly still.

What I wanted to say is: I like EU's regulation and I find it interesting how other people have different world views.

By atemerev 2025-06-1015:082 reply

I live in Europe.

By mrtksn 2025-06-1015:092 reply

Cool, which regulations exactly stopped you from doing cutting edge AI?

By meta_ai_x 2025-06-1016:551 reply

regulation-culture breed a certain type of risk-taking culture. So, you can't blame a specific regulation for lack of innovation culture

By mrtksn 2025-06-1017:34

Im not sure about that, Europe has plenty of starups. Also, IIRC it has larger number of small businesses than US as in US huge companies employ huge numbers of people.

What Europe does not have is scale ups in tech. The tech consolidated in US. By tech I mean internet based companies. Remove those and EU has higher productivity.

By kelseyfrog 2025-06-1015:251 reply

Décret sur la Pause Goûter Universelle (PGU).

By philjohn 2025-06-1016:191 reply

Is that the regulation that says you need to allow someone to take a 20 minute break after 6 hours of work?

By FabHK 2025-06-114:11

Chinese Employers to Grant 15 Minute Maternity Break

https://theonion.com/chinese-employers-to-grant-15-minute-ma...

By Mistletoe 2025-06-1015:391 reply

This is why I want to move to the EU. I don’t care if companies aren’t coddled there. I want to live where people are the first priority.

By atemerev 2025-06-1018:081 reply

Well, are you ready to live on a low middle class salary of a European software engineer? It is really low middle class. The middle middle here would be a bank clerk, and upper middle — a lawyer or a surgeon.

This is not coincidental.

By baq 2025-06-1018:26

Incidentally (also not) surgeons and lawyers are not poor in the states either… it’s just Silicon Valley was the perfect place with just the right people and it kept growing for 60 years straight. Surgery and law do not grow exponentially. (I’ll pretend the pages of regulation aren’t supposed to count.)

By micromacrofoot 2025-06-1014:503 reply

probably some silly thing like "people should have more rights and protections"

By bobxmax 2025-06-1017:561 reply

Rights and protections that have benefited heavily from an economy built on the alliance with the US.

If it weren't for American help and trade post-WW2, Europe would be a Belarusian backwater and is fast heading back in that direction.

Countries like Greece, Italy, Spain, Portugal, etc. show the future of Europe as it slowly stagnates and becomes a museum that can't feed it's people.

Even Germany that was once excelling is now collapsing economically.

The only bright spot on the continent right now is Poland who are, shocker, much less regulatorily strict and have lower corporate taxes.

By debugnik 2025-06-1020:551 reply

> Countries like Greece, Italy, Spain, Portugal

PIGS, really? Some of the top growing EU economies right now, which have turned their deficit around, show the future of a slowly stagnating Europe?

By bobxmax 2025-06-1022:381 reply

A 200B economy growing 2% is the future of the EU? Yes that is the point I am making.

By micromacrofoot 2025-06-1214:321 reply

How much is an economy supposed to grow?

By bobxmax 2025-06-1423:27

Bangladesh has twice the GDP and 3x the growth. How long do you expect Portugal to be relevant?

By atemerev 2025-06-1015:092 reply

I've yet to find any rights and protections in these cookie banners.

By saubeidl 2025-06-1015:131 reply

The cookie banners are corps trying to circumvent the rights and protections. If they actually went by the spirit of the protections, the cookie banners wouldn't be needed. Your ire is misdirected.

By yeahforsureman 2025-06-1016:163 reply

Are you sure?

The ePrivacy Directive requires a (GDPR-level) consent for just placing the cookie, unless it's strictly necessary for the provision of the “service”. The way EU regulators interpret this, even web analytics falls outside the necessity exception and therefore requires consent.

So as long as the user doesn't and/or is not able to automatically signal consent (or non-consent) eg via general browser-level settings, how can you obtain it without trying to get it from the user on a per-site basis somehow? (And no, DNT doesn't help since it's an opt-out, not an opt-in mechanism.)

By exyi 2025-06-1016:331 reply

Everyone I know of will try to click "reject all unnecessary cookies", and you don't need the dialog for the necessary ones. You can therefore simply remove the dialog and the tracking, simplifying your code and improving your users' experience. Can tracking the fraction which misclicks even give some useful data?

By yeahforsureman 2025-06-114:441 reply

My point was that according to the current interpretation, if they rely on cookies, user analytics (even simple visitor stats where no personal data is actually processed) are not considered "necessary" and are therefore not exempt from the cookie consent obligation under the ePrivacy Directive. The reason why personal data processing is irrelevant is that the cookie consent requirement itself is based on the pre-GDPR ePrivacy Directive which requires, as a rule, consent merely for saving cookies on the client device (subject to some exceptions, including the one discussed).

So you need a consent for all but the most crucial cookies without which the site/service wouldn't be able to function, like session cookies for managing signed-in state etc.

(The reason why you started to see consent banners really only after GDPR came to force is at least in part due to the fact that the ePrivacy Directive refers to the Data Protection Directive (DPD) for the standard of consent, and after DPD was replaced by GDPR, the arguably more stringent GDPR consent standard was applied, making it unfeasible to rely on some concept of implied consent or the like.)

By mhitza 2025-06-119:25

User analytics that require cookies, sounds like tracking to me.

> like session cookies for managing signed-in state etc.

Maybe I'm reading it wrong, but are you saying that consent is required for session cookies? Because that is not the case, at all.

> (25) However, such devices, for instance so-called "cookies", can be a legitimate and useful tool, for example, in analysing the effectiveness of website design and advertising, and in verifying the identity of users engaged in on-line transactions. Where such devices, for instance cookies, are intended for a legitimate purpose, such as to facilitate the provision of information society services, their use should be allowed on condition that users are provided with clear and precise information in accordance with Directive 95/46/EC about the purposes of cookies or similar devices so as to ensure that users are made aware of information being placed on the terminal equipment they are using. Users should have the opportunity to refuse to have a cookie or similar device stored on their terminal equipment. This is particularly important where users other than the original user have access to the terminal equipment and thereby to any data containing privacy-sensitive information stored on such equipment. Information and the right to refuse may be offered once for the use of various devices to be installed on the user's terminal equipment during the same connection and also covering any further use that may be made of those devices during subsequent connections. The methods for giving information, offering a right to refuse or requesting consent should be made as user-friendly as possible. Access to specific website content may still be made conditional on the well-informed acceptance of a cookie or similar device, if it is used for a legitimate purpose.

https://eur-lex.europa.eu/eli/dir/2002/58/oj/eng

You should inform users about any private data you would be storing in a cookie. But this can be a small infobox on your page with no button.

When storing other type of information, the "cookie" problem needs to be seen from the perspective of shared devices. You know, the times before, when you might forget to log out at an internet cafe or clear your cookies containing password and other things they shouldn't. This is a dated approach at looking at the problem (most people have their own computing devices today, their phone), but still applicable (classrooms, and family shared devices).

By micromacrofoot 2025-06-1017:061 reply

there are analytics providers that don't require third party cookies, it's not hard to switch

By yeahforsureman 2025-06-115:04

The cookie consent provision under the ePrivacy Directive doesn't care whether they're first- or third-party. Actually, the way it's been worded, you'd arguably need a consent for (strictly non-"necessary") use of eg local storage, too — afaik this hasn't really come up in regulatory practice or case law, but may be more due to regulators' modest technical expertise or priorities.

A conceptually different matter altogether is consent (possibly) needed under GDPR for various kinds of personal data processing involving the use of cookies (ie not just the placement of cookies as such) and other technologies for tracking, targeting and the like. That's why you see cookie banners with detailed purposes and eg massive lists of vendors (since they can be considered "recipients" of the user's personal data under GDPR). In this context, a valid consent (and the information you have to provide to obtain it) is required (at least) when consent is the only feasible legal basis of the ones available under Art 6 GDPR for the personal data processing activities in question. This is where the national regulators have taken strict stances especially regarding ad targeting and other activities usually involving cross-site tracking, for example, deeming that the only feasible basis for those activities would be consent (ie "opt-in") — instead of, in particular, "legitimate interests" which would enable opt-out-like mechanisms instead. This is the legal context of looking critically at 3rd-party cookies, but unfortunately, for the reasons mentioned above, getting rid of such cookies might still not be enough to avoid the minimal base cookie consent requirement when you use eg analytics... :(

It's pretty ridiculous, I know, and it's a bummer they scrapped the long-planned and -negotiated ePrivacy Regulation which was meant to replace the old ePrivacy Directive and, among other things, update the weird old cookie consent provision.

By saubeidl 2025-06-117:49

As you said yourself, analytics are not necessary.

It's corpos trying to invade our privacy.

By micromacrofoot 2025-06-1015:171 reply

cookie banners are malicious compliance while we head towards the death of cross-site cookies, they are indeed a poor implementation but the legislation that lead to them did not come up with it

did you really prefer when companies were selling your data to third parties and didn't have to ask you?

By sunaookami 2025-06-1018:121 reply

Do you really think clicking "Reject non-essential cookies" does something?

By micromacrofoot 2025-06-1018:58

show me a single example that doesn't

By __alexs 2025-06-1015:093 reply

EU regulation is often "you can not have the cool thing" not "the cool thing must be operated equitably".

I think they are more interested in protecting old money than in protecting people.

By andruby 2025-06-1015:42

EU never just states "you can not have the cool thing". Please provide an example if you disagree.

It is very hard to create policies and legislation that protects consumers, workers and privacy while also giving enough liberties for innovation. These are difficult but important trade-offs.

I'm glad there is diversity in cultures and values between the US, EU and Asia.

By micromacrofoot 2025-06-1015:201 reply

I think usb-c and third party app stores are pretty cool

By umbra07 2025-06-1017:254 reply

I think the government shouldn't be legislating that companies must use a specific USB connector.

Realistically the legislation was only targeting Apple. If consumers want USB-C, then they can vote with their wallets and buy an Android, which is a reasonable alternative.

By vintermann 2025-06-1111:00

It used to be the case in Europe that you couldn't use a washing machine made for Sweden in Norway. Everything was different. Every country had its own standards too, which had to certify your products. It was openly for protectionistic reasons.

EU got rid of that. It only makes sense that they don't let private companies start all that crap up again. If states don't get to use artificial technological barriers as protectionism, certainly Apple shouldn't be allowed to either.

By msgodel 2025-06-1018:271 reply

They shouldn't be forcing people to use patented Qualcomm technology to access cellular networks either but here we are.

Realistically Apple's connector adds no value and if they want to sell into markets like the EU they need to cut that kind of thing out.

By umbra07 2025-06-1018:452 reply

> Realistically Apple's connector adds no value

Like I said, usb-c is a regression from lightning in multiple ways.

* Lightning is easier to plug in.

* Lightning is a physically smaller connector.

* USB-C is a much more mechanically complex port. Instead of a boss in a slot, you have a boss with a slot plugging into a slot in a boss.

There was so much buzz around Apple no longer including a wall wort with its phones, which meant an added cost for the consumer, and potentially an increased environmental impact if enough people were going to say, order a wall wort online and shipped to them. The same logic applies to Apple forced to switch to USB, except that the costs are now multiplied.

By fkyoureadthedoc 2025-06-1020:08

Having owned both lighting and USB-C iPhones/iPads, I prefer the USB-C experience, but neither were that bad.

My personal biggest gripe with lightning was that the spring contacts were in the port instead of the cable, and when they wore out you had to replace the phone instead of the cable. The lightning port was not replaceable. In practice I may end up breaking more USB-C ports, we'll see.

By micromacrofoot 2025-06-1020:07

I've worked with thousands of both types of cable at this point

> Lightning is easier to plug in.

according to you? neither are at all difficult

> Lightning is a physically smaller connector.

I've had lightning cables physically disassemble in the port, the size also made them somewhat delicate

> USB-C is a much more mechanically complex port.

much is a bit well, much... they're both incredibly simple mechanically — the exposed contacts made lightning more prone to damage

I've had multiple Apple devices fail because of port wear on the device. Haven't encountered this yet with usb-c

> The same logic applies to Apple forced to switch to USB, except that the costs are now multiplied.

Apple would have updated inevitably, as they did in the past — now at least they're on a standard... the long-term waste reduction is very likely worth the switch (because again, without the standard they'd have likely switched to another proprietary implementation)

By flmontpetit 2025-06-1017:53

It's hard to see the benefit in letting every hardware manufacturer attempt to carve out their own little artificial interconnect monopoly and flood the market with redundant, wasteful solutions.

By micromacrofoot 2025-06-1017:411 reply

We've had multiple USB standards for decades with no end in sight. Apple was targeted because they have the most high-profile proprietary connector and they were generally using it to screw consumers. Good riddance.

By umbra07 2025-06-1017:502 reply

Like I said, if consumers don't want it, then they can buy Android phones instead.

> they were generally using it to screw consumers

You understand that there were lots of people happy with Lightning? USB-C is a regression in many ways.

By boroboro4 2025-06-1018:241 reply

I want to have USB-C and I want to have iPhone.

I’m very happy EU regulators took this headache off my shoulders and I don’t need to keep multiple chargers at home, and can be almost certain I can find a charger in restaurant if I need it.

Based on the reaction of my friends 90% of people supported this change and were very enthusiastic about it.

I have zero interest in being part of vendor game to lock me in.

By umbra07 2025-06-1018:331 reply

Products are supposed to come with different tradeoffs. I want to have an Android and I want to have my headphone jack back. That doesn't mean that the EU should make that a law.

> Based on the reaction of my friends 90% of people supported this change and were very enthusiastic about it.

That is an absolutely worthless metric, and you know it.

By Aeolos 2025-06-1019:29

It's about as useful as your complaining.

Good riddance for Lightning.

By micromacrofoot 2025-06-1019:01

Why bother arguing the point if you're not going to provide a single example.

By saubeidl 2025-06-1015:12

Can you name specific examples? Otherwise, this just sounds like inflammatory polemic.

By 0xDEAFBEAD 2025-06-1015:182 reply

Honestly the US approach to AI is incredibly irresponsible. As an American, I'm glad that someone somewhere is thinking about regulation. Not sure it will be enough though: https://xcancel.com/ESYudkowsky/status/1922710969785917691#m

By msgodel 2025-06-1018:302 reply

There's nothing the regulation could meaningfully hope to accomplish other than slow down people willing to play by the rules.

By ambicapter 2025-06-1021:181 reply

Wow, the "criminals don't follow laws therefore laws are worthless" argument, here? In my HN?

By msgodel 2025-06-1022:51

Usually it's possible to actually detect crime (in fact it's usually hard to ignore.) That's not the case with AI.

By 0xDEAFBEAD 2025-06-1212:07

How about regulating big GPU clusters?

By MoonGhost 2025-06-1016:142 reply

No, thanks, we don't want to be like EU. Everything regulated to death. They even thought to criminalize street photography because there could be copyrighted materials in the picture. Not sure, are they still taxing Eiffel tower images?

By int_19h 2025-06-1017:221 reply

EU is not a monolithic entity, and amount of regulation varies widely. Baltics are very business friendly, for example.

By bobxmax 2025-06-1017:57

And Estonia has the most impressive tech ecosystem on the continent while being a soviet backwater 20 years ago. Shocking how that works.

By johnisgood 2025-06-1016:24

I thought it is happening in the US, too. I mean, the Government is there to regulate the shit out of everything. Regardless of where you are.

By dwedge 2025-06-1019:541 reply

Their OCR model was really well hyped and coincidentally came out at the time I had a batch of 600 page pdfs to OCR. They were all monospace text just for some reason the OCR was missing.

I tried it, 80% of the "text" was recognised as images and output as whitespace so most of it was empty. It was much much worse than tesseract.

A month later I got the bill for that crap and deleted my account.

Maybe this is better but I'm over hype marketing from mistral

By notnullorvoid 2025-06-1117:03

I wouldn't trust any of these LLM teams to produce a good OCR model. OCR from 10 years ago is better than the crap they put out.