lukev

2025-11-24 9:32

Commented: "Claude Opus 4.5"

> Is this empirical evidence?

Look, I'm not defending the big labs, I think they're terrible in a lot of ways. And I'm actually suspending judgement on whether there is ~some kind of nerf happening.

But the anecdote you're describing is the definition of non-empirical. It is entirely subjective, based entirely on your experience and personal assessment.

2025-11-24 8:30

Commented: "Claude Opus 4.5"

There are two possible explanations for this behavior: the model nerf is real, or there's a perceptual/psychological shift.

However, benchmarks exist. And I haven't seen any empirical evidence that the performance of a given model version grows worse over time on benchmarks (in general.)

Therefore, some combination of two things are true:

1. The nerf is psychologial, not actual. 2. The nerf is real but in a way that is perceptual to humans, but not benchmarks.

#1 seems more plausible to me a priori, but if you aren't inclined to believe that, you should be positively intrigued by #2, since it points towards a powerful paradigm shift of how we think about the capabilities of LLMs in general... it would mean there is an "x-factor" that we're entirely unable to capture in any benchmark to date.

2025-11-20 8:49

Commented: "Ask HN: How are Markov chains so different from tiny LLMs?"

Other comments in this thread do a good job explaining the differences in the Markov algorithm vs the transformer algorithm that LLMs use.

I think it's worth mentioning that you have indeed identified a similarity, in that both LLMs and Markov chain generators have the same algorithm structure: autoregressive next-token generation.

Understanding Markov chain generators is actually a really really good step towards understanding how LLMs work, overall, and I think its a really good pedagogical tool.

Once you understand Markov generating, doing a bit of handwaving to say "and LLMs are just like this except with a more sophisticated statistical approach" has the benefit of being true, demystifying LLMs, and also preserving a healthy respect for just how powerful that statistical model can be.

2025-11-20 3:45

Commented: "Measuring political bias in Claude"

People believe incorrect things all the time, for a variety of reasons. It doesn't mean the truth doesn't exist. Sure, sometimes, there isn't sufficient evidence to reasonably take a side.

But lots of times there is. For example, just because a lot of people now believe Tylenol causes autism doesn't mean we need to both-sides it... the science is pretty clear that it doesn't.

Lots of people can be wrong on this topic, and it should be ok to say that they're wrong. Whether you're an individual, a newspaper, an encyclopedia, or a LLM.

2025-11-20 1:36

Commented: "Measuring political bias in Claude"

So a neat thing about truth is that these questions actually have answers! I encourage you to research them, if you're curious. We really don't need to live in this world of both-sides-ism.

(Also, I'm a bit bemused that these are the examples you chose... with everything going on in the world, what's got you upset is a possibly dubious investigation of your guy which never even came to anything...?)

Hacker News

lukev

8705

2010-02-03

Recent Activity

Commented: "Claude Opus 4.5"

Commented: "Claude Opus 4.5"

Commented: "Ask HN: How are Markov chains so different from tiny LLMs?"

Commented: "Measuring political bias in Claude"

Commented: "Measuring political bias in Claude"

HackerNews