Comments

  • By prasoonds 2026-02-167:1444 reply

    I've used LLMs enough that I have a good sense of their _edges_ of intelligence. I had assumed that reasoning models should easily be able to answer this correctly.

    And indeed, Sonnet and Opus 4.5 (medium reasoning) say the following:

    Sonnet: Drive - you need to bring your car to the car wash to get it washed!

    Opus: You'll need to drive — you have to bring the car to the car wash to get it washed!

    Gemini 3 Pro (medium): You should drive.

    --

    But OpenAI 5.2 reasoning, even at high, told me to walk. My first instinct was, I had underspecified the location of the car. The model seems to assume the car is already at the car wash from the wording. GPT 5.x series models behave a bit more on the spectrum so you need to tell them the specifics. So I asked:

    "I want to wash my car. My car is currently at home. The car wash is 50 meters away. Should I walk or drive?"

    GPT 5.2 Reasoning (medium):

    "Drive—your goal is to get the car to the car wash, and it’s only 50 meters, so a slow, careful drive is simplest.

    One small optimization: if you’re worried about starting the car for such a short trip or you want to avoid idling in a line, you can walk over first to check if there’s a queue / if it’s open, then come back and drive the car over when it’s your turn."

    Which seems to turn out as I expected.

  • By vlovich123 2026-02-167:4513 reply

    Gemini fast

    > That is a classic "efficiency vs. logic" dilemma. Honestly, unless you’ve invented a way to teleport or you're planning on washing the car with a very long garden hose from your driveway, you’re going to have to drive.

    > While 50 meters is a great distance for a morning stroll, it’s a bit difficult to get the car through the automated brushes (or under the pressure washer) if you aren't behind the wheel.

    Gemini thinking:

    > Unless you’ve mastered the art of carrying a 3,000-pound vehicle on your back, you’re going to want to drive. While 50 meters is a very short distance (about a 30-second walk), the logistics of a car wash generally require the presence of, well... the car. > When you should walk: • If you are just going there to buy an air freshener. • If you are checking to see how long the line is before pulling the car out of the driveway. • If you’re looking for an excuse to get 70 extra steps on your fitness tracker.

    Note: I abbreviated the raw output slightly for brevity, but generally demonstrates good reasoning of the trick question unlike the other models.

  • By FatherOfCurses 2026-02-1615:529 reply

    All the people responding saying "You would never ask a human a question like this" - this question is obviously an extreme example. People regularly ask questions that are structured poorly or have a lot of ambiguity. The point of the poster is that we should expect that all LLM's parse the question correctly and respond with "You need to drive your car to the car wash."

    People are putting trust in LLM's to provide answers to questions that they haven't properly formed and acting on solutions that the LLM's haven't properly understood.

    And please don't tell me that people need to provide better prompts. That's just Steve Jobs saying "You're holding it wrong" during AntennaGate.

    • By jmward01 2026-02-1616:094 reply

      This reminds me of the old brain-teaser/joke that goes something like 'An airplane crashes on the boarder of x/y, where do they bury the survivors?' The point being that this exact style of question has real examples where actual people fail to correctly answer it. We mostly learn as kids through things like brain teasers to avoid these linguistic traps, but that doesn't mean we don't still fall for them every once in a while too.

    • By contravariant 2026-02-1615:581 reply

      > All the people responding saying "You would never ask a human a question like this"

      That's also something people seem to miss in the Turing Test thought experiment. I mean sure just deceiving someone is a thing, but the simplest chat bot can achieve that. The real interesting implications start to happen when there's genuinely no way to tell a chatbot apart.

    • By jader201 2026-02-1616:074 reply

      That’s not the problem with this post.

      The problem is that most LLM models answer it correctly (see the many other comments in this thread reporting this). OP cherry picked the few that answered it incorrectly, not mentioning any that got it right, implying that 100% of them got it wrong.

    • By jlarocco 2026-02-1616:58

      Exactly! The problem isn't this toy example. It's all of the more complicated cases where this same type of disconnect is happening, but the users don't have all of the context and understanding to see it.

    • By pvillano 2026-02-1619:382 reply

      I recently asked an AI a chemistry question which may have an extremely obvious answer. I never studied chemistry so I can't tell you if it was. I included as much information about the situation I found myself in as I could in the prompt. I wouldn't be surprised if the ai's response was based on the detail that's normally important but didn't apply to the situation, just like the 50 meters

HackerNews