Hacker News

The Waymo World Model

2026-02-0616:201160663waymo.com

We are excited to introduce the Waymo World Model, a frontier generative model that sets a new bar for large-scale, hyper-realistic autonomous driving simulation.

Show article

The Waymo Driver has traveled nearly 200 million fully autonomous miles, becoming a vital part of the urban fabric in major U.S. cities and improving road safety. What riders and local communities don’t see is our Driver navigating billions of miles in virtual worlds, mastering complex scenarios long before it encounters them on public roads. Today, we are excited to introduce the Waymo World Model, a frontier generative model that sets a new bar for large-scale, hyper-realistic autonomous driving simulation.

Simulation of the Waymo Driver evading a vehicle going in the wrong direction. The simulation initially follows a real event, and seamlessly transitions to using camera and lidar images automatically generated by an efficient real-time Waymo World Model.

Simulation is a critical component of Waymo’s AI ecosystem and one of the three key pillars of our approach to demonstrably safe AI. The Waymo World Model, which we detail below, is the component that is responsible for generating hyper-realistic simulated environments.

The Waymo World Model is built upon Genie 3—Google DeepMind's most advanced general-purpose world model that generates photorealistic and interactive 3D environments—and is adapted for the rigors of the driving domain. By leveraging Genie’s immense world knowledge, it can simulate exceedingly rare events—from a tornado to a casual encounter with an elephant—that are almost impossible to capture at scale in reality. The model’s architecture offers high controllability, allowing our engineers to modify simulations with simple language prompts, driving inputs, and scene layouts. Notably, the Waymo World Model generates high-fidelity, multi-sensor outputs that include both camera and lidar data.

This combination of broad world knowledge, fine-grained controllability, and multi-modal realism enhances Waymo’s ability to safely scale our service across more places and new driving environments. In the following sections we showcase the Waymo World Model in action, featuring simulations of the Waymo Driver navigating diverse rare edge-case scenarios.

🌎 Emergent Multimodal World Knowledge

Most simulation models in the autonomous driving industry are trained from scratch based on only the on-road data they collect. That approach means the system only learns from limited experience. Genie 3’s strong world knowledge, gained from its pre-training on an extremely large and diverse set of videos, allows us to explore situations that were never directly observed by our fleet.

Through our specialized post-training, we are transferring that vast world knowledge from 2D video into 3D lidar outputs unique to Waymo’s hardware suite. While cameras excel at depicting visual details, lidar sensors provide valuable complementary signals like precise depth. The Waymo World Model can generate virtually any scene—from regular, day-to-day driving to rare, long-tail scenarios—across multiple sensor modalities.

🌪️ Extreme weather conditions and natural disasters

Simulation: Driving on the Golden Gate Bridge, covered in light snow. Waymo’s shadow is visible in the front camera footage.
Simulation: Encountering a tornado.
Simulation: A suburban cul de sac completely submerged in stagnant flood water with floating furniture.
Simulation: Driving on a street with lots of palm trees in a tropical city, strangely covered in snow.
Simulation: Driving out of a raging fire.

💥 Rare and safety-critical events

Simulation: Reckless driver driving off road.
Simulation: The leading vehicle driving into the tree branches.
Simulation: Driving behind a vehicle with precariously positioned furniture on top.
Simulation: A malfunctioned truck facing the wrong way, blocking the road.

🐘 Long-tail (pun intended!) objects and more

Simulation: Encounter with a friendly elephant.
Simulation: Encounter with a Texas longhorn.
Simulation: Encounter with a lion.
Simulation: A pedestrian dressing up as a T-rex.
Simulation: Encountering a huge tumbleweed the size of a car.

In the interactive viewers below, you can immersively view the realistic 4D point clouds generated by the Waymo World Model.

Interactive 3D visualization of an encounter with an elephant.

Interactive 3D visualizations of a drive through a city street.

🕹️ Strong Simulation Controllability

The Waymo World Model offers strong simulation controllability through three main mechanisms: driving action control, scene layout control, and language control.

Driving action control allows us to have a responsive simulator that adheres to specific driving inputs. This enables us to simulate “what if” counterfactual events such as whether the Waymo Driver could have safely driven more confidently instead of yielding in a particular situation.

Counterfactual driving. We demonstrate simulations both under the original route in a past recorded drive, or a completely new route. While purely reconstructive simulation methods (e.g., 3D Gaussian Splats, or 3DGS) suffer from visual breakdowns due to missing observations when the simulated route is too different from the original driving, the fully learned Waymo World Model maintains good realism and consistency thanks to its strong generative capabilities.

Scene layout control allows for customization of the road layouts, traffic signal states, and the behavior of other road users. This way, we can create custom scenarios via selective placement of other road users, or applying custom mutations to road layouts.

Scene layout conditioning following

Language control is our most flexible tool that allows us to adjust time-of-day, weather conditions, or even generate an entirely synthetic scene (such as the long-tail scenarios shown previously).

World Mutation - Time of Day

World Mutation - Weather

🎞️ Converting Dashcam Videos

During a scenic drive, it is common to record videos of the journey on mobile devices or dashcams, perhaps capturing piled up snow banks or a highway at sunset. The Waymo World Model can convert those kinds of videos, or any taken with a regular camera, into a multimodal simulation—showing how the Waymo Driver would see that exact scene. This process enables the highest degree of realism and factuality, since simulations are derived from actual footage.

Arches National Park, Utah, USA
Death Valley, California, USA

⚙️ Scalable Inference

Some scenes we want to simulate may take longer to play out, for example, negotiating passage in a narrow lane. That’s harder to do because the longer the simulation, the tougher it is to compute and maintain stable quality. However, through a more efficient variant of the Waymo World Model, we can simulate longer scenes with dramatic reduction in compute while maintaining high realism and fidelity to enable large-scale simulations.

🚀 Long rollout (4x speed playback) on an efficient variant of the Waymo World Model

Navigating around in-lane stopper and fast traffic on the freeway.
Navigating a busy neighborhood.
Driving up a steep street and safely navigating around motorcyclists.

By simulating the “impossible”, we proactively prepare the Waymo Driver for some of the most rare and complex scenarios. This creates a more rigorous safety benchmark, ensuring the Waymo Driver can navigate long-tail challenges long before it encounters them in the real world.

Acknowledgements

The Waymo World Model is enabled by the key research, engineering and evaluation contributions from James Gunn, Kanaad Parvate, Lu Liu, Lucas Deecke, Luca Bergamini, Zehao Zhu, Raajay Viswanathan, Jiahao Wang, Sakshum Kulshrestha, Titas Anciukevičius, Luna Yue Huang, Yury Bychenkov, Yijing Bai, Yichen Shen, Stefanos Nikolaidis, Tiancheng Ge, Shih-Yang Su and Vincent Casser.

We thank Chulong Chen, Mingxing Tan, Tom Walters, Harish Chandran, David Wong, Jieying Chen, Smitha Shyam, Vincent Vanhoucke and Drago Anguelov for their support in defining the vision for this project, and for their strong leadership and guidance throughout.

We would like to additionally thank Jon Pedersen, Michael Dreibelbis, Larry Lansing, Sasho Gabrovski, Alan Kimball, Dave Richardson, Evan Birenbaum, Harrison McKenzie Chapter and Pratyush Chakraborty, Khoa Vo, Todd Hester, Yuliang Zou, Artur Filipowicz, Sophie Wang and Linn Bieske for their invaluable partnership in facilitating and enabling this project.

We thank our partners from Google DeepMind: Jack Parker-Holder, Shlomi Fruchter, Philip Ball, Ruiqi Gao, Songyou Peng, Ben Poole, Fei Xia, Allan Zhou, Sean Kirmani, Christos Kaplanis, Matt McGill, Tim Salimans, Ruben Villegas, Xinchen Yan, Emma Wang, Woohyun Han, Shan Han, Rundi Wu, Shuang Li, Philipp Henzler, Yulia Rubanova, and Thomas Kipf for helpful discussions and for sharing invaluable insights for this project.

Read the original article

xnx

Karma: 23161

@Hacker__News
@hacker._news

Comments

By mattlondon 2026-02-0618:0922 reply

Suddenly all this focus on world models by Deep mind starts to make sense. I've never really thought of Waymo as a robot in the same way as e.g. a Boston Dynamics humanoid, but of course it is a robot of sorts.

Google/Alphabet are so vertically integrated for AI when you think about it. Compare what they're doing - their own power generation , their own silicon, their own data centers, search Gmail YouTube Gemini workspace wallet, billions and billions of Android and Chromebook users, their ads everywhere, their browser everywhere, waymo, probably buy back Boston dynamics soon enough (they're recently partnered together), fusion research, drugs discovery.... and then look at ChatGPT's chatbot or grok's porn. Pales in comparison.

By phkahler 2026-02-0619:484 reply

Google has been doing more R&D and internal deployment of AI and less trying to sell it as a product. IMHO that difference in focus makes a huge difference. I used to think their early work on self-driving cars was primarily to support Street View in thier maps.

By brokencode 2026-02-0620:584 reply

There was a point in time when basically every well known AI researcher worked at Google. They have been at the forefront of AI research and investing heavily for longer than anybody.

It’s kind of crazy that they have been slow to create real products and competitive large scale models from their research.

But they are in full gear now that there is real competition, and it’ll be cool to see what they release over the next few years.

By Arainach 2026-02-076:472 reply

>It’s kind of crazy that they have been slow to create real products and competitive large scale models from their research.

Not really. If Google released all of this first instead of companies that have never made a profit and perhaps never will, the case law would simply be the copyright holders suing them for infringement and winning.

By zipy124 2026-02-0723:21

It's not even that. It's way easier to do R&D when you don't have a customer base to support.

By drewstiff 2026-02-089:23

Also think of how LLMs are replacing web searches for most people - Google would have been cannibalising their Search profits for no good reason

By wslh 2026-02-0713:241 reply

> It’s kind of crazy that they have been slow to create real products and competitive large scale models from their research.

It’s not that crazy. Sometimes the rational move is to wait for a market to fully materialize before going after it. This isn’t a Xerox PARC situation, nor really the innovator’s dilemma, it’s about timing: turning research into profits when market conditions finally make it viable. Even mammoths like Google are limited in their ability to create entirely new markets.

By DSingularity 2026-02-0714:01

This take makes even more sense when you consider the costs of making a move to create the market. The organizational energy and its necessary loss in focus and resources limits their ability to experiment. Arguably the best strategy for Google: (1) build foundational depth in research and infrastructure that would be impossible for competition to quickly replicate (2) wait for the market to present a clear new opportunity for you (3) capture it decisively by focusing and exploiting every foundational advantage Google was able to build.

By hosh 2026-02-0621:224 reply

I also think the presence of Sergey Brin has been making a difference in this.

By refulgentis 2026-02-0621:294 reply

Ex-googler: I doubt it, but am curious for rationale (i know there was a round of PR re: him “coming back to help with AI.” but just between you and me, the word on him internally, over years and multiple projects, was having him around caused chaos b/c he was a tourist flitting between teams, just spitting out ideas, but now you have unclear direction and multiple teams hearing the same “you should” and doing it)

By AYBABTME 2026-02-073:192 reply

the rebuke is that lack of chaos makes people feel more orderly and as if things are going better, but it doesn't increase your luck surface area, it just maximizes cozy vibes and self interested comfort.

By refulgentis 2026-02-073:431 reply

My dynamic range of professional experience is high, dropout => waiter => found startup => acquirer => Google.

You're making an interesting point that I somewhat agree with from the perspective of someone was...clearly a little more feral than his surroundings in Google, and wildly succeeded and ultimately quietly failed because of it.

The important bit is "great man" theory doesn't solve lack of dynamism. It usually makes things worse. The people you read about in newspapers are pretty much as smart as you, for better or worse.

I actually disagreed with the Sergey thing along the same lines, it was being used as a parable for why it was okay to do ~nothing in year 3 and continue avoiding what we were supposed to ship in year 1, because only VPs outside my org and the design section in my org would care.

Not sure if all that rhymes or will make any sense to you at all. But I deeply respect the point you are communicating, and also mean to communicate that there's another just as strong lesson: one person isn't bright enough to pull that off, and the important bit there isn't "oh, he isn't special", it's that it makes you even more careful building organizations that maintain dynamism and creativity.

By sdf2erf 2026-02-074:471 reply

Yeah people seem to be pretty poor at judging the impact of 'key' people.

E.g. Steve Jobs was absolutely fundamental to the turn around of Apple. Will Brin have this level of incremental impact on the Goog/Alphabet of today? Nah.

By rstuart4133 2026-02-0711:502 reply

The difference is: Apple had one "key person", Jobs, and yes the products he drove made the company successful. Now Jobs has gone I haven't seen anything new.

But if you look at Google, there isn't one key product. There are a whole pile of products that are best in class. Search (cringe, I know it's popular here to say Google search sucks and perhaps it does, but what search engine is far better?), YouTube, Maps, Android, Waymo, GMail, Deep Mind, the cloud infrastructure, translate, lens (OCR) and probably a lot of others I've forgotten. Don't forget Sheets and Docs, which while they have been replicated by Microsoft and others now were first done by Google. Some of them, like Maps, seem to have swapped entire teams - yet continued to be best in class. Predicting Google won't be at the forefront on the next advance seems perilous.

Maybe these products have key people as you call them, but the magic in Alphabet doesn't seem to be them. The magic seems to be Alphabet has some way to create / acquire these keep people. Or perhaps Alphabet just knows how to create top engineering teams that keep rolling along, even when the team members are replaced.

Apple produced one key person, Jobs. Alphabet seems to be a factory creating lots of key people moving products along. But as Google even manages to replace these key people (as they did for Maps) and still keep the product moving, I'm not sure they are the key to Googles success.

By felixg3 2026-02-0716:51

Docs was just an acquisition of Writely, an early „Web 2.0“ document editor service, so „first done by google“ is a bit imprecise

By 1kurac 2026-02-0913:39

> what search engine is far better?

Since you ask, this surely has to be altpower.app!

By dietr1ch 2026-02-074:361 reply

In Assistant having higher-ups spitting ideas and random thoughts ended up in people mistakenly assume that we really wanted to go/do that, meaning that chaos resulted in ill and cancelled projects.

The worst part was figuring what happened way too late. People were having trying to go for promo for a project that didn't launch. Many people got angry, some left, the product felt stale and leadership&management lost trust.

By alwa 2026-02-078:181 reply

Isn’t that what the parent is describing? “Ill and cancelled projects” <==> “luck surface area”, and “trying to go for promotion” <==> “cozy vibes and self-interested comfort”?

By IncreasePosts 2026-02-073:351 reply

I'm in a similar position and generally agree with your take, but the plus side to his involvement is if he believed in your project or viewpoint he would act as the ultimate red tape cutter.

By refulgentis 2026-02-073:56

And there is absolutely nothing more valuable at G (no snark)

(cheers, don't read too much signal into my thoughts, it's more negative than I'd intend. Just was aware it was someone going off PR, and doing hero worship that I myself used to do, and was disabused over 7 years there, and would like other people outside to disabuse themselves of. It's a place, not the place)

By pstuart 2026-02-0621:581 reply

That makes sense. A "secret shopper" might be a better way to avoid that but wouldn't give him the strokes of being the god in the room.

By belter 2026-02-072:20

He was shopping for other strokes from Google employees: https://finance.yahoo.com/blogs/the-exchange/alleged-affair-...

By LightBug1 2026-02-0622:181 reply

Oh ffs, we have an external investor who behaves like that. Literally set us back a year on pet nonsense projects and ideas.

By redanddead 2026-02-0623:531 reply

What'd he say

By refulgentis 2026-02-073:04

That the rocket company should buy an LLM

By hungryhobbit 2026-02-0621:476 reply

Please, Google was terrible about using the tech the had long before Sundar, back when Brin was in charge.

Google Reader is a simple example: Googl had by far the most popular RSS reader, and they just threw it away. A single intern could have kept the whole thing running, and Google has literal billions, but they couldn't see the value in it.

I mean, it's not like being able to see what a good portion of America is reading every day could have any value for an AI company, right?

Google has always been terrible about turning tech into (viable, maintained) products.

By vinkelhake 2026-02-0622:073 reply

Is there an equivalent to Godwin's law wrt threads about Google and Google Reader?

See also: any programming thread and Rust.

By scarmig 2026-02-0622:571 reply

I'm convinced my last groan will be reading a thread about Google paper clipping the world, and someone will be moaning about Google Reader.

By tbossanova 2026-02-0623:46

“A more elegant weapon of a civilised age.”

By refulgentis 2026-02-070:53

Lol, it seems obvious in retrospect, there really, really, needs to be.

Therefore we now have “Vinkel’s Law”

By wisty 2026-02-076:20

It's far from the only example https://killedbygoogle.com/

By burgreblast 2026-02-0622:281 reply

I never get the moaning about killing Reader. It was never about popularity or user experience.

Reader had to be killed because it [was seen as] a suboptimal ad monetization engine. Page views were superior.

Was Google going to support minimizing ads in any way?

By theptip 2026-02-0718:06

Right. Reader was not a case of apathy and failure to see the product’s value.

It was Google clearly seeing the product’s value, and killing it because that value was detrimental to their ads business.

By DiggyJohnson 2026-02-0622:26

How is this relevant? At best it’s tangentially related and low effort

By jamespo 2026-02-0622:08

Took a while but I got to the google reader post. Self host tt-rss, it's much better

By largbae 2026-02-0623:29

Can you not vibe code it back into existence yet?

By inquirerGeneral 2026-02-078:42

[dead]

By belter 2026-02-071:412 reply

Because after the death of Epstein he suddenly had a lot of free time?

https://www.wsj.com/finance/jeffrey-epstein-advised-sergey-b...

https://x.com/MarioNawfal/status/2017428928814588323

By rvnx 2026-02-072:402 reply

If this is true, this is disappointing :/

On a similar topic, it is worth mentioning the entrepreneurs that are forced into sex (or let’s say, very pushed) by VCs.

For those who feel safe or taking it as a joke, this affects women AND men.

Some people are going to be disappointed about their heroes.

By sdf2erf 2026-02-073:37

Dont mention the recent Eric Schmidt scandal.

Barely any of these jokers are clean. Makes MZ look seemingly normal in comparison.

By belter 2026-02-075:39

>> If this is true, this is disappointing

Wait for the second set of files...

"...One of Mr. Epstein’s former boat captains told The New York Times earlier this year that he had seen Mr. Brin on the island more than once..."

https://dnyuz.com/2026/01/31/powerful-men-who-turn-up-in-the...

By wslh 2026-02-0717:191 reply

What's striking is the sheer scale of Epstein's and Maxwell's scheduling and access. The source material makes it hard to even imagine how two people could sustain that many meetings/parties/dinners/victims, across so many places, with such high-profile figures. And, how those figures consistently found the time to meet them.

By belter 2026-02-0721:21

Ghislaine making a speech at the UN... https://youtu.be/-h5K3hfaXx4?t=350

By smallnix 2026-02-0621:581 reply

> It’s kind of crazy that they have been slow to create real products and competitive large scale models from their research.

I always thought they deliberately tried to contain the genie in the bottle as long as they could

By mullingitover 2026-02-0623:563 reply

Their unreleased LaMDA[1] famously caused one of their own engineers to have a public crashout in 2022, before ChatGPT dropped. Pre-ChatGPT they also showed it off in their research blog[2] and showed it doing very ChatGPT-like things and they alluded to 'risks,' but those were primarily around it using naughty language or spreading misinformation.

I think they were worried that releasing a product like ChatGPT only had downside risks for them, because it might mess up their money printing operation over in advertising by doing slurs and swears. Those sweet summer children: little did they know they could run an operation with a seig-heiling CEO who uses LLMs to manufacture and distribute CSAM worldwide, and it wouldn't make above-the-fold news.

[1] https://en.wikipedia.org/wiki/LaMDA#Sentience_claims

[2] https://research.google/blog/lamda-towards-safe-grounded-and...

By tempest_ 2026-02-073:43

The front runner is not always the winner. If they were able to keep pace with openai while letting them take all the hits and miss steps, it could pay off.

Time will tell if LLM training becomes a race to the bottom or the release of the "open source" ones proves to be a spoiler. From the outside looking while ChatGPT has brand recognition for the average person who could not tell the difference between any two LLMs google offering Gemini in android phones could perhaps supplant them.

By bhadass 2026-02-073:32

I swear the Tay incident caused tech companies to be unnecessarily risk averse with chatbots for years.

By xtracto 2026-02-070:211 reply

Attention is all you need was written by Googlers IIRC.

By mullingitover 2026-02-070:32

Indeed, none of the current AI boom would’ve happened without Google Brain and their failure to execute on their huge early lead. It’s basically a Xerox Parc do-over with ads instead of printers.

By AlfredBarnes 2026-02-0620:432 reply

It has always felt to me that the LLM chatbots were a surprise to Google, not LLMs, or machine learning in general.

By raphlinus 2026-02-0621:023 reply

Not true at all. I interacted with Meena[1] while I was there, and the publication was almost three years before the release of ChatGPT. It was an unsettling experience, felt very science fiction.

[1]: https://research.google/blog/towards-a-conversational-agent-...

By hibikir 2026-02-0621:481 reply

The surprise was not that they existed: There were chatbots in Google way before ChatGPT. What surprised them was the demand, despite all the problems the chatbots have. The pig problem with LLMs was not that they could do nothing, but how to turn them into products that made good money. Even people in openAI were surprised about what happened.

In many ways, turning tech into products that are useful, good, and don't make life hell is a more interesting issue of our times than the core research itself. We probably want to avoid the valuing capturing platform problem, as otherwise we'll end up seeing governments using ham fisted tools to punish winners in ways that aren't helpful either

By diamondage 2026-02-0622:25

The uptake forced the bigger companies to act. With image diffusion models too - no corporate lawyer would let a big company release a product that allowed the customer to create any image...but when stable diffusion et al started to grow like they did...there was a specific price of not acting...and it was high enough to change boardroom decisions

By bagels 2026-02-070:181 reply

ChatGPT really innovated on making the chat not say racist things that the press could report on. Other efforts before this failed for that reason.

By _alternator_ 2026-02-071:15

Right. The problem was that people under appreciated ‘alignment’ even before the models were big. And as they get bigger and smarter it becomes more of an issue.

By nasretdinov 2026-02-0621:06

Well, I must say ChatGPT felt much more stable than Meena when I first tried it. But, as you said, it was a few years before ChatGPT was publicly announced :)

By olalonde 2026-02-073:35

It was a surprise to OpenAI too. ChatGPT was essentially a demo app to showcase their API, it was not meant to be a mass consumer product. When you think about it, ChatGPT is a pretty awkward product name, but they had to stick with it.

By AbstractH24 2026-02-0621:131 reply

Google and OpenAI are both taking very big gambles with AI, with an eye towards 2036 not 2026. As are many others, but them in particular.

It'll be interesting to see which pays off and which becomes Quibi

By majormajor 2026-02-071:361 reply

Quibi would be if someone came in 10 years from now and said "if we put a lot more money behind spitting out content using characters and settings from Hollywood IP than we'll obviously be way more popular than a tech company can be!"

By sincerely 2026-02-090:47

Quibi also got extremely unlucky in spending a bunch of money to develop media for people to watch on their commutes right before covid lockdowns hit. Wouldn't be surprised if some other company tries to make video for that market again and does well (maybe working with tiktok/shorts native creators)

By aiauthoritydev 2026-02-077:23

Use your own sh*t is one of the best way to build excellent products.

By mooktakim 2026-02-0618:243 reply

Tesla built something like this for FSD training, they presented many years ago. I never understood why they did productize it. It would have made a brilliant Maps alternative, which country automatically update from Tesla cars on the road. Could live update with speed cameras and road conditions. Like many things they've fallen behind

By berryg 2026-02-0619:261 reply

No Lidar anymore on the 2026 Volvo models ES60 and EX60. See for example: https://www.jalopnik.com/2032555/volvo-ends-luminar-lidar-20...

By senordevnyc 2026-02-0619:574 reply

I love Volvo, am considering buying one in a couple weeks actually, but they're doing nothing interesting in terms of ADAS, as far as I can tell. It seems like they're limited to adaptive cruise control and lane keeping, both of which have been solved problems for more than a decade.

It sounds like they removed Lidar due to supplier issues and availability, not because they're trying to build self-driving cars and have determined they don't need it anymore.

By ruszki 2026-02-0621:211 reply

Is lane keeping really a solved problem? Just last year one of my brand new rented cars tried to kill me a few times when I tried it again, and so far not even the simple lane leaving detection mechanism worked properly in any of the tried cars when it was raining.

By boredtofears 2026-02-0714:002 reply

What problem is it even solving? Keeping my car straight so I can be less attentive on the road?

I get it in the context of driverless but find it nothing but annoying as a driver.

By LorenPechtel 2026-02-083:55

Adaptive cruise control requires some degree of lane detection. It has to figure out what car it's actually following, not merely what car is in front of it. (The road is turning, the car in front of you can easily not be the car you are actually behind.)

By kube-system 2026-02-0715:161 reply

Lane keep keeps your car in the lane so you can stop paying attention just like cruise control keeps you going the same speed so you can stop paying attention… they don’t.

They are just aids that ease fatigue on long trips.

By wiredpancake 2026-02-091:001 reply

The "fatigue" from long trips is hardly a result of having to keep in a lane.

It's more so the result of being awake, doing effectively nothing, for a long time. Lane Keep assistance is a useless technology for 99% of the population and the 1% who need it, likely shouldn't be driving a car anyways.

The more we "aid" fatigue, the longer drivers will attempt to drive. This cannot be a good outcome. The worst driving occurs when one is practically half asleep.

By kube-system 2026-02-091:37

I’m not referring to mental fatigue, but the physical ergonomic fatigue simply from continually activating muscles in a narrow range of motion even over a couple of hours.

If you’ve ever driven a 1970s truck you’ll know that continually correcting the steering will wear you out after just a couple of hours. Modern rack and pinion steering is a lot more comfortable, and lane keep is a further comfort improvement.

By nfg 2026-02-0621:121 reply

I’d suggest doing some research on software quality. Two years back I was all for buying one (I was considering an EX40), but I got myself into some Facebook groups for owners and was shocked at the dreadful reports of quality of the software and it completely put me off. I got an ID4 instead. Reports about the EX90 have been dreadful. I was very interested, and I still admire their look and build when they drive by - but it killed my enthusiasm to buy one for a few years until they get it right.

By nwienert 2026-02-0622:50

Software is pretty solid as of latest release. EX90 is a sleeper pick now because of the bad press being behind the latest software.

By dham 2026-02-0719:03

Lane keep is absolutely not a solved problem. Go test drive any of the latest cars from Kia, Honda, Toyota, Hyundai, Ford, etc. They all will literally kill you.

By fuckyah 2026-02-0620:10

[dead]

By jellojello 2026-02-0618:315 reply

Without Lidar + the terrible quality of tesla onboard cameras.. street view would look terrible. The biggest L of elon's career is the weird commitment to no-lidar. If you've ever driven a Tesla, it gives daily messages "the left side camera is blocked" etc.. cameras+weather don't mix either.

By ASalazarMX 2026-02-0618:518 reply

At first I gave him the benefit of the doubt, like that weird decision of Steve Jobs banning Adobe Flash, which ran most of the fun parts of the Internet back then, that ended up spreading HTML5. Now I just think he refused LIDAR on purely aesthetic reasons. The cost is not even that significant compared to the overall cost of a Tesla.

By dwaite 2026-02-082:46

It's important to understand the timeline of the Steve Jobs open letter on Adobe Flash - at that point the iPhone had been out just shy of three years, and before the first public betas on Android. So for nearly three years, Apple had been investing in HTML5 technology because Flash wasn't in a form where it was deployable.

Additionally, Flash required android phones with 256MB ram as a minimum (which would have precluded two of the three shipped iPhone models at the time) and at least initially only supported software video decoding. Because of the difference in screen dimensions, resolutions and interaction models (plus the issues with embedding due to RAM limitations), the website was still basically broken whether your mobile phone had Flash or not.

My understanding (based on the timing) was always that when Adobe was finally ready to push its partners to bundle mobile Flash, Apple looked at it and decided against it. Adobe made public statements against their partner and so Jobs did so in kind.

By ciberado 2026-02-0621:43

That one was motivated by the need of controlling the app distribution channel, just like they keep the web as a second class citizen in their ecosystem nowadays.

By londons_explore 2026-02-0623:01

Years ago he called lidar a crutch...

And I agree, it is. Clearly it is theoretically possible without.

But when you can't walk at all, a crutch might be just what you need to get going before you can do it without the crutch!

By iamtheworstdev 2026-02-0619:092 reply

he didn't refuse it. MobileEye or whoever cut Tesla off because they were using the lidar sensors in a way he didn't approve. From there he got mad and said "no more lidar!"

By semiquaver 2026-02-0621:23

Assuming what you say is true, are they the only LIDAR vendor?

By iknowstuff 2026-02-0619:243 reply

False. Mobileye never used lidar. Lmao where do you all come up with this

By nerdsniper 2026-02-0619:412 reply

I think Elon announced Tesla was ditching LIDAR in 2019.[0] This was before Mobileye offered LIDAR. Mobileye has used LIDAR from Luminar Technologies around 2022-2025. [1][2] They were developing their own lidar, but cancelled it. [3] They chose Innoviz Technologies as their LIDAR partner going forward for future product lines. [4]

0: https://techcrunch.com/2019/04/22/anyone-relying-on-lidar-is...

1: https://static.mobileye.com/website/corporate/media/radar-li...

2: https://www.luminartech.com/updates/luminar-accelerates-comm...

3: https://www.youtube.com/watch?v=Vvg9heQObyQ&t=48s

4: https://ir.innoviz.tech/news-events/press-releases/detail/13...

By Fricken 2026-02-0622:212 reply

The original Mobileye EyeQ3 devices that Tesla began installing in their cars in 2013 had only a single forward facing camera. They were very simple devices, only intended to be used for lane keeping. Tesla hacked the devices and pushed them beyond their safe design constraints.

Then that guy got decapitated when his Model S drove under a semi-truck that was crossing the highway and Mobileye terminated the contract. Weirdly, the same fatal edge case occurred 2 more times at least on Tesla's newer hardware.

https://en.wikipedia.org/wiki/List_of_Tesla_Autopilot_crashe...

By iknowstuff 2026-02-073:07

They had radar too. No such incidents since going camera only fyi, even on the old autopilot product

By nerdsniper 2026-02-0622:45

Thank you!

By iknowstuff 2026-02-0620:16

Never with the product used by Tesla early on.

By iamtheworstdev 2026-02-1016:26

It's been a decade and it's hard to keep up with all of the drama and ego. It was the EyeQ3 vision system. It used cameras, radar, and ultrasonic sensors and Tesla was accessing them directly. MobileEye cut them off and Elon put his foot down and said "fine we'll just use crappy webcams and be fine."

By agildehaus 2026-02-0619:371 reply

https://www.mobileye.com/news/mobileye-to-end-internal-lidar...

Um, yes they did.

No idea if it had any relation to Tesla though.

By iknowstuff 2026-02-0620:17

Did not

By rcpt 2026-02-074:271 reply

> purely aesthetic reasons

This is huge though.

People aren't setting them on fire during protests, and if an FSD Tesla plows into a farmers market, it might not even make the news.

People hate tech so much that self-driving companies with easy-to-spot cars have had to shut down after just a few mistakes.

Disguising Teslas as plain old regular human-driven cars is a great idea and I wouldn't be surprised if they win the market because of this. Even if they suck at driving.

By codeka 2026-02-076:08

People aren't setting Teslas on fire? Where do you get that from?

https://www.forbes.com/sites/conormurray/2025/05/01/tesla-pr...

By mr_toad 2026-02-074:161 reply

> The cost is not even that significant compared to the overall cost of a Tesla.

That’s true now, but when they first debuted they would have doubled the cost of the car.

By GuB-42 2026-02-0717:35

When Tesla debuted, the cost of batteries made electric cars more like an expensive novelty. The Tesla roadster certainly was fun, but it wasn't a practical car for day-to-day use.

Of course, things have changed.

Had Tesla gone all-in on Lidar, they could have turned the technology into a commodity, they are a trillion dollar company producing a million cars a year. Lidar is already present on cheap robot vacuum cleaners, and we have time-of-flight cameras in smartphones, I don't believe it would have been a problem to equip $50k cars with Lidar.

By smallmancontrov 2026-02-0619:141 reply

His stated reason was that he wanted the team focused on the driving problem, not sensor fusion "now you have two problems" problems. People assumed cost was the real reason, but it seems unfair to blame him for what people assumed. Don't get me wrong, I don't like him either, but that's not due to his autonomous driving leadership decisions, it's because of shitting up twitter, shitting up US elections with handouts, shitting up the US government with DOGE, seeking Epstein's "wildest party," DARVO every day, and so much more.

By jellojello 2026-02-0619:353 reply

Sensor fusion is an issue, one that is solvable over time and investment in the driving model, but sensor-can't-see-anything is a show stopper.

Having a self-driving solution that can be totally turned off with a speck of mud, heavy rain, morning dew, bright sunlight at dawn and dusk.. you can't engineer your way out of sensor-blindness.

I don't want a solution that is available to use 98% of the time, I want a solution that is always-available and can't be blinded by a bad lighting condition.

I think he did it because his solution always used the crutch of "FSD Not Available, Right hand Camera is Blocked" messaging and "Driver Supervision" as the backstop to any failure anywhere in the stack. Waymo had no choice but to solve the expensive problem of "Always Available and Safe" and work backwards on price.

By red75prime 2026-02-0623:071 reply

> Waymo had no choice but to solve the expensive problem of "Always Available and Safe"

And it's still not clear whether they are using a fallback driving stack for a situation where one of non-essential (i.e. non-camera (1)) sensors is degraded. I haven't seen Waymo clearly stating capabilities of their self-driving stack in this regard. On the other hand, there are such things as washer fluid and high dynamic range cameras.

(1) You can't drive in a city if you can't see the light emitted by traffic lights, which neither lidar nor radar can do.

By jellojello 2026-02-077:313 reply

Hence why both together make the solution waymo chose. The proof is in the pudding, Waymo's have been driving millions of miles without any intervention. Tesla requires safety drivers. I would never trust the FSD on my model 3 to be even nearly perfect all the time.

Lidar also gives you the ability to see through fog and as it scans, see the depth needed to nearly always understand what object is in front of them.

My Model 3 shows "degraded" or "unavailable" about 2% of the time i'm driving around populated areas. Zero chance it will ever be truly FSD capable, no matter the software improvements. It'll still be unavailable because the cameras are blinded/blocked/unable to process the scene because it can't see the scene.

While you're right, washer fluid works usually on the windshield, it doesn't on the side cameras, and yea hdr could improve things, it won't improve depth perception, and this will never be installed on my model 3..

Lidar contributes the data most needed to handle the millions of edge cases that exist. With both camera and lidar contributing the data they are both the best at collecting, the risk of the very worst type of accidents is greatly reduced.

I don't see these stats https://waymo.com/safety/impact/ happening for tesla anytime soon.

By red75prime 2026-02-078:13

> without any intervention

but with occasional remote guidance (Waymo doesn't seem to disclose statistics of that). In some cases remote guidance includes placing waypoints[1].

> Lidar also gives you the ability to see through fog and as it scans

Nah. Lidar isn't much better in fog than cameras. If I'm not mistaken, fog, rain, smoke, snow scatter IR light approximately the same as visible light. The lidar beam needs to travel twice the distance and its power is limited by eye-safety concerns.

> FSD on my model 3 to be even nearly perfect all the time

It doesn't need to be perfect. It needs to not hit things, cars and pedestrians too hard and too often, while mostly obeying traffic rules. Waymo has quite a few complains about their cars' behavior[2], but they manage just fine.

[1] third video in https://waymo.com/blog/2024/05/fleet-response

[2] https://www.austintexas.gov/page/autonomous-vehicles

By dham 2026-02-0719:10

Waymo had safety drivers for a long time. And still have safety drivers to this day when they roll out a new city. You wouldn't have known that because no one was paying attention to this stuff back then.

By mavhc 2026-02-0713:17

Waymo also had safety drivers for years.

All you really need is "drive slower if you can't see (because rain, fog, or degraded cameras), or you're in an area where children might run out into the road"

By dham 2026-02-0719:08

If you have mud on a camera, you can't drive it either way. Lidar or not. The way to actually solve these issues is to have way more cameras for redundancy / self cleaning etc, not other sensors.

By smallmancontrov 2026-02-0620:18

LIDAR is notoriously easy to blind, what are you on about? Bonus meme: LIDAR blinds you(r iPhone camera)!

By jellojello 2026-02-0619:24

[dead]

By verelo 2026-02-0618:484 reply

Yeah its absurd. As a Tesla driver, I have to say the autopilot model really does feel like what someone who's never driven a car before thinks it's like.

Using vision only is so ignorant of what driving is all about: sound, vibration, vision, heat, cold...these are all clues on road condition. If the car isn't feeling all these things as part of the model, you're handicapping it. In a brilliant way Lidar is the missing piece of information a car needs without relying on multiple sensors, it's probably superior to what a human can do, where as vision only is clearly inferior.

By smallmancontrov 2026-02-0618:593 reply

The inputs to FSD are:

    7 cameras x 36fps x 5Mpx x 30s
    48kHz audio
    Nav maps and route for next few miles
    100Hz kinematics (speed, IMU, odometry, etc)

Source: https://youtu.be/LFh9GAzHg1c?t=571

By ambicapter 2026-02-0619:201 reply

So if they’re already “fusioning” all these things, why would LIDAR be any different?

By smallmancontrov 2026-02-0620:075 reply

Tesla went nothing-but-nets (making fusion easy) and Chinese LIDAR became cheap around 2023, but monocular depth estimation was spectacularly good by 2021. By the time unit cost and integration effort came down, LIDAR had very little to offer a vision stack that no longer struggled to perceive the 3D world around it.

Also, integration effort went down but it never disappeared. Meanwhile, opportunity cost skyrocketed when vision started working. Which layers would you carve resources away from to make room? How far back would you be willing to send the training + validation schedule to accommodate the change? If you saw your vision-only stack take off and blow past human performance on the march of 9s, would you land the plane just because red paint became available and you wanted to paint it red?

I wouldn't completely discount ego either, but IMO there's more ego in the "LIDAR is necessary" case than the "LIDAR isn't necessary" at this point. FWIW, I used to be an outspoken LIDAR-head before 2021 when monocular depth estimation became a solved problem. It was funny watching everyone around me convert in the opposite direction at around the same time, probably driven by politics. I get it, I hate Elon's politics too, I just try very hard to keep his shitty behavior from influencing my opinions on machine learning.

By magicalist 2026-02-0621:201 reply

> but monocular depth estimation was spectacularly good by 2021

It's still rather weak and true monocular depth estimation really wasn't spectacularly anything in 2021. It's fundamentally ill posed and any priors you use to get around that will come to bite you in the long tail of things some driver will encounter on the road.

The way it got good is by using camera overlap in space and over time while in motion to figure out metric depth over the entire image. Which is, humorously enough, sensor fusion.

By smallmancontrov 2026-02-0622:211 reply

It was spectacularly good before 2021, 2021 is just when I noticed that it had become spectacularly good. 7.5 billion miles later, this appears to have been the correct call.

By quanto 2026-02-0716:35

What are the techniques (and the papers thereof) that you consider to be spectacularly good before 2021 for depth estimation, monocular or not?

I do some tangent work from this field for applications in robotics, and I would consider (metric) depth estimation (and 3D reconstruction) starting to be solved only by 2025 thanks to a few select labs.

Car vision has some domain specificity (high similarity images from adjacent timestamps, relatively simpler priors, etc) that helps, indeed.

By kanbara 2026-02-0620:431 reply

depth estimation is but one part of the problem— atmospheric and other conditions which blind optical visible spectrum sensors, lack of ambient (sunlight) and more. lidar simply outperforms (performs at all?) in these conditions. and provides hardware back distance maps, not software calculated estimation

By gibolt 2026-02-0621:191 reply

Lidar fails worse than cameras in nearly all those conditions. There are plenty of videos of Tesla's vision-only approach seeing obstacles far before a human possibly could in all those conditions on real customer cars. Many are on the old hardware with far worse cameras

By Mawr 2026-02-0623:033 reply

Interesting, got any links? Sounds completely unbelievable, eyes are far superior to the shitty cameras Tesla has on their cars.

By dham 2026-02-0719:15

There's a misconception that what people see and what the camera sees is similar. Not true at all. One day when it's raining or foggy, have some record the driving, through the windshield. You'll be very surprised. Even what the camera displays on the screen isn't what it's actually "seeing".

By jellojello 2026-02-078:06

Yea.. not holding my breath on links to superman tesla cameras performing better than eyes

By 7e 2026-02-0621:011 reply

Monocular depth estimation can be fooled by adversarial images, or just scenes outside of its distribution. It's a validation nightmare and a joke for high reliability.

By gibolt 2026-02-0621:221 reply

It isn't monocular though. A Tesla has 2 front-facing cameras, narrow and wide-angle. Beyond that, it is only neural nets at this point, so depth estimation isn't directly used; it is likely part of the neural net, but only the useful distilled elements.

By smallmancontrov 2026-02-0622:41

I never said it was. I was using it as a lower bound for what was possible.

By kranke155 2026-02-0621:01

Always thought the case was for sensor redundancy and data variety - the stuff that throws off monocular depth estimation might not throw off a lidar or radar.

By rswail 2026-02-076:441 reply

It doesn't solve the "Coyote paints tunnel on rock" problem though.

By rkomorn 2026-02-076:561 reply

IIRC, that was only ever a problem for the coyote, though.

Source: not a computer vision engineer, but a childhood consumer of looney toons cartoons.

By rswail 2026-02-077:31

Time for a car company to call itself "ACME" and the first model the "Road Runner".

By ChicagoDave 2026-02-0620:231 reply

Fog, heavy rain, heavy snow, people running between cars or from an obstructed view…

None of these technologies can ever be 100%, so we’re basically accepting a level of needless death.

Musk has even shrugged off FSD related deaths as, “progress”.

By smallmancontrov 2026-02-0620:363 reply

Humans: 70 deaths in 7 billion miles

FSD: 2 deaths in 7 billion miles

Looks like FSD saves lives by a margin so fat it can probably survive most statistical games.

By socialcommenter 2026-02-075:35

How many of the 70 human accidents would be adequately explained by controlling for speed, alcohol, wanton inattention, etc? (The first two alone reduce it by 70%)

No customer would turn on FSD on an icy road, or on country lanes in the UK which are one lane but run in both directions; it's much harder to have a passenger fatality in stop-start traffic jams in downtown US cities.

Even if those numbers are genuine (2 vs 70) I wouldn't consider it apples-for-apples.

Public information campaigns and proper policing have a role to play in car safety, if that's the stated goal we don't necessarily need to sink billions into researching self driving

By hn_acc1 2026-02-0621:143 reply

Is that the official Tesla stat? I've heard of way more Tesla fatalities than that..

By simondotau 2026-02-0622:40

There are a sizeable number of deaths associated with the abuse of Tesla’s adaptive cruise control with lane cantering (publicly marketed as “autopilot”). Such features are commonplace on many new cars and it is unclear whether Tesla is an outlier, because no one is interested in obsessively researching cruise control abuse among other brands.

There are two deaths associated with FSD.

By ChicagoDave 2026-02-0621:371 reply

This is absolutely a Musk defender. FSD and Tesla related deaths are much higher.

https://www.tesladeaths.com/index-amp.html

By smallmancontrov 2026-02-0622:221 reply

Autopilot is the shitty lane assist. FSD is the SOTA neural net.

Your link agrees with me:

> 2 fatalities involving the use of FSD

By ChicagoDave 2026-02-079:19

Tesla sales are dead across the world. Cybertruck is a failure. Chinese EVs are demonstrably better.

No one wants these crappy cars anymore.

By Fricken 2026-02-0622:222 reply

I don't know what he's on about. Here's a better list:

https://en.wikipedia.org/wiki/List_of_Tesla_Autopilot_crashe...

By dham 2026-02-0719:17

Good ole Autopilot vs FSD post. You would think people on Hacker News would be better informed. Autopilot is just lane keep and adaptive cruise control. Basically what every other car has at this point.

"MacOS Tahoe has these cool features". "Yea but what about this wikipedia article on System 1. Look it has these issues."

That's how you come across

By smallmancontrov 2026-02-0622:292 reply

Autopilot is the shitty lane assist. FSD is the SOTA neural net.

Your link agrees with me:

> two that NHTSA's Office of Defect Investigations determined as happening during the engagement of Full Self-Driving (FSD) after 2022.

By elgenie 2026-02-0622:071 reply

Isn't there a great deal of gaming going on with the car disengaging FSD milliseconds before crashing? Voila, no "full" "self" driving accident; just another human failing [*]!

[*] Failing to solve the impossible situation FSD dropped them into, that is.

By smallmancontrov 2026-02-0622:28

Nope. NHTSA's criteria for reporting is active-within-30-seconds.

https://www.nhtsa.gov/laws-regulations/standing-general-orde...

If there's gamesmanship going on, I'd expect the antifan site linked below to have different numbers, but it agrees with the 2 deaths figure for FSD.

By verelo 2026-02-0619:30

Better than I expected. So this was 3 days ago, is this for all previously models or is there a cut off date here?

By torginus 2026-02-0620:511 reply

I quickly googled Lidar limitations, and this article came up:

https://www.yellowscan.com/knowledge/how-weather-really-affe...

Seeing how its by a lidar vendor, I don't think they're biased against it. It seems Lidar is not a panacea - it struggles with heavy rain, snow, much more than cameras do and is affected by cold weather or any contamination on the sensor.

So lidar will only get you so far. I'm far more interested in mmwave radar, which while much worse in spatial resolution, isn't affected by light conditions, weather, can directly measure stuff on the thing its illuminating, like material properties, the speed its moving, the thickness.

Fun fact: mmWave based presence sensors can measure your hearbeat, as the micro-movements show up as a frequency component. So I'd guess it would have a very good chance to detect a human.

I'm pretty sure even with much more rudimentary processing, it'll be able to tell if its looking at a living being.

By the way: what happened to the idea that self-driving cars will be able to talk to each other and combine each other's sensor data, so if there are multiple ones looking at the same spot, you'd get a much improved chance of not making a mistake.

By dham 2026-02-0719:21

Lidar is a moot point. You can't drive with just Lidar, no matter what. That's what people don't understand. The most common one I hear: "What if the camera gets mud on it", ok then you have to get out and clean it, or it needs an auto cleaning system.

By ASalazarMX 2026-02-0618:57

Maybe vision-only can work with much better cameras, with a wider spectrum (so they can see thru fog, for example), and self-cleaning/zero upkeep (so you don't have to pull over to wipe a speck of mud from them). Nevertheless, LIDAR still seems like the best choice overall.

By iknowstuff 2026-02-0619:231 reply

Autopilot hasn’t been updated in years and is nothing like FSD. FSD does use all of those cues.

By verelo 2026-02-0619:28

I misspoke, i'm using Hardware 3 FSD.

By kypro 2026-02-0619:471 reply

From the perspective of viewing FSD as an engineering problem that needs solving I tend to think Elon is on to something with the camera-only approach – although I would agree the current hardware has problems with weather, etc.

The issue with lidar is that many of the difficult edge-cases of FSD are all visible-light vision problems. Lidar might be able to tell you there's a car up front, but it can't tell you that the car has it's hazard lights on and a flat tire. Lidar might see a human shaped thing in the road, but it cannot tell whether it's a mannequin leaning against a bin or a human about to cross the road.

Lidar gets you most of the way there when it comes to spatial awareness on the road, but you need cameras for most of the edge-cases because cameras provide the color data needed to understand the world.

You could never have FSD with just lidar, but you could have FSD with just cameras if you can overcome all of the hardware and software challenges with accurate 3D perception.

Given Lidar adds cost and complexity, and most edge cases in FSD are camera problems, I think camera-only probably helps to force engineers to focus their efforts in the right place rather than hitting bottlenecks from over depending on Lidar data. This isn't an argument for camera-only FSD, but from Tesla's perspective it does down costs and allows them to continue to produce appealing cars – which is obviously important if you're coming at FSD from the perspective of an auto marker trying to sell cars.

Finally, adding lidar as a redundancy once you've "solved" FSD with cameras isn't impossible. I personally suspect Tesla will eventually do this with their robotaxis.

That said, I have no real experience with self-driving cars. I've only worked on vision problems and while lidar is great if you need to measure distances and not hit things, it's the wrong tool if you need to comprehend the world around you.

By senordevnyc 2026-02-0620:022 reply

This is so wild to read when Waymo is currently doing like 500,000 paid rides every week, all over the country, with no one in the driver's seat. Meanwhile Tesla seems to have a handful of robotaxis in Austin, and it's unclear if any of them are actually driverless.

But the Tesla engineers are "in the right place rather than hitting bottlenecks from over depending on Lidar data"? What?

By kypro 2026-02-0621:441 reply

I wasn't arguing Tesla is ahead of Waymo? Nor do I think they are. All I was arguing was that it makes sense from the perspective of a consumer automobile maker to not use lidar.

I don't think Tesla is that far behind Waymo though given Waymo has had a significant head start, the fact Waymo has always been a taxi-first product, and given they're using significantly more expensive tech than Tesla is.

Additionally, it's not like this is a lidar vs cameras debate. Waymo also uses and needs cameras for FSD for the reasons I mentioned, but they supplement their robotaxis with lidar for accuracy and redundancy.

My guess is that Tesla will experiment with lidar on their robotaxis this year because design decisions should differ from those of a consumer automobile. But I could be wrong because if Tesla wants FSD to work well on visually appealing and affordable consumer vehicles then they'll probably have to solve some of the additional challenges with with a camera-only FSD system. I think it will depend on how much Elon decides Tesla needs to pivot into robotaxis.

Either way, what is undebatable is that you can't drive with lidar only. If the weather is so bad that cameras are useless then Waymos are also useless.

By DoctorOetker 2026-02-075:24

What causes LiDAR to fail harder than normal cameras in bad weather conditions? I understand that normal LiDAR algorithms assume the direct paths from light source to object to camera pixel, while a mist will scatter part of the light, but it would seem like this can be addressed in the pixel depth estimation algorithm that combines the complex amplitudes at the different LiDAR frequencies.

I understand that small lens sizes mean that falling droplets can obstruct the view behind the droplet, while larger lens sizes can more easily see beyond the droplet.

I seldom see discussion of the exact failure modes for specific weather conditions. Even if larger lenses are selected the light source should use similar lens dimensions. Independent modulation of multiple light sources could also dramatically increase the gained information from each single LiDAR sensor.

Do self-driving camera systems (conventional and LiDAR) use variable or fixed tilt lenses? Normal camera systems have the focal plane perpendicular to the viewing direction, but for roads it might be more interesting to have a large swath of the horizontal road in focus. At least having 1 front facing camera with a horizontal road in focus may prove highly beneficial.

To a certain extend an FSD system predicts the best course of action. When different courses of action have similar logits of expected fitness for the next best course of action, we can speak of doubt. With RMAD we can figure out which features or what facets of input or which part of the view is causing the doubt.

A camera has motion blur (unless you can strobe the illumination source, but in daytime the sun is very hard to outshine), it would seem like an interesting experiment to:

1. identify in real time which doubts have the most significant influence on the determination of best course of action

2. have a camera that can track an object to eliminate motion blur but still enjoy optimal lighting (under the sun, or at night), just like our eyes can rotate

3. rerun the best course of action prediction and feed back this information to the company, so it can figure out the cost-benefit of adding a free tracking camera dedicated to eliminating doubts caused by motion blur.

By smallmancontrov 2026-02-0620:302 reply

Tesla has driven 7.5B autonomous miles to Waymo's 0.2B, but yes, Waymo looks like they are ahead when you stratify the statistics according to the ass-in-driver-seat variable and neglect the stratum that makes Tesla look good.

The real question is whether doing so is smart or dumb. Is Tesla hiding big show-stopper problems that will prevent them from scaling without a safety driver? Or are the big safety problems solved and they are just finishing the Robotaxi assembly line that will crank out more vertically-integrated purpose-designed cars than Waymo's entire fleet every day before lunch?

By hn_acc1 2026-02-0621:172 reply

Tesla's also been involved in WAY more accidents than Waymo - and has tried to silence those people, claim FSD wasn't active, etc.

What good is a huge fleet of Robotaxis if no one will trust them? I won't ever set foot in a Robotaxi, as long as Elon is involved.

By jellojello 2026-02-078:271 reply

waymo just hit it's first pedestrian, ever. It did it at a speed of 6mph and it was estimated a human would have hit the kid at 14mph (it was going 17mph when a small child jumped out in front of it from behind a black suv.

First pedestrian struck. That's crazy.

Tesla just disengages fsd anytime a sensor is slightly blocked/covered/blinded.. waymo out here doing fsd 100% of the time and basically never hurts anyone.

I don't get the tesla/elon love here, i like my model 3 but it's never going to get real fsd, and that sucks, elon also lies about the roadmap, timing, etc. I bet the roadster is canceled now. Why do people like inferior sensors and autistic hitler?

By johnthewise 2026-02-0713:351 reply

Waymos disengage and get tele operated too?

By kube-system 2026-02-0715:28

Not really. Waymos can’t be driven remotely, their remote operators can give the car directions, e.g. “use this lane”, and then the autonomous system controls the vehicle to execute those directions.

I’m sure latency and connectivity is too much of an risk to do it any other way.

The only Waymos driven by a human are the ones with human drivers physically in the car

By kypro 2026-02-0621:521 reply

There's more Tesla's on the road than Waymo's by several orders of magnitude. Additionally the types of roads and conditions Tesla's drive under is completely incomparable to Waymo.

By jamespo 2026-02-0622:16

Yes that was accounted for above, but this isn't autonomous apples to apples

By jasondigitized 2026-02-0622:06

semi autonomous

By gambiting 2026-02-0620:371 reply

>>The biggest L of elon's career is the weird commitment to no-lidar.

I thought it was the Nazi salutes on stage and backing neo-nazi groups everywhere around the world, but you know, I guess the lidar thing too.

By jeron 2026-02-0722:18

maybe it's better to say it was the biggest L of his engineering career instead of his political career

By 0xfaded 2026-02-0618:583 reply

I have HW3, but FSD reliably disengages at this time of year with sunrise and sunset during commute hours.

By jellojello 2026-02-0619:191 reply

Yep, and won't activate until any morning dew is off the sensors.. or when it rains too hard.. or if it's blinded by a shiny building/window/vehicle.

I will never trust 2d camera-only, it can be covered or blocked physically and when it happens FSD fails.

As cheap as LIDAR has gotten, adding it to every new tesla seems to be the best way out of this idiotic position. Sadly I think Elon got bored with cars and moved on.

By dham 2026-02-0719:23

If the camera is covered or blocked, you can't drive plain and simple, as you can't drive a car (at least on Earth) with just Lidar. The roads are made for eyes. Maybe on Rocky's homeworld you can have a Lidar only system for traveling.

By DoctorOetker 2026-02-075:06

This will considerably skew the statistics, a low sun dramatically increases accident rates on humans too.

By iknowstuff 2026-02-0619:24

FSD14 on hw4 does not. Its dynamic range is equivalent or better than human.

By rajnathani 2026-02-1613:51

Not really I think, they built a simulation engine for autonomous driving, for which tons of such exist out there including ones from Nvidia and also at least 1 open-source one. Using world models is different.

By xnx 2026-02-0618:331 reply

> Suddenly all this focus on world models by Deep mind starts to make sense

Google's been thinking about world models since at least 2018: https://arxiv.org/abs/1803.10122

By anp 2026-02-0620:18

FWIW I understood GP to mean that it suddenly makes sense to them, not that there’s been a sudden focus shift at google.

By rswail 2026-02-076:38

Maybe they were focusing on a real world use that basically requires AI, but not LLMs.

Tesla claimed that all their "real world" recording would give them a moat on FSD.

Waymo is showing that a) you need to be able to incorporate stuff that isn't "real" when training, and b) you get a lot more information from alternate sensors to visible spectrum only.

By ericzundel 2026-02-0716:18

I just listened to a fantastic multi-hour Acquired (https://www.acquired.fm/) podcast episode on Google and AI that talks about the history of Google and AI and all the ways they have been using it since 2012. It's really fascinating. You can forgive them for not focusing on Reader or any of their other properties when you realize they were pulling in hundreds of billions of dollars of value by making big bets in AI and incorporating it into their core business.

By spiderfarmer 2026-02-0621:00

Grok/xAI is a joke at this point. A true money pit without any hopes for a serious revenue stream.

They should be bought by a rocket company. Then they would stand a chance.

By smeeth 2026-02-0618:125 reply

I always understood this to be why Tesla started working on humanoid robots

By smt88 2026-02-0619:421 reply

They started working on humanoid robots because Musk always has to have the next moonshot, trillion-dollar idea to promise "in 3 years" to keep the stock price high.

As soon as Waymo's massive robotaxi lead became undeniable, he pivoted to from robotaxis to humanoid robots.

By senordevnyc 2026-02-0620:04

Yeah, that and running Grok on a trillion GPUs in space lol

By ACCount37 2026-02-0619:04

Pretty much. They banked on "if we can solve FSD, we can partially solve humanoid robot autonomy, because both are robots operating in poorly structured real world environments".

By jasondigitized 2026-02-0622:073 reply

I don't want a humanoid robot. I want a purpose built robot.

By simondotau 2026-02-0622:53

Obviously both will exist and compete with each other on the margins. The thing to appreciate is that our physical world is already built like an API for adult humans. Swinging doors, stairs, cupboards, benchtops. If you want a robot to traverse the space and be useful for more than one task, the humanoid form makes sense.

The key question is whether general purpose robots can outcompete on sheer economies of scale alone.

By dham 2026-02-0719:28

It's called a dishwasher, washing machine, and dryer. Plus like robomowers, vaccums etc.

By monocasa 2026-02-070:181 reply

I mean, I would take a robot to handle all of my housework.

Purpose built, that probably takes the form of a humanoid robot since all of tasks it needs to do were previously designed for humanoids.

By rswail 2026-02-076:521 reply

Vacuuming and mopping are not inherently "designed" for humans.

Dusting with a single extensible and multiple degrees of freedom arm would be much more maneuverable than a human arm.

Loading and unloading washing machines or dryers or doign the same for dishes and cutlery in a dishwasher is not inherently designed for humans.

If anything, selling an integrated "housekeeping" system that fits into an existing laundry and combines features would be a much better approach.

By monocasa 2026-02-087:151 reply

I agree that each would be made slightly better with a more integrated system. But you could handle all of them in my hundred year old house with the form factor it was designed for: a humanoid. Probably pretty soon here for cheaper than each could be handled separately by more integrated systems.

By rswail 2026-02-096:17

For new builds, a laundry/utility room that includes the dishwashing and other "housekeeping" facilities is a no-brainer when there is a custom robot built to use those facilities as well as maneuver around the rest of the house.

For old/retrofit renovations it also makes sense, but otherwise, yes, a human-form robot makes sense.

The question is which is a better investment for any robot manufacturer in 2026?

By rswail 2026-02-076:48

The drop in demand for Tesla's clapped out model range would have meant embarrassing factory closures, so now they're being closed to start manufacturing a completely different product. Bait and switch for Tesla investors.

I wonder how long they'll be closed for "modifications" and whether the Optimus Prime robot factories will go into production before the "Trump Kennedy Center" is reopened after its "renovations".

By Fricken 2026-02-0622:25

It's so they can stick a Tesla logo on a bunch of chinese tech and call it innovation.

By theptip 2026-02-0718:03

So is this a model baked into the VLLM layer? Or a scaffold that the agent sits in for testing?

If the former then it’s relevant to the broader discourse on LLM generality. If the latter, then it seems less relevant to chatbots and business agents.

Edit to add: this is not part of the model, it’s in a separate pillar (Simulator vs Driver). More at https://waymo.com/blog/2025/12/demonstrably-safe-ai-for-auto....

By YeGoblynQueenne 2026-02-0718:481 reply

>> Suddenly all this focus on world models by Deep mind starts to make sense.

The apparent applicability to Waymo is incidental, more likely because a few millions+ were spent on Genie and they have to do something with it. DeepMind started to train "world models" because that's the current overhyped buzzword in the industry. First it was "natural language understanding" and "question answering" back in the days of old BERT, then it was "agentic", then "reasoning", now it's "world models", next years it's going to be "emotions" or "social intelligence" or some other anthropomorphic, over-drawn neologism. If you follow a few AI accounts on social media you really can't miss when those things suddenly start trending, then pretty much die out and only a few stragglers still try to publish papers on them because they failed to get the memo that we're now all running behind the Next Big Thing™.

By wiz21c 2026-02-0718:591 reply

notice that all these buzzwords you give actually correspond to real advances in the field. All of these were improvements on something existing, not a big revolution for sure, but definitely measurable improvements.

By YeGoblynQueenne 2026-02-0722:521 reply

Those are not "real advances in the field", which is why they are constantly abandoned for the next new buzzword.

Edit:

This just in:

https://news.ycombinator.com/item?id=46870514#46929215

The Next Big Thing™ is going to be "context learning", at least if Tencent have their way. And why do we need that?

>> Current language models do not handle context this way. They rely primarily on parametric knowledge—information compressed into their weights during massive pre-training runs. At inference time, they function largely by recalling this static, internal memory, rather than actively learning from new information provided in the moment.

>> This creates a structural mismatch. We have optimized models to excel at reasoning over what they already know yet users need them to solve tasks that depend on messy, constantly evolving context. We built models that rely on what they know from the past, but we need context learners that rely on what they can absorb from the environment in the moment.

Yep. Reasoning is so 2025.

By gbnwl 2026-02-088:121 reply

I think you might be salty because the words become overused and overhyped, and often 90% of the people jumping on the bandwagon are indeed just parroting the new hot buzzword and don't really understand what they're talking about. But the terms you mentioned are all obviously very real and and very important in applications using LLMs today. Are you arguing that reasoning was vaporware? None of these things were meant to the be the final stop of the journey, just the next step.

By YeGoblynQueenne 2026-02-0817:13

Excuse me? I'm "salty"? What the hell are you talking about?

Why doesn't this site have a block user button?

By QuantumFunnel 2026-02-071:192 reply

Also known as a monopoly. This should terrify us all.

By Andrex 2026-02-073:16

No, it's known as vertical integration, which is legally permitted by default.

By cman1444 2026-02-0719:14

Monopolies are essentially 100% horizontal integration. Vertical integration is a completely different concept.

By schiffern 2026-02-0622:222 reply

  >I've never really thought of Waymo as a robot in the same way as e.g. a Boston Dynamics humanoid, but of course it is a robot of sorts.

So for the record, with this realization you're 3+ years behind Tesla.

https://www.youtube.com/watch?v=ODSJsviD_SU&t=3594s

By numpad0 2026-02-076:48

Practically ALL course introductory materials that regard robotics and AI that I've seen began with "you might imagine a talking bipedal humanoid when you hear the word `robot`, but perhaps the most commonplace robot that you have seen is a vending machine", with the illustration of a typical 80s-90s outdoor soda vendor with no apparent moving parts.

So "maybe cars are a bit of robots too" is more like 30-50 years behind the time.

By tapoxi 2026-02-0622:321 reply

Aren't they still using safety drivers or safety follow cars and in fewer cities? Seems Tesla is pretty far behind.

By schiffern 2026-02-0622:382 reply

What do you think I said that you're contradicting?

IMO the presence of safety chase vehicles is just a sensible "as low as reasonably achievable" measure during the early rollout. I'm not sure that can (fairly) be used as a point against them.

I'm comfortably with Tesla sparing no expense for safety, since I think we all (including Tesla) understand that this isn't the ultimate implementation. In fact, I think it would be a scandal if Tesla failed to do exactly that.

Damned if you do and damned if you don't, apparently.

By tapoxi 2026-02-0622:391 reply

I don't know if Tesla claiming they're doing something carries weight anymore.

By schiffern 2026-02-0622:45

Setting aside the anti-Tesla bias, none of what I said relies on Tesla claims. The "chase vehicle" claims are all based on third-party accounts from actual rideshare customers.

By Mawr 2026-02-0623:143 reply

> IMO the presence of safety chase vehicles is just a sensible "as low as reasonably achievable" measure during the early rollout. I'm not sure that can (fairly) be used as a point against them.

Only if you're comparing them to another company, which you seem to be. So yes, yes it can.

Seriously, the amount of sheer cope here is insane. Waymo is doing the thing. Tesla is not. If Tesla were capable of doing it, they would be. But they're not.

It really is as simple as that and no amount of random facts you may bring up will change the reality. Waymo is doing the thing.

By schiffern 2026-02-072:452 reply

>Waymo is doing the thing.

This worldview is overly simplistic.

Waymo has (very shrewdly, for prospective investors at least) executed a strategy that most quickly scales to 0.1% of the population. Unfortunately it doesn't scale further. The cars are too costly and the mapping is too costly. There is no workable plan for significant scale from Waymo.

Tesla is executing the strategy that most quickly scales to 100% of the population.

By essdas 2026-02-0710:311 reply

> most quickly scales to 0.1% of the population. Unfortunately it doesn't scale further

Data suggests that they’re already available to ~2% of the US population.

By schiffern 2026-02-121:21

There's definitely not enough Waymos to replace the transport needs of 2% of the population, so 0.1% is a more accurate figure of merit.

By ra7 2026-02-074:09

> Tesla is executing the strategy that most quickly scales to 100% of the population.

So, uh… where is this “scale” then? This “strategy” has been bandied about for better part of a decade. Why are they still in a tiny geofence in Austin with chase cars?

Waymo is doing it right now. Half a million rides every week, expansion to a dozen new cities. Tesla does a few hundred in a tiny area.

Scale is assessed by looking at concrete numbers, not by “strategies” that haven’t materialized for a decade.

By uoaei 2026-02-0621:22

What an upsetting comment. I'm glad you came around but what did you think was going to be effective before you came around to world models?

By dmd 2026-02-0619:301 reply

Which is why it's embarrassing how much worse Gemini is at searching the web for grounding information, and how incredibly bad gemini cli is.

By xnx 2026-02-0621:05

Not my experience in either of those areas.

By londons_explore 2026-02-0622:53

Internal firewalls and poor management means that the vast majority of integration opportunities are missed.

By jasondigitized 2026-02-0620:21

The flywheel is starting to spin......

By lagrange77 2026-02-072:01

> I've never really thought of Waymo as a robot in the same way as e.g. a Boston Dynamics humanoid, but of course it is a robot of sorts.

I view Tesla also more as a robot company than anything else.

By adarsh2321 2026-02-072:03

[dead]

By sdf2erf 2026-02-0618:502 reply

"Waymo as a robot in the same way"

Erm, a dishwasher, washing machine, automated vacuum can be considered robots. Im confused as to this obsession of the term - there are many robots that already exist. Robotics have been involved in the production of cars for decades.

......

By ASalazarMX 2026-02-0618:591 reply

I think the (gray) line is the degree of autonomy. My washing machine makes very small, predictable decisions, while a Waymo has to manage uncertainty most of the time.

By sdf2erf 2026-02-0619:024 reply

Its irrelevant. A robot is a robot.

Dictionary def: "a machine controlled by a computer that is used to perform jobs automatically."

By saghm 2026-02-0619:50

A robot is a robot, and a human is a creature that won't necessarily agree with another human on what the definition of a word is. Dictionaries are also written by humans and don't necessarily reflect the current consensus, especially on terms where people's understanding might evolve over time as technology changes.

Even if that definition were universally agreed on l upon though, that's not really enough to understand what the parent comment was saying. Being a robot "in the same way" as something else is even less objective. Humans are humans, but they're also mammals; is a human a mammal "in the same way" as a mouse? Most humans probably have a very different view of the world than most mice, and the parent comment was specifically addressing the question of whether it makes sense for an autonomous car to model the world the same way as other robots or not. I don't see how you can dismiss this as "irrelevant" because both humans and mice are mammals (or even animals; there's no shortage of classifications out there) unless you're completely having a different conversation than the person you responded to. You're not necessarily wrong because of that, but you're making a pretty significant misjudgment if you think that's helpful to them or to anyone else involved in the ongoing conversation.

By mattlondon 2026-02-0619:10

No one is denying that robots existed already (but I would hardly call a dishwasher a robot FWIW)

But in my mind a waymo was always a "car with sensors", but more recently (especially having recently used them a bunch in California recently) I've come to think of them truly as robots.

By ASalazarMX 2026-02-0619:09

TIL fuel injectors are robots. Probably my ceiling lights too.

Maybe we need to nitpick about what a job is exactly? Or we could agree to call Waymos (semi)autonomous robots?

By goatlover 2026-02-0619:21

In the same way people online have argued helicopters are flying cars, it doesn't capture what most people mean when they use the word "robot", anymore than helicopters are what people have in mind when they mention flying cars.

By themafia 2026-02-0618:581 reply

It's a 3500lb robot that can kill you.

Boston Robotics is working on a smaller robot that can kill you.

Anduril is working on even smaller robots that can kill you.

The future sucks.

By zzzeek 2026-02-0619:171 reply

and they're all controlled by (poorly compensated) humans anyway [1] [2]

[1] https://www.wsj.com/tech/personal-tech/i-tried-the-robot-tha...

[2] https://futurism.com/advanced-transport/waymos-controlled-wo...

By themafia 2026-02-0620:501 reply

They couldn't even make burger flipping robots work and are paying fast food workers $20/hr in California.

If that doesn't make it obvious what they can and cannot do then I can't respect the tranche of "hackers" who blindly cheer on this unchecked corporate dystopian nightmare.

By MisterMower 2026-02-085:15

They can totally make it work, it’s just currently cheaper to have humans do it.

Solving the technical challenges and using that solution profitably are two completely different things.

By Dig1t 2026-02-0621:46

>or grok's porn

I know it’s gross, but I would not discount this. Remember why Blu-ray won over HDDVD? I know it won for many other technical reasons, but I think there are a few historical examples of sexual content being a big competitive advantage.

By coffeemug 2026-02-0619:332 reply

The vertical integration argument should apply to Grok. They have Tesla driving data (probably much more data than Waymo), Twitter data, plus Tesla/SpaceX manufacturing data. When/if Optimus starts on the production line, they'll have that data too. You could argue they haven't figured out how to take advantage of it, but the potential is definitely there.

By BoredPositron 2026-02-0619:39

Agreed. Should they achieve Google level integration, we will all make sure they are featured in our commentary. Their true potential is surely just around the corner...

By jeffbee 2026-02-0620:321 reply

"Tesla has more data than Waymo" is some of the lamest cope ever. Tesla does not have more video than Google! That's crazy! People who repeat this are crazy! If there was a massive flow of video from Tesla cars to Tesla HQ that would have observable side effects.

By schiffern 2026-02-074:10

"More video" (gigabytes) is a straw man.

The key metric is more unusual situations. That scales with miles driven, not gigabytes. With onboard inference the car simply logs anything 'unusual' (low confidence) to selectively upload those needle-in-a-haystack rare events.

By thefounder 2026-02-0619:376 reply

But somehow google fails to execute. Gemini is useless for programming and I don’t think even bother to use it as chat app. Claude code + gpt 5.2 xhigh for coding and gpt as chat app are really the only ones that are worth it(price and time wise)

By coffeemug 2026-02-0619:403 reply

I've recently switched to Claude for chat. GPT 5.2 feels very engagement-maxxed for me, like I'm reading a bad LinkedIn post. Claude does a tiny bit of this too, but an order of magnitude less in my experience. I never thought I'd switch from ChatGPT, but there is only so much "here's the brutal truth, it's not x it's y" I can take.

By thechao 2026-02-0619:491 reply

GPT likes to argue, and most of its arguments are straw man arguments, usually conflating priors. It's ... exhausting; akin to arguing on the internet. (What am I even saying, here!?) Claude's a lot less of that. I don't know if tracks discussion/conversation better; but, for damn sure, it's got way less verbal diarrhea than GPT.

By mrlongroots 2026-02-0620:08

Yes, GPT5-series thinking models are extremely pedantic and tedious. Any conversation with them is derailed because they start nitpicking something random.

But Codex/5.2 was substantially more effective than Claude at debugging complex C++ bugs until around Fall, when I was writing a lot more code.

I find Gemini 3 useless. It has regressed on hallucinations from Gemini 2.5, to the point where its output is no better than a random token stream despite all its benchmark outperformance. I would use Gemini 2.5 to help write papers and all, can't see to use Gemini 3 for anything. Gemini CLI also is very non-compliant and crazy.

By aschla 2026-02-0619:45

Experiencing the same. It seems Anthropic’s human-focused design choices are becoming a differentiator.

By thefounder 2026-02-0620:47

To me ChatGPT seems smarter and knows more. That’s why I use it. Even Claude rates gpt better for knowledge answers. Not sure if that itself is any indication. Claude seems superficial unless you hammer it to generate a good answer.

By unsupp0rted 2026-02-0621:16

Gemini is by far the best UI/UX designer model. Codex seems to the worst: it'll build something awkward and ugly, then Gemini will take 30-60 seconds to make it look like something that would have won a design award a couple years ago.

By henryfjordan 2026-02-0619:54

Gemini works well enough in Search and in Meet. And it's baked into the products so it's dead simple to use.

I don't think Google is targeting developers with their AI, they are targeting their product's users.

By noelsusman 2026-02-0620:02

It is a bit mind boggling how behind they were considering they invented transformers and were also sitting on the best set of training data in the world, but they've caught up quite a bit. They still lag behind in coding, but I've found Gemini to be pretty good at more general knowledge tasks. Flash 3 in particular is much better than anything of comparable price and speed from OpenAI or Anthropic.

By ody4242 2026-02-0710:56

Yesterday GPT 5.2 wrote a python function for me that had the import in the middle of the code, for no reason. (It was a simple import of requests module in a REST client...) Claude I agree is a lot better for backend,Gemini is very good for frontend

By xnx 2026-02-0616:307 reply

> The Waymo World Model can convert those kinds of videos, or any taken with a regular camera, into a multimodal simulation—showing how the Waymo Driver would see that exact scene.

Subtle brag that Waymo could drive in camera-only mode if they chose to. They've stated as much previously, but that doesn't seem widely known.

By bonsai_spool 2026-02-0616:522 reply

I think I'm misunderstanding - they're converting video into their representation which was bootstrapped with LIDAR, video and other sensors. I feel you're alluding to Tesla, but Tesla could never have this outcome since they never had a LIDAR phase.

(edit - I'm referring to deployed Tesla vehicles, I don't know what their research fleet comprises, but other commenters explain that this fleet does collect LIDAR)

By smallmancontrov 2026-02-0617:042 reply

They can and they do.

https://youtu.be/LFh9GAzHg1c?t=872

They've also built it into a full neural simulator.

https://youtu.be/LFh9GAzHg1c?t=1063

I think what we are seeing is that they both converged on the correct approach, one of them decided to talk about it, and it triggered disclosure all around since nobody wants to be seen as lagging.

By tfehring 2026-02-0618:001 reply

I watched that video around both timestamps and didn't see or hear any mention of LIDAR, only of video.

By smallmancontrov 2026-02-0618:241 reply

Exactly: they convert video into a world model representation suitable for 3D exploration and simulation without using LIDAR (except perhaps for scale calibration).

By tfehring 2026-02-0618:33

My mistake - I misinterpreted your comment, but after re-reading more carefully, it's clear that the video confirms exactly what you said.

By IhateAI_3 2026-02-0618:51

tesla is not impressive, I would never put my child in one

By yakz 2026-02-0616:541 reply

Tesla does collect LIDAR data (people have seen them doing it, it's just not on all of the cars) and they do generate depth maps from sensor data, but from the examples I've seen it is much lower resolution than these Waymo examples.

By justapassenger 2026-02-0616:571 reply

Tesla does it to map the areas to come up with high def maps for areas where their cars try to operate.

By vardump 2026-02-0617:11

Tesla uses lidar to train their models to generate depth data out of camera input. I don’t think they have any high definition maps.

By ActorNightly 2026-02-0617:115 reply

The purpose of lidar is to prove error correction when you need it most in terms of camera accuracy loss.

Humans do this, just in the sense of depth perception with both eyes.

By robotresearcher 2026-02-0619:563 reply

Human depth perception uses stereo out to only about 2 or 3 meters, after which the distance between your eyes is not a useful baseline. Beyond 3m we use context clues and depth from motion when available.

By ActorNightly 2026-02-105:39

Thats going off just the focal point.

We do a lot more internal image processing. For example, relative motion as seen by either eye helps improve accuracy by a whole lot, in the "medium" distance range.

By aylons 2026-02-0620:14

Thanks, saved some work.

And I'll add that it in practice it is not even that much unless you're doing some serious training, like a professional athlete. For most tasks, the accurate depth perception from this fades around the length of the arms.

By cyanydeez 2026-02-0620:252 reply

ok, but a care is a few meters wide, isn't that enough for driving depth perception similar to humans

By robotresearcher 2026-02-0620:361 reply

The depths you are trying to estimate are to the other cars, people, turnings, obstacles, etc. Could be 100m away or more on the highway.

By cyanydeez 2026-02-0622:332 reply

ok, but the point trying to be made is based on human's depth perception, but a car's basic limitation is the width of the vehicle, so there's missing information if you're trying to figure out if a car can use cameras to do what human eyes/brains do.

By acomjean 2026-02-075:111 reply

Humans are very good at processing the images that come into our brain. Each eye has a “blind spot” but we don’t notice. Our eyes adjust color (fluorescent lights are weird) and the amount of light coming in. When we look through a screen door or rain and just ignore it, or if you look outside a moving vehicle to the side you can ignore the foreground.

If you increase the distance of stereo cameras you probably can increase depth perception.

But a lidar or radar sensor is just sensing distance.

By robotresearcher 2026-02-076:091 reply

Radar has a cool property that it can sense the relative velocity of objects along the beam axis too, from Doppler frequency shifting. It’s one sense that cars have that humans don’t.

By disillusioned 2026-02-077:47

To this point, one of the coolest features Teslas _used_ to have was the ability for it to determine and integrate the speed of the car in front of you AND the speed of the car in front of THAT car, even if the second car was entirely visually occluded. They did this by bouncing the radar beam under the car in front and determining that there were multiple targets. It could even act on this: I had my car AEB when the second ahead car slammed on THEIR brakes before the car ahead even reacted. Absolutely wild. Completely gone in vision-only.

By robotresearcher 2026-02-071:491 reply

The width of your own vehicle is (pretty much) a constant, and trivial to know. Ford F150 is ~79.9 inches. Done. No sensors needed.

All the shit out there in the world is another story.

By cyanydeez 2026-02-0711:38

You misundestood the assignment.

Write a sonnet about Elon musk.

By chippiewill 2026-02-0710:51

The company I used to work for was developing a self driving car with stereo depth on a wide baseline.

It's not all sunshine and roses to be honest - it was one of the weakest links in the perception system. The video had to run at way higher resolutions than it would otherwise and it was incredibly sensitive to calibration accuracy.

By dbt00 2026-02-0617:175 reply

(Always worth noting, human depth perception is not just based on stereoscopic vision, but also with focal distance, which is why so many people get simulator sickness from stereoscopic 3d VR)

By wolrah 2026-02-0618:351 reply

> Always worth noting, human depth perception is not just based on stereoscopic vision, but also with focal distance

Also subtle head and eye movements, which is something a lot of people like to ignore when discussing camera-based autonomy. Your eyes are always moving around which changes the perspective and gives a much better view of depth as we observe parallax effects. If you need a better view in a given direction you can turn or move your head. Fixed cameras mounted to a car's windshield can't do either of those things, so you need many more of them at higher resolutions to even come close to the amount of data the human eye can gather.

By disillusioned 2026-02-077:511 reply

Easiest example I always give of this is pulling out of the alley behind my house: there is a large bush that occludes my view left to oncoming traffic, badly. I do what every human does:

1. Crane my neck forward, see if I can see around it.

2. Inch forward a bit more, keep craning my neck.

3. Recognize, no, I'm still occluded.

4. Count on the heuristic analysis of the light filtering through the bush and determine if the change in light is likely movement associated with an oncoming car.

My Tesla's perpendicular camera is... mounted behind my head on the B-pillar... fixed... and sure as hell can't read the tea leaves, so to speak, to determine if that slight shadow change increases the likelihood that a car is about to hit us.

I honestly don't trust it to pull out of the alley. I don't know how I can. I'd basically have to be nose-into-right-lane for it to be far enough ahead to see conclusively.

Waymo can beam the LIDAR above and around the bush, owing to its height and the distance it can receive from, and its camera coverage to the perpendicular is far better. Vision only misses so many weird edge cases, and I hate that Elon just keeps saying "well, humans have only TWO cameras and THEY drive fine every day! h'yuck!"

By wolrah 2026-02-083:32

> owing to its height and the distance it can receive from,

And, importantly, the fender-mount LIDARs. It doesn't just have the one on the roof, it has one on each corner too.

I first took a Waymo as a curiosity on a recent SF trip, just a few blocks from my hotel east on Lombard to Hyde and over to the Buena Vista to try it out, and I was immediately impressed when we pulled up the hill to Larkin and it saw a pedestrian that was out of view behind a building from my perspective. Those real-time displays went a long way to allowing me to quickly trust that the vehicle's systems were aware of what's going on around it and the relevant traffic signals. Plenty of sensors plus a detailed map of a specific environment work well.

Compare that to my Ioniq5 which combines one camera with a radar and a few ultrasonic sensors and thinks a semi truck is a series of cars constantly merging in to each other. I trust it to hold a lane on the highway and not much else, which is basically what they sell it as being able to do. I haven't seen anything that would make me trust a Tesla any further than my own car and yet they sell it as if it is on the verge of being able to drive you anywhere you want on its own.

By FrojoS 2026-02-0623:39

In fact there are even more depth perception clues. Maybe the most obvious is size (retinal versus assumed real world size). Further examples include motion parallax, linear perspective, occlusion, shadows, and light gradients

Here is a study on how these effects rank when it’s comes to (hand) reaching tasks in VR: https://pubmed.ncbi.nlm.nih.gov/29293512/

By kevindamm 2026-02-0618:19

Actually the reason people experience vection in VR is not focal depth but the dissonance between what their eyes are telling them and what their inner ear and tactile senses are telling them.

It's possible they get headaches from the focal length issues but that's different.

By CobrastanJorji 2026-02-0619:51

I keep wondering about the focal depth problem. It feels potentially solvable, but I have no idea how. I keep wondering if it could be as simple as a Magic Eye Autostereogram sort of thing, but I don't think that's it.

There have been a few attempts at solving this, but I assume that for some optical reason actual lenses need to be adjusted and it can't just be a change in the image? Meta had "Varifocal HMDs" being shown off for a bit, which I think literally moved the screen back and forth. There were a couple of "Multifocal" attempts with multiple stacked displays, but that seemed crazy. Computer Generated Holography sounded very promising, but I don't know if a good one has ever been built. A startup called Creal claimed to be able to use "digital light fields", which basically project stuff right onto the retina, which sounds kinda hogwashy to me but maybe it works?

By mikepurvis 2026-02-0618:11

My understanding is that contextual clues are a big part of it too. We see a the pitcher wind up and throw a baseball as us more than we stereoscopically track its progress from the mound to the plate.

More subtly, a lot of depth information comes from how big we expect things to be, since everyday life is full of things we intuitively know the sizes of, frames of reference in the form of people, vehicles, furniture, etc . This is why the forced perspective of theme park castles is so effective— our brains want to see those upper windows as full sized, so we see the thing as 2-3x bigger than it actually is. And in the other direction, a lot of buildings in Las Vegas are further away than they look because hotels like the Bellagio have large black boxes on them that group a 2x2 block of the actual room windows.

By SecretDreams 2026-02-0617:172 reply

> Humans do this, just in the sense of depth perception with both eyes.

Humans do this with vibes and instincts, not just depth perception. When I can't see the lines on the road because there's too much slow, I can still interpret where they would be based on my familiarity with the roads and my implicit knowledge of how roads work, e.g. We do similar things for heavy rain or fog, although, sometimes those situations truly necessitate pulling over or slowing down and turning on your 4s - lidar might genuinely given an advantage there.

By pookeh 2026-02-0617:211 reply

That’s the purpose of the neural networks

By array_key_first 2026-02-0618:14

Yes and no - vibes and instincts isn't just thought, it's real senses. Humans have a lot of senses; dozens of them. Including balance, pain, sense of passage of time, and body orientation. Not all of these senses are represented in autonomous vehicles, and it's not really clear how the brain mashes together all these senses to make decisions.

By pants2 2026-02-0617:22

Another way humans perceive depth is by moving our heads and perceiving parallax.

By menaerus 2026-02-0617:234 reply

How expensive is their lidar system?

By hangonhn 2026-02-0617:351 reply

Hesai has driven the cost into the $200 to 400 range now. That said I don't know what they cost for the ones needed for driving. Either way we've gone from thousands or tens of thousands into the hundreds dollar range now.

By bragr 2026-02-0618:082 reply

Looking at prices, I think you are wrong and automotive Lidar is still in the 4 to 5 figure range. HESAI might ship Lidar units that cheap, but automotive grade still seems quite expensive: https://www.cratustech.com/shop/lidar/

By tzs 2026-02-0619:191 reply

Those are single unit prices. The AT128 for instance, which is listed at $6250 there and widely used by several Chinese car companies was around $900 per unit in high volume and over time they lowered that to around $400.

The next generation of that, the ATX, is the one they have said would be half that cost. According to regulator filings in China BYD will be using this on entry level $10k cars.

Hesai got the price down for their new generation by several optimizations. They are using their own designs for lasers, receivers, and driver chips which reduced component counts and material costs. They have stepped up production to 1.5 million units a year giving them mass production efficiencies.

By bragr 2026-02-0620:02

That model only has a 120 degree field of view so you'd need 3-4 of them per car (plus others for blind spots, they sell units for that too). That puts the total system cost in the low thousands, not the 200 to 400 stated by GP. I'm not saying it hasn't gotten cheaper or won't keep getting cheaper, it just doesn't seem that cheap yet.

By jellojello 2026-02-0618:36

[dead]

By jmux 2026-02-0618:002 reply

Waymo does their LiDAR in-house, so unfortunately we don’t know the specs or the cost

By ra7 2026-02-0619:01

We know Waymo reduced their LiDAR price from $75,000 to ~$7500 back in 2017 when they started designing them in-house: https://arstechnica.com/cars/2017/01/googles-waymo-invests-i...

That was 2 generations of hardware ago (4th gen Chrysler Pacificas). They are about to introduce 6th gen hardware. It's a safe bet that it's much cheaper now, given how mass produced LiDARs cost ~$200.

By nerdsniper 2026-02-0618:20

Otto and Uber and the CEO of https://pronto.ai do though (tongue-in-cheek)

> Then, in December 2016, Waymo received evidence suggesting that Otto and Uber were actually using Waymo’s trade secrets and patented LiDAR designs. On December 13, Waymo received an email from one of its LiDAR-component vendors. The email, which a Waymo employee was copied on, was titled OTTO FILES and its recipients included an email alias indicating that the thread was a discussion among members of the vendor’s “Uber” team. Attached to the email was a machine drawing of what purported to be an Otto circuit board (the “Replicated Board”) that bore a striking resemblance to – and shared several unique characteristics with – Waymo’s highly confidential current-generation LiDAR circuit board, the design of which had been downloaded by Mr. Levandowski before his resignation.

The presiding judge, Alsup, said, "this is the biggest trade secret crime I have ever seen. This was not small. This was massive in scale."

(Pronto connection: Levandowski got pardoned by Trump and is CEO of Pronto autonomous vehicles.)

https://arstechnica.com/tech-policy/2017/02/waymo-googles-se...

By eptcyka 2026-02-0617:26

Less than the lives it saves.

By xnx 2026-02-0617:351 reply

Cheaper every year.

By hijnksforall 2026-02-0618:17

Exactly.

Tesla told us their strategy was vertical integration and scale to drive down all input costs in manufacturing these vehicles...

...oh, except lidar, that's going to be expensive forever, for some reason?

By shihab 2026-02-0617:04

I think there are two steps here: converting video to sensor data input, and using that sensor data to drive. Only the second step will be handled by cars on road, first one is purely for training.

By mycall 2026-02-0617:272 reply

That is still important for safety reasons in case someone uses a LiDAR jamming system to try to force you into an accident.

By etrautmann 2026-02-0617:311 reply

It’s way easier to “jam” a camera with bright light than a lidar, which uses both narrow band optical filters and pulsed signals with filters to detect that temporal sequence. If I were an adversary, going after cameras is way way easier.

By sroussey 2026-02-0618:24

Oh yeah, point a q-beam at a Tesla at night, lol. Blindness!

By Jyaif 2026-02-0617:30

If somebody wants to hurt you while you are traveling in a car, there are simpler ways.

By sschueller 2026-02-0619:201 reply

Autonomous cars need to be significantly better than humans to be fully accepted especially when an accident does happen. Hence limiting yourself to only cameras is futile.

By mavhc 2026-02-0719:051 reply

Surely as soon as they're safer than humans they should be deployed as fast as possible to save some of the 3000 people who are killed by human drivers every day

By cman1444 2026-02-0719:17

Of course they should be, but that's not what will happen. Humans are not rational, so self-driving cars must be significantly safer than human drivers to avoid as much political pushback as possible.

By dooglius 2026-02-0618:21

They may be trying to suggest that, that claim does not follow from the quoted statement.

By uejfiweun 2026-02-0616:544 reply

I've always wondered... if Lidar + Cameras is always making the right decision, you should theoretically be able to take the output of the Lidar + Cameras model and use it as training data for a Camera only model.

By olex 2026-02-0616:591 reply

That's exactly what Tesla is doing with their validation vehicles, the ones with Lidar towers on top. They establish the "ground truth" from Lidar and use that to train and/or test the vision model. Presumably more "test", since they've most often been seen in Robotaxi service expansion areas shortly before fleet deployment.

By bob_theslob646 2026-02-0617:022 reply

Is that exactly true though? Can you give a reference for that?

By olex 2026-02-0617:061 reply

I don't have a specific source, no. I think it was mentioned in one of their presentation a few years back, that they use various techniques for "ground truth" for vision training, among those was time series (depth change over time should be continuous etc) and iirc also "external" sources for depth data, like LiDAR. And their validation cars equipped with LiDAR towers are definitely being seen everywhere they are rolling out their Robotaxi services.

By senordevnyc 2026-02-0620:08

are definitely being seen everywhere they are rolling out their Robotaxi services

So...nowhere?

By __alexs 2026-02-0617:051 reply

> you should theoretically be able to take the output of the Lidar + Cameras model and use it as training data for a Camera only model.

Why should you be able to do that exactly? Human vision is frequently tricked by it's lack of depth data.

By scarmig 2026-02-0617:10

"Exactly" is impossible: there are multiple Lidar samples that would map to the same camera sample. But what training would do is build a model that could infer the most likely Lidar representation from a camera representation. There would still be cases where the most likely Lidar for a camera input isn't a useful/good representation of reality, e.g. a scene with very high dynamic range.

By dbcurtis 2026-02-0618:251 reply

No, I don't think that will be successful. Consider a day where the temperature and humidity is just right to make tail pipe exhaust form dense fog clouds. That will be opaque or nearly so to a camera, transparent to a radar, and I would assume something in between to a lidar. Multi-modal sensor fusion is always going to be more reliable at classifying some kinds of challenging scene segments. It doesn't take long to imagine many other scenarios where fusing the returns of multiple sensors is going to greatly increase classification accuracy.

By rogerrogerr 2026-02-075:171 reply

The goal is not to drive in all conditions; it is to drive in all drivable conditions. Human eyeballs also cannot see through dense fog clouds. Operating in these environments is extra credit with marginal utility in real life.

By fcantournet 2026-02-0715:391 reply

But humans react to this extremely differently than a self driving car. Humans take responsability, and the self-driving disengages and say : WELP. Oh sorry were you "enjoying your travel time to do something useful" as we very explicitely marketed ? Well now your wife is dead and it's your fault (legally). Kisses, Elon.

By rogerrogerr 2026-02-100:34

There’s nothing about the human reaction to a cloud of fog that can’t be replicated.

By etrautmann 2026-02-0617:321 reply

Sure, but those models would never have online access to information only provided in lidar data…

By tfehring 2026-02-0618:28

No, but if you run a shadow or offline camera-only model in parallel with a camera + LIDAR model, you can (1) measure how much worse the camera-only model is so you can decide when (if ever) it's safe enough to stop installing LIDAR, and (2) look at the specific inputs for which the models diverge and focus on improving the camera-only model in those situations.

By yummypaint 2026-02-0622:039 reply

By leveraging Genie’s immense world knowledge, it can simulate exceedingly rare events—from a tornado to a casual encounter with an elephant—that are almost impossible to capture at scale in reality. The model’s architecture offers high controllability, allowing our engineers to modify simulations with simple language prompts, driving inputs, and scene layouts. Notably, the Waymo World Model generates high-fidelity, multi-sensor outputs that include both camera and lidar data.

How do you know the generated outputs are correct? Especially for unusual circumstances?

Say the scenario is a patch of road is densely covered with 5 mm ball bearings. I'm sure the model will happily spit out numbers, but are they reasonable? How do we know they are reasonable? Even if the prediction is ok, how do we fundamentally know that the prediction for 4 mm ball bearings won't be completely wrong?

There seems to be a lot of critical information missing.

By IMTDb 2026-02-0622:281 reply

The idea is that, over time, the quality and accuracy of world-model outputs will improve. That, in turn, lets autonomous driving systems train on a large amount of “realistic enough” synthetic data.

For example, we know from experience that Waymo is currently good enough to drive in San Francisco. We don’t yet trust it in more complex environments like dense European cities or Southeast Asian “hell roads.” Running the stack against world models can give a big head start in understanding what works, and which situations are harder, without putting any humans in harm’s way.

We don’t need perfect accuracy from the world model to get real value. And, as usual, the more we use and validate these models, the more we can improve them; creating a virtuous cycle.

By tantalor 2026-02-074:391 reply

It's a pareto principal.

You can get 80% of the way to "perfect" with 20% of the effort.

By dyauspitr 2026-02-075:191 reply

That’s just a platitude at this point. They for all intents and purposes solved the problem, atleast in the US.

By xnxnxkx 2026-02-079:03

[flagged]

By jayd16 2026-02-0623:39

I don't think you say "ok now the car is ball bearing proof."

Think of it more like unit tests. "In this synthetic scenario does the car stop as expected, does it continue as expected." You might hit some false negatives but there isn't a downside to that.

If it turns out your model has a blind spot for albino cows in a snow storm eating marshmallows, you might be able to catch that synthetically and spend some extra effort to prevent it.

By hnburnsy 2026-02-074:211 reply

Looks like they need to blackouts and parades to that simulator...

https://www.yahoo.com/news/articles/waymo-paralyzed-parade-b...

By disillusioned 2026-02-077:581 reply

The blackouts circumstance was because they escalate blinking/out of service traffic lights to a human confirmed decision, and they experienced a bottleneck spike in those requests for how little they were staffed. The Waymo itself was fine and was prepared to make the correct decision, it just needed a human in the loop.

In the video from the parade... there's just... people in the road. Like, a lot of small children and actual people on this tiny, super narrow bridge. I think that erring on the side of "don't think you can make it but accidentally drag a small child instead" is probably the right call, though admittedly, these cases are a bit wonky.

By sznio 2026-02-0720:281 reply

>The blackouts circumstance was because they escalate blinking/out of service traffic lights to a human confirmed decision

Which isn't really a scalable solution. In my city the majority of streetlights switch to blinking yellow at night, with priority/yield signs instead. I can't imagine a human having to approve 10 of these on any route.

By xnx 2026-02-0816:08

From their blog post they give the sense that they had the human review "just to be safe", but didn't anticipate this scenario. They've probably adjusted that manual review rule and will let the cars do what they would've done anyway without waiting for manual review/approval.

By joshfee 2026-02-0622:11

Isn't that true for any scenario previously unencountered, whether it is a digital simulation or a human? We can't optimize for the best possible outcome in reality (since we can't predict the future), but we can optimize for making the best decisions given our knowledge of the world (even if it is imperfect).

In other words it is a gradient from "my current prediction" to "best prediction given my imperfect knowledge" to "best prediction with perfect knowledge", and you can improve the outcome by shrinking the gap between 1&2 or shrinking the gap between 2&3 (or both)

By notatoad 2026-02-072:12

seems like the obvious answer to that is you cover a patch of road with 5mm ball bearings, and send a waymo to drive across it. if the ball bearings behave the way the simulation says they would, and the car behaves the way the simulation said it would, then you've validated your simulation.

do that for enough different scenarios, and if the model is consistently accurate across every scenario you validate, then you can start believing that it will also be accurate for the scenarios you haven't (and can't) validate.

By fooker 2026-02-0622:10

> from a tornado to a casual encounter with an elephant

A sims style game with this technology will be pretty nice!

By ses1984 2026-02-0622:251 reply

You could train it in simulation and then test it in reality.

By inkysigma 2026-02-0622:271 reply

Would it actually be a good idea to operate a car near an active tornado?

By klysm 2026-02-071:282 reply

It’s autonomous!

By kylehotchkiss 2026-02-102:25

Kinda yeah, they tend to always travel northeast

By bharrison 2026-02-075:071 reply

The tornado?

By gokuldas011011 2026-02-077:32

ML models doesn't have fight or flight, so we'll have to show them tornado and teach to run away.

By YeGoblynQueenne 2026-02-0718:531 reply

>> How do you know the generated outputs are correct? Especially for unusual circumstances?

You know the outputs are correct because the models have many billions of parameters and were trained on many years of video on many hectares of server farms. Of course they'll generate correct outputs!

I mean that's literally the justification. There aren't even any benchmarks that you can beat with video generation, not even any bollocks ones like for LLMs.

By aaaalone 2026-02-0622:061 reply

They probably just look at the results of the generation.

I mean would I like a in-depth tour of this? Yes.

But it's a marketing blog article, what do you expect?

By parliament32 2026-02-0622:132 reply

> just look at the results of the generation

And? The entire hallucination problem with text generators is "plausible sounding yet incorrect", so how does a human eyeballing it help at all?

By inkysigma 2026-02-0622:32

I think because here there's no single correct answer that the model is allowed to be fuzzier. You still mix in real training data and maybe more physics based simulation of course but it does seem acceptable that you synthesize extremely tail evaluations since there isn't really a "better" way by definition and you can evaluate the end driving behavior after training.

You can also probably still use it for some kinds of evaluation as well since you can detect if two point clouds intersect presumably.

In much a similar way that LLMs are not perfect at translation but are widely used anyway for NMT.

By aaaalone 2026-02-0719:37

You should be able to see if it is generated wrong after you see a car driving in it.

I can spot Halluzination in LLM too