Uber has more than twenty autonomous vehicle partners, and they all want one thing: data. The company says it will make that data available through a new division called Uber AV Labs. Despite the name, Uber is not returning to developing its own robotaxis, which it stopped doing after one of its test vehicles killed a pedestrian in 2018. Uber ultimately sold off that division in 2020. However, it will send its own cars outfitted with sensors into cities to collect data for partners like Waymo, Waabi, and Lucid Motors, though no contracts are signed yet.
Broadly speaking, self-driving cars are shifting away from rules-based operation and toward relying more on reinforcement learning. As that happens, real-world driving data has become hugely valuable for training these systems. Uber notes that the autonomous vehicle companies most interested in this data are the ones already collecting a lot of it themselves. This indicates that, like many frontier AI labs, they realize solving the most extreme edge cases is a volume game.
Right now, the size of an autonomous vehicle company’s fleet creates a physical limit to how much data it can collect. While many companies create simulations to hedge against edge cases, nothing beats driving on actual roads to discover all the strange and unexpected scenarios that occur. Waymo provides an example of this gap. The company has operated autonomous vehicles for a decade, yet its robotaxis have recently been caught illegally passing stopped school buses.
Having access to a larger pool of driving data could help robotaxi companies solve such problems as they arise, according to Uber’s chief technology officer Praveen Neppalli Naga. And Uber will not be charging for the data, at least not yet. The primary goal is to democratize this data, as the value of advancing partners’ technology is bigger than any immediate revenue. Uber’s VP of engineering, Danny Guo, said the lab must build the basic data foundation first before determining product market fit, believing Uber is uniquely positioned to accelerate the entire industry.
The new AV Labs division is starting small. It currently has just one car, a Hyundai Ioniq 5, and the team is still physically attaching sensors like lidars, radars, and cameras. Guo acknowledged the scrappy beginnings, noting it will take time to deploy a hundred data-collection cars, but the prototype exists. Partners will not receive raw data. Once the fleet is running, the division will process the data to fit partner needs. This “semantic understanding” layer is what driving software from companies like Waymo would use to improve real-time path planning.
There will likely be an intermediate step where Uber plugs a partner’s driving software into its AV Labs cars to run in “shadow mode.” Any time the human driver acts differently from the software, Uber will flag it for the partner. This helps discover software shortcomings and train models to drive more like a human.
This approach is essentially what Tesla has done to train its autonomous software over the last decade, though Uber lacks Tesla’s scale of millions of customer cars. Uber is not bothered, expecting to do more targeted data collection based on partner needs across the six hundred cities where it operates. Naga expects the division to grow to a few hundred people within a year. While he sees a future where Uber’s entire ride-hail fleet could collect training data, the division must start somewhere. From conversations with partners, Guo says the message is clear: give us anything helpful, because the amount of data Uber can collect outweighs everything they can do on their own.

