Startup Gimlet Labs is solving the AI inference bottleneck in a surprisinglyelegant way

Zain Asgar, a Stanford adjunct professor and successfully exited founder, has raised an eighty million dollar Series A for his startup, Gimlet Labs. The round was led by Menlo Ventures. The company has created what it claims is the first and only multi-silicon inference cloud. This is software that allows an AI workload to be simultaneously run across diverse types of hardware. It can split an AI application’s work across traditional CPUs, AI-tuned GPUs, and high-memory systems.

Asgar explained that a single AI agent may chain together multiple steps, and each step requires different hardware. Inference is compute-bound, decode is memory-bound, and tool calls are network-bound. No single chip does it all yet. As new hardware gets rolled out and aging GPUs get redeployed, the multi-silicon fleet is ready but missing the software layer to make it work. That is what Gimlet Labs offers.

If the current trend of deploying more compute continues, data center spending is estimated to reach nearly seven trillion dollars by 2030. Asgar states that applications are only using existing deployed hardware between fifteen and thirty percent of the time. He frames this as wasting hundreds of billions of dollars on idle resources. The goal of Gimlet Labs is to figure out how to get AI workloads to be ten times more efficient than they are today.

Asgar and his cofounders, Michelle Nguyen, Omid Azizi, and Natalie Serrino, built orchestration software that slices up agentic workloads so they can be spread across all kinds of hardware simultaneously. Gimlet Labs claims it reliably speeds AI inference up by three to ten times for the same cost and power. It can even slice the underlying model so it runs across different architectures, using the best chip for each portion.

The company has partnered with chip makers NVIDIA, AMD, Intel, ARM, Cerebras, and d-Matrix. Its product, delivered as software or through an API to its Gimlet Cloud, is targeted at the largest AI model labs and data centers, not the average AI app developer. The company publicly launched in October, stating it had eight-figure revenues from the start. Asgar said his customer base has more than doubled in the last four months and now includes a major model maker and an extremely large cloud computing company.

The cofounders previously worked together at Pixie, a startup that created an open source observability tool for Kubernetes. Pixie was acquired by New Relic in 2020. After Asgar randomly ran into Menlo’s Tim Tully about a year ago and received angel investments from Stanford professors, venture capitalists started calling. After launch, a term sheet landed, and when investors heard Asgar was looking at offers, the startup got a big swarm of funding, quickly oversubscribing the round.

Including a previous seed round, the startup has now raised a total of ninety-two million dollars. Investors include a slew of angels like Sequoia’s Bill Coughran, Stanford Professor Nick McKeown, former VMware CEO Raghu Raghuram, and Intel CEO Lip-Bu Tan. Other investors include Factory, who led the seed round, Eclipse Ventures, Prosperity7, and Triatomic. The company currently employs thirty people.