Channel
Interviewed Person
Conferences
Sit down with Tuhin from Baseten and Sarah from Conviction to discuss how AI is shaping the web and the world around us. Get a demo today: https://vercel.com/contact/sales/demo
Vercel
Interviewed: Conferences
[Music] Okay. Um, great to meet you all. My name is Sarah. I started a venture fund uh around AI a few years ago, but before that I was lucky enough to be an investor in B 10 and have worked with Tuhin for the last five years. Um maybe it's great to it's great to see all of you but maybe I'll let Tuhan introduce B 10 first. Yeah it's been six years six you know who's counting.
Um the um yeah Hi Tan I'm CEO of B 10. Uh B 10 is an AI infrastructure company focused exclusively on inference. Um you know we help the fastest growing companies in the world run their models in production as performantly and scalably as possible. So we work with companies like cursor a bridge notion open evidence gamma clay amongst others. Yeah. Um so one thing I think is really interesting we're talking a lot about like AI product experiences of the last day here. Um I think they're actually
like quite slow, right? Like if you think about like waiting several seconds for like some of the most interesting experiences or the reliability issues that people see with AI APIs and products like happens all the time and that is not something you would accept from I don't know like the best consumer grade products five years ago pre- AI. Um what do you like how do you understand that? Do you think like performance matters today? Yeah, I think performance matters greatly. I I think you know if you think about what is what is like the critical
path to to the application layer delivering value to all their customers. It's hey do the models work one base case oftentimes the models don't work from a quality perspective from a quality perspective. Um second is like do they work fast no fast which is hey you know how long does it take me to actually run this model? A good example of this is one of our customers is a
customer called open evidence. Um you know 55% of doctors use uh open evidence every day and you know it's a chat GPT for healthcare and they expect answers fast and if those models are slow that's not um super useful or or degrade their experience to some extent. And the third piece is do they work reliably and that is hey do these models not go down. I think we in AI to some degree have somewhat of of a high bar from a
consumer perspective but a low bar from infrastructure perspective um of just because computers limited these models running these models at scale is hard um but really for the best products they just don't go down and that and that's what we're trying to drive in I think one of the things that I've seen is just that it's partly because like people forget how insulated the entire software ecosystem has been from hardware and infrastructure issues increasingly over like you know multiple
decades right you guys were talking about like making sure you had the right CPU resources for anc um uh deployment and like you know we don't worry in general as engineers about that today yeah except in AI right and so you have this like super immature ecosystem um I'm pretty confident that given how important the problem is and how many smart people are working on it will improve over time um is also a really competitive space, right? So like the number of people who would say they do inference in some way ranges from uh
obviously like proprietary model companies serving their APIs, companies serving you know shared model APIs, dedicated inference routers like bare metal, how do you um people wanting to buy their own clusters and do it in house as like an app company like what is your framework for how people understand all of this and make a choice? Yeah, I think it's very very complicated. So I I think I think you're right. Like there's a hardware problem, there's a like acquiring hardware and