Are Developers Getting a BETTER Experience on Vercel in 2025? (with Malte Ubl & Lee Robinson) | Vercel Insights Hub

Vercel Insights Hub

Are Developers Getting a BETTER Experience on Vercel in 2025? (with Malte Ubl & Lee Robinson) | Vercel Insights Hub

Transcript

Group by:

The This team recently came back from Versell ship 2025 and we're going to go over some of the highlights of major releases coming out of Verscell this year. The developer platform space is evolving fast. AI, serverless, performance, pricing, all of these are moving at once. We spoke with the CTO Malta U and VP of developer experience Lee Robinson. Malta and Lee shared with us how Verscell is evolving its services to fit the needs of engineers today. First, let's look at Verscell's new fluid compute model. Traditionally, serverless models had you paying for the

entire duration of a web request. Now, with fluid compute, you're only going to have to be paying for active CPU time. This means no more paying for IO time, network latency, or really long API calls that take an eternity to resolve. Here's Lee and Danny talking more about Verscell's new fluid compute model. If you like servers and you like serverless, you should check out fluid compute because we've tried to take some of the best parts of both. So always has at least one instance running, doesn't

have cold starts, can scale up automatically, and now as of forcell shift this year, has what we think is an extremely compelling pricing model. If you're calling a long API or a long AI model call, could take 10 seconds, 30 seconds. You don't want to pay for all of that compute. You only want to pay when it's actually doing work. So when the CPU is actually running, we call this active CPU pricing. And now fluid compute has this. So if it 30 seconds that a function used to run, now you're

only charged for that really small bit where you're actually doing the work. So in practice, it's pretty dramatic savings for any type of IO bound workloads. Yeah, that's fluid. Products that use LLM, streaming APIs, and inference workloads are going to be benefiting massively from fluid compute. Check out what Malta had to share about that. There's a pretty stark interaction to active uh CPU billing. I'll give you an example. For remote MCB, it comes in two standards. One is the legacy one called SC and one is the new one called streaml HTP. With stream HTP that really works

like any kind of REST API style thing you would imagine. That's the normal thing. But all the older clients including older cursor versions only support SSC. Literally what happens open your laptop in the morning you have their MCV server connected. Cursor will connect to that endpoint and it will keep the connection open as long as you work. And if you don't close your laptop at night, it just stays open. You literally are causing hours of compute for a single user, but they're not

making many calls. Maybe they're making 12 calls. Under a gross CPU building model, you're paying for let's say 12 hours of compute. With active CPU billing, you're paying for five calls. So like let's say 100 millisecond, right? Like so the the difference is actually quite stark. In 2025, developers are increasingly building systems that facilitate longrunning AI interactions. Purcell is clearly working hard to remove the bottlenecks that make these systems expensive to run. Pcella also announced their new AI gateway and

AI SDK this year, as well as pricing improvements to their image optimization services. Check out what Lee had to share about that. First, if you've never used image optimization, it can be very compelling because you want your page to load fast and most web pages have quite a few images. So, you want to make the images as small as possible and you probably for lots of people have many hundreds or thousands of images. So, image optimization helps you do that dynamically. You don't have to put them all into your git repo. Our image optimization service previously uh the

pricing was okay, but it could be a little expensive for people. So, we really wanted to change it and make it much more cost effective. it's now much lower and also just a better metric that we're metering on the number of transformations that you're doing. So, it's kind of a double whammy. Helps you improve page load performance and it's much more affordable to use. So, if you're working with images, I think that's a good one to check out. Another one we, you know, we talked a little bit about Fluid, but I also think that if you're starting to use AI models right now and you're not exactly sure which

29 segments (grouped from 413 original)2805 words~14 min readGrouped by 30s intervals