Channel
Interviewed Person
Various
Ever since models started getting powerful, many people have started building really cool products, integrating models into them, and solving lots of problems for us. But these systems consume a lot of tokens, especially if you're actually integrating a model using an API. The solution to this is much simpler than you think. The best architecture is not some extreme pipeline or highly scaled tuning, but actually an old philosophy that forms the basis of a Unix-based system that everything is a file. Now, I know they weren't talking about model costs and we're talking about devices
and files, but surprisingly enough, the solution to this high-cost issue is the exact same principle. And this is exactly what a software engineer at Vercel talks about. Before we explore why files are the solution, let's understand a few things about how these models actually work. Models have been trained on massive amounts of code. This is the exact reason why they're better at understanding code, directory structures, and native bash scripts that developers use to navigate files and find what they need. When an agent uses GP and ls, it's not doing something new.
It's simply doing something it already knows how to do, just in a more controlled way. This approach isn't limited to code. Agents can navigate any directory containing anything, be it code or not, because they're already comfortable with commands and understand file systems. Whenever an agent needs something, it looks around the file system using native bash commands like ls and find. Once the agent finds the exact file using find, it searches for relevant content within that file using pattern matching with grep and cat. Only a small relevant slice of information is
sent to the model while the rest stays out of memory, keeping the context window clean. This means we're not burning through tokens on irrelevant data that the model doesn't need. Using this approach, the agent returns a structured output. This pattern worked so well that Vercel ended up open sourcing a bash tool built specifically around it, giving agents the ability to explore file systems the same way a developer would. When building large language model systems, there are two ways of providing the right information to the model. Either through a detailed system prompt, hoping the agent actually
follows it, or by feeding a lot of data into a vector database and using semantic search to extract it. But each approach has limitations. System prompts have a limited token window which limits how much information we can send to the model at a time. To handle larger data sets, we use semantic search which finds information based on matching meanings to the query. But vector search is used for semantic similarity rather than exact searches. It returns chunks of data that match the general context of the query, not necessarily the specific value we're looking for. This leaves extracting the right content from all
the chunks to the model itself. File systems, however, offer a different approach. With file systems, the structure actually maps to your domain. You often have relationships between files in the folder structure that mirror the relationships between parent folders. With file systems, you don't have to flatten these relationships into model understandable vector chunks, which helps avoid missing relationships that are usually lost in semantic search. These hierarchical connections are preserved naturally, maintaining the organizational logic that already exists in your data. Another advantage is that
retrieval is precise because Grep and Bash tools return exact matches. Unlike vector search, which returns all chunks that loosely match the query and then leaves it to the model to decide which one to use, you get only the required value. The context is minimal when agents use bash tools because they receive the specific chunk they need and many other chunks don't go into memory. This allows them to stay aligned and focused on the exact piece of information without getting lost in unrelated data. Now, this idea isn't something you are unfamiliar with. This
idea has already been used inside Claude Code and all CLI agents where they use bash functions to narrow down findings using pattern matching. We've already been using the file system and Claude codes capabilities for research purposes for any idea that we evaluate. We usually pass the software tool we come across through this pipeline which contains multiple phases with our evaluation criteria that the research must pass through. All of this is defined in a markdown file containing the requirements and objectives of the tool we're testing, how to write the