Essentially Free Semantic Search
Having recently migrated my old WordPress site to Hugo, I was wondering if I’d written a post about a certain topic and wished I had search functionality. Sure, I could just use Google to search my site but that’s boring. I’ve been thinking about search a lot at work lately and I wondered just how commodotized semantic search really is – the challenge I gave myself was adding good search functionality while committing to spend as little money on it as possible. rc3.org is already hosted on Cloudflare Pages for free, which is pretty incredible, honestly.
Of course I also used Claude Code to build the search feature, which sped things up but didn’t keep the project from becoming frustrating at times along the way. I asked Claude to create a separate write up of the project that you can look at. I don’t know why I find these entertaining, but I do.
I’ve set up semantic search for a few test apps lately just for fun, so I had a really clear idea on what the build would be like. The challenge was picking the right tools for this particular use case. There are basically three components – a script to index the content, a vector search engine, and a BFF to wrap the search engine, pull in my API keys, etc.
The Embedding
A quick note on semantic search if you’re not familiar. Semantic search is search that matches on the meaning of the query and the documents, rather than just comparing the text of the query to the text of the document directly. For example, a search for ‘migration tools’ would surface posts about ‘switching platforms’ even without shared keywords.
To build it you need an embedding, which is an ML-generated representation of text as a list of numbers. These numbers encode semantic relationships by positioning similar text nearby in high-dimensional space. You also need a vector search engine that computes the distance between embeddings, that’s how the actual search works. You encode the query using the embedding model, and then find the closest documents in the n-dimensional space defined by the embedding.
With semantic search, the quality of the experience is determined by how well the embedding solves the matching problem you have. These days really high quality text embeddings are completely commodified. For my search application, I used OpenAI’s text-embedding-3-small model, which costs $0.02 per million tokens. Indexing my entire corpus of blog posts costs pennies.
If I were more serious about this project I’d probably set up hybrid search (both lexical and semantic search) and maybe use a different embedding that’s tuned specifically for search.
The Vector Search Engine
The challenge I gave myself on this project was to find a way to host it for free. If I had chosen the “relatively cheaply” constraint there were plenty of great choices available like Turbopuffer or Vespa Cloud. I wound up using Qdrant Cloud, specifically their free tier. The developer experience is great … it was really easy to set up. (You can look at the post Claude Code wrote about the project for details.)
The Search Service
The other component is the BFF (backend for frontend) service that connects the Web site to the search engine. It has a few jobs, encoding the query using the embedding model and pulling in the API keys mostly. It’s tiny. I didn’t want to pay for this, either. Fortunately I could use Cloudflare Page Functions, which are serverless JavaScript workers built directly on top of V8.
The BFF was incredibly easy (for Claude Code) to write. As usual in software development, dependencies and environment issues were a pain. Claude Code was confused about the Cloudflare environment and originally built the service using Node.js libraries that aren’t available. There was also an incompatibilty between Cloudflare’s tool for testing these functions locally and Docker. When I tried to test the function locally, the process used all of the Docker container’s CPU. I didn’t believe Claude Code when it blamed the struggle to get things working on that, but it turned out to be true.
In the end, I was able to add semantic search to my blog in just a few hours of tinkering.
The Implications
The Democratization of Sophisticated Technology
The project itself isn’t that interesting, but the implications are. When I started working at Depop in 2021, we used a popular SaaS search vendor for our product search. They didn’t offer semantic search at all, at any price. (They do now, of course.) Now building this kind of system is an easy weekend project, at least at small scale.
The narrative around technical innovation right now is completely dominated by LLMs, but there’s a parallel revolution happening across the entire infrastructure stack. Vector databases, embedding models, serverless compute, edge functions, object storage, CDNs - all of these have become not just available but easy. The developer experience bar has risen dramatically. APIs are well-documented, setup is measured in minutes, free tiers are actually useful rather than marketing gimmicks.
This changes what’s possible at small scale. A solo developer can now assemble sophisticated applications using best-in-class services without managing infrastructure or negotiating enterprise contracts. And with coding agents that excel at stitching together APIs and services, experienced developers can move remarkably fast on greenfield projects.
The Multi-Vendor Stack
There’s another shift here that’s easy to miss: the old playbook was to commit to a single hyperscaler’s ecosystem to get cohesion and avoid integration overhead. Use AWS for everything, or GCP, or Azure. The tight integration within their walled gardens was supposed to be the competitive advantage.
But the best-in-class services are now often outside the hyperscalers - specialized vendors who’ve focused on doing one thing exceptionally well. Qdrant for vector search, Cloudflare for edge compute and CDN, OpenAI for embeddings. These services have mature APIs, excellent documentation, and genuinely care about developer experience in ways that hyperscaler services often don’t.
The integration tax has dropped low enough that composing purpose-built services from multiple vendors can actually give you better results than staying within a single ecosystem. You’re no longer trading off best-of-breed for convenience - you can have both.
What I Learned
This was also my first real foray into building a truly serverless application. I know serverless is by no means new, but I’ve been working at companies that have mature systems that run on VMs or in containerized environments. Seeing how it all comes together when you build a completely serverless app has been enlightening.
My biggest learning though was that when you’re starting a new project these days it makes a ton of sense to really do a lot of technology evaluation up front, and also to start with your wish list of constraints. What I originally wanted was zero hosting costs and hybrid lexical and semantic search. In the end, the only compromise I had to make was on hybrid search, this search engine is purely semantic. If I were willing to spend $20 per month I could have had hybrid search too. The time spent sifting through the options was worth it.