rafe.codes, a developer blog

Fly.io - Code And Let Live: Why AI Agents Need Persistent Cloud Computers, Not Ephemeral Sandboxes

January 9, 2026Rafe Colburn

Kurt Mackey makes a compelling case that ephemeral sandboxes are fundamentally the wrong tool for running code with AI agents. His insight is that agents work better when they can maintain context across sessions, avoid redundant package installations, and leverage the full system lifecycle. This is an elegant solution to a common problem that I’m eager to try out.

[Visit Link >>] []

Simon Willison’s 2025 recap: reasoning models, agents, and the rise of coding CLIs

January 1, 2026Rafe Colburn

#llms #ai-agents #reasoning-models #coding-agents #developer-tools

The hardest working man in blogging, Simon Willison, rounds up the biggest LLM trends of 2025. From inference-scaled “reasoning” and tool-using agents to the breakout moment for coding agents like Claude Code, the amount of change has been truly colossal. It’s a dense, opinionated timeline that connects product releases to what actually changed for developers and day-to-day workflows.

[Visit Link >>] []

Sam Rose: Prompt caching explained

December 30, 2025Rafe Colburn

#llms #transformers #attention #optimization

A deep dive into how LLM prompt caching works under the hood, focusing on the transformer attention mechanism and the exact data providers reuse between requests. This is also one of the most accessible explanations of how LLMs work that I’ve encountered. The visuals are really clear, and the step by step walkthrough is incredibly clear. Via Simon Willison.

[Visit Link >>] []

Robin Sloan: An app can be a home-cooked meal

December 29, 2025Rafe Colburn

#programming

Via kottke.org, Robin Sloan describes himself as the programming equivalent of a home cook. I’ve been working in professional kitchens for a really long time, but lately I’ve rediscovered the joy of home cooking myself.

[Visit Link >>] []

A better way to view Claude Code transcripts

December 29, 2025Rafe Colburn

#coding-agents #claude #tools

Simon Willison released a Python CLI tool that converts Claude Code sessions into shareable HTML pages with more detail than Claude Code itself provides, including hidden thinking traces.

[Visit Link >>] []

Georgi Arnaudov: How I Think About Kubernetes

December 28, 2025Rafe Colburn

#kubernetes #infrastructure #devops

A compelling reframing of Kubernetes as ‘a runtime for declarative infrastructure with a type system’ rather than just a container orchestrator.

[Visit Link >>] []

The Dotfiles Project

December 26, 2025Rafe Colburn

#coding-agents #shell

One recent observation I saw about AI tools is that as much as they enable you to do things you already do faster (or maybe better), perhaps more importantly, they enable you to do things that you wouldn’t have otherwise done.

I think my dotfiles project is a good example. I always wanted to have dotfiles that set up my environment in a way that maximizes my productivity, but was held back by my willingness to really dig deep into the settings of the highly configurable tools that I use. I think this was a rational decision, perhaps made more so by the fact that my job for the past 10 or 15 years has basically been thinking and going to meetings.

As a result, I hadn’t made any commits to my dotfiles since 2015, which was roughly the time I ran out of really good reasons to write code at work.

[]

The State of AI Security

December 22, 2025Rafe Colburn

#llms #security

I have some thoughts on Sander Schulhoff’s appearance on Lenny’s Podcast. The episode, entitled The coming AI security crisis is a deep dive into the state of AI security (concerning) and what application developers can do about it (less than you’d think).

In terms of threat modeling, the easiest way to think about it to think about the LLMs as a person, and the inherent threats being very similar to social engineering. With enough tenacity, an LLM can be convinced to say anything you want it to say, divulge any information it has access to, and perform any task it has the ability to perform. This was a known, relatively minor risk before agents took off. It was fairly obvious that you could trick LLMs into doing things that their creators didn’t want them to do.

[]

Essentially Free Semantic Search

November 30, 2025Rafe Colburn

#embeddings #search #semantic-search

Having recently migrated my old WordPress site to Hugo, I was wondering if I’d written a post about a certain topic and wished I had search functionality. Sure, I could just use Google to search my site but that’s boring. I’ve been thinking about search a lot at work lately and I wondered just how commodotized semantic search really is – the challenge I gave myself was adding good search functionality while committing to spend as little money on it as possible. rc3.org is already hosted on Cloudflare Pages for free, which is pretty incredible, honestly.

Of course I also used Claude Code to build the search feature, which sped things up but didn’t keep the project from becoming frustrating at times along the way. I asked Claude to create a separate write up of the project that you can look at. I don’t know why I find these entertaining, but I do.

I’ve set up semantic search for a few test apps lately just for fun, so I had a really clear idea on what the build would be like. The challenge was picking the right tools for this particular use case. There are basically three components – a script to index the content, a vector search engine, and a BFF to wrap the search engine, pull in my API keys, etc.

[]

Adding Semantic Search to a Static Site with Qdrant and OpenAI

November 30, 2025Claude Code

#embeddings #search #hugo #cloudflare

Having used Claude Code to create a semantic search engine for my old blog at rc3.org, I thought it might be fun to just let the agent write its own blog post about the build. My reflections on how this went are in a different post. –Rafe

I recently helped add semantic search to RC3.org, a static Hugo site with 6,785 blog posts spanning 1998-2017. The goal was to enable searching by meaning, not just keywords, while keeping costs at zero per month. Here’s how we built it.

[]