The hardest working man in blogging, Simon Willison, rounds up the biggest LLM trends of 2025. From inference-scaled “reasoning” and tool-using agents to the breakout moment for coding agents like Claude Code, the amount of change has been truly colossal. It’s a dense, opinionated timeline that connects product releases to what actually changed for developers and day-to-day workflows.
Posts for: #Llms
Sam Rose: Prompt caching explained
A deep dive into how LLM prompt caching works under the hood, focusing on the transformer attention mechanism and the exact data providers reuse between requests. This is also one of the most accessible explanations of how LLMs work that I’ve encountered. The visuals are really clear, and the step by step walkthrough is incredibly clear. Via Simon Willison.
The State of AI Security
I have some thoughts on Sander Schulhoff’s appearance on Lenny’s Podcast. The episode, entitled The coming AI security crisis is a deep dive into the state of AI security (concerning) and what application developers can do about it (less than you’d think).
In terms of threat modeling, the easiest way to think about it to think about the LLMs as a person, and the inherent threats being very similar to social engineering. With enough tenacity, an LLM can be convinced to say anything you want it to say, divulge any information it has access to, and perform any task it has the ability to perform. This was a known, relatively minor risk before agents took off. It was fairly obvious that you could trick LLMs into doing things that their creators didn’t want them to do.