LLMs
Self hosting
Build a server: https://blog.briancmoses.com/2024/09/self-hosting-ai-with-spare-parts.html
I got a trashcan mac from work to Ollama!
Emacs
GitHub Copilot
Started using this when it first came out, through VS Code.
It's pretty amazing. Great for boilerplate code, and a great autocomplete. Makes writing tests and writing code faster.
It also changes the way I write code: I first write a descriptive plain-english comment, and if I'm descriptive enough copilot can almost always get the next line right.
On a few rarer occasions, I've used the copilot chat to help get un-stuck in difficult problems, mostly as a brainstorming buddy.
Long term outlook
Not going to take over the world anytime soon.
Bad take: https://ai-2027.com/.
The whole thing hinges on this feedback loop:
So when OpenBrain finishes training Agent-1, a new model under internal development, it’s good at many things but great at helping with AI research.
I don't buy this premise yet. IMO, it's not more of the same that will help, but some different modalities. E.g., models that are embodied in some way.
See:
AI Blindspots
Reverse tips for using LLMs for coding:
https://ezyang.github.io/ai-blindspots/
HN: https://news.ycombinator.com/item?id=43414393
I agree. You have to know how to use the tool, like knowing how to search. Good comment on HN to this point: https://news.ycombinator.com/item?id=43504165
I become more and more convinced with each of these tweets/blogs/threads that using LLMs well is a skill set akin to using Search well. It’s been a common mantra - at least in my bubble of technologists - that a good majority of the software engineering skill set is knowing how to search well. Knowing when search is the right tool, how to format a query, how to peruse the results and find the useful ones, what results indicate a bad query you should adjust… these all sort of become second nature the longer you’ve been using Search, but I also have noticed them as an obvious difference between people that are tech-adept vs not.
LLMs seems to have a very similar usability pattern. They’re not always the right tool, and are crippled by bad prompting. Even with good prompting, you need to know how to notice good results vs bad, how to cherry-pick and refine the useful bits, and have a sense for when to start over with a fresh prompt. And none of this is really hard - just like Search, none of us need to go take a course on prompting - IMO folks jusr need to engage with LLMs as a non-perfect tool they are learning how to wield.
The fact that we have to learn a tool doesn’t make it a bad one. The fact that a tool doesn’t always get it 100% on the first try doesn’t make it useless. I strip a lot of screws with my screwdriver, but I don’t blame the screwdriver.
Tracing the thoughts of a large language model
https://www.anthropic.com/research/tracing-thoughts-language-model
The blog post is pretty short on details, but it's interesting. A lot like taking an MRI of the model? Recall, MRI just measures bloodflow which is a proxy for neural activity.
HN: https://news.ycombinator.com/item?id=43495617
More detail: https://transformer-circuits.pub/2025/attribution-graphs/biology.html#dives-poems
MCP Servers
Set up with claude desktop: https://modelcontextprotocol.io/quickstart/user