- The AI UX Dispatch
- Posts
- 🔑 Key AI Reads for January 7, 2026
🔑 Key AI Reads for January 7, 2026
Issue 27 • Andrej Karpathy's 2025 LLM year in review, Human review as the new bottleneck, How Jevons Paradox applies to AI, Leveraging a design system to improve vibe-coded prototypes
Frontier Models
Andrej Karpathy's 2025 LLM year in review
Andrej Karpathy (former Tesla AI lead, OpenAI founding member) published a reflection on how LLMs evolved in 2025, and it's one of the clearest summaries of why AI feels different now.
He highlights these key developments:
Reinforcement learning from verifiable rewards (the reason "reasoning" models like OpenAI o3 improved so dramatically)
The rise of Cursor-style "LLM apps" that bundle and orchestrate AI calls for specific domains
Claude Code as the model for agents that live on your computer rather than in the cloud
Advances in vibe coding—using plain English to build real software without touching the code
His TLDR: “2025 was an exciting and mildly surprising year of LLMs. LLMs are emerging as a new kind of intelligence, simultaneously a lot smarter than I expected and a lot dumber than I expected. In any case they are extremely useful and I don't think the industry has realized anywhere near 10% of their potential even at present capability. Meanwhile, there are so many ideas to try and conceptually the field feels wide open.”
Andrej Karpathy | 2025 LLM Year in Review
☕ Medium Read (7 minutes)
Frontier Models
Human review as the new bottleneck
In 2025, AI tools delivered on their "productivity promise": marketing teams can now generate more creative assets in a week than they used to produce in a quarter, and engineers can generate code at impossible speeds. However, there is a new bottleneck: human review. People still need to review the generated content.
To this end, Nate Jones ($) has three bets for 2026:
The review stack flips — AI starts reviewing AI. Humans handle exceptions. This sounds like efficiency, but it’s actually an identity crisis — and the new bottleneck isn’t reviewing outputs, it’s writing testable intent in the first place.
Work becomes testable — At frontier companies, every role starts looking like a creative QA function. The wall between “technical” and “non-technical” dissolves. What’s on the other side isn’t engineering exactly — it’s the ability to specify work precisely enough that automated systems can evaluate it.
The chasm opens — The gap between fast movers and everyone else becomes unbridgeable. Not because fast companies “adopt AI” — because they’ve built auditability and rollback into how they operate. The compounding advantage is organizational learning rate, and catching up requires more than tools.
Nate Jones | Why the gap between prepared and unprepared is about to get wider than we've ever seen in 2026 ($)
⚡ Quick Read (3 minutes)
AI in the Organization
Why much of the work AI will do doesn't exist yet
In a recent LinkedIn post, Aaron Levie (CEO of Box) lays out how a 19th-century economic principle (Jevons Paradox) applies to AI's impact on knowledge work:
In the 19th century, English economist William Stanley Jevons found that tech-driven efficiency improvements in coal use led to increased demand for coal across a range of industries. The paradox of course being that if you assume demand remains constant, then the volume of the underlying resource should fall if you make it more efficient. Instead, making it more efficient leads to massive growth, because there are more use-cases for the resources than previously contemplated. The paradox has proven itself repeatedly as we've made various aspects of the industrial world more productive or cheaper, and especially in technology itself.
Levie argues the same thing is happening now: AI agents are making non-deterministic work (contracts, code, research, campaigns) dramatically cheaper to attempt. The real leverage isn't the "return" in ROI: it's collapsing the cost of "investment" so that projects that would never have started now become viable.
He predicts that most AI tokens in the future won't replace what we do today: they'll be spent on work that currently doesn't happen at all. The software projects that never get built, the research that never gets funded, the campaigns small teams can't afford to try. And while AI handles more tasks, humans remain essential for judgment, context, and stitching outputs into workflows that actually produce value.
Aaron Levie | Jevons Paradox for Knowledge Work
⚡ Quick Read (5 minutes)
Prototyping with AI
How Atlassian improved AI prototype fidelity
When Atlassian started rolling out AI prototyping tools like Figma Make and Replit to thousands of employees, they hit a wall: the AI kept hallucinating icons, mangling navigation components, and generating designs that didn't look like Atlassian products. Their design system team found that simply feeding documentation to the AI wasn't enough—the models think differently than humans do.
The fix involved a hybrid approach: pre-coding the elements AI consistently got wrong (like top nav and side nav), creating "translation" guides that map Tailwind classes to their proprietary components, and building copy-paste recipes for common patterns. One particularly clever technique: they used AI's computer vision to analyze how it actually sees their UI, then rewrote instructions to match the AI's mental model. Louis and Kyler share plenty of real-world examples of the work they've done — and how they rolled out these tools and templates at Atlassian.
Dive Club | The trick to AI prototyping with your design system
Watch Time: 43 minutes
Note: Ridd subsequently posted on X a good summary of the templating approach.
That’s it for this week.
Thanks for reading, and see you next Wednesday with more curated AI/UX news and insights. 👋
All the best, Heidi
Reply