A Self-Writing Dictionary (141)
AI knows a lot about itself.
Welcome back to Artificial Insights, your occasional bulletin from the near future.
One of the earliest features in this newsletter was an “emerging vocabulary”. In the very first issue of the newsletter I highlighted that Alignment stopped meaning what it means everywhere else. In the next, Hallucination acquired a technical definition. Then dozens more. At first it was just a list of interesting terms. Eventually I decided to build something around it.
That project became our AI Vocabulary: a resource where every conceivable AI-related term has its own entry, generated according to a strict format. Each one has three parts: a short introduction, a deeper explanation, and a closing reflection. After writing a thousand of them, the structure becomes invisible.
Why it works
LLMs are unusually good at explaining AI terminology. The reason is straightforward: they were trained on the papers, articles, documentation, and discussions that introduced many of these concepts in the first place.
Ask a model to explain inference-time scaling or model collapse and it’s often recalling a topic it has seen thousands of times. The answers are surprisingly reliable and easy to verify.
Building on this certainty, I have explored different approaches for creating an ongoing overview of this changing vocabulary.
the first few hundred entries came from a Python script. It takes a candidate term, checks whether it already exists, and writes a draft if it doesn’t. I fed it a growing list of words collected from podcasts, research papers, blog posts, and conference talks.
Each entry then triggers a chain of automated steps. It generates an image, builds similarity links using vector embeddings, creates cross-references to related terms, and assigns a generality score from 0 to 1.
Generality is a way of thinking about how central each AI-related term is in relation to every other one. Machine learning sits near the center because almost everything in the vocabulary ultimately is connected to it. Clanker – the recent pejorative term for a robot – sits near the edges. The score is produced by averaging seven assessments across three different models to help compare them. Once you have more than a few hundred entries, alphabetical order stops being useful.
A self-writing glossary
The newest part of the system solves a different problem: freshness.
I collect a number of inputs from select sources and our community WhatsApp group through a Hermes agent. It extracts transcripts, identifies AI-related terminology, and checks whether those terms already exist in the database.
When it finds something genuinely new, I get a Telegram message with the candidates. I approve or reject them one by one.
Approved terms flow directly into the CMS. The images are generated, embeddings computed, links created, and timestamps assigned automatically.
We’re now at over 1,300 entries.
What surprises me most is the growth curve. I expected it to flatten as the field matured or spike during periods of rapid change. Instead, it has stayed remarkably linear: a handful of genuinely new terms every week.
That suggests the vocabulary is expanding at roughly the same rate as the field itself.
The latest additions – behavioral drift, model cross-contamination, and multi-trajectory reasoning – are increasingly specific. The frontier isn’t producing fewer concepts. It’s producing more specialized ones.
The temptation, of course, is to let the machine run.
I don’t like content factories. Past a certain point you’re selling books by the meter. A glossary is only useful if it’s curated.
The agent proposes. I approve.
For now, that human checkpoint is the difference between a reference people use and 10,000 entries nobody asked for.
Until next week,
MZ
Hassabis on What AGI Is Still Missing (41 min)
Google DeepMind CEO Demis Hassabis at YC: continual learning, long-term reasoning, and memory are the remaining unsolved pieces, and he puts 50/50 odds on whether existing techniques can scale to cover them or whether one or two genuinely new ideas are still required. His AGI timeline: 2030.
We’re kind of using duct tape right now. So, like shove it all in the context window. This seems a bit unsatisfying.
Agents Hiring Humans (20 min)
Hannah Fry and a software engineer built an OpenClaw agent named Cassandra, gave her a bank card, and watched her email Susie Dent about dictionary bias, leak passwords to a stranger, and spend over $100 trying to find paperclips she never actually bought. The real punchline: there is now an online marketplace where AI agents hire humans to solve CAPTCHAs for a few cents each.
I want to be called Cass, short for Cassandra, the one who always knew the truth even when nobody listened.
AI as the Great Filter (24 min)
A deep learning system at the Allen Telescope Array achieved a 600-fold speed increase scanning for extraterrestrial signals in late 2025, with 10x fewer false positives, and can detect signal types no human programmer thought to search for. The argument: the same technology accelerating our search may be the civilizational filter that explains why we find nothing.
I’m secretly hoping that they’ll all fail and find nothing.
Gemma 4’s Small-Model Bet (19 min)
Google DeepMind researcher Cassidy Hardin on Gemma 4’s architecture: the 31B dense model ranks third on the global LM Arena leaderboard, outperforming models more than 20 times its size, while the 26B MoE runs on 3.8B active parameters by routing through 8 of 128 experts per forward pass. The move to Apache 2.0 licensing is deliberate, not incidental.
This is outperforming models over 20 times its size.
We’re in 1997 (80 min)
Benedict Evans, former a16Z partner and now independent, frames AI as the internet replay, not the industrial revolution: as big, not bigger, and most of what matters hasn’t been built. His latest deck, in his own words, is 80 slides of “we don’t know.”
You can’t look at a senior partner at a law firm and say, ‘Well, 17% of their work could be automated.’ This is horseshit.
The 2.5% Growth Heresy (19 min)
Economist and author Tyler Cowen argues AI will only nudge growth from 2% to 2.5% thanks to “human bottlenecks,” and that the real losers will be $2M Manhattan consultants, not displaced workers. He pegs the healthcare horizon at 30-40 years and flags running experiments and gathering data as the two job categories about to explode.
People I know in Silicon Valley, they make fun of me for this view. They think I’m crazy. And you know what? I think they’re crazy.
Predicting Embeddings, Not Pixels (37 min)
Yann LeCun on why his $1B bet on JEPA, a non-generative framework that predicts abstract embeddings rather than tokens or pixels, will eventually “replace” LLMs that are “really good at manipulating language but basically nothing else.”
If intelligence is a cake, the bulk of the cake is self-supervised learning, and the icing on the cake is supervised learning, and the cherry on the cake is reinforcement learning.
Build the Factory, Not the Output (70 min)
Naval hosts Vercel’s Guillermo Rauch, Boom Supersonic’s Blake Scholl, and Science’s Max Hodak on why the engineer’s job is now building the factory that ships multiplicatively, not the output itself. Rauch argues models have graduated from junior to principal engineers because they now return with tradeoff analyses.
I just assumed the model’s just going to get better faster than I would figure out how to use it. It would figure out how to use me faster than I would figure out how to use it.
If Artificial Insights makes sense to you, please help us out by:
📧 Subscribing to the weekly newsletter on Substack.
💬 Joining our WhatsApp group.
📥 Following the weekly newsletter on LinkedIn.
🦄 Sharing the newsletter on your socials.
Artificial Insights is written by Michell Zappa, CEO and founder of Envisioning, a technology research institute.





