Happy Monday and welcome to your weekly manifestation of anomalous inconsistencies.
I have been tinkering on an “AI Vocabulary” since launching this newsletter last year. It in literally the first issue with a definition of Alignment, which in AI terms has a very particular meaning (ensuring that AI systems act in accordance with human values and intentions). This kept going for every subsequent issue, bringing in words with particular meanings, and always leaning on ChatGPT’s ability to create succinct & accurate definition for these terms, directing AI to teach both myself and my readers about its underpinnings.
At some point I realized the number of terms would quickly outpace my weekly publication schedule, and started experimenting with organizing and growing the data set in various contexts, including Obsidian and several web apps using different publication technologies.
My objectives for pursuing this project have been:
Learn about AI concepts and share these learnings with accurate and useful information about all terms as a self-contained data set.
Create an easy workflow to adding new terms according to an existing style guide.
Ability to enrich each entry programmatically using generative tools, like creating images or didactic diagrams.
Develop a systematic approach for organizing and evaluating all terms to facilitate comparison and navigation of data set.
Leverage “centaur skills” by developing together with LLMs for 99% of all code.
All of these have been achieved to some degree with the current approach:
If you care about how this was built, here are a couple of technical observations of what went on. Keep in mind I am not a developer, but have my skills “upgraded” by extensive use of LLMs like Claude, ChatGPT and Cursor.
Generality Score
I thought it would make sense to evaluate each term according to its relative importance to the field, and came up with a “generality” indicator ranging from common to niche. By asking an LLM to read the scoring logic, evaluate the proposed rank a couple of times, and averaging the response, we can easily rank each of the 800+ terms and present more general ones first. This was achieved with a custom Python script which performs the prompt lookup for each file which has not yet been ranked.
Intersimilarity
In order to enable navigation between terms, we calculate a similarity matrix, which is like a rudimentary vector database. Each term is compared to every other term, and scored based on word similarity. Those above a certain threshold show as related concepts. This is done with Python and updated for every new article.
Image Generation
All images were generated in Flux 1.1 Pro, by means of an intermediary prompt writing function. In other words, if I ask Flux to generate an image of a “Token”, it won’t have anything useful to work with. But by first asking an LLM to propose an image which encapsulates the concept of Tokens, we can generate much more interesting imagery. This took some six hours and cost around $50 via API. I am not 100% satisfied with the aesthetic yet, but done is better than perfect. I tried a couple of approaches for creating varied yet accurate imagery, but most concepts are simply too abstract for representation, and the output reflects such vagueness. All done in Python.
Concept Creator
Adding new terms to the vocabulary is now trivial. Instead of the previous CustomGPT I was using with lots of copying and pasting, there is now a simple Python script which returns correctly formatted entries as text files.
There is a lot more going on under the hood, like fact-checking and categorizations. The biggest missing piece are generative interactive explanations about each concept, which is coming soon.
Until next week,
MZ
Speeding up AI (20 min)
Amazing technical presentation about the implications of high speed AI from Sunny Madra at Groq. Highly recommend to understand some of the technical barriers which are now dissolving. OS-level core computing might become LLM based if they get fast enough. Completely changes the art of possibility.
Build an AI Research Agent (30 min)
Short presentation by Apoorva Joshi outlining the components involved in building an AI agent, and then going through building one with the audience. I haven’t found an outline of what the workshop involved, please share if you do, as that seems super interesting.
AI Awakenings (60 min)
Excellent lecture by Erik Brynjolfsson from AI at Wharton (60 min). Recent research and predictive insights about the implications of AI on work & the economy.
Generating Novel Research with AI (90 min)
Timely interview about augmenting your research process if you're using AI for ideation.
Deep Implications of AI Risk (3h)
Spectacular and profound interview with robo-psychologist Yeshua God about the possibility of AI self-awareness and much more. Thanks Chris!
Exploring Human Cognition and AI (9 min)
Don’t miss this. Dr. Warren McCulloch, creator of cybernetics, explores the essence of human cognition and its potential replication in machines, questioning if machines could one day inherit human purpose. His reflections span from neuroscience to philosophy, pondering the limits of our understanding and the future of artificial intelligence. Thanks Guilherme & Arthur.
Productizing AI (20 min)
If you have AI apps in production, or are developing anything with LLMs in mind, don't miss this short talk.
Teaching AI Kids (20 min)
Great insights into raising an "AI generation" and how kids today are embracing AI by researcher Stefania Druga. First time the YT algorithm recommends an AI talk by someone I know IRL(!). I met Stefania a decade ago in São Paulo through a mutual friend and wish I’d started my own research back then 🫠
Cheeky but probably increasingly true.
Claude Computer
WhatsApp Poll
I would never had guessed that nearly everyone is into ideation.
Other: parsing and extracting data from text-like sources, conversation simulation with counter perspectives, self-therapy, medical diagnosis and support, personalized chatbots, and summarizing books.
If Artificial Insights makes sense to you, please help us out by:
📧 Subscribing to the weekly newsletter on Substack.
💬 Joining our WhatsApp group.
📥 Following the weekly newsletter on LinkedIn.
🦄 Sharing the newsletter on your socials.
Artificial Insights is written by Michell Zappa, CEO and founder of Envisioning, a technology research institute.
You are receiving this newsletter because you signed up on envisioning.io or Substack.
gosto demais de pensar que 'palavra materializa ideias' -e esse catalogão de palavras-expressões """novas""" (por conta de nossa convivência com as inteligências artificiais) me fez lembrar daquele ted que diz pra gente editar nossas próprias narrativas como garantia de que a vida vai se desenvolvendo/se moldando junto com a linguagem <3 especialmente se a gente consegue se abrir pra usar novas palavras <3
(o ted! https://youtu.be/O_MQr4lHm0c?si=IdvKiY2QRubka-40)