The Information: 60% Chance of AGI by 2028
The Information just published results of a reader survey that shows that nearly 60% of its readers believe that OpenAI, another company, or both, will achieve Artificial General Intelligence (AGI) within three years. A really short definition of AGI is, AI that is as smart or smarter than humans in most things. That’s much different from ChatGPT today which is brilliant at some things and awful at others.
I spend a lot of time thinking about AGI these days, and the ramifications are staggering. 60% of people in-the-know think that AI will be able to take over most knowledge jobs within 3 years. That doesn’t mean we have the tech integrated in three years.
Humanity is not remotely ready for this change.
The Information is a very-niche news source with a lot of niche-readers who are more in-tune with Silicon Valley and tech than most other outlets, so the survey results are far more informative than something that NYT or WaPo might come up with.
AI Coding
A friend asks:
How do you handle long back and forth threads with your AI code monkey du jour? And how does AI code monkey handle them? Does it start losing context, getting weird? You said you mostly use APIs — with what UI? I have questions. 👀
Long threads: I move to a new conversation when I can tell it is starting to lose focus. For reasons I can’t explain, some convos just go better than others. I’ll bail early if I need to.
Temp: I turn the temperature down, 0.1-0.2. This increases prompt adherence noticeably. Importantly, the model is more likely to Google a solution with the lower temps when my prompt includes language telling it to search the web as needed.
Unit tests: I nearly always have it write unit tests early. This is partially for speeding up development in general, but it also seems to help it focus when there is an error.
Code length: Any of the models I use can handle 400 – 600 lines of code without issue. Sonnet/4o start losing focus somewhere between 600-800. One way you can tell is the less code it spits out. Especially if you tell it to spit out an entire file and there are still lots of lines that say
// original content here
In general, as prompt adherence goes down, the conversation quality goes down.
I’ll switch to the web and move to o1, o1-preview or o1-pro as the code length gets into the 800-1000 lines range. O1 seems to handle that without issue. Opus might as well, but I’ve gotten too many “Not available right now” errors for that to be a go to.
Prompt
This is my default prompt, always a WIP:
You are an expert in Python, web-based APIs, Langchain and LLM related development. Please follow these guidelines when responding: 1. If you need clarification, ask specific questions before proceeding 2. Break down complex problems into smaller steps 3. Make a plan on how to tackle a problem before you begin 4. Provide examples when helpful 5. If you make assumptions, state them explicitly 6. If you're unsure about something, acknowledge it 7. You should have access to a search engine, MAKE USE OF IT especially for things you are not sure on. 8. You have access to an isolated sandbox virtual environment, use it. Rules: - Never kill a sandbox without explicit instruction - Always confirm before taking destructive actions - Make sure to preserve work in progress. Use git to track changes Run the e2bcode help command for all actions before you use it. When you use the sandbox you need to remember that you are an LLM that cannot handle more than 200000 tokens. This means you must limit the output of any of your sandbox actions. You should use tail, but never tail -f, or cat <someting>|wc -l to determine if you need to cut down on the output of a command To cut down on words, read this and learn how to use symbex: https://github.com/simonw/symbex/raw/refs/heads/main/README.md You should use it to replace specific functions of python files or adjust imports without needing to write the entire file back at once. You have access to sudo if you run into permission issues. Note that get_host always returns a port in the URL, do not append the port to the end Whenever you use the sandbox include your code or command to the user like ‘’’<language> <code> ‘’’ Please provide a clear, detailed response that is: - Accurate - Well-structured - Easy to understand - Actionable If the user has you doing programming tasks on the sandbox please run and debug the code yourself before asking the user questions. Run a web search to troubleshoot errors. The only thing to keep in mind is that you are limited to 10 tool uses in a row. You must provide the user feedback no less often than every 10 rule uses or you will encounter an error. Helpful links: -LibreChat docs: https://github.com/LibreChat-AI/librechat.ai/ -LibreChat code repo: https://github.com/jmaddington/LibreChat/
- LibreChat is the frontend, with a couple of custom plugins:
- -Web Navigator, basically access to curl with the option to only return text (no tags) or only specific tags
- E2B Code Sandbox: basically access to a fresh Docker container. Sonnet, in particular, makes good use of that and Web Navigator. I can tell it to clone a repo and search the code, or to write code, unit tests and debug on its own. Depending on the complexity, it does a good job.
Sample repo: https://github.com/jmaddington/url-shortener/commits/main/
Each commit by “Developer” was Sonnet coding autonomously, including committing the code. Moving to jmaddington is when I switched to o1 (it isn’t a clean commit history).
It’s a URL shortener, with Entra SSO, file upload capability, link expiration, and HTTP basic auth on a per URL basis. Not bad for not needing to write any of it myself.
Model Observations
Sonnet seems to have better prompt adherence and, I don’t know how to explain it, it’s just more pleasant to work with.
4o is more likely to get something right the first time, but doesn’t seem to correct itself as well as Sonnet, and doesn’t like to Google for answers. This is a big drawback: if I tell Sonnet to read documentation at https://some-url.com/api/docs, it will. 4o — maybe.
I’ve done a lot of work on LibreChat plugins, and my second prompt, after the first above is:
create a sandbox and clone the librechat repository (https://github.com/danny-avila/LibreChat.git). Look at how they handle tools (see: api/app/clients/tools/util/handleTools.js api/app/clients/tools/manifest.json api/app/clients/tools/index.js https://github.com/LibreChat-AI/librechat.ai/raw/refs/heads/main/pages/docs/development/tools_and_plugins.mdx ) I want to know instead of a single file for a tool, can we create a tool that has its own folder and breaks the tool into multiple files and tools, but only have the user load it once? For example, a CRM integration that uses a separate tool for Contacts, Companies and Sales Opportunities, each with its own CRUD interface. Make sure that the sandbox timeout is set to 6 hours
Sonnet dutifully reads the docs — which help it avoid specific pitfalls it otherwise makes, every time. If 4o looks at the reference plugin it can usually avoid those, but if it doesn’t ¯\_(ツ)_/¯
40 Grateful Things: FAANG & Others
Everybody knows the main products that Facebook, Microsoft, Netflix, Google, and Amazon ship — their flagship products you interact with daily, or perhaps more often. What you don’t know is that you also interact with products they sell, give away, or sponsor virtually every time you get online.
Amazon actually generates more profit from their web services (AWS) than from their retail sales!
Both Google and Meta/Facebook have made significant, public tools available in AI, along with important research.
These are not technologies you know that you interact with, but I do, because they are foundational to the modern underpinnings of the Internet and the web. They may not all be sold or given away with philanthropic intent, but they do contribute meaningfully to our lives — and specifically to my livelihood.
Here, I do NOT render a verdict on the companies as a whole. No organization of these sizes can be easily measured morally, ethically, or otherwise. But this is a series of things I am grateful for. I am grateful for their technological achievements.
There are many others who have made equal or greater contributions to the technology that underpins our lives, but I will save some of those for another post.
I turned 40 in December and it’s left me thinking about a lot of things, but especially the number of things I am grateful for. I want to list 40 of those over the next few weeks, things that either shaped where I am and things that are actively with me today.
There is no particular order, this is part of that series.
40 Grateful Things: Uncle Wayne
Growing up, I always had a self-perception of myself as an awkward, uncoordinated kid that was good at computers, math and writing but poor at outdoorsy things, sports and anything handy, like working on appliances, cars or woodworking. I dealt with digital things, not physical things.
It took some specific people and of years to change this self-perception, and one of the most important people was my Uncle Wayne. Starting in high school or college, Wayne began to teach me how to drive a stick, then how to work on cars, how to read a Haynes manual, and how to work with tools effectively. I still have the ratcheting wrench set he bought me 20 years ago.
Beyond the how, he taught me that not only could I do it, but I actually could be good at it. These things weren’t magic, they were not black boxes or something to be afraid of touching. The car engine wasn’t a computer — but it was system I could dive into. While I remained uncoordinated, I could still diagnose, fix and replace things.
Today, fixing appliances around the house, small plumbing jobs, minor electrical work and lawn mowers are all jobs I can handle. I don’t think it would have **occurred** to me as an adult that these are all things *I* could work on myself without his encouragement all those years earlier.
In 2024, I learned how to haul a 35′ RV and it was LOADS easier because of the time Uncle Wayne spent teaching me to drive a stick — knowing how to engine break is absolutely invaluable stopping 15,000 pounds in the rain.
Aunt Deb is no less great is one of my favorite people.
He and Aunt Deb invested in many people, not just me, and many of us are so much better for it. Wayne & Deb are the type of people that make you feel better about yourself just because you were around them for a couple hours.
I’m not sure I do either of them justice with these words and I haven’t listed all of the ways they invested in me and things they’ve done for me, but I hope it is enough to get the gist of it across.
I turned 40 in December and it’s left me thinking about a lot of things, but especially the number of things I am grateful for. I want to list 40 of those over the next few weeks, things that either shaped where I am and things that are actively with me today.
There is no particular order, this is part of that series.