If you’re trying to keep up with all the advancements in AI lately…good luck.
Ever since OpenAI’s ChatGPT blew up late last year, it seems like there’s a new development in artificial intelligence every minute.
The latest craze is something called Auto-GPT, which uses ChatGPT’s underlying tech (GPT-4) to do things ChatGPT simply can’t.
You can even build your own Auto-GPT AI agent, or try it out on the web for yourself.
What is Auto-GPT?
Auto-GPT dramatically flips the relationship between AI and the end user (that’s you). ChatGPT relies on a back-and-forth between the AI and the end user: You prompt the AI with a request, it returns a result, and you respond with a new prompt, perhaps based on what the AI gave you. Auto-GPT, however, only needs one prompt from you; from there, the AI agent will then generate a task list it thinks it will need to accomplish whatever you asked it to, without needing any additional input or prompts. It essentially chains together LLM (large language model) “thoughts,” according to developer Significant Gravitas (Toran Bruce Richards).
Auto-GPT is a complex system relying on multiple components. It connects to the internet to retrieve specific information and data (something ChatGPT’s free version cannot do), features long-term and short-term memory management, uses GPT-4 for OpenAI’s most advanced text generation, and GPT-3.5 for file storage and summarization. There’s a lot of moving parts, but it all comes together to produce some impressive results.
How people are using Auto-GPT
The first example comes from Auto-GPT’s GitHub site: You can’t quite see all of the goals the demonstrated lists Auto-GPT is working to complete, but the gist is someone asks the AI agent to research and learn more about itself. It follows suit, opening Google, finding its own GitHub repository, analyzing it, and compiling a summary of the data in a text file for the demonstrator to view.
Here’s a more practical example: The user wants to figure out which headphones on the market are the best. Instead of doing the research themselves, they turn to Auto-GPT, and prompt the AI agent with these four goals:
- Do market research for different headphones on the market today.
- Get the top five headphones and list their pros and cons.
- Include the price for each one and save the analysis.
- Once you are done, terminate.
After thinking for a moment, the AI agent springs into action, searching the web to compile information and reviews on headphones. It then spits out an easy-to-read plain text file, ranking the best headphones, listing their prices, and highlighting their pros and cons.
Twitter user Sully Omar tasked Auto-GPT with a similar objective: Omar pretended to be a shoe company and asked the AI agent to research waterproof shoes and report back on the top five competitors. The bot even had the wherewithal to vet reviews it considered in its analysis, since some reviews could be “biased or fake.”
You can find examples like these all over Twitter by searching “Auto-GPT” or “AGI” (artificial general intelligence). But before we get carried away, it’s worth remembering Auto-GPT is super new: Significant Gravitas dropped the project on GitHub in late March, and while the hype around the tool has grown considerably since, it’s still completely experimental.
NVIDIA AI scientist Jim Fan highlights that while Auto-GPT is a “fun experiment,” and something that makes it easy for the general public to tinker with, it’s still a prototype, one that shouldn’t be mistaken for a revolution.
Even the developer knows Auto-GPT is still in its early stages, listing the following issues on their GitHub site:
- Not a polished application or product, just an experiment
- May not perform well in complex, real-world business scenarios. In fact, if it actually does, please share your results!
- Quite expensive to run, so set and monitor your API key limits with OpenAI!
But I think what makes Auto-GPT cool (or at least the promise of Auto-GPT) is the idea of being able to ask an AI to take on most of the responsibility for any given task. You don’t need to know the right questions to ask or the optimal prompts to give to make the AI do what you want. As long as your initial goals are clear, the AI can think of those next steps for you, and build you things you might not have been able to think of yourself. While we might not be there yet, the fact Auto-GPT launched essentially two weeks after GPT-4 means we have no idea how quickly this type of tech will advance going forward.
How to try out Auto-GPT right now
You don’t need to know how to code in order to build your own AI agent with Auto-GPT, but it helps. You’ll need a Windows PC, an OpenAI API key (a pay as you go plan is highly recommended), a text editor (like Notepad++), Git (or the latest stable release of Auto-GPT), and Python, but there are plenty of other requirements if you want to expand Auto-GPT’s capabilities, such as integrating speech or alternative memory locations such as Pinecone.
Auto-GPT’s GitHub page has an extensive list of instructions for setting up the tool as well as adding in those extras. Tom’s Hardware also has a great guide for simple set up if all you’re looking to do is try out an AI agent with Auto-GPT. If you do build it yourself, mind your token usage—we discuss setting limits in our OpenAI API piece so you don’t accidentally allow Auto-GPT to burn through your credit card balance.
However, you don’t need to build the AI agent yourself if all you want to do is try out Auto-GPT. Some developers have built interfaces for Auto-GPT that are easy to access from your web browser, no coding experience necessary. Cognosys was free to use until high demand forced developers to require a OpenAI API key in order to access. AgentGPT is an interesting example you don’t need an API key for, but it limits the amount of tasks the AI will generate for itself. Still, it will give you a sense of how the process works, and you can increase those limits by providing an API key.