Yesterday OpenAI announced
#Operator, a computer-using-agent. This opens up a plethora of use cases based on interacting with browsers and performs tasks for you. You can read more here.
lnkd.in/g39xFaVA However, it would would require you to have the $200 ChatGPT Pro plan.
There is also a cheaper but equally efficient version where you can utilize any LLM of your choice. This is made possible by
#BrowserUse tool
lnkd.in/gSdi6jAB, that is available as opensource and Web UI based on BrowserUse takes it further.
Check the attached video, where I have instructed the agent to go to
X.com, read the first tweet and reply in a fun and engaging manner. I have used
#Gemini Experimental reasoning LLM for this instance. You can use
#OpenAI,
#DeepSeek and other models as well. My experience with locally hosted Ollama based
#Deepseek-R1 8B disitlled version did not succeed though. Will experiment further.
Welcome to the world of Agents and Operators that work on your behalf.
Follow me for more interesting updates on the AI progress and other emerging technologies.
#AI #TechnologyProgress #Innovations