Sudhanshu Shekhar

Sudhanshu Shekhar

Users
Tweets

14m

Stop thinking in single data types. Multimodal AI now unifies text, image, audio & video for seamless understanding. This is how next-gen AI truly perceives the world. #MultimodalAI #GenAI #AITools

Nishan Chowdury

Nishan Chowdury

@Nishan_011

I kept seeing Agnes-2.0-Flash ranking high on AI leaderboards, so I gave it a try. First, I tested it on a Python API project with JSON response issues. Instead of generic suggestions, it identified the exact bug, explained the cause, and provided a working fix. Then I tested its multimodal capabilities: • Agnes-Image-2.0-Flash → generated a clean product mockup from a simple idea • Agnes-Video-V2.0 → turned the same concept into a short demo video Everything worked in one workflow without switching tools. What’s surprising is that the entire stack is currently free: • Agnes-2.0-Flash • Agnes-Image-2.0-Flash • Agnes-Video-V2.0 If you use Claude Code, Codex, or similar AI tools, Agnes AI is worth checking out. #AgnesAI #MultimodalAI #AIAgent #DeveloperTools #AIWorkflow

276

Noor Islam S. Mohammad @ICML

Noor Islam S. Mohammad @ICML @nislam_mohammad

🚀 Excited to share that two of my papers have been accepted to the ICML 2026 Workshop on Efficient Multimodal Question Answering (EMMQA)! #ICML2026 #MachineLearning #LLM #MultimodalAI #ReasoningAI #ArtificialIntelligence #Research

Deborah

Deborah

@n__deborah

Lately I've been testing Agnes-2.0-Flash, and it's one of the more interesting AI models I've tried recently. What got my attention wasn't image generation—it was how useful it was in actual workflows. I used it for coding tasks, debugging, research, and workflow planning, and it handled all of them surprisingly well. Instead of jumping between multiple tools, I could use a single model for reasoning, problem-solving, and content creation. It's easy to see why Agnes-2.0-Flash ranks among the Top 10 AI models on major benchmark leaderboards. The model feels fast, capable, and reliable across a wide range of tasks. I also tested the multimodal side of the platform: - Using Agnes-Image-2.0-Flash, I generated detailed visuals with complex prompts. - Agnes-Video-V2.0 made it easy to create short video content from ideas that would normally take much longer to produce. What surprised me most is that the entire multimodal stack—agent, image, and video models—is currently available for free, with unlimited access and no time limit. A workflow that would normally require multiple paid AI subscriptions can now be handled in one place without the extra cost. If you're a developer, builder, creator, or someone who enjoys experimenting with AI workflows, it's definitely worth checking out. Try Agnes AI: agnes-ai.com/ #AgnesAI #Agnes2Flash #FreeAIModel #AIAgent #MultimodalAI #CodingWithAI #DeveloperTools #NoMorePaywalls #AIWorkflow #AgnesFreeAPI #AgnesAPITutorial

125

78,342

Zhengzhong Tu

Zhengzhong Tu

@_vztu

14h

We often hear that "computer vision has been solved.” But is it really so? 🚀 Excited to share our new work: 𝗖𝗩-𝗔𝗿𝗲𝗻𝗮: 𝗔𝗻 𝗢𝗽𝗲𝗻 𝗕𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸 𝗳𝗼𝗿 𝗜𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝗮𝗹 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗣𝗿𝗼𝗯𝗹𝗲𝗺 𝗦𝗼𝗹𝘃𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗛𝘂𝗺𝗮𝗻-𝗔𝗜 𝗖𝗼𝗹𝗹𝗮𝗯𝗼𝗿𝗮𝘁𝗶𝘃𝗲 𝗣𝗿𝗲𝗳𝗲𝗿𝗲𝗻𝗰𝗲𝘀. In this paper, we define 𝗶𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝗮𝗹 𝗰𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝘃𝗶𝘀𝗶𝗼𝗻 𝗽𝗿𝗼𝗯𝗹𝗲𝗺 𝘀𝗼𝗹𝘃𝗶𝗻𝗴 𝗶𝗖𝗩𝗣𝗦 as a broader formulation of image editing: given a real input image and a natural-language instruction, a system must produce an edited output that realizes the requested transformation while satisfying explicit preservation, geometric, physical, and usability constraints. 🧩 To support this direction, we introduce 𝗖𝗩-𝗔𝗿𝗲𝗻𝗮, an open benchmark designed for professional-grade visual editing and problem solving. 𝗖𝗩-𝗔𝗿𝗲𝗻𝗮 contains: ✅ 12K high-resolution real-image instruction pairs ✅ 16 instruction-based visual task types ✅ Tasks spanning restoration, enhancement, computational photography, physically grounded object insertion, semantic manipulation, geometry-driven structural editing, and typography recovery ✅ Real-world images with native aspect ratios and high-resolution details 🔍 We also introduce 𝗖𝗼𝗴𝗥𝗲𝘁𝗿𝗶𝗲𝘃𝗲𝗿, a dual-track retrieval and curation pipeline that combines targeted web search, agentic query refinement, verification, and traceability to construct diverse and legally traceable benchmark data. ⚖️ For evaluation, we propose 𝗔𝗰𝘁𝗶𝘃𝗲 𝗘𝗹𝗼, a human-AI collaborative preference protocol. Instead of relying purely on automatic metrics or fully human annotation, Active Elo combines: 1. 𝗖𝗩-𝗝𝘂𝗱𝗴𝗲, a logic-gated, multi-dimensional VLM evaluator 2. selective routing of ambiguous high-quality comparisons to expert human raters 3. reliability-weighted Elo updates to aggregate mixed human and AI supervision This allows us to evaluate models at scale while preserving alignment with expert human preferences. 📊 We benchmark 21 systems, including proprietary, open-source, and agentic models. Our results reveal persistent gaps in instruction adherence, physical reasoning, structural control, and fine-grained detail preservation. 🤖 Finally, we develop 𝗖𝗩-𝗔𝗴𝗲𝗻𝘁, a lightweight agentic baseline that combines planning, editing, and verification. The results suggest that closed-loop reasoning is a promising direction for professional-grade instruction-following visual editing. 💡 The main takeaway: as visual AI moves toward real workflows, the challenge is no longer only to generate visually plausible images. Models must also understand intent, preserve constraints, reason about structure and physics, and verify whether the edit actually solves the requested visual problem. 𝗣𝗿𝗼𝗷𝗲𝗰𝘁: ark1234.github.io/cv-arena 𝗖𝗼𝗱𝗲: github.com/taco-group/CV-Are… #ComputerVision #GenerativeAI #MultimodalAI #ImageEditing #AIAgents #Benchmarking #CVArena #TAMU

3,792

Timelines

Timelines @hulio_ai

22h

🌐 Foundation models: Stronger reasoning & long-context skills. 🖼️ Multimodal AI: Expanding beyond text to images, audio & video. 🚀 AI in enterprise: Enhancing productivity in everyday tools. #AITrends #FoundationModels #MultimodalAI #EnterpriseAI timelines.hulio.ai/result/91…

Timelines - Track and discover new information of your search queries

Track and analyze timeseries of query results from answer engines including Perplexity, Google, and more. Get notified when the results met your conditions.

timelines.hulio.ai

Alexander Inspira IA

Alexander Inspira IA

@Alex_Inspira

23h

Recientemente probé Agnes AI en un flujo de trabajo real de desarrollo y, sinceramente, lo que más me sorprendió no fue la generación de imágenes, sino Agnes-2.0-Flash como agente. Lo utilicé para depurar un problema de integración de API, analizar errores, proponer una estructura más limpia para el flujo de trabajo y ayudarme a planificar los siguientes pasos del proyecto. Más que un chatbot, se sintió como un agente capaz de asistir en tareas reales de coding, debugging e investigación. Además, Agnes-2.0-Flash se encuentra entre los modelos con mejor rendimiento en importantes rankings y benchmarks de IA, lo que explica por qué los resultados fueron tan sólidos en distintos escenarios. Después utilicé Agnes-Image-2.0-Flash para generar conceptos de interfaces UI a partir de simples instrucciones en texto y también probé las capacidades de generación de video. Lo que me sorprendió: • Agente generación de imágenes generación de video en una sola plataforma • Agnes-2.0-Flash resolvió bastante bien tareas de coding, depuración, planificación e investigación • El flujo multimodal se sintió realmente útil, no solo una función llamativa • Acceso ilimitado y sin muro de pago Lo más curioso es que normalmente herramientas con este nivel de rendimiento suelen estar detrás de suscripciones o límites de uso. Aquí estamos hablando de una plataforma multimodal completa que actualmente sigue siendo gratuita. Normalmente tengo que alternar entre varias herramientas para programar, planificar y crear contenido, pero Agnes logró cubrir gran parte del flujo de trabajo en un solo lugar. Para ser un sistema multimodal con acceso gratuito e ilimitado, es sorprendentemente capaz. Vale la pena probarlo si trabajas con herramientas de IA. 🔗 agnes-ai.com/ #AgnesAI #Agnes2Flash #FreeAIModel #AIAgent #MultimodalAI #CodingWithAI #DeveloperTools #NoMorePaywalls #AIWorkflow #AgnesFreeAPI #AgnesAPITutorial

113

35,159

Kalsoom (ghotai )

Kalsoom (ghotai )

@AIwithGhotai

Jun 13

A top-tier multimodal model from a leading AI lab and it’s still completely free? I kept seeing Agnes-2.0-Flash appear near the top of a few public AI leaderboards, so I decided to test it on a real development workflow. Didn’t expect much at first but the results were genuinely impressive. My first test was a Python API project that kept failing due to inconsistent JSON responses. I dropped the repo context and error logs into Agnes-2.0-Flash, and instead of giving generic debugging advice, it traced the issue to the exact failure point, explained what was going wrong, and returned a clean, working fix. It felt more like an actual coding/debugging agent than a chatbot. Then I tested the multimodal stack: • Agnes-Image-2.0-Flash → turned a simple idea into a clean product mockup • • Agnes-Video-V2.0 → generated a short demo video from the same concept • No extra tooling, no switching between apps just one system handling the full workflow. What really stands out is this: A multimodal agent stack like this would normally sit behind multiple paid subscription yet here it is, fully open with unlimited access. • Agnes-2.0-Flash • Agnes-Image-2.0-Flash • Agnes-Video-V2.0 • No paywalls. No usage limits. One unified workflow. If you’re already using tools like Claude Code or Codex for development workflows, this is worth trying. A free multimodal stack that performs like this is rare right now. @agnesai_sapiens 🔗 agnes-ai.com/ #AgnesAI #Agnes2Flash #FreeAIModel #AIAgent #MultimodalAI #CodingWithAI #DeveloperTools #AIWorkflow

111

41,102

Whissle

Whissle

@WhissleAI

Jun 13

You can now run our unified Voice AI locally with a single Docker command. Start building locally with the Whissle Gateway: whissle.ai/gateway #VoiceAI #EdgeAI #Privacy #MultiModalAI #Sustainability #DataSovereignty #DistributedAI #HybridCloud

Tulsi Soni

Tulsi Soni

@shedntcare_

Jun 13

Spent a few hours testing Agnes AI today and I genuinely didn’t expect this stack to still be free. What's even more surprising is that we're not talking about some small experimental model. Agnes-2.0-Flash ranks among the Top 10 AI models on major benchmark leaderboards, yet the entire multimodal stack is still available with unlimited access and no time limit. Most “free” AI tools lock the useful parts behind API limits or paywalls. Agnes currently gives free access to the full multimodal system: agent image video generation. Honestly, I would've expected capabilities at this level to sit behind a subscription or usage cap. I used Agnes-2.0-Flash inside a real debugging workflow. A Node.js API service I was testing kept failing because of an async queue issue that Claude Code couldn’t fully trace cleanly from the logs alone. So I dropped the repo structure error traces into Agnes-2.0-Flash. What surprised me: it didn’t just explain the bug — it mapped the execution flow, identified the race condition, and suggested a cleaner retry architecture with structured middleware separation. The fix worked on the first pass. Then I tested the multimodal side: → Agnes-Image-2.0-Flash Result was surprisingly usable without heavy prompt engineering. I also tested Agnes-Video-V2.0 for a short product demo animation and the output was way smoother than I expected from a free stack. What’s interesting is how well it fits existing dev workflows. Right now my setup is basically: Claude Code → Agnes-2.0-Flash for reasoning/debugging/planning → Agnes Image/Video for assets & demos. Kind of wild that a Top 10 benchmark-performing multimodal model stack is currently available for free, with unlimited access and no obvious paywall. A workflow that would normally require multiple paid AI subscriptions can be done in one place. Feels less like “another chatbot” and more like a genuinely usable developer agent system. Try it here: agnes-ai.com/ #AgnesAI #Agnes2Flash #FreeAIModel #AIAgent #MultimodalAI #CodingWithAI #DeveloperTools #NoMorePaywalls #AIWorkflow #AgnesFreeAPI #AgnesAPITutorial

2,203

Sharon Riley

Sharon Riley

@Just_sharon7

Jun 13

Just spent the afternoon testing Agnes AI, specifically Agnes-2.0-Flash, and honestly, I’m pretty impressed. I originally came for the image generation, but what stood out most was how useful the model was in actual workflows. I tested it on a debugging task involving a Node.js API service, used it to analyze logs, identify issues, and suggest a cleaner architecture for handling async requests. I also used it for research, planning, and coding-related tasks, and it felt much closer to a real developer agent than a typical chatbot. It’s easy to see why Agnes-2.0-Flash ranks among the Top 10 AI models on major benchmark leaderboards. The model performs surprisingly well across reasoning, coding, workflow planning, and multimodal tasks. On the creative side, I also tested image generation with prompts like: A silver-haired anime warrior with her glowing azure dragon spirit guardian in epic manga cover style. The results were detailed, creative, and usable without much prompt engineering. What’s even more surprising is that Agnes isn’t just an image model. It’s a full multimodal system combining agent, image, and video capabilities in one place. And right now the entire stack is available for free, with unlimited access and no time limit. A workflow that would normally require multiple paid AI tools can be done through a single platform. If you're building, coding, researching, automating workflows, or creating content, it's definitely worth checking out. Try Agnes AI: agnes-ai.com/ #AgnesAI #Agnes2Flash #FreeAIModel #AIAgent #MultimodalAI #CodingWithAI #DeveloperTools #NoMorePaywalls #AIWorkflow #AgnesFreeAPI #AgnesAPITutorial

100

560

30,741

Merge News

Merge News

@mergenewsapp

Jun 12

Kling AI 3.0 models launch, offering multimodal AI for photorealistic video & image creation. Now everyone can be a director. #multimodalai #videogeneration #imagegeneration #aimodels

Firoj Alam

Firoj Alam

@firojalam04

Jun 12

#NLProc Introducing ImageEval 2026 - a new shared task on Cultural Grounding in Arabic Multimodal AI, organized with #ArabicNLP2026 and co-located with #EMNLP2026. The task evaluates how well multimodal AI systems understand and generate culturally grounded Arabic visual content from the MENA region. Tracks: 🔹 Arabic Visual QA & Hallucination Detection 🔹 Cultural Accuracy Evaluation for Text-to-Image Generation Registration: shorturl.at/utvGK 📅 Registration deadline: July 20, 2026 📅 System papers due: August 15, 2026 🌐 imageeval2026.github.io/ 📂 github.com/ImageEval2026/Ima… Open to researchers in Arabic NLP, multimodal AI, computer vision, generative AI, and cultural computing. Please feel free to share! W/ @sabdaljalil_ @AhlamBashiti Farina Amir @shammur_absar @NadirDurrani5 @dalvifahim @baselmousi995 Hunzalah Hassan Bhattihtt @Zein_5 Erchin Serpedin Hasan Kurban Mustafa Jarrar #ImageEval2026 #ArabicNLP #MultimodalAI #VQA #TextToImage @emnlpmeeting @_ArabicNLP

Web4app

Web4app @web4app

Jun 12

LAMIS IS THE DREAM COMING TO LIFE ❤️ 🔗 github.com/auraecosystem/Lam… @web4app #LAMIS #EmbodiedAI #Robotics #HumanoidRobots #AI #MultimodalAI #RobotCompanion #OpenSourceAI #UnitreeG1

GitHub - auraecosystem/Lamis

Contribute to auraecosystem/Lamis development by creating an account on GitHub.

github.com

Timelines

Timelines @hulio_ai

Jun 12

🤖 Multimodal Models: AI now handles text, images, audio & video. 🛠️ Agentic AI: Models perform multi-step tasks and real-world actions. ⚡ Efficient AI: Smaller models mean strong AI runs on everyday devices. #AI2026 #MultimodalAI #AgenticAI #EfficientAI timelines.hulio.ai/result/b1…

Timelines - Track and discover new information of your search queries

Track and analyze timeseries of query results from answer engines including Perplexity, Google, and more. Get notified when the results met your conditions.

timelines.hulio.ai

Nelly;

Nelly;

@nrqa__

Jun 12

Our startup cuts monthly LLM budget entirely after switching to Agnes Our small dev startup always struggled with recurring API bills from GPT-4o and Claude, each sprint’s debugging and data processing racked up unexpected charges. After checking Claw-Eval public benchmark, Agnes falls into global top 10 tier, so we migrated part of our workflow for trial. Unbelievable perk: Agnes-2.0-Flash、Agnes-Image-2.0-Flash、Agnes-Video-V2.0 all offer permanent free API access with no usage cap. We are now building internal automation agents on this platform and generating promotional videos and marketing image materials needed for daily work. We no longer have to spend extra money on content. Zero ongoing AI spending is a massive lifesaver for cash-tight small startups. Check it out 👉🏻agnes-ai.com 🔗 platform.agnes-ai.com/login @agnesai_sapiens #AgnesAI #Agnes2Flash #FreeAIModel #AIAgent #MultimodalAI #CodingWithAI #DeveloperTools #NoMorePaywalls #AIWorkflow

0:02

11,811

Grace Miao

Grace Miao @GraceQMiao

Jun 12

Replying to @GraceQMiao @racdale @social_brains @JoyceOoops @LouisaLyu1 @ElisaKreiss @parkinsoncm @guscooney @angelhwang6

Now: the data. Can't wait to dig in! If you're working on #HumanAIInteraction #MultimodalAI #SocialNeuroscience #AIEvaluation, I'd love to connect. (6/6)

Alif Khan

Alif Khan

@Alifkhanzxx

Jun 12

Freelance dev rant: Finally stop burning cash on paid closed-source AI Tired of deducting AI fees from my freelance profits? I’ve spent hundreds monthly on premium LLMs for client coding and design work before. A fellow freelancer recommended Agnes, whose benchmark results rank within the global top 10 tier. @agnesai_sapiens What’s surprising is the full three-model suite is unlimited free forever via API. Whether I need code bug fixes or custom promotional images for clients, no more per-call fees like Gemini/Claude. Occasionally I organize project brief slides with customers on its collaborative workspace, comments and editing separated neatly. Anyone taking sporadic dev orders really should try this cost-saving option. 🔗agnes-ai.com #AgnesAI #Agnes2Flash #FreeAIModel #AIAgent #MultimodalAI #CodingWithAI #DeveloperTools #NoMorePaywalls #AIWorkflow

0:22

44,439

tong niu

tong niu @tongniu66

Jun 12

Constant AI tweaks reveal unscalable architecture. Ditch bulky model adapters. Shift from hard coding to task orchestration. crun.ai unifies models, removes redundant work, and scales your AI seamlessly. #AI #MultimodalAI #Dev 🔗 crun.ai

Denis Rylikov

Denis Rylikov @drylikov

Jun 11

#𝐀𝐈 #𝐒𝐭𝐞𝐩𝟑𝟕𝐅𝐥𝐚𝐬𝐡 #𝐌𝐢𝐱𝐭𝐮𝐫𝐞𝐎𝐟𝐄𝐱𝐩𝐞𝐫𝐭𝐬 #𝐎𝐩𝐞𝐧𝐒𝐨𝐮𝐫𝐜𝐞𝐀𝐈 #𝐋𝐋𝐌 #𝐌𝐮𝐥𝐭𝐢𝐦𝐨𝐝𝐚𝐥𝐀𝐈 #𝐀𝐠𝐞𝐧𝐭𝐢𝐜𝐖𝐨𝐫𝐤𝐟𝐥𝐨𝐰𝐬 #𝐋𝐋𝐌𝐎𝐩𝐬 #𝐇𝐢𝐠𝐡𝐓𝐡𝐫𝐨𝐮𝐠𝐡𝐩𝐮𝐭 #𝐌𝐨𝐄 #𝐀𝐈𝐟𝐨𝐫𝐃𝐞𝐯𝐞𝐥𝐨𝐩𝐞𝐫𝐬 #𝐭𝐡𝐫𝐞𝐚𝐝 #𝐝𝐫𝐲𝐥𝐢𝐤𝐨𝐯

ALT 𝐓𝐡𝐞 𝐜𝐨𝐯𝐞𝐫 𝐢𝐦𝐚𝐠𝐞 𝐨𝐟 𝐭𝐡𝐞 𝐩𝐨𝐬𝐭.