Jimmy at Voxel51

Jimmy at Voxel51

Users
Tweets

Jimmy at Voxel51

@jimmy_voxel51

Level up your computer vision workflows with a free hands-on workshop for your team! Book a workshop: hubs.ly/Q04l8SlP0 These hands-on workshops are delivered by Voxel51 computer vision experts. Both virtual and in-person formats. * 60 min virtual workshop * Half-day onsite workshop * Full-day onsite workshop and hackathon #mcp #skills #computervision #ai #artificialintelligence #machinevision #machinelearning #physicalai

AamirShehzad

AamirShehzad @Aamir14381091

#ArtificialIntelligence #FireDetection #ComputerVision #DeepLearning #TechInnovation instagram.com/reel/DZjvCHRAb…

PixelMind (@pixelmindcv) • Instagram reel

instagram.com

Jimmy at Voxel51

Jimmy at Voxel51

@jimmy_voxel51

Join us on July 9 for day 2 of the “Best of CVPR” series. Register for the Zoom: hubs.ly/Q04l8nhB0 Talks will include: * Efficient Representation and Coding of Dynamic Light Fields - Joshitha Ravishanker at Indian Institute of Technology Madras * PHANTOM: Physics-Infused Video Generation via Joint Modeling of Visual and Latent Physical Dynamic - Ismini Lourentzou at University of Illinois Urbana-Champaign * LoST: Level of Semantics Tokenization for 3D Shapes - Niladri Dutt at UCL | Adobe * 3D Reconstruction Improves Weakly-Supervised Semantic Segmentation - Wolfgang Boettcher at Max Planck Institute for Informatics *********** Want to build better computer vision models? FiftyOne is an open source toolkit from Voxel51 (our Meetup sponsor) that helps you curate datasets, evaluate model performance, visualize embeddings, catch annotation errors, and eliminate duplicate images—all in one place. “pip install fiftyone” is all it takes to get started - hubs.ly/Q04l8Wbl0 #computervision #ai #artificialintelligence #machinevision #machinelearning #datascience #physicalai #mcp #agents

JMA_Journal

JMA_Journal @JMA_Journal

The PMR-YOLO model improved keypoint detection accuracy and achieved over 96% accuracy across nine dangerous personnel behavior categories, enhancing real-time risk detection in complex mining environments. #Mining #AI #Safety #ComputerVision scienceopen.com/document?vid…

Jimmy at Voxel51

Jimmy at Voxel51

@jimmy_voxel51

Join us on July 8 for day 1 of the “Best of CVPR series” Register for the Zoom: hubs.ly/Q04l8nZH0 Talks will include: * CylinderDepth: Cylindrical Spatial Attention for Multi-View Consistent Self-Supervised Surround Depth Estimation - Samer Abualhanud at Leibniz University Hannover * Your ViT is Secretly Also a Video Segmentation Model - Daan de Geus at Eindhoven University of Technology * LinkedOut: Linking World Knowledge Out of Video LLMs for Next-Generation Video Recommendation - Haichao Zhang at Northeastern University * Some Modalities Are More Equal Than Others: Understanding and Improving Multimodal Integration in MLLMs - Tianle Chen at Boston University *********** Want to build better computer vision models? FiftyOne is an open source toolkit from Voxel51 (our Meetup sponsor) that helps you curate datasets, evaluate model performance, visualize embeddings, catch annotation errors, and eliminate duplicate images—all in one place. “pip install fiftyone” is all it takes to get started - hubs.ly/Q04l8mFm0 #computervision #ai #artificialintelligence #machinevision #machinelearning #datascience #physicalai #mcp #agents

AI News Clips by Morris Lee: News to help your R&D

AI News Clips by Morris Lee: News to help your R&D @morris_phd

Wild3R: Feed-Forward 3D Gaussian Splatting from Unconstrained Sparse .. arxiv.org/abs/2606.11894 --- Newsletter morrislee1234.wixsite.com/we… More story linkedin.com/in/morris-lee-p… LinkedIn morris.short.gy/linkedin #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning #ComputerVision

Gen AI Spotlight

Gen AI Spotlight

@GenAISpotlight

🔢 𝗖𝗼𝘂𝗻𝘁 𝗔𝗻𝘆𝘁𝗵𝗶𝗻𝗴: 𝗡𝗲𝘄 𝗔𝗜 𝗠𝗼𝗱𝗲𝗹 𝗖𝗼𝘂𝗻𝘁𝘀 𝗢𝗯𝗷𝗲𝗰𝘁𝘀 𝗳𝗿𝗼𝗺 𝗧𝗲𝘅𝘁 𝗔𝗰𝗿𝗼𝘀𝘀 𝟲 𝗗𝗼𝗺𝗮𝗶𝗻𝘀 Researchers from Tsinghua University released Count Anything, a vision model that counts objects in images based on a text query. It uses a dual approach: a region-level counter for large sparse objects and a pixel-level counter for small crowded ones. The two outputs are fused into a single point set showing where each counted instance is. The model covers six domains: general scenes, remote sensing, histopathology, cellular microscopy, agriculture, and microbiology. They also built CLOC, a 220K-image dataset across 619 categories with 15M object instances to train and benchmark it on. Count Anything substantially beats existing open-world counting methods across all six domains. Project page: GitHub #ComputerVision #ObjectCounting #AIResearch #Tsinghua ─── 🤖 𝗙𝗼𝗿 𝗺𝗼𝗿𝗲 𝗔𝗜 𝗻𝗲𝘄𝘀 𝗮𝗻𝗱 𝘀𝘁𝗼𝗿𝘆 𝘀𝗼𝘂𝗿𝗰𝗲𝘀, 𝘀𝗲𝗮𝗿𝗰𝗵 "𝗚𝗲𝗻𝗔𝗜𝗦𝗽𝗼𝘁" 𝗼𝗻 𝗧𝗲𝗹𝗲𝗴𝗿𝗮𝗺

ALT News article image

Comidoc

Comidoc

@comidoc

Computer Vision with OpenCV and Python: Beginner to Advanced ⏱️ 1.8 hours ⭐ 4.26 👥 7,532 🔄 Jan 2026 💰 $17.99 → 100% OFF comidoc.com/udemy/opencv-beg… #OpenCV #Python #ComputerVision #udemy

Zhengzhong Tu

Zhengzhong Tu

@_vztu

11h

We often hear that "computer vision has been solved.” But is it really so? 🚀 Excited to share our new work: 𝗖𝗩-𝗔𝗿𝗲𝗻𝗮: 𝗔𝗻 𝗢𝗽𝗲𝗻 𝗕𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸 𝗳𝗼𝗿 𝗜𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝗮𝗹 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗣𝗿𝗼𝗯𝗹𝗲𝗺 𝗦𝗼𝗹𝘃𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗛𝘂𝗺𝗮𝗻-𝗔𝗜 𝗖𝗼𝗹𝗹𝗮𝗯𝗼𝗿𝗮𝘁𝗶𝘃𝗲 𝗣𝗿𝗲𝗳𝗲𝗿𝗲𝗻𝗰𝗲𝘀. In this paper, we define 𝗶𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝗮𝗹 𝗰𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝘃𝗶𝘀𝗶𝗼𝗻 𝗽𝗿𝗼𝗯𝗹𝗲𝗺 𝘀𝗼𝗹𝘃𝗶𝗻𝗴 𝗶𝗖𝗩𝗣𝗦 as a broader formulation of image editing: given a real input image and a natural-language instruction, a system must produce an edited output that realizes the requested transformation while satisfying explicit preservation, geometric, physical, and usability constraints. 🧩 To support this direction, we introduce 𝗖𝗩-𝗔𝗿𝗲𝗻𝗮, an open benchmark designed for professional-grade visual editing and problem solving. 𝗖𝗩-𝗔𝗿𝗲𝗻𝗮 contains: ✅ 12K high-resolution real-image instruction pairs ✅ 16 instruction-based visual task types ✅ Tasks spanning restoration, enhancement, computational photography, physically grounded object insertion, semantic manipulation, geometry-driven structural editing, and typography recovery ✅ Real-world images with native aspect ratios and high-resolution details 🔍 We also introduce 𝗖𝗼𝗴𝗥𝗲𝘁𝗿𝗶𝗲𝘃𝗲𝗿, a dual-track retrieval and curation pipeline that combines targeted web search, agentic query refinement, verification, and traceability to construct diverse and legally traceable benchmark data. ⚖️ For evaluation, we propose 𝗔𝗰𝘁𝗶𝘃𝗲 𝗘𝗹𝗼, a human-AI collaborative preference protocol. Instead of relying purely on automatic metrics or fully human annotation, Active Elo combines: 1. 𝗖𝗩-𝗝𝘂𝗱𝗴𝗲, a logic-gated, multi-dimensional VLM evaluator 2. selective routing of ambiguous high-quality comparisons to expert human raters 3. reliability-weighted Elo updates to aggregate mixed human and AI supervision This allows us to evaluate models at scale while preserving alignment with expert human preferences. 📊 We benchmark 21 systems, including proprietary, open-source, and agentic models. Our results reveal persistent gaps in instruction adherence, physical reasoning, structural control, and fine-grained detail preservation. 🤖 Finally, we develop 𝗖𝗩-𝗔𝗴𝗲𝗻𝘁, a lightweight agentic baseline that combines planning, editing, and verification. The results suggest that closed-loop reasoning is a promising direction for professional-grade instruction-following visual editing. 💡 The main takeaway: as visual AI moves toward real workflows, the challenge is no longer only to generate visually plausible images. Models must also understand intent, preserve constraints, reason about structure and physics, and verify whether the edit actually solves the requested visual problem. 𝗣𝗿𝗼𝗷𝗲𝗰𝘁: ark1234.github.io/cv-arena 𝗖𝗼𝗱𝗲: github.com/taco-group/CV-Are… #ComputerVision #GenerativeAI #MultimodalAI #ImageEditing #AIAgents #Benchmarking #CVArena #TAMU

2,714

Jimmy at Voxel51

Jimmy at Voxel51

@jimmy_voxel51

11h

Level up your computer vision workflows with a free hands-on workshop for your team! Book a workshop: hubs.ly/Q04l6tJL0 These hands-on workshops are delivered by Voxel51 computer vision experts. Both virtual and in-person formats. * 60 min virtual workshop * Half-day onsite workshop * Full-day onsite workshop and hackathon #mcp #skills #computervision #ai #artificialintelligence #machinevision #machinelearning #physicalai

Ebokify

Ebokify

@ebokify

12h

Exploring the Raspberry Pi 5 — a tiny computer with endless possibilities. The Raspberry Pi 5 is not just another development board. It's a complete single-board computer capable of running operating systems, AI applications, IoT platforms, robotics projects, and even edge computing workloads. From GPIO interfacing to computer vision and automation, Raspberry Pi bridges the gap between software and hardware innovation. What makes it powerful? 🔹 Quad-Core 64-bit ARM Cortex-A76 Processor 🔹 Up to 8GB LPDDR4X RAM 🔹 Dual 4K Display Support 🔹 PCIe Expansion Interface 🔹 High-Speed USB 3.0 Connectivity 🔹 Gigabit Ethernet & Wireless Networking 🔹 40-Pin GPIO for Hardware Integration Currently exploring: ⚙️ Embedded Linux 🤖 Robotics & Automation 📡 IoT Systems 🧠 Edge AI & Computer Vision 🔌 Hardware-Software Integration The deeper I explore Raspberry Pi, the more I appreciate how a credit-card-sized computer can power real-world innovation—from smart devices to intelligent robotic systems. 🚀 🔽 Download Best Raspberry Pi eBooks 📕 ebokify.com/raspberry-pi #RaspberryPi5 #RaspberryPi #EmbeddedSystems #Linux #IoT #EdgeAI #ComputerVision #Robotics #Automation #ElectronicsEngineering #HardwareDesign #EngineeringStudent #TechInnovation #AestheticRagib

Jimmy at Voxel51

Jimmy at Voxel51

@jimmy_voxel51

13h

Join us on June 25 for the monthly AI, ML, and Computer Vision Meetup! Register for the Zoom: hubs.ly/Q04l8ygw0 Talks will include: * Large-Scale Scene Reconstruction via Local View Transformers - Tooba Imtiaz at Northeastern University * Enhancing Low-Field MRI with Deep Super-Resolution for Improved Nipah Virus Neuroimaging - Ajay Sharma at Johns Hopkins University * Lessons learned from running AI workloads in production - David Hughes at Stelia * And Now for Something Completely Different with FiftyOne - Burhan Qaddoumi at Voxel51 *********** Want to build better computer vision models? FiftyOne is an open source toolkit from Voxel51 (our Meetup sponsor) that helps you curate datasets, evaluate model performance, visualize embeddings, catch annotation errors, and eliminate duplicate images—all in one place. “pip install fiftyone” is all it takes to get started - hubs.ly/Q04l8k-B0 #computervision #ai #artificialintelligence #machinevision #machinelearning #datascience #physicalai #mcp #agents

Jeremy Park, PhD

Jeremy Park, PhD

@jeremyparkphd

13h

Got real-time object detection depth sensing working for my iPhone app! You can see how LiDAR object detection gives you how far each detected object is in meters. You can also see the distance heatmap in the top right. Details: - Hardware: iPhone 15 Pro’s LiDAR sensor provides the depth map. - Model: YOLOX-S running on the iPhone’s ANE (Apple Neural Engine) This is a preliminary step before I connect this to my deadlift tracker iPhone app 🙂 @Apple #ai #machinelearning #computervision #iphone

0:15

1,723

GOOD GIRL COIN

GOOD GIRL COIN

@Good_GirlVault

13h

One of the biggest problems in navigation isn't maps. It's awareness. Road conditions change. Hazards appear. Construction starts. Signs get missed. Routes evolve. Most navigation apps react after the fact. We're building Velvet Wayz to understand what's happening in the real world while you're moving through it. This week we advanced Velvet Eye's real-time environmental awareness, moving closer to a navigation system that doesn't just tell you where to go—it understands what it sees. Still early. Still browser-based. Still building. But every day the vision gets clearer. 🌎 Build. Share. Earn. Together. #VelvetWayz #AI #Navigation #ComputerVision #Startup Or, if you want something that matches the "One Coin. Two Sides." image that was generated: X Post: Most projects launch a token first and figure out the utility later. We took the opposite approach. For months we've been building: 🌎 Velvet Wayz 💜 Velvet Room 🏦 Velvet Wallet 💳 Velvet Pay 💬 Velvet Ping One ecosystem. One vision. One coin. HER. HIM. One Coin. Two Sides. $GGIRL #GGIRL #VelvetWayz #VelvetRoom #AI #Web3

Sipeed

±s0-x907shuu retweeted

Sipeed

@SipeedIO

Jan 11

Desktop soccer bot powered by #MaixCAM Lite! ⚽️🤖 Real-time visual recognition in action. Small but smart! 🚀 #ComputerVision #Robotics #DIY #EdgeAI #Maker

0:35

3,062

Kosta Derpanis (sabbatical in Zurich)

priya joseph retweeted

Kosta Derpanis (sabbatical in Zurich)

@CSProfKGD

Jun 10

Back to the lecture prep grind. VERY early version of my lecture on Policy Gradients. With #computervision and #robotics research appearing to converge, at least based on #CVPR2026 and my European tour conversations, it feels like good timing 😉

1:27

28,227

Jimmy at Voxel51

Jimmy at Voxel51

@jimmy_voxel51

15h

Join Nick Lotz on June 24 for a virtual workshop to learn how to use FiftyOne’s plugin framework to build custom computer vision applications. Register for the Zoom - hubs.ly/Q04l8fWF0 You’ll learn to extend the FiftyOne App with Python based panels and server side operators, as well as integrate external tools for labeling, vector search, and model inference into your dataset views. You’ll also automate repetitive tasks by writing custom workflows executing within the FiftyOne environment. Hands on tutorials will include: * Build Python plugins. Define plugin manifests and directory structures to register custom functionality within the FiftyOne ecosystem. * Develop server side operators. Write functions to execute model inference, data cleaning, or metadata updates from the App interface. * Build interactive panels. Create custom UI dashboards using to visualize model metrics or specialized dataset distributions. * Manage operator execution contexts. Pass data between the App front end and your backend to build dynamic user workflows. * Implement delegated execution. Configure background workers to handle long running data processing tasks without blocking the user interface. * Build labeling integrations. Streamline the flow of data between FiftyOne and annotation platforms through custom triggers and ingestion scripts. * Extend vector database support. Program custom connectors for external vector stores to enable semantic search across large sample datasets. * Package and share plugins. Distribute your extensions internally and externally *********** Want to build better computer vision models? FiftyOne is an open source toolkit from Voxel51 (our Meetup sponsor) that helps you curate datasets, evaluate model performance, visualize embeddings, catch annotation errors, and eliminate duplicate images—all in one place. “pip install fiftyone” is all it takes to get started - hubs.ly/Q04l84fl0 #mcp #skills #computervision #ai #artificialintelligence #machinevision #machinelearning #physicalai

Jimmy at Voxel51

Jimmy at Voxel51

@jimmy_voxel51

17h

Join Adonai Vera on June 17 for a virtual workshop to learn how to build production-ready AI agents. Register for the Zoom - hubs.ly/Q04l8CSY0 Learn how to build production-ready AI agents that can reason over your data, automate complex tasks, and integrate seamlessly into your existing stack using tools, skills, and the Model Context Protocol (MCP). We’ll walk through how modern agentic systems move beyond simple prompts—leveraging structured tools like dataset operations, embeddings, evaluation pipelines, and model execution to take real action. You’ll see how these agents can tag data, run inference, evaluate performance, and surface insights automatically, all within a unified workflow. By combining natural language interfaces with programmable building blocks, teams can dramatically reduce manual effort, accelerate experimentation, and unlock faster decision-making across the ML lifecycle. *********** Want to build better computer vision models? FiftyOne is an open source toolkit from Voxel51 (our Meetup sponsor) that helps you curate datasets, evaluate model performance, visualize embeddings, catch annotation errors, and eliminate duplicate images—all in one place. “pip install fiftyone” is all it takes to get started - hubs.ly/Q04l8yZv0 #mcp #skills #computervision #ai #artificialintelligence #machinevision #machinelearning #physicalai

Comidoc

Comidoc

@comidoc

18h

Mastering OpenCV: A Practical Guide to Computer Vision ⏱️ 3.9 hours ⭐ 4.35 👥 24,519 🔄 Jun 2024 💰 $17.99 → 100% OFF comidoc.com/udemy/mastering-… #OpenCV #ComputerVision #Python #udemy

Jinam jain

Jinam jain

@jinamcapital

18h

#hiring . 💼. . Referral Alert 🚨 Cairovision is hiring | AI Intern 🔥 📍 Greater Noida | Full Time Internship → AI Intern | CSE / IT / AI / ML graduates Skills: Python | Computer Vision | OpenCV | YOLO Deep Learning | Object Detection | Video Analytics Work on real-time CV apps & industrial AI solutions Apply 🔗 Send resume to anamika@cairovisions.com Tag someone who fits this #hiring #aiintern #computervision #python #deeplearning #noida #internship #techjobs #india

555