AI has transformed how video is created. We think the next wave is about understanding it.
Over the past few years, we've seen remarkable advances in video generation, editing, avatars, and creative tooling. An increasingly important problem is teaching machines to search, analyze, reason over, and extract insight from video - across massive libraries and live streams alike.
We're calling this video intelligence, and we're actively looking to back founders building here. We're most excited about companies pushing on the core capabilities:
- Video-native models - multimodal embeddings, temporal reasoning, and retrieval built specifically for video rather than adapted from image or text
- Real-time and large-scale pipelines - infrastructure for processing, indexing, and querying video at the speed and scale enterprises actually need
- Agentic and reasoning layers - systems that don't just retrieve clips but answer questions, surface anomalies, and take action on what they see
The models and infrastructure to make this real are appearing to be crossing a capability threshold right now. Multimodal foundation models are maturing, storage costs have collapsed, and enterprises are sitting on years of unstructured video with no way to use it.
That infrastructure unlocks a wide range of applications including media and sports workflows, security and physical operations, enterprise knowledge management, advertising analytics, robotics, and consumer products, where video has historically been dark data.
If you're building in video intelligence at the model layer, the platform layer, or in a vertical application, we'd love to talk!