Joined February 2025
32 Photos and videos
Year 2: completed.βœ… Year 3: let's build. πŸš€βœ¨ #SpaceMarvel #AI #TechStartup #Innovation #2YearsStrong #BuildingTheFuture #FutureOfAI
1
1
9
πŸš€ @SpaceMarvelAI is now part of the @nvidia Inception Program! Excited to build faster with NVIDIA's AI ecosystem. #NVIDIAInception #AI #SpaceMarvelAI #StartupIndia
12
Companies don’t need more AI tools. They need better systems. #AI #Automation #AIAgents #FutureOfWork
13
No eval tool fits every product. What did you end up building in-house? πŸ‘‡πŸ’¬ #ArtificialIntelligence #GenerativeAI #AIEngineering #MachineLearning #TechLeadership #BuildInPublic #StartupLife
13
If reviewing outputs feels slow, your evals won’t scale. What’s missing in your review UI? πŸ‘‡πŸ’¬ #AI #GenAI #LLM #AIEvals #UXDesign #MLOps #AIProduct #BuildInPublic #TechInsights
1
24
When did off-the-shelf tools stop being enough for you? πŸ‘‡πŸ’¬ #AI #GenAI #LLM #AIEvals #AIObservability #MLOps #AIProduct #BuildInPublic #TechInsights
10
Should prompts be handcrafted - or auto-generated? Where do you draw the line? πŸ‘‡πŸ’¬ #AI #GenAI #LLM #PromptEngineering #AIEvals #MLOps #AIProduct #BuildInPublic #TechInsights
1
1
20
What should LLMs evaluate - and what should stay human? πŸ‘‡πŸ’¬ #AI #GenAI #LLM #AIEvals #HumanInTheLoop #AITesting #MLOps #AIProduct #BuildInPublic #TechInsights
19
Outsource annotation - or keep it in-house? Where have you seen better results? πŸ‘‡πŸ’¬ #AI #GenAI #LLM #AIEvals #HumanInTheLoop #AITesting #MLOps #AIProduct #BuildInPublic #TechInsights
15
Error analysis isn’t just technical - it’s a product decision too. Do PMs join your eval reviews? πŸ‘‡ #AI #GenAI #LLM #AIEvals #ProductManagement #MLOps #AITesting #AIProduct #BuildInPublic #TechInsights
17
Is more annotation always better - or just more noise? πŸ‘‡πŸ’¬ #AI #GenAI #LLM #AIEvals #HumanInTheLoop #AITesting #MLOps #AIProduct #BuildInPublic #TechInsights
11
Would you rather have a confident wrong answer - or an honest β€œI don’t know”? πŸ‘‡πŸ’¬ #AI #GenAI #LLM #AIEvals #ResponsibleAI #AITesting #MLOps #MachineLearning #AIProduct #BuildInPublic
13
Can a model fairly evaluate its own work? Sometimes - with guardrails. What’s your setup? πŸ‘‡πŸ’¬ #AI #GenAI #LLM #AIEvals #AITesting #MLOps #MachineLearning #AIProduct #BuildInPublic #TechInsights
14
Similarity β‰  correctness. How do you evaluate AI outputs? πŸ‘‡ #AI #GenAI #LLM #AIEvals #MachineLearning #MLOps #AIProduct #BuildInPublic #TechInsights
21
Off-the-shelf metrics are fast - but are they meaningful? What do you rely on today? πŸ‘‡πŸ’¬ #AI #GenAI #LLM #AIEvals #AITesting #MLOps #MachineLearning #AIProduct #BuildInPublic #TechInsights
1
20
Should every failure become an automated eval? Or only the ones that truly matter? πŸ‘‡πŸ’¬ #AI #GenAI #LLM #AIEvals #AITesting #MLOps #MachineLearning #AIProduct #BuildInPublic #TechInsights
12
Evals shouldn’t be an afterthought. They should guide what you build. Agree? πŸ‘‡ #AI #GenAI #LLM #AIEvals #AITesting #MLOp
15
1-5 ratings feel detailed - but they hide disagreement. Pass/fail forces clarity. Which do you use today? πŸ‘‡πŸ’¬ #AI #GenAI #LLM #AIEvals #AITesting
8
You don’t need all traces. You need the right traces. How do you sample production today? πŸ‘‡ #AI #GenAI #LLM #AIEvals #AIObservability #MLOps #MachineLearning #AIProduct #BuildInPublic #TechInsights
11
Different users = different failure modes. Segment your evals to see what’s really breaking. Do you bucket your queries? πŸ‘‡πŸ”₯ #AI #GenAI #LLM #AIEvals #AITesting #MachineLearning
1
2
31