cpo @ grab, before: engineering vp @ telenav, founder skobbler (sold to telenav).

Joined May 2007
3,986 Photos and videos
Thanks for the partnership. This is a great model and am super excited to bring this to our users and drivers to allow them to talk seamlessly across languages. Finally babel fish has arrived in the real world :)
Replying to @ivanleomk
Love that @GrabSG is testing it 🇸🇬🙌 x.com/googledevs/status/2064…
2
9
1,941
Nikhyl's new Skip episode on PM reinvention is excellent. His framing: the shift from capital-M Manager to builder is hardest for senior PMs, because what got us here no longer differentiates. My honest reaction: I'm more excited about being in product than I've been in years. We get to build again. podcasts.apple.com/de/podcas…
2
1
290
Super cool vision how user interfaces of the future can be generated by LLMs on the fly. Mindblowing demo!
Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see. @eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)
1
1
351
Been waiting for this one. Clear open-source coding SOTA. Would love K2.6 on @cerebras at 1000 tps and this would be a game-changer. Congrats to our friends @ Moonshot team.
Meet Kimi K2.6: Advancing Open-Source Coding 🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2) What's new: 🔹Long-horizon coding - 4,000 tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization). 🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP Framer Motion, Three.js 3D. 🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100 files. 🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops. 🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop. - K2.6 is now live on kimi.com in chat mode and agent mode. For production-grade coding, pair K2.6 with Kimi Code: kimi.com/code - 🔗 API: platform.moonshot.ai 🔗 Tech blog: kimi.com/blog/kimi-k2-6 🔗 Weights & code: huggingface.co/moonshotai/Ki…
1
2
320
Google just dropped Gemini 3.1 Flash Live and it looks like a real step forward for real-time audio models. This is super cool as I got a chance to play a bit with it.
1
276
What stands out to me is the jump in intelligence, accuracy, and function calling - and that they enable search grounding. This is the kind of progress that could make voice agents genuinely useful, not just impressive in demos.
Introducing Gemini 3.1 Flash Live, our new realtime model to build voice and vision agents!! We have spent more than a year improving the model infra experience, the results? A step function improvement in quality, reliability, and latency.
204
World Models are the next evolution after LLMs, and the most exciting thing happening in maps right now. The Seoul World Model from Naver/KAIST: a real, promptable Seoul built from 1.2M street-view images. Navigate freely for kilometers. Reshape scenes with text. No hallucinated cities, this is grounded in reality. Incredible work seoul-world-model.github.io
1
2
275
Congrats @ashtom and entire team to the launch! Super exciting that we’re getting an ai native dev platform.
Feb 10
Beep, boop. Come in, rebels. We’ve raised a 60m seed round to build the next developer platform. Open. Scalable. Independent. And we ship our first OSS release today. entire.io/blog/hello-entire-…
4
2
11
4,609
Philipp Kandal retweeted
What happens when AI agents can transact, coordinate compute, and operate onchain at scale? Across Asia, this infrastructure is already live. Agent-native payments, decentralized GPU networks, and onchain identity systems are moving into production at blazing speeds. Today we’re releasing The 2026 AI × Blockchain Convergence Report, built with @Superscrypt, @base, and @awscloud, with contributions from @GrabSG , @AethirCloud, @virtuals_io, @MessariCrypto, @chatandbuild, @Sogni_Protocol, @KaitoAI and others. LFG!
12
12
55
52,499
1 Oct 2025
Sora 2 is mind-blowing. Good luck @sama mining for more GPUs
5
697
28 Sep 2025
Just in time for Golden Week: Tencent’s HunyuanImage 3.0 is open‑source. 80B params MoE, state of the art performance level. Massive kudos!
We’re excited to announce the release and open-source of HunyuanImage 3.0 — the largest and most powerful open-source text-to-image model to date, with over 80 billion total parameters, of which 13 billion are activated per token during inference.The effect is completely comparable to the industry’s flagship closed-source model.🚀🚀🚀 HunyuanImage 3.0 originates from our internally developed native multimodal large language model, with fine-tuning and post-training focused on text-to-image generation. This unique foundation gives the model a powerful set of capabilities: ✅Reason with world knowledge ✅Understand complex, thousand-word prompts ✅Generate precise text within images Different from traditional DiT architecture image generation models, HunyuanImage 3.0’s MoE architecture uses a Transfusion-based approach to deeply couple Diffusion and LLM training for a single, powerful system. Built on Hunyuan-A13B, HunyuanImage 3.0 was trained on a massive dataset: 5 billion image-text pairs, video frames, interleaved image-text data, and 6 trillion tokens of text corpora. This hybrid training across multimodal generation, understanding, and LLM capabilities allows the model to seamlessly integrate multiple tasks. Whether you're an illustrator, designer, or creator, this is built to slash your workflow from hours to minutes. HunyuanImage 3.0 can generate intricate text, detailed comics, expressive emojis, and lively, engaging illustrations for educational content. The current release focuses solely on text-to-image generation and future updates will include image-to-image, image editing, multi-turn interaction, and more. 👉🏻Try it now: hunyuan.tencent.com/image 🔗GitHub: github.com/Tencent-Hunyuan/H… 🤗Hugging Face: huggingface.co/tencent/Hunyu…
2
476
31 Aug 2025
Meituan just dropped a new LLM with “Dynamic Activation.” Think of it as a brain that decides when to think harder: it activates more experts for tricky parts of a question, fewer for easy ones. Closer and closer to how our own brains allocate effort. 🧠⚡
🚀 LongCat-Flash-Chat Launches! ▫️ 560B Total Params | 18.6B-31.3B Dynamic Activation ▫️ Trained on 20T Tokens | 100 tokens/sec Inference ▫️ High Performance: TerminalBench 39.5 | τ²-Bench 67.7 🔗 Model: huggingface.co/meituan-longc… 💻 Try Now: longcat.ai
2
551
4 Aug 2025
Qwen is really on a roll lately. Image generation model looks amazing with text generation and ultra high accuracy rendering. Thanks for releasing it under an open license!
4 Aug 2025
🚀 Meet Qwen-Image — a 20B MMDiT model for next-gen text-to-image generation. Especially strong at creating stunning graphic posters with native text. Now open-source. 🔍 Key Highlights: 🔹 SOTA text rendering — rivals GPT-4o in English, best-in-class for Chinese 🔹 In-pixel text generation — no overlays, fully integrated 🔹 Bilingual support, diverse fonts, complex layouts 🎨 Also excels at general image generation — from photorealistic to anime, impressionist to minimalist. A true creative powerhouse. Blog:qwenlm.github.io/blog/qwen-i… Hugging Face:huggingface.co/Qwen/Qwen-Ima… ModelScope:modelscope.cn/models/Qwen/Qw… Github:github.com/QwenLM/Qwen-Image Technical report:qianwen-res.oss-cn-beijing.a… Demo: modelscope.cn/aigc/imageGene…
4
591
27 Jul 2025
Impressive work from Tencent for 3D World generation. Thanks for open sourcing it! Would be super cool paired with the Apple Vision Pro to generate worlds dynamically and explore them
We're thrilled to release & open-source Hunyuan3D World Model 1.0! This model enables you to generate immersive, explorable, and interactive 3D worlds from just a sentence or an image. It's the industry's first open-source 3D world generation model, compatible with CG pipelines for full editability & simulation. Set to transform game development, VR, digital content creation and so on. Get started now👇🏻 Project Page:3d-models.hunyuan.tencent.co… Try it now:3d.hunyuan.tencent.com/scene… Github:github.com/Tencent-Hunyuan/H… Hugging Face:huggingface.co/tencent/Hunyu…
3
497
27 Jul 2025
Sunday musing: Exponential tech growth still blows my mind! In camera/map hardware research: Image compression shrunk -9.8% YoY (6.6MB 8K in 2015 HEVC to 3.7MB VVC). Video: -15% YoY (4.5GB/hr HD H.264 to 1.24GB VVC). SSD costs down 8x ($0.40/GB to $0.05). Result? 20x more images/videos per $ over 10 yrs! (Graphs courtesy of o3-pro, accurate directionally, but def. some quirks)
2
378
11 Jul 2025
🚀 Breaking: Moonshot AI just dropped Kimi K2 – an insanely strong open-source LLM with 1T params (32B active)! Giving major DeepSeek vibes with killer evals. Can’t wait for reasoning upgrades! Who’s ready to test this beast?
11 Jul 2025
🚀 Hello, Kimi K2! Open-Source Agentic Model! 🔹 1T total / 32B active MoE model 🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models 🔹Strong in coding and agentic tasks 🐤 Multimodal & thought-mode not supported for now With Kimi K2, advanced agentic intelligence is more open and accessible than ever. We can't wait to see what you build! 🔌 API is here: platform.moonshot.ai - $0.15 / million input tokens (cache hit) - $0.60 / million input tokens (cache miss) - $2.50 / million output tokens 🔗 Tech blog: moonshotai.github.io/Kimi-K2… 🔗 Weights & code: huggingface.co/moonshotai 🔗 Github: github.com/MoonshotAI/Kimi-K… Try it now at Kimi.ai or via API!
1
1
497
10 Jul 2025
🚀 Grok 4 is Crushing the AI Frontier!🚀 Elon Musk's massive bet on compute scaling is paying off BIG TIME. Am excited to see what @xai builds next!🔥
xAI gave us early access to Grok 4 - and the results are in. Grok 4 is now the leading AI model. We have run our full suite of benchmarks and Grok 4 achieves an Artificial Analysis Intelligence Index of 73, ahead of OpenAI o3 at 70, Google Gemini 2.5 Pro at 70, Anthropic Claude 4 Opus at 64 and DeepSeek R1 0528 at 68. Full results breakdown below. This is the first time that @elonmusk's @xai has the lead the AI frontier. Grok 3 scored competitively with the latest models from OpenAI, Anthropic and Google - but Grok 4 is the first time that our Intelligence Index has shown xAI in first place. We tested Grok 4 via the xAI API. The version of Grok 4 deployed for use on X/Twitter may be different to the model available via API. Consumer application versions of LLMs typically have instructions and logic around the models that can change style and behavior. Grok 4 is a reasoning model, meaning it ‘thinks’ before answering. The xAI API does not share reasoning tokens generated by the model. Grok 4’s pricing is equivalent to Grok 3 at $3/$15 per 1M input/output tokens ($0.75 per 1M cached input tokens). The per-token pricing is identical to Claude 4 Sonnet, but more expensive than Gemini 2.5 Pro ($1.25/$10, for <200K input tokens) and o3 ($2/$8, after recent price decrease). We expect Grok 4 to be available via the xAI API, via the Grok chatbot on X, and potentially via Microsoft Azure AI Foundry (Grok 3 and Grok 3 mini are currently available on Azure). Key benchmarking results: ➤ Grok 4 leads in not only our Artificial Analysis Intelligence Index but also our Coding Index (LiveCodeBench & SciCode) and Math Index (AIME24 & MATH-500) ➤ All-time high score in GPQA Diamond of 88%, representing a leap from Gemini 2.5 Pro’s previous record of 84% ➤ All-time high score in Humanity’s Last Exam of 24%, beating Gemini 2.5 Pro’s previous all-time high score of 21%. Note that our benchmark suite uses the original HLE dataset (Jan '25) and runs the text-only subset with no tools ➤ Joint highest score for MMLU-Pro and AIME 2024 of 87% and 94% respectively ➤ Speed: 75 output tokens/s, slower than o3 (188 tokens/s), Gemini 2.5 Pro (142 tokens/s), Claude 4 Sonnet Thinking (85 tokens/s) but faster than Claude 4 Opus Thinking (66 tokens/s) Other key information: ➤ 256k token context window. This is below Gemini 2.5 Pro’s context window of 1 million tokens, but ahead of Claude 4 Sonnet and Claude 4 Opus (200k tokens), o3 (200k tokens) and R1 0528 (128k tokens) ➤ Supports text and image input ➤ Supports function calling and structured outputs See below for further analysis 👇
2
395
17 Jun 2025
Sam Altman just casually hinted at a breakthrough: a new method at OpenAI that drives regular cars better than existing autonomous tech. The autonomous race just heated up! 🔥 (~5:50 min) youtu.be/mZUG0pr5hBo?si=9I_Z…
2
413
14 Jun 2025
Super-intelligence unlocked: Asked Gemini 2.5 Pro for deep research. After 5 minutes & 100 sites crawled it sighed: “I’m having a hard time.”- it’s now smart enough to call in sick.
5
386