working on something new • @carnegiemellon alum

Joined June 2020
45 Photos and videos
Pinned Tweet
19 Feb 2024
If you like 3D graphics, I wrote a high-level post on the difference between Neural Radiance Fields (NeRFs) and 3D Gaussian Splatting (3DGS). It's a ten min read, and aims to educate those not as familiar with the field of neural rendering. Link and some snippets below. 1/5
9
83
532
74,451
you know you've been in tech too long when you read the MANGOS acronym as "MangOS" and not "mangos" didn't hit me until a week after I first saw it
90
if you're a leader, you should have your org tokenmaxxing in ML we call this curriculum learning ie. the method to get AI adoption for an org by first tokenmaxxing, then maxing eng velocity, then maxing product velocity makes logical sense tokenmaxxing is never the endgame
1
126
Here's an analogy that's been landing with startup leaders I've been chatting with lately: AI is like discovering a gun. Suddenly you can move way faster. Problem is everyone else discovered guns too. Shipping fast isn't an advantage anymore. It's the price of staying alive
170
I've built 3D streaming pipelines on Modal websockets just like this post so let me be the guy who does the math no one wants to do: Bandwidth: 1080p@24fps = ~5-15 Mbps per user. A webpage showing the same thing is a few KB per session. Unless you have Netflix-grade infra it's tough GPUs: ~$1/hr=$720/mo per warm A100 on Modal. that's just for one. If you're serving thousands of users... that's a lot of $$. Maybe local models would be great, but we're not there yet Not trying to be a hater, but I've gone down this road before because I too believed pixel streaming would be next-gen UI. But every couple years there's always a hot pixel streaming demo and it just stays a demo Pixel streaming also has a graveyard. OnLive, Google Stadia, etc. It's just a hard technical problem with minimal benefits. Again, good for a demo but tough to solve the business problem And yet I'm still a fan. I hope it's solved because I'll be the first adopter to the new paradigm. I'm just a skeptic that's all
Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see. @eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)
1
3
488
been doing this for the past couple months as I train to become a pro-level swimmer. highly recommend
LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.
1
209
agreed. my AR glasses will always have a place in my life, but as an accessory, not a replacement, to the phone phones never replaced computers so why should glasses replace phones?
There is no successor to the smartphone. It's the terminal form factor. Much like the car is the terminal form factor for human transportation. Tesla has improved 1000x over the Model T but you still have the same cab riding on four wheels, because it's the optimal solution given the jobs to be done and constraints of reality. The smartphone has the most intense product market fit of any product ever for a reason: a computer you carry in your pocket and hold in your hand, with a screen you manipulate with your fingers, cannot be meaningfully improved upon for the jobs people want done. This doesn't mean there's not room for awesome new devices, especially with AI opening up new vistas. We have motorcycles, mopeds, scooters, e-bikes, and those funny one-wheeled vehicles you see whizzing through Golden Gate Park. We will have smart glasses, pendants, pucks, and whatever device Hark has in store. Hark seems like an S-tier team with an amazing vision, and I will probably buy their product. Moreover, I wouldn't want to live in a world where crazy ambitious visions aren't pursued with convert's zeal. Nevertheless, I'm writing this because every time I see a venture that explicitly or implicitly promises to supersede the smartphone, I think the same thing: It wouldn't matter if Jony Ive teamed up with the ghost of Steve Jobs and raised an army of the greatest HCI designers that ever walked the earth. It's structural. The smartphone is terminal.
1
7
408
o yo this is actually wild
What if a world model could render not an imagined place, but the actual city? We introduce Seoul World Model, the first world simulation model grounded in a real-world metropolis. TL;DR: We made a world model RAG over millions of street-views. proj: seoul-world-model.github.io/
4
545
Edward Ahn retweeted
InstantSplat is now open source. It is a lightweight library that connects foundation models (VGGT, MASt3R, MAP-Anything, etc.) with the Gaussian splatting family. Given uncalibrated images, it optimizes a 3D scene in a few seconds. Try the demo and code here: github.com/phai-lab/InstantS…
7
85
713
31,821
pretty much the exact same realization i made
I’m with Brad. Years ago I thought it was almost entirely a software problem and people would put up with a bulky headset if the software is great. I’ve completely flipped. The hardware has to be socially acceptable and very comfortable for mainstream adoption.
4
399
not surprised at all, it was a conscious bet that I took to bring the world a new future. i didn’t go into VR trying to make money if anything we need more risk takers in the world i’m happy i took the bet but the bet just didn’t pan out and that’s something I’m OK with
13
1,140
gorillatag success has nothing to do with if mixed reality is ready to be a general computing platform or if there’s a compelling consumer use case it only proves that short session, low stakes entertainment works in mixed reality
Replying to @BartronPolygon
110k concurrent Gorilla Tag users literally last week. If that isn’t signal for a compelling consumer use case, what is?
1
4
1,721
the ar/vr industry is hurting :( work on vision pro and you have no users. work on meta and meta actively sabotages its own platform. work on google, steam frame, etc.. oh wait they don’t exist yet and working on AI glasses is cool but having worked in XR for a decade the signs say it’s still too early. gardens are too closed for devs to meaningfully do anything people just say “wait a couple more years” or “stay lean” or “don’t hire” but it’s not that easy. this is our lives and careers we’re betting on. the opportunity cost gets higher and higher my point is if you’re looking into joining the space now, take a minute and make sure you really want to take the bet despite your heart’s desires
Reflecting in Tokyo with my time off... Thought hard on what happened to VR industry this past month... This is a long read: I've been in VR since 2013 when I first helped start up Survios. You can imagine how the Meta VR fallout devastated me in really unique ways. Totally and absolutely crushing. I can't even really explain it fully. It's hard to express how personal it all is being in it for so long. And I'm not saying that was smart to allow. I believed in VR too much and let it get there. The best-laid plans of mice and men, right? But this was very consequential to the entire VR gaming industry. While there are always paths to raise capital, traditional gaming is already struggling and VR focused gaming is even worse now. It's a bloodbath. For everyone. So when Meta pulled the 1st party plug I said watch 3rd party- we've already seen major shake ups. Now wait until the entire year runs it's course. This is especially so now with Steam Frame delayed and component prices rising rapidly. It all conflicts with promising myself in 2013 when I left AAA for VR it would be my only focus. VR became my requirement for opportunity. Period. I put VR first. I refused all work not VR related. It's been 13 years now in "VR only" and I've done a lot, worked on many titles, and certainly hit some dreams... And 6 years ago I started a family right into the whole 2020 turmoil. Anyone from the game industry knows the challenge of living any nornal family life. And XR as an emerging technology frankly made that dynamic even tougher. Everything since that 2020 year has slowly but surely recalibrated me. Those years added even more complex pressure as my grandparents passed away in the lockdown. I never understood it when I was young but you really do change once you have kids and really go through it. So I'm exhausted caring so much while watching apologists dismiss and sugarcoat how major this tide going out really is. As if AI glasses will save VR! "Only when the tide goes out do you discover who’s been swimming naked." Well get ready to see some birthday suits out there. Many studios are already retreating. Some will go under. So I need to be honest with myself... My framing: First I quit college and entered gamedev in 2007 to make games. It's all I ever wanted to do. Then I quit AAA and entered VR in 2013 to make the VR medium I believed in succeed. I tried the DK1 and could not stop talking about the potential of VR to the frustration of my traditional gaming peers. This crunch life I've now spent nearly 20 years in, while it has given me many gifts, has taken quite a lot of my time. My advice to hedge for new young XR devs- If I were to continue the "emerging tech" race here in XR I would go hard into AR AI design for incoming mixed reality devices. And I think they will do very well in mainstream. I would urge people interested to explore it. However, this isn't a box for my own passion. I can't pretend it is. I love VR. I love gaming. AI glasses do not excite me. Even traditional gaming is a better fit for me than glasses tech. This all puts me at my current decision- I will adjust my efforts and shift to XR being a great feature of my games, not the focus, and this will also reflect in my posting which will evolve toward more traditional gaming. So instead of putting all the eggs in one basket like launching first on Quest platform or even try to shift as a Steam Frame exclusive, I'll instead build traditional flat with "VR as a feature" Anyone who knows me well will understand how hard this decision was. But as an avid PC/PCVR gamer, I'm going to my gaming roots first. I don't see a path for me with glasses tech being pushed everywhere. Can't lie to myself and fake enthusiasm for emerging tech I'm not that interested in. I'm not going into mobile apps or building for a generative AI platform. I'm here to create what only games can. I want people to experience worlds.
7
1
56
13,146
insightful yet unsurprising. human nature to optimize the tool more than the actual process the ungodly amount of hours i spent in college optimizing my vim setup..
3
236
as someone who also had a rare tumor this is awesome to see. even for myself AI has helped me tremendously for understanding what i had best of luck @blader 🤞 i might have to torture deep research myself
over the weekend, i built an app that i sincerely hope you will never have a need for, but if you do happen to need a friendly, free, private mri viewer designed to make it easy for you to track tumor progression, you can try it here: miraviewer.org here's the story: as some of you may know, last last september, my six year old daughter mira was diagnosed with an extremely rare brain tumor called an adamatinomatous craniopharyngioma, and since then our family has been doing everything we can possibly do to find a cure for her. we tortured chatgpt deep research, put together our own private research team, raised $1.4M and donated it all to @HankMitraLab research thanks to $MIRA, explored every remotely applicable drug whether on the market or not, and even began working with md anderson to develop a personalized vaccine that we hope can lead to a more permanent cure unfortunately, we received the devastating news last march that the tumor has continued to grow since her initial surgery, and we had to start to consider more drastic options which would have seriously impacted mira’s quality of life. thankfully, with the help of dr. sabine mueller @UCSFChildrens and the @HankMitraLab at the university of colorado, in april, we started her on an alternative but extremely experimental treatment for this disease. to our unimaginable relief, her tumor has responded extraordinarily well to this treatment which combines tocilizumab (an arthritis drug that blocks IL6 receptors) with avastin (a colon cancer drug that inhibhits VEGF proteins). we know this, because mira gets an MRI scan of her brain every few months. and every time we get a new scan, the first thing we do is compare it against her last scan. so we have to find the matching weight of the scan, and then find the same plane, and then carefully find the slices of the scan where the tumor is visible, and then find the closest match to last month’s scan, then adjust the zoom, rotation, and brightness / contrast so they all look the same. we got pretty good at this. but it shouldn't be this hard. so i built miraviewer.org last weekend using gemini 3 with some gpt 5.2 xhigh. you just import your DICOM MRI files (either zip, files or a folder), and you can align all of your scans across multiple dates instantly, just click and drag a rectangle around the tumor on any image, and it will use some very clever algorithms to automatically align up and find the closest matching slice from all your other scans, match the brightness / contrast, rotation, pan, zoom, and even shear to make sure the registration is as close as possible, and make it as easy to possible to compare tumor progression. it has a grid view so you can see all your scans for the same location all at once, and an overlay view so you can quickly compare two scans visually (by holding down the space bar to toggle quickly between two scans), along with tools to animate your scans both within the same sequence as well as over multiple scans to show progression. there is no server, it runs entirely locally on your browser - nothing ever gets uploaded and it's all open source: github.com/blader/MiraViewer if you've found this useful, please consider a donation to the @UCSFChildrens hospital foundation, who has given us extraordinary care over the past year or so: donate.ucsfbenioffchildrens.…
3
238
this is not something i ever expected and i'm frankly sad i was really hoping to use their avatars sometime this year for remote work, but it looks like it won't happen anytime soon
Meta is shutting down its Horizon Workrooms VR meeting software next month, with no direct replacement for its online meetings: uploadvr.com/meta-shutting-d…
2
216
this is just a phone call btw no app no signup
3
1
5
513
415-909-3691
2
1
138
i miss when x had less articles and more threads (with maybe a link to a blog post) i'm not going to read your 1000 word essay unless i know it's worth my time
3
105