My project humbled me completely.
*Long rant ahead
14 days ago I started a project.
It was a simple idea;
Landing page url -> Video demo.
Turns out what I pictured in my mind was pretty much impossible. My vision was a highly polished motion design output curated specially for the specific website.
​I had everything working. Playwright captured screenshots, palettes, fonts, images, and icons. Using heuristics and a vision model, the system accurately parsed the website’s intent and contents.
However, I quickly realized, AI sucks at motion design.
I started with Remotion, then Hyperframes. I even tried Html in Canvas. I failed to realize that the framework was not the problem.
The workflow was,
Url input -> Playwright crawls the website -> A vision model checks the content -> An AI model generates a script -> Render the video with Hyperframes.
The output of this was always a generic powerpoint-like-presentation or a poorly edited video.
It never satisfied my vision.
So instead of making it procedurally generated, I added custom premade components. Different text effects, animations, etc. 20 curated dynamic components that change styles depending on the website.
But this still did not satisfy me. It felt too 'generic'. It did not create custom animations specific to the website. It just used my premade components.
And so, I've hit a wall.
​I kept looking at platforms like HeyGen or motion wondering what I missed. They drop decent, premium outputs from simple user parameters.
Maybe their leverage comes from locking down a single, highly specialized model variant, like Claude with a specialized skill, while I spread my logic too thin, or relied too much on a low cost LLM to understand design composition and visual timing. I had every single raw asset required the brand fonts, correct hex codes, images, but transforming raw assets into a directed pacing was something I failed to do.
​This is my first time tackling a project of this scale with the explicit goal of shipping it to production.
Right now, I’m thinking about a possible pivot, but I haven't locked anything down yet.
There's a lot I could do with the current architecture. It accurately crawls layout structures, assets, and design contexts from a URL in seconds.
Maybe I was just too occupied with my vision making me unable to accept anything not matching it. Or maybe I did something wrong along the way, perhaps relied to much on Gpt 5.5.
Anyways, that was my rant. Back to the drawing board!