We've lost countless credits on Veo 3.1 Fast/Lite due to these issues despite repeatedly force-prompting targeted restrictions. Anyone else face the same issues?
1) Wrong speaker talks in multi-speaker environment or worse, everyone moves their mouths as if they're ventriloquists
2) Character stubbornly turns around and AI renders a totally different face, clothing, hand, rings, etc.
3) Camera stubbornly turns around and we see new and unnecessary structures, backgrounds, characters previously blocked by foreground characters
4) No minors in the scene (so we have to age our child characters)
5) Still only 8 seconds when the norm now is 12-15 seconds
Prompting helps but on average, each prompt generation will use up between 3-5 generations.
Weigh these against the now-possible one-shot generations from say, Seedance 1.5 Pro or Seedance 2.0 (which now costs between USD1.5-USD3 per 15s gen output) and you'll see the real economics.
The expected Google Veo 4 needs to be multimodal, supports at least 15 seconds output and AI-gen human character uploads, to remain justifiably and economically viable.