That particular pipeline is doing a lot of pdfs and forms, some with actual handwriting. The flash model is really cheap and keeps performing well despite throwing a lot of weird variations at it, so it was cleaner to use it for everything rather than having multiple services.
For web text only, yeah that'd be overkill - I am using puppeteer Gemini in one project and postlight parser gpt-4o-mini in another