I spoke yesterday about the Copy Text From Image (CTFI) feature landing in Windows (already on mac) builds. With these two platforms supported, that leaves Linux as the underserved platform.
Linux presents an interesting challenge, since the OS doesn't always have built-in OCR support. The Vision framework exists on macOS, Windows APIs offer the OCREngine, but Linux users have to install a package like Tesseract, GOCR, OCRopus, etc. to support similar functionality.
I pitched an idea yesterday that I'd love to share with our Linux users. Just to be clear, I'm primarily a Windows user, so I might have made a few assumptions 😂
What if CTFI for Brave on Linux worked in the following manner: when spinning-up the CTFI feature in Brave, we check to see if one of several OCR packages is available (initial list might be Tesseract and GOCR, but could also include CuneiForm and others). If no package is found, we inform the user that one is required to use the feature. Brave doesn't need to be very opinionated about which package you prefer to use in this scenario.
If Brave detects a local package, we defer to it for the OCR functionality. These packages tend to have very clean CLI support, making it trivial for the browser to interact with their functionality. When an image is passed to the CTFI service, it would be saved to a temporary location, and then fed into the OCR command of the preferred package.
This approach prevents Brave from needing to ship large amounts of OCR-related code in our builds, and enables the user to choose which of the packages they'd prefer to use.
Another option is for Brave to mark one of these as a dependency, such that it will be auto-installed with the browser itself. This would likely reduce friction for most users, including those with variant flavors of Debian and Ubuntu. Patches would always be encouraged where support is lacking.
So, that's it! Please let us know your thoughts as we continue to scratch out a path forward for our Brave Linux users 🙂