So I built a thing, you sign in front of your webcam, and it speaks what you're saying out loud. like, in real time.
It's a full pipeline: MediaPipe pulls out hand/body keypoints, a Transformer figures out what sign you're doing, then an LLM turns that into a proper English sentence, and Edge TTS reads it out.
Honestly, it started as a "can I actually make this work?" project, and somehow all 5 stages ended up working together.
just open-sourced it. still very much a research project, not a polished product, but it works, and I think it's pretty cool.
github.com/iamEtornam/ASL-to…