Hi
@ClementDelangue ,
I was thrilled to read that you were hiring . Since you specifically asked for OSS contributions and GitHub links, I'll lead with those. I've poured my energy into Hugging Face's ecosystem, specifically post training optimizations and fixes that align spot on with your focus, plus other key projects. Here's the rundown:
Hugging Face Transformers (my entry point. I tackled bugs, utils, and coverage boosts):
- Fixed PaliGemma’s pad token unmasking during training:
github.com/huggingface/trans…
- Resolved audio classification bugs, including doc mismatches and runtime errors with top_k=None, plus test isolation fixes:
github.com/huggingface/trans…
- Made output_dir optional in TrainingArguments:
github.com/huggingface/trans…
- Added a reload utility for smoother dev workflows:
github.com/huggingface/trans…
Unsloth Optimization Challenges (I solved four tough ones and wrote them up on Medium):
- Triton kernel for NF4 double dequantization to FP16/BF16 on T4 GPUs:
medium.com/@indosambhav/unsl…
- QLoRA FSDP2 integration for Llama 3.1 8B multi GPU fine tuning:
medium.com/@indosambhav/unsl…
- torch.compile without graph breaks for QLoRA in LLMs:
medium.com/@indosambhav/unsl…
- VRAM efficient backprop via chunked logits and generalized Cut Cross Entropy:
medium.com/@indosambhav/unsl…
Hugging Face PEFT (I've been contributing since October 2025, dropping impactful PRs on LoRA/DoRA/X LoRA enhancements. Fun fact: I'm the 3rd largest contributor there, right behind the maintainers, and #32 overall): - Negative weights support in LoRA merging, useful for task arithmetic and adapter subtraction:
github.com/huggingface/peft/…
- Embed_scale handling for DoRA, supporting scaled embeddings like Gemma3TextScaledWordEmbedding:
github.com/huggingface/peft/…
- Embed_scale handling for X LoRA:
github.com/huggingface/peft/…
- Scaled embeddings in TrainableTokensModel:
github.com/huggingface/peft/… Ensure_weight_tying for trainable_token_indices, which is upcoming for merge:
github.com/huggingface/peft/…
- 3 4x faster LoRA GA integration, also upcoming for merge:
github.com/huggingface/peft/…
Hugging Face Diffusers (as an MVP):
- Submitted a fix for Qwen image encoding padding. The PR is up for review, so fingers crossed it merges soon:
github.com/huggingface/diffu…
- MVP program shoutout:
github.com/huggingface/diffu…
Other Hugging Face Issues Raised (red teaming OSS models, including OpenAI's on Kaggle): - Cross tokenizer distillation failures in TRL's GKD/MiniLLM trainers:
github.com/huggingface/trl/i…
- FA4 integration in Transformers:
github.com/huggingface/trans…
Developer Tools & More :
-
@getpieces : I'm the #3 contributor to pieces_cli_tool (
github.com/pieces-app/cli-ag…) and #2 to Python OS SDK (
github.com/pieces-app/pieces…).
- Swarms, a multi agent orchestration library and LangChain alternative, where I'm an active contributor.
- PECAn (Predictive Ecosystem Analyzer): I was a GSoC '24 mentee, working on downscaling ML outputs spatially and temporally, and I kept contributing post program.
I really enjoy making AI models faster and more practical. Whether that's fine tuning LLMs at ByteLearn, a role I got thanks to a Kaggle contest, or sharing what I learn through open source.
I've DM'ed you as well and I'd love to chat when you have a moment.