โจ ๐๐๐ฏ๐ ๐๐ง ๐ข๐ง๐ฏ๐ข๐ญ๐๐ ๐ญ๐๐ฅ๐ค ๐๐ญ ๐๐๐ ๐๐๐ฌ๐๐๐ซ๐๐ก! โจ
I recently spoke at
@IBMResearch about sthe afety alignment of generative foundation models.
Huge thanks to
@pinyuchenTW for the invitation and the amazing discussions!
๐๏ธ ๐๐๐ฅ๐ค: Safety Alignment of Generative Foundation Models
๐๐ฐ๐ธ ๐ฅ๐ฐ ๐ธ๐ฆ ๐ฆ๐ฏ๐ด๐ถ๐ณ๐ฆ ๐ต๐ฉ๐ฆ๐ด๐ฆ ๐ด๐บ๐ด๐ต๐ฆ๐ฎ๐ด ๐ด๐ต๐ข๐บ ๐ข๐ญ๐ช๐จ๐ฏ๐ฆ๐ฅ ๐ธ๐ช๐ต๐ฉ ๐ฉ๐ถ๐ฎ๐ข๐ฏ ๐ช๐ฏ๐ต๐ฆ๐ฏ๐ต ๐ข๐ฏ๐ฅ ๐ด๐ข๐ง๐ฆ๐ต๐บ ๐ฏ๐ฐ๐ณ๐ฎ๐ด?
I highlighted two recent collaborations with
@Meta and
@IBMResearch:
๐ง Internalizing safety in reasoning (RECAP)
๐ง Generalizing safety in LLM finetuning (STAR-DSS, NeurIPS'25)
๐ ๐๐๐๐๐ข๐ง๐ ๐ญ๐จ ๐๐๐ฎ๐ซ๐๐๐ ๐๐๐๐!
If youโre working on post-training, reasoning models, or agentic systems, letโs connect in San Diego! ๐