TurningPoint AI

TurningPoint AI

Users
Tweets

TurningPoint AI

@TurningPointAI

Jun 4

We’re excited to share our new work on CVPR 2026, Understanding Reward Hacking in Text-to-Image Reinforcement Learning. Reinforcement learning is becoming an increasingly important tool for post-training text-to-image generation models. But as we optimize these models with learned rewards, an important question arises: Are reward models truly improving generation quality, or are they creating new ways for models to game the objective? In this work, we take a closer look at reward hacking in T2I RL post-training. We study a range of reward designs, including aesthetic and preference rewards, prompt-image consistency rewards, and multi-reward ensembles. Our analysis shows that models can easily over-optimize a single reward across reward setups. Human preference reward may push generations toward exaggerated colors or superficial appeal, while a prompt-image consistency reward may improve alignment at the cost of realism and structure. Even combining multiple rewards only partially mitigates the issue. To mitigate this, we introduce ArtifactReward, a lightweight artifact-aware reward trained from a small curated dataset of artifact-free and artifact-containing samples. ArtifactReward can be integrated into existing T2I RL pipelines as a simple safeguard, improving realism and reducing reward hacking across multiple reward configurations. Paper: arxiv.org/pdf/2601.03468 Code: github.com/yq-hong/ArtifactR… Poster Session: June 6, 7:30am ExHall A Many thanks to our amazing team: Yunqi Hong @yyqq_hong , Kuei-Chun Kao @KueiChunKao, Hengguang Zhou @hgzhou42 , and Cho-Jui Hsieh @cho_jui_hsieh . #CVPR2026 #TextToImage #ReinforcementLearning #RewardHacking #GenerativeAI #UCLA #TurningPointAI

179

Viraj Sagar Das

Viraj Sagar Das

@VirajSagarDas1

Feb 19

address at the 2026 India AI Impact Summit, was also presented in sign language through the use of AI technology. This initiative connects the spirit of “Sabka Saath, Sabka Vikas” with technological innovation, further strengthening the vision of an inclusive and accessible Digital India. #TurningPointAI #ModiOnAI #BharatAI #IndiaAIImpactSummit2026 #IndiaAISummit2026 @narendramodi

0:21

145

NყƙƚυɾɳαƖ

NყƙƚυɾɳαƖ

@nykturna1

13 Dec 2025

That's quite the speaker list for umErika Fest 2025. Especially ol'Sloppy Steve Bannon, Mr. Anti-Everything except money, PDFislands, gladiator fetish porn, and corrupt political organizations. Same ol shite. ..fancier toilet. #amfest2025 @TPUSA @tpaction @DrFrankTurek @MrsErikaKirk @FrontlinesTPUSA @TPUSAEvents @tpusastudents @TPUSAJCHS @TPFaithNYC @TPointUK @TurningPointAI @TPointUSA_WSU @AndrewKolvet @BlakeSNeff @charliekirk11 @realDonaldTrump @DNIGabbard @MELANIATRUMP @TuckerCarlson @DonaldJTrumpJr @BillGates @BillClinton @HillaryClinton @Riley_Gaines_ @bennyjohnson @benshapiro @MegynKellyShow @megynkelly @Riley_Gaines_ @conservmillen @rustyrockets @_ItsSavannah_ @JesseBWatters @RealBenCarson @RobSchneider @JackPosobiec @TheOfficerTatum @MattWalshBlog @glennbeck @lucasmiles @robmccoyus @REVWUTRUTH @CalvaryChapel @CalvaryChapelUK @CalvaryChapelSH @HouseGOP @NickJFuentes

202

Tianyi Zhou

Tianyi Zhou @zhoutianyi

28 Feb 2025

🚨Exciting news from our @TurningPointAI team: the very first "aha moment" on multimodal reasoning during RL on a 2B base (non-instruct) model! 📎Blog (observations): turningpointai.notion.site/t… 📎Code for our VisualThinker-Zero-2B: github.com/turningpoint-ai/V… Key findings👇

R1-Zero’s “Aha Moment” in Visual Reasoning on a 2B Non-SFT Model | Notion

Hengguang Zhou$^*$, Xirui Li$^*$, Ruochen Wang†, Minhao Cheng, Tianyi Zhou and Cho-Jui Hsieh

turningpointai.notion.site

TurningPoint AI

@TurningPointAI

28 Feb 2025

🚀 We’re excited to share our latest work! Welcome to the first successful "aha moment" on multimodal reasoning. "Aha moment" is featured by improved response length & performance. It emerges during RL of an unaligned base model on multimodal tasks. Aha moment for language reasoning was originally observed on DeepSeek-R1-Zero. 🔍 Key Findings: 1. Directly applying GRPO on an unaligned 2B base model could elicit the multimodal “aha moment”: thinking capability marked by spontaneous reasoning strategy and increased reasoning length 2. Visual-centric task could benefit from long Chain-of-Thoughts 💻 Discover more on our notion blog and project page! Detailed Research Blog: Follow our complete journey and technical insights at our Notion Blog: 🔗turningpointai.notion.site/t… Reproduce Our Results: Access and build upon our implementation at GitHub: 🔗github.com/turningpoint-ai/V… Presented by: TurningPointAI Team 🔗turningpoint-ai.com/ #turningpointai #Smallmodel #MultimodalR1 #DeepseekR1 #R1 #Deepseek #AI #MultimodalReasoning #Qwen #QwenVL #DeepSeekR1zero

845

Xirui Li

Xirui Li

@xiruili7_li

28 Feb 2025

Experience the first true multimodal "aha moment" in 2B models with us! Excited for future research pushing the boundaries of higher intelligence. 🚀 #AI #TurningPointAI #deepseekai #deepseekr1 #GenAI

TurningPoint AI

@TurningPointAI

28 Feb 2025

436

TurningPoint AI

TurningPoint AI

@TurningPointAI

28 Feb 2025

3,638

Hengguang Zhou

Hengguang Zhou

@hgzhou42

3 Jul 2024

🚨Breaking insights! With the first multimodal-LLM oversensitivity benchmark, we showed that the safest and most powerful Multimodal-LLMs can be unnecessarily alarmed by safe queries. Follow our journey at @TurningPointAI, where I serve as the project lead.

TurningPoint AI

@TurningPointAI

29 Jun 2024

We made Multimodal LLMs safe, but have they also become oversensitive? "Every time I try, it uses all tokens just refusing." - @artilectium "This isn’t safety. It's a nanny state." - @krishnanrohit Concerned AI safety has gone too far? you’re not alone! Explore MOSSBench by TurningPointAI: the first test suite assessing if current MLLMs falsely reject benign queries. Our findings reveal: 🔍 Some of the safest models like Claude-3 Opus and Gemini-Pro reject ~70% of benign queries. 🧠 MLLMs’ overprotective behavior resembles human cognitive distortions. Discover more in our new paper represented by TurningPointAI: turningpoint-ai.github.io/MO… #turningpointai #ArtificialIntelligence #AINews #WokeAI #safety #alignment #LLM #VLM #GPT #Gemini #Claude #Psychology #CBT #mentalhealth

134

Tianyi Zhou

Tianyi Zhou @zhoutianyi

2 Jul 2024

Want to make #AIGC #LLM more controllable? How to build embodied agents from #LLM and #VLM, or a #jailbreak agent as a hacker? How do we predict and interpret #GenAI output? Are your models safe or #oversensitive? Follow @TurningPointAI for exciting research on #MultimodalAgent!

TurningPoint AI

@TurningPointAI

29 Jun 2024

2,186

Minhao Cheng

Minhao Cheng @cmhcbb

2 Jul 2024

Tired of LLM refusing your questions? Check out recent study and benchmark on when Multimodel LLMs will be oversensitive to your questions! Datasets are available at now turningpoint-ai.github.io/MO… . Follow @TurningPointAI for more exciting AIGC research.

TurningPoint AI

@TurningPointAI

29 Jun 2024

391

Ruochen Wang

Ruochen Wang @RuochenWang1

2 Jul 2024

Excited to share our latest paper! We discover that as MLLMs become safer, they also become oversensitive and consistently reject benign queries. This highlights the need for more calibrated safety alignment. Following our team @TurningPointAI for more papers on Multimodal Agents

TurningPoint AI

@TurningPointAI

29 Jun 2024

759

TurningPoint AI

TurningPoint AI

@TurningPointAI

29 Jun 2024

5,397