Postdoc (Current) & Ph.D. & B.Eng. @Tsinghua_Uni. | Previously visiting Ph.D. @CISPA. | ML security, privacy, and safety.

Joined August 2021
3 Photos and videos
Tianshuo Cong retweeted
11 Dec 2024
Congratulations to the recipients of the #ACSAC2024 Distinguished Artefact Reviewer Awards: Md Ajwad Akil, Dominik Roy George, Carlotta Tagliaro, Delong Ran 👏👏👏
1
4
623
Thanks for sharing our work! 😄 We regard JailbreakEval to be a catalyst that simplifies the evaluation process in jailbreak research and fosters an inclusive standard for jailbreak evaluation within the community🚀🚀🚀.
18 Jun 2024
JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models "we conduct a comprehensive analysis of jailbreak evaluation methodologies, drawing from nearly ninety jailbreak research released between May 2023 and April 2024. Our study introduces a systematic taxonomy of jailbreak evaluators" "we propose JailbreakEval, a user-friendly toolkit focusing on the evaluation of jailbreak attempts. It includes various well-known evaluators out-of-the-box, so that users can obtain evaluation results with only a single command" (not peer reviewed) paper: arxiv.org/abs/2406.09321
3
282
Tianshuo Cong retweeted
7 May 2024
🚀Just updated: We present our longitudinal robustness tests on LLaMA (v1, v2, v2 Chat, v3, and v3 Instruct), GPT-3.5 (v0613, v1106, and v0125), and GPT-4 (v0613, v1106, v0125, and v0409) across three critical categories: misclassification, jailbreak, and hallucination! Understanding long-term reliability is key for AI's future. With @TianshuoCong, @JeremyZhaozy, Yun Shen, Michael Backes, and @realyangzhang. Dive into our findings: arxiv.org/pdf/2308.07847.

3
3
1,201
"Stay updated on the latest works in Safety, Security, and Privacy (SSP) for Large Models (LM)!🥳" Explore our comprehensive reading list, LM-SSP, co-organized with @AllenXinleiHe, @JeremyZhaozy, & @YugengLiu. 🔗github.com/ThuCCSLab/lm-ssp 🆕LM-SSP adds 107 papers from #ICLR2024!
2
6
24
3,172
LM-SSP is inspired by some other awesome projects like @llm_sec @topofmlsafety, etc.
1
231
- 🌱The list is in progress, welcome to recommend resources to us~ - Currently we collect ~400 papers in 16 topics (Fig.1 and Fig.2) - The large models we focus on are Large Language Models (LLMs), Vision-Language Models (VLMs), and Diffusion Models (Fig.3).
282
Great blog! Recently we propose a black-box, no gradient needed jailbreaking algorithm named FigStep (arxiv.org/abs/2311.05608).
10 Nov 2023
Feeling a bit intimidating to write about it but work on attacks can lead to good insights for mitigation. Plan to write about mitigation work separately later. Also want to thank all the researchers who shared disclosure reports w/ us so far. 🙏🙏🙏 lilianweng.github.io/posts/2…
3
651
We propose a quite simple-but-effective approach, FigStep, to jailbreak large vision-language models. 📣 Just a screenshot and a benign textual prompt can jailbreak LLaVA, MiniGPT-4, and even GPT-4V! ⚠️ Check our paper: arxiv.org/abs/2311.05608
7
19
3,229
Tianshuo Cong retweeted
Replying to @AllenXinleiHe
@AllenXinleiHe is on the job market (mainly) for a faculty position. He is amazing (xinleihe.github.io/ ) and please do consider him if your institutions are hiring in the field of trustworthy machine learning!

Today, my first PhD student @AllenXinleiHe became a doctor! I’m very lucky to work with Xinlei during the past 3 years, and wish him all the best for his future career!
1
12
51
9,354
Tianshuo Cong retweeted
21 Sep 2022
Summer is over and we are back! Next seminar Wed, September 28th, 3:30 PM (Central European Time) Prof. Tianhao Wang (@bigflywth, University of Virginia) "Continuous Release of Data Streams under Differential Privacy" Details: prisec-ml.github.io/

13
28