Ruoming Pang

Ruoming Pang

14 Photos and videos

Tweets

Qingqing Cao retweeted

Ruoming Pang @ruomingpang

17 Jul 2025

In this report we describe the 2025 Apple Foundation Models ("AFM"). We also introduce the new Foundation Models framework, which gives app developers direct access to the on-device AFM model. machinelearning.apple.com/re…

Apple Intelligence Foundation Language Models Tech Report 2025

We introduce two multilingual, multimodal foundation language models that power Apple Intelligence features across Apple devices and…

machinelearning.apple.com

499

464

70,129

Qingqing Cao

Qingqing Cao @awk_ai

18 Oct 2024

📢 #hiring We are looking for research interns at @Apple AIML! Applications should have experience in LLMs and multi-modal foundation models. The goal is to do impactful research and publish new interesting ideas.

5,854

Qingqing Cao

Qingqing Cao @awk_ai

18 Oct 2024

Please email us at mind-research-internship@group.apple.com including your CV and highlighting your most relevant skills and experience, and then apply at jobs.apple.com/en-us/details….

404

Qingqing Cao

Qingqing Cao @awk_ai

18 Oct 2024

The position is full-time for a minimum of 12 weeks, with up to a year possible depending on the start date. Internships can begin before summer, but the sooner the better.

306

Maxwell Horton

Qingqing Cao retweeted

Maxwell Horton

@mchorton1991

16 Oct 2024

1/ KV Prediction for Improved Time to First Token Tired of waiting forever for your on-device LLM to begin outputting tokens? I know I am. In our latest preprint, we investigate a method for improving time to first token (TTFT) for on-device models by predicting a model’s KV cache using a small Auxiliary model. We demonstrate improvements in TTFT of up to 2x at a fixed accuracy. Full details: arxiv.org/pdf/2410.08391 . Code at github.com/apple/corenet/tre… . Work done with @awk_ai , Chenfan Sun, Yanzi Jin, @sacmehtauw @morastegari Moin Nabi. #LLM #Apple #Research #TTFT #Ondevice #OpenELM #Corenet #KVPrediction

1,347

Bowen Zhao

Qingqing Cao retweeted

Bowen Zhao

@BowenROIM

22 Jul 2024

Replying to @awk_ai

@awk_ai @HannaHajishirzi Can’t run billion-level LLMs efficiently? Take a look at our work: APT. We are excited to share our #ICML2024 oral paper, “APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference”. Paper: shorturl.at/xabJl

1,194

Qingqing Cao

Qingqing Cao @awk_ai

5 May 2024

Thank you Akari for the help!

Akari Asai

@AkariAsai

5 May 2024

Replying to @AkariAsai

BTR by @sysnlp @sewon__min @yizhongwyz @HannaHajishirzi None of the authors can travel to ICLR this time 🥲 I’ll do my best presenting this really cool work! You should also check out the video available on the website by @sysnlp!

503

Sachin

Qingqing Cao retweeted

Sachin @sacmehtauw

25 Apr 2024

Like OpenELM, CatLIP is also "Open" github.com/apple/corenet

GitHub - apple/corenet: CoreNet: A library for training deep neural networks

CoreNet: A library for training deep neural networks - apple/corenet

github.com

@_akhaliq

25 Apr 2024

Apple presents CatLIP CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data Contrastive learning has emerged as a transformative method for learning effective visual representations through the alignment of image and text

2,602

AK

Qingqing Cao retweeted

@_akhaliq

24 Apr 2024

Apple presents OpenELM An Efficient Language Model Family with Open-source Training and Inference Framework The reproducibility and transparency of large language models are crucial for advancing open research, ensuring the trustworthiness of results, and

556

173,899

Qingqing Cao

Qingqing Cao @awk_ai

12 Jul 2023

LLMs are becoming more powerful and multimodal, wonder how we can run them faster? I'm happy to present the paper "PuMer: Pruning and Merging Tokens for Efficient Vision Language Models" at the #ACL2023NLP @aclmeeting, joint work w/ @bvp22294 @HannaHajishirzi

2,177

Qingqing Cao

Qingqing Cao @awk_ai

12 Jul 2023

Come and join our poster session tomorrow (July 12) at 11:00-12:30 (EDT, America/Toronto). Unfortunately, I cannot present the work in person due to visa and family reasons. But feel free to check it out at virtual2023.aclweb.org/paper…, video at underline.io/events/395/post….

375

Qingqing Cao

Qingqing Cao @awk_ai

12 Jul 2023

Slides are at awk.ai/assets/pumer-slides.p…, code is at github.com/csarron/PuMer

140

Tal Schuster

Qingqing Cao retweeted

Tal Schuster @TalSchuster

9 Jul 2023

I'll be moderating an Efficiency in #NLProc Panel at the #SustaiNLP 🌿 workshop at #ACL2023 this Thursday. We have top leaders from Academia and Industry: @DrorRotem, @myleott, @sysnlp, @sarahookr Any questions you would like to ask? Post here or join the panel!

5,408

Akari Asai

Qingqing Cao retweeted

Akari Asai

@AkariAsai

7 Jul 2023

Don't miss our #ACL2023 tutorial on Retrieval-based LMs and Applications this Sunday! acl2023-retrieval-lm.github.… with @sewon__min, @ZexuanZhong, @danqi_chen We'll cover everything from architecture design and training to exploring applications and tackling open challenges! [1/2]

Tutorial on Retrieval-based LMs and Applications at ACL 2023.
- Instructors: Akari Asai (University of Washington), Sewon Min (University of Washington), Zexuan Zhong (Princeton), Danqi Chen (Princeton)
- Time & location: Sunday, July 9 14:00 - 17:30 (EDT) @ Metropolitan West
- website https://acl2023-retrieval-lm.github.io/

ALT Tutorial on Retrieval-based LMs and Applications at ACL 2023. - Instructors: Akari Asai (University of Washington), Sewon Min (University of Washington), Zexuan Zhong (Princeton), Danqi Chen (Princeton) - Time & location: Sunday, July 9 14:00 - 17:30 (EDT) @ Metropolitan West - website https://acl2023-retrieval-lm.github.io/

103

487

91,239

Aditya Kusupati

Qingqing Cao retweeted

Aditya Kusupati @adityakusupati

12 Jun 2023

Introducing💃AdANNS: A Framework for Adaptive Semantic Search🕺 TL;DR: Up to 90× faster nearest neighbor retrieval and 2× lower memory cost for web-scale search. Applies to vector search at scale & improves all "retrieval" augmented models! arxiv.org/abs/2305.19435 [1/8]

470

91,083

Artidoro Pagnoni

Qingqing Cao retweeted

Artidoro Pagnoni

@ArtidoroPagnoni

24 May 2023

4-bit QLoRA is here to equalize the playing field for LLM exploration. You can now fine-tune a state-of-the-art 65B chatbot on one GPU in 24h. Paper: arxiv.org/abs/2305.14314 Code and Demo: github.com/artidoro/qlora

QLoRA: Efficient Finetuning of Quantized LLMs

We present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance....

arxiv.org

Tim Dettmers

@Tim_Dettmers

24 May 2023

QLoRA: 4-bit finetuning of LLMs is here! With it comes Guanaco, a chatbot on a single GPU, achieving 99% ChatGPT performance on the Vicuna benchmark: Paper: arxiv.org/abs/2305.14314 Code Demo: github.com/artidoro/qlora Samples: colab.research.google.com/dr… Colab: colab.research.google.com/dr…

222

81,370

Marcos Treviso

Qingqing Cao retweeted

Marcos Treviso @MarcosTreviso

28 Mar 2023

Looking for valuable insights and carefully curated pointers on efficient NLP research directions? 🔎 Check out our updated survey covering an extensive range of topics in the classic NLP pipeline 🚀: arxiv.org/abs/2209.00099 Product of an amazing collaborative team effort! 🤝

Leon Derczynski ⚒️☁️🏔️🌲@LeonDerczynski

28 Mar 2023

Efficient NLP methods - an up-to-date survey, to appear in TACL. We cover efficiency wrt: * Data * Model design * Pre-training * Fine-tuning * Inference & compression * Hardware utilization * Evaluation * Model selection This was a blast to co-produce! arxiv.org/abs/2209.00099

3,900

Leon Derczynski ⚒️☁️🏔️🌲

Qingqing Cao retweeted

Leon Derczynski ⚒️☁️🏔️🌲@LeonDerczynski

2 Sep 2022

How can NLP be more efficient? We can't blindly scale forever. This survey I'm lucky to be part of presents current efficiency research at many stages of the NLP process, and actionable advice to practitioners for making their NLP more efficient. arxiv.org/abs/2209.00099 #nlproc

Efficient Methods for Natural Language Processing: A Survey

Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource...

arxiv.org

293

Qingqing Cao

Qingqing Cao @awk_ai

10 Jul 2022

I'll be at #NAACL2022 this week, my first ever in-person #nlproc conf since I joined the @uwnlp group at the University of Washington. I'm happy to chat about efficient #nlproc (QA, retrieval, vision-language models, etc.) or postdoc at UW, life in Seattle.

LUNR

Qingqing Cao retweeted

LUNR @stonybrooknlp

8 Nov 2021

Replying to @lal_yash

@lal_yash is presenting this in Bavaro 4 right now! Stop by and have a chat @emnlpmeeting #EMNLP2021

LUNR @stonybrooknlp

26 Aug 2021

Replying to @stonybrooknlp

IrEne-viz is a platform showcasing energy consumption of transformer models. Built over IrEne (from #ACL2021), it allows for fine-grained & interpretable analysis of models & their components. Joint work by @yash_lal, Reetu Singh, @harsh3vedi, @sysnlp, @aruna__b, @b_niranjan /n