Filip Graliński

Filip Graliński

46 Photos and videos

Tweets

Filip Graliński

@FilipGralinski

Feb 2

Snowflake

Filip Graliński retweeted

Snowflake

@Snowflake

5 Jun 2025

Day 2 of #SnowflakeSummit flew by but not before a mountain of announcements from our Platform Keynote! We announced: Adaptive Compute, Snowflake Openflow, Cortex AISQL, Semantic Model Sharing, Snowflake Intelligence, and much more. See what's new: bit.ly/4mNjiqR

1:53

2,026

Łukasz Borchmann

Filip Graliński retweeted

Łukasz Borchmann

@LukaszBorchmann

1 Apr 2025

How can the most accurate SQL be generated for a given question? We propose a method to significantly boost text-to-SQL accuracy while drastically cutting costs.👇 #NLProc #AI #TextToSQL #LLMs

51,211

Anupam Datta

Filip Graliński retweeted

Anupam Datta

@datta_cs

25 Mar 2025

Our Snowflake AI Research team just released Arctic Embed’s core training code into the open source ArcticTraining project — making it easier for developers and researchers to reproduce, fine-tune, and build on our embedding models. Arctic Embed is the leading small embedding model on the MTEB leaderboard and is widely used with over 1M monthly downloads. What you’ll find: ✅ Clean, config-driven workflows powered by DeepSpeed ✅ Flexible contrastive data handling ✅ Example fine-tuning recipes and ready-to-use tooling Read more here and try it out: snowflake.com/en/engineering… @SnowflakeDB @DeepSpeedAI @lukemerrick_ @pxyumass @spacemanidol @rajhans_samdani @jeffra45 @StasBekman

Snowflake Arctic Embed Joins ArcticTraining: Simple And Scalable Embedding Model Training

Arctic Embed now merges with ArcticTraining, giving developers open access to core training code for building efficient frontier embedding models.

snowflake.com

10,122

Luke Merrick

Filip Graliński retweeted

Luke Merrick @lukemerrick_

18 Dec 2024

Connor Shorten was kind enough to give me the mic for a lot of hot takes on text embedding models in the latest Weaviate podcast.

Connor Shorten

@CShorten30

18 Dec 2024

Arctic Embed ❄️ has been one of the most impactful open-source text embedding models! In addition to the open model, which has helped a lot of companies kick off their own inference and fine-tuning services (including us), the Snowflake team has also published incredible research breaking down all the components of how to train these models! I am SUPER EXCITED to publish the 110th Weaviate Podcast with Luke Merrick (@lukemerrick_), Puxuan Yu (@pxyumass), and Charles Pierse (@cdpierse) discussing all things Arctic Embed! The podcast covers: • The origin of Arctic Embed • Pre-training embedding models • Matryoshka Representation Learning • Fine-tuning embedding models • Synthetic Query Generation • Hard Negative Mining • Single-Vector Embedding Models in the search model cohort of ColBERT, SPLADE, and Re-rankers I hope you enjoy the podcast! As always, please reach out if you would like to discuss any of these ideas further!

1,123

Aurick Qiao

Filip Graliński retweeted

Aurick Qiao

@aurickq

5 Dec 2024

We are excited to share SwiftKV, our recent work at @SnowflakeDB AI Research! SwiftKV reduces the pre-fill compute for enterprise LLM inference by up to 2x, resulting in higher serving throughput for input-heavy workloads. 🧵

2,755

Daniel Campos

Filip Graliński retweeted

Daniel Campos @spacemanidol

4 Dec 2024

🚀 I am thrilled to introduce @SnowflakeDB 's Arctic Embed 2.0 embedding models! 2.0 offers high-quality multilingual performance with all the greatness of our prior embedding models (MRL, Apache-2 license, great English retrieval, inference efficiency) snowflake.com/engineering-bl…🌍

Snowflake’s Arctic Embed 2.0 Goes Multilingual

snowflake.com

7,212

Filip Graliński

Filip Graliński

@FilipGralinski

26 Nov 2024

111

Filip Graliński

Filip Graliński

@FilipGralinski

26 Nov 2024

Więcej informacji: linkedin.com/posts/bartosz-n…

📢 Zapraszamy na cykl spotkań „AI na co dzień”! AI jest aktualnie jednym z najbardziej innowac...

📢 Zapraszamy na cykl spotkań „AI na co dzień”! AI jest aktualnie jednym z najbardziej innowacyjnych i dynamicznie rozwijających się obszarów biznesu. Podczas spotkań „AI na co dzień. Praktyczne...

pl.linkedin.com

Michał Pietruszka

Filip Graliński retweeted

Michał Pietruszka

@MichaPietruszka

2 Nov 2024

Can AI models help us create better models? 🧵 1/ It's a question that stands at the boundaries of what's possible in data science. We explored how Large Language Models (LLMs) perform as data scientists, especially in the art of feature engineering.

201

Department of Artificial Intelligence AMU Poznan

Filip Graliński retweeted

Department of Artificial Intelligence AMU Poznan @poznanAI

11 Sep 2024

A joint study by @poznanAI researchers and Samsung Electronics Polska engineers was presented at @FedCSIS 2024. The paper investigates the impact of augmenting spoken language corpora with domain-specific synthetic samples. arxiv.org/abs/2406.07090

413

Filip Graliński

Filip Graliński

@FilipGralinski

7 Sep 2024

Good people out there, please make your Python script more command-line friendly: 1. put this as the first line: #!/usr/bin/env python3 2. set x permission: chmod u x your_script.py (and commit that to git) Now you I can run your script with ./your_script.py. Thank you!

Daniel Campos

Filip Graliński retweeted

Daniel Campos @spacemanidol

6 Sep 2024

It's fall which means it's intern recruitment time! @SnowflakeDB is widely recruiting research interns to work on all kinds of problems around AI/LLM/Search. If you are interested or know any students who are looking for summer 2025 internships hit me up!

3,128

Yuxiang Wei

Filip Graliński retweeted

Yuxiang Wei

@YuxiangWei9

5 Sep 2024

Code LLMs involve multiple stages of training. At Snowflake, we did extensive training ablations across general repo data, high quality filtered data, and synthetic instruction data so you don’t have to. 🧵

3,443

Darek Kłeczek

Filip Graliński retweeted

Darek Kłeczek

@dk21

2 Sep 2024

Pretty wild that @kaggle got me on the cover! Thanks @InezOkulska for the interview! 😻

3,351

Department of Artificial Intelligence AMU Poznan

Filip Graliński retweeted

Department of Artificial Intelligence AMU Poznan @poznanAI

28 Aug 2024

LLM Bielik v2 on our internal benchmark, based on Polish educational and professional tests, achieves an accuracy score of 58.03%. This is a noticeable improvement over the 41,51% in v.0.1. Congratulations to the entire @Speak_Leash team. More extensive results coming 🔜

220

Filip Graliński

Filip Graliński

@FilipGralinski

28 Aug 2024

This new LLM for Polish looks really interesting, congrats to the team!

SpeakLeash @Speak_Leash

28 Aug 2024

The wait is over - Bielik v2 is here!🦅 Here’s what it offers: 💪11B parameters 📈32,768 token context window 🚝Enhanced training data ⌨Improved NLP 🤝Flexible deployment Made possible through our collaboration with @Cyfronet Check it out here: bit.ly/472ZwQG

Filip Graliński

Filip Graliński

@FilipGralinski

27 Aug 2024

Some lessons (I) learnt preparing the data mixture for Snowflake Arctic LLM 👨‍🍳

1,120

Filip Graliński

Filip Graliński

@FilipGralinski

27 Aug 2024

linkedin.com/posts/snowflake…

What data sets should be used to pretrain an LLM? With a number of possible data sources to choose...

What data sets should be used to pretrain an LLM? With a number of possible data sources to choose from, it can be challenging to know what to look for. In this blog, the Snowflake AI Research team...

linkedin.com

107