Professor @utn_nuremberg. Formerly @AWS, @MIT_CSAIL, @TU_Muenchen. Focused on building efficient, easy-to-use data systems.

Joined June 2009
33 Photos and videos
Pinned Tweet
Excited to share that I’ll start as Professor of Data Systems @utn_nuremberg in early 2024! My research will explore the intersection of data systems and ML. I’ll soon announce PhD and postdoc positions in my group.
20
8
144
23,356
📣 🏆2026 SIGMOD Research Highlight Awards 🏆: sigmod.org/sigmod-awards/sig… DPconv: Super-Polynomially Faster Join Ordering @mihail_sto, @andreaskipf dl.acm.org/doi/10.1145/36988… Automating Vectorized Distributed Graph Computation Wenyue Zhao, Yang Cao, Peter Buneman, Jia Li, @ntarmos dl.acm.org/doi/10.1145/36988… AnyBlox: A Framework for Self-Decoding Datasets Mateusz Gienieczko, @maxikuschewski, Thomas Neumann, Viktor Leis, @JanaGiceva dl.acm.org/doi/10.14778/3749… Rel: A Programming Language for Relational Data @molhamaref, Paolo Guagliardo, @gkastrinis, Leonid Libkin, Victor Marsault, Wim Martens, Mary McGrath, Filip Murlak, Nathaniel Nystrom, Liat Peterfreund, Allison Rogers, Cristina Sirangelo, Domagoj Vrgoč, David Zhao, Abdul Zreika dl.acm.org/doi/10.1145/37222… MEMPHIS: Holistic Lineage-based Reuse and Memory Management for Multi-backend ML Systems @ArnabPhani, @matthiasboehm7 openproceedings.org/2025/con… Diva: Dynamic Range Filter for Var-Length Keys and Queries Navid Eslami, @IoanaBercea, @niv_dayan dl.acm.org/doi/10.14778/3749… The Key to Effective UDF Optimization: Before Inlining, First Perform Outlining Samuel Arch, Yuchen Liu, Todd C. Mowry, @pateljm, @andy_pavlo dl.acm.org/doi/10.14778/3696… Output-sensitive Conjunctive Query Evaluation @ShaleenDeep, @HangdongZ79542, Austen Z. Fan, Paraschos Koutris dl.acm.org/doi/10.1145/36958… Output-Optimal Algorithms for Join-Aggregate Queries Xiao Hu dl.acm.org/doi/10.1145/37252… Differentially Private Substring and Document Counting Giulia Bernardini, @philipbille, @li_rtz, Teresa Anna Steiner dl.acm.org/doi/10.1145/37252… Congratulations to all the authors👏 👏 💐 #SIGMOD2026 #ACM #researchhighlight #SIGMODawards

6
23
1,165
Andreas Kipf retweeted
Can your cloud database predict underprovisioning before it even happens? Meet ◒ xBound, the very first framework for join size lower bounds. xBound tells you how many tuples your SQL query will produce *at least*. Brought to you by Microsoft GSL & @utndatasystems.
2
3
23
4,650
Andreas Kipf retweeted
PoC customer bringing their own workload to test? 👉 Boost their LIKE/REGEX predicates with 🌰 string fingerprints. Freshly presented at AIDB'25 @VLDBconf. Paper: arxiv.org/abs/2507.10391 Code: github.com/utndatasystems/st…
1
1
304
Our lab is excited to be presenting two papers at VLDB 2025 in London this week! 🇬🇧
1
1
7
544
Tuesday, 1:45 PM: 🪂 Parachute: Single-Pass Bi-Directional Information Passing by Mihail Stoian (Research 8 — Westminster, 4F) PDF: vldb.org/pvldb/vol18/p3299-s… Code: github.com/utndatasystems/pa…
1
198
Looking forward to great discussions and catching up with everyone at VLDB!
182
Andreas Kipf retweeted
21 Jul 2025
Today we release Franca, a new vision Foundation Model that matches and sometimes outperforms DINOv2. The data, the training code and the model weights (with intermediate checkpoints) are open-source, allowing everyone to build on this. Methodologically, we introduce two new SSL components, one is a multi-granularity SK clustering loss that utilizes Matryoshka representations and a quick post-pretraining scheme to remove unwanted spatial biases. This is the result of a close and fun collaboration @valeoai (in France) and @FunAILab (in Franconia)
21 Jul 2025
Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research🧵
3
25
170
13,682
The Data Systems Lab is seeking a motivated PhD candidate to join our team and work on foundation models for data compression.
1
1
2
255
🛠️ The position requires strong programming skills in C and Python. We've already published early results in this space: - Virtual, TRL @ NeurIPS'24: arxiv.org/pdf/2410.14066v3 - Virtual, EDBT'25 (Best Demo): openproceedings.org/2025/con…

1
665
🔗 Learn more about our research and team: utndatasystems.github.io/
168
Off to SIGMOD 2025 in Berlin! 🚄 Here’s our schedule: Today, 4:20 PM: 💡 Redbench: A Benchmark Reflecting Real Workloads (aiDM) Wed, 2:00 PM: 🏆 DPconv: Super-Polynomially Faster Join Ordering Thu, 2:30 PM: ❄️ Pruning in Snowflake: Working Smarter, Not Harder Come say hi! 👋
3
11
531
Parachute takes semi-join filtering to the next level! Congrats to my PhD student @mihail_sto and thanks to our co-authors from MIT for initiating the project four years ago. See you in London! 🇬🇧
Delighted to announce that Parachute 🪂 will appear at @VLDBconf! 🇬🇧 Compared to regular semi-join filtering, Parachute removes dangling tuples in a bi-directional manner by precomputing fingerprint columns. Dangling tuples ⏬ = Join pruning ⏫. 📎 arxiv.org/abs/2506.13670
7
370
Fantastic news 🎖️ @mihail_sto will present DPconv at SIGMOD in Berlin this June.
DPconv just won a SIGMOD'25 Honorable Mention! 🥁 I was quite impressed given this year's high-quality papers. Let's see who won the big prize. My list of candidates in the thread below 🧵.
8
514
Thrilled to share that we've received the Best Demonstration Award 🏆 at EDBT 2025! Congratulations to my students @mihail_sto and Ping-Lin Kuo for their excellent work and dedication over the past few weeks—well deserved! Paper: openproceedings.org/2025/con…
1
15
481
Andreas Kipf retweeted
Check out our poster tomorrow at EDBT Demo 🇪🇸! 🔥Update: Virtual v0.2 now supports S3 and 🤗 Parquet files - try it out! pip install virtual-parquet.
1
8
497