Joined September 2023
Photos and videos
Got strong reviews at NeurIPS Datasets & Benchmarks but no accept? DMLR Special Conference Track wants your data-centric work! See the CFP at tinyurl.com/dmlrspecial #ML #DataCentricAI #DMLR #NeurIPS #NeurIPS2025

1
3
486
Synthetic Datasets for Machine Learning on Spatio-Temporal Graphs using PDEs by Jost Arndt, Utku Isil, Michael Detzel, Wojciech Samek, Jackie Ma Action Editor: Yi Liu data.mlr.press/assets/pdf/v0…

124
The FIX Benchmark: Extracting Features Interpretable to eXperts by Helen Jin, et al Action Editor: Hugo Jair Escalante data.mlr.press/assets/pdf/v0…

90
Chronicling Germany: An Annotated Historical Newspaper Dataset by Christian Schultze, Niklas Kerkfeld, Kara Kuebart, Princilia Weber, Moritz Wolter, Felix Selgert Action Editor: Hugo Jair Escalante data.mlr.press/assets/pdf/v0…

139
MONSTER: Monash Scalable Time Series Evaluation Repository by Angus Dempster, Navid Mohammadi Foumani, Chang Wei Tan, Lynn Miller, Amish Mishra, Mahsa Salehi, Charlotte Pelletier, Daniel F. Schmidt, Geoffrey I. Webb Action Editor: Hugo Jair Escalante data.mlr.press/assets/pdf/v0…

94
FlowBench: A Large Scale Benchmark for Flow Simulation over Complex Geometries by Ronak Tali, et al Action Editor: Sergio Escalera data.mlr.press/assets/pdf/v0…

47
Text Quality-Based Pruning for Efficient Training of Language Models by Vasu Sharma, Karthik Padthe, Newsha Ardalani, Kushal Tirumala, Russell Howes, Hu Xu, Po-Yao Huang, Daniel Li Chen, Armen Aghajanyan, Gargi Ghosh, Luke Zettlemoyer AE: Yang Liu data.mlr.press/assets/pdf/v0…

59
Deep Learning for Accurate Diagnosis of Viral Infections through scRNA-seq Analysis: A Comprehensive Benchmark Study by Ziwei Yang, Xuxi Chen, Biqing Zhu, Tianlong Chen, Zhangyang Wang Action Editor: Sergio Escalera data.mlr.press/assets/pdf/v0…

78
Data Acquisition: A New Frontier in Data-centric AI by Lingjiao Chen, Bilge Acun, Newsha Ardalani, Yifan Sun, Feiyang Kang, Hanrui Lyu, Yongchan Kwon, Ruoxi Jia, Carole-Jean Wu, Matei Zaharia, James Zou Action Editor: Remi Denton data.mlr.press/assets/pdf/v0…

93
Challenge design roadmap by Hugo Jair Escalante, Isabelle Guyon, Addison Howard, Walter Reade, Sébastien Treguer Action Editor: Sebastian Schelter data.mlr.press/assets/pdf/v0…

60
V-LoL😂: A Diagnostic Dataset for Visual Logical Learning by Lukas Helff, Wolfgang Stammer, Hikaru Shindo, Devendra Singh Dhami, Kristian Kersting Action Editor: Christopher De Sa data.mlr.press/assets/pdf/v0…

47
SuperBench: A Super-Resolution Benchmark Dataset for Scientific Machine Learning by Pu Ren, N. Benjamin Erichson, Junyi Guo, Shashank Subramanian, Omer San, Zarija Lukić, Michael W. Mahoney Action Editor: Holger Caesar data.mlr.press/assets/pdf/v0…

54
Towards impactful challenges: post-challenge paper, benchmarks and other dissemination actions by Antoine Marot, David Rousseau, Zhen (Zach) Xu Action Editor: Sebastian Schelter data.mlr.press/assets/pdf/v0…

71
Constructing Confidence Intervals for "the" Generalization Error – a Comprehensive Benchmark Study by Hannah Schulz-Kümpel, Sebastian Fischer, Roman Hornung, Anne-Laure Boulesteix, Thomas Nagler, Bernd Bischl Action Editor: Yue Zhao data.mlr.press/assets/pdf/v0…

73
ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications by Juan Zuluaga-Gomez, et al. Action Editor: Peter Mattson data.mlr.press/assets/pdf/v0…

55
Evaluating Durability: Benchmark Insights into Image and Text Watermarking by Jielin Qiu, William Han, Xuandong Zhao, Shangbang Long, Christos Faloutsos, Lei Li Action Editor: Hongyang Zhang data.mlr.press/assets/pdf/v0…

100
OpenOOD v1.5: Enhanced Benchmark for Out-of-Distribution Detection by Jingyang Zhang, Jingkang Yang, Pengyun Wang, Haoqi Wang, Yueqian Lin, Haoran Zhang, Yiyou Sun, Xuefeng Du, Yixuan Li, Ziwei Liu, Yiran Chen, Hai Li Action Editor: Yang Liu data.mlr.press/assets/pdf/v0…

1
110
'Properties of Alternative Data for Fairer Credit Risk Predictions' by Jung Youn Lee, Joonhyuk Yang Action Editor: Yang Liu data.mlr.press/assets/pdf/v0… #AlternativeData #ProxyDiscrimination #GenderGap #CreditScoring #AlgorithmicFairness

74
'The Matrix Reloaded: Towards Counterfactual Group Fairness in Machine Learning' by Mariana Pinto, André V. Carreiro, Pedro Madeira, Alberto López, Hugo Gamboa Action Editor: Yang Liu data.mlr.press/assets/pdf/v0… #Bias #Fairness #Counterfactual #ConfusionMatrix #DataAugmentation

2,961