I like studying statistical problems.

Joined September 2019
6 Photos and videos
12 Nov 2024
Very excited to give the next BNP webinar! I will be talking about some recent works on feature allocation models and several generalizations 🎉
12 Nov 2024
Next webinar organised by BNP-ISBA will be DATE & TIME: 17:00 UTC on December 4th, 2024. SPEAKER: Mario Beraha (Polytechnic of Milan) TITLE: Bayesian analysis of (extended) feature allocation models: predictions, sufficientness, and applications bnp-isba.github.io/webinars.…
1
7
343
Mario Beraha retweeted
📢I’m really proud of “Bayesian clustering of high-dimensional data via latent repulsive mixtures’’ just appeared on Biometrika, Advance articles, doi.org/10.1093/biomet/asae0…. Thank you to my terrific coauthors Lorenzo Ghilotti and @mberaha2.
1
7
36
2,358
Mario Beraha retweeted
📢The submission of contributed talks for the 14th International Conference on Bayesian Nonparametrics, UCLA (Los Angeles, US), June 23-27, 2025, is now OPEN! Deadline for submission: Dec 15, 2024.
1
2
3
181
Mario Beraha retweeted
25 Mar 2024
Average temperatures by year, with the years listed in alphabetical order, and suddenly the climate crisis looks like a joke.
8
14
320
10,020
Mario Beraha retweeted
Warning: extremely difficult text to read below. Last night I surrendered my phone and signed a waiver. And then I sat with a small group and an Israeli military attaché and we watched the 45 minute compilation of Hamas videos from October 7th. 1/12
2,757
10,263
30,750
6,781,162
28 Sep 2023
Together with Stefano Favaro and Matteo Sesia, we just arXiv'd our latest take on frequency and cardinality estimation from compressed (sketched) data: arxiv.org/abs/2309.15408
1
2
439
28 Sep 2023
Hence, we adopt a pragmatic approach and propose the class "smoothed" estimators, which work very well in practice and are easy to compute! We extend the analysis to sketches obtained with multiple hash functions by drawing from the "multi-view" literature.
1
98
28 Sep 2023
As a bonus, we show that our model-based approach can be used to infer the cardinality of the dataset as well. This is another classical problem in computer science which was typically solved using a different data structure!
1
1
103
28 Sep 2023
Hence, we adopt a pragmatic approach and propose the class "smoothed" estimators, which work very well in practice and are easy to compute! We extend the analysis to sketches obtained with multiple hash functions by drawing from the "multi-view" literature.
1
58
28 Sep 2023
As a bonus, we show that our model-based approach can be used to infer the cardinality of the dataset as well. This is another classical problem in computer science which was typically solved using a different data structure!
53
Mario Beraha retweeted
Replying to @mberaha2
@mberaha2 and I have just published the first project we started collaborating 5 years ago! Meanwhile we have taken onboard 4 more coauthors, and I thank them all. See Childhood obesity in Singapore: A Bayesian nonparametric approach, SMJ OnlineFirst
2
9
518
Personal update. As of last week, I'm officially a Ph.D. in data science and computation. My thesis was on the statistical learning of RPMs, under the supervision of @AlessandraGugl9! We also arXived the first two papers from my postdoc, joint works with Stefano Favaro.
3
3
42
2,604
Indeed, under a CRM prior, the predictive distribution of the number of "new" traits in an additional sample depends only on the sample size! We propose a new class of priors derived from the scaled subordinators by James et al. (2015).
1
1
206
In particular, we show that, in some cases, this leads to a predictive distribution that depends on the sample size and the number of unique traits in the sample, similarly to what happens under the Pitman-Yor process prior!
187