A C/C based package for advanced data transformation and statisical computing in R. Account managed by the author. #rcollapse

Joined June 2020
31 Photos and videos
Pinned Tweet
The article on "collapse: Advanced and Fast Statistical Computing and Data Transformation in R" has now been published (open-access) in the Journal of Statistical Software: jstatsoft.org/article/view/v… It offers a concise yet thorough introduction to the package. #rstats #DataScience
3
9
316
Recording of my talk on {collapse} and the {fastverse} at the Bank of Portugal‘s workshop on „Speeding up Empirical Research: Tools and Techniques for Fast Computing“ in December is now online: youtu.be/qO5dHIPsfK8?si=1-mY… #Rstats #DataScience
1
4
322
Updated windows benchmarks for in-memory database-like operations by Adrian Antico show that {collapse} still leads on lagging and casting benchmarks (not covered in DuckDB benchmarks) and remains overall very competitive: github.com/AdrianAntico/Benc… #Rstats #DataScience

1
5
343
I've released a new package {flownet} for efficient transport modeling and graph manipulation: sebkrantz.github.io/flownet/. It builds on 7 {fastverse} libraries, most notably {collapse}, from which it imports 60 functions. Thus, another great learning resource for developers #rstats

1
4
167
I'm excited to share the release and rOpenSci publication of dfms 1.0 (docs.ropensci.org/dfms), a high-performance, {collapse} and {RcppArmadillo}-based, and feature-rich, implementation of Dynamics Factor Models in R. More in the release blog post at: sebkrantz.github.io/Rblog/20…

1
4
153
28 Dec 2025
A few updates: (1) new fastverse domain at fastverse.org (2) the collapse (kit) repos moved to github.com/fastverse/collaps… (/kit) and site to fastverse.org/collapse (/kit) (3) A group of maintainers has been given access (4) collapse has a DeepWiki deepwiki.com/fastverse/colla…

3
8
363
10 Mar 2025
{collapse} 2.1.0 is out! It introduces a new fslice() function (sebkrantz.github.io/collapse…), a new theory-consistent weighted quantile algorithm (sebkrantz.github.io/collapse…) with excellent properties. And some convenience features such as join requirements: #rstats #DataScience
1
9
48
2,267
3 Feb 2025
There is now a #fastverse benchmark wiki (github.com/fastverse/fastver…) where users can freely contribute benchmarks. If you have benchmarks involving {fastverse} packages ({collapse}, {data.table}, etc., including extensions) please contribute them (takes 1 min) #rstats #DataScience
1
14
525
27 Dec 2024
It's nice to see an increasing number of #rstats packages using {collapse}. A developer focused vignette was long planned and now it is here - with modest advice on writing efficient R package code in general and using {collapse} in particular: sebkrantz.github.io/collapse…

2
11
37
2,798
30 Dec 2024
I just improved the vignette a bit further, adding some detailed benchmarks and a section on Global Options. I needed to correct myself: it is not true that {collapse} global options should never be invoked in packages - they just need to be reversed like #rstats global options.
2
220
23 Dec 2024
{collapse} is now also on BlueSky (bsky.app/profile/rcollapse.b…) and I am also there (bsky.app/profile/sebkrantz.b…) [and on Mastodon: fosstodon.org/@sebkrantz]. I will repost {collapse} posts but also share about research/data. This X account will remain active. #rstats
1
498
collapse retweeted
Check out the latest package to be granted the Seal of Approval: {collapse} by Sebastian Krantz! {collapse} is a partner package, that implements various data transformation and statistical analysis tasks using ultra fast C/C implementations. rdatatable-community.github.…

3
8
72
8,352
8 Jul 2024
{collapse} v2.0.15, with fast aggregation pivots, has just reached CRAN. A minor but neat feature worth pointing out in this release is enhanced join verbosity. In addition to the join success rates, the join relationship is now determined and reported - at no extra cost #rstats
7
39
3,280
6 Jul 2024
New independent benchmark by Adrian Antico: github.com/AdrianAntico/Benc… Setup: - large local Windows machine - real data - broad range of tasks - scripts executed inside Rstudio and VScode -> shows that {collapse} is an absolute top performer in this setting #rstats #DataScience
5
24
6,349
30 May 2024
{collapse} v2.0.15, already available via install.packages("collapse", repos = "fastverse.r-universe.dev"), adds wide/recast pivot()'s with aggregation, including some hard-coded internal functions. A game changer for pivot tables in R. More at sebkrantz.github.io/collapse…. #rstats
2
6
56
6,621
11 Mar 2024
An article on {collapse} is available on arXiv: arxiv.org/abs/2403.05038 (submitted to Journal of Statistical Software). It highlights the aims and added value of collapse and its cutting-edge performance for many complex statistical tasks in #rstats. Please consider sharing it.
1
7
56
6,749
3 Nov 2023
{collapse} has been benchmarked in the DuckDB benchmark: duckdblabs.github.io/db-benc…, and is pretty competitive on 0.5-5Gb (laptop-grade) operations. A surprise is that it seems to be the only framework next to DuckDB to be able perform large data joins (50Gb) efficiently. #rstats

2
5
59
7,600
17 Oct 2023
I’m thrilled to announce the release of {collapse} 2.0, adding blazing fast joins, pivots, a flexible namespace, and many other features. It is a remarkable piece of R software and capable of enhancing the workflow of all R users. Spread the word #rstats sebkrantz.github.io/Rblog/20…

29
154
28,709
16 Sep 2023
As I'm slowly moving towards the release of collapse 2.0, you have again opportunities to explore features in the development version and provide valuable feedback (API, performance, bugs etc.). In particular join() and pivot() are major innovations and likely of interest.
7
9
71
6,901