Site Reliability Engineer @Apple  . SLOgician of sociotechnical systems. Opinions are my own.

Joined December 2010
60 Photos and videos
Pavlos Ratis retweeted
20 Dec 2023
Apple announces LLM in a flash: Efficient Large Language Model Inference with Limited Memory paper page: huggingface.co/papers/2312.1… Large language models (LLMs) are central to modern natural language processing, delivering exceptional performance in various tasks. However, their intensive computational and memory requirements present challenges, especially for devices with limited DRAM capacity. This paper tackles the challenge of efficiently running LLMs that exceed the available DRAM capacity by storing the model parameters on flash memory but bringing them on demand to DRAM. Our method involves constructing an inference cost model that harmonizes with the flash memory behavior, guiding us to optimize in two critical areas: reducing the volume of data transferred from flash and reading data in larger, more contiguous chunks. Within this flash memory-informed framework, we introduce two principal techniques. First, "windowing'" strategically reduces data transfer by reusing previously activated neurons, and second, "row-column bundling", tailored to the sequential data access strengths of flash memory, increases the size of data chunks read from flash memory. These methods collectively enable running models up to twice the size of the available DRAM, with a 4-5x and 20-25x increase in inference speed compared to naive loading approaches in CPU and GPU, respectively. Our integration of sparsity awareness, context-adaptive loading, and a hardware-oriented design paves the way for effective inference of LLMs on devices with limited memory.
25
450
2,478
698,051
Pavlos Ratis retweeted
12 Jun 2023
If you're SRE or SRE-adjacent, it would be great to spend some time filling out catchpoint.com/sre-survey -- one of the few ways we can get cross-company readings on what's happening with the profession. Thank you for your time in advance!
1
16
24
8,137
Watching my first #WWDC23 as an Apple employee! Great job to all the teams!
337
Pavlos Ratis retweeted
Are you an SRE looking to make a difference in your org? Use research, data, and math to slay MTTR. Find it in the 2022 VOID Report: thevoid.community/report
8
17
2,582
Pavlos Ratis retweeted
19 Dec 2022
Would be curious to hear from practising SREs whether or not this captures their experiences
19 Dec 2022
Are you a software engineering director in charge of some Site Reliability Engineers (SRE) and wondering what they’re doing - or _should_ do? Then @niallm has some guidelines for you... stanza.systems/post/what-doe…
3
2
9
3,470
Pavlos Ratis retweeted
If you are responsible for production infrastructure this is for you: thevoid.community/report
3
6
789
Pavlos Ratis retweeted
I feel it is a great curated resource for SREs Great work @dastergon 👏 x.com/dastergon/status/15141…

🎉 The awesome-sre repo turns six today! I hope it’s helped folks and teams to get started with SRE. github.com/dastergon/awesome… #SRE #SiteReliabilityEngineering
1
1
291
Pavlos Ratis retweeted
24 Nov 2022
The SRE Book did a lot of things. Were enough of them good? @lauralifts has takes! stanza.systems/post/should-w…
9
16
Pavlos Ratis retweeted
Just in case you’re looking, my org at Apple is hiring SREs, engineering managers, infrastructure managers, a system’s reliability tech lead, a cloud SRE manager, build engineers and much more. These aren’t full time remote roles, but im happy to provide more info in dms
2
72
200
Pavlos Ratis retweeted
I'm a big believer that great products are a result of healthy autonomy. It means less interdependencies between teams, strong API/data boundaries, mature capacity planning, the devops model where devs are operators of their services.
10
55
396
I’m happy to share that I’m starting a new position as Site Reliability Engineer at @Apple!
11
2
68
Pavlos Ratis retweeted
2 Aug 2022
. @dastergon gave a talk at #SREcon21 on the "Lessons Learned Using the Operator Pattern to Build a Kubernetes Platform," usenix.org/conference/srecon… youtube.com/watch?v=F-MLsAYb…

1
1
4
Pavlos Ratis retweeted
Oh look, redhat.com/sre is live! 🎉 @RedHat @OperateFirst

11
27
Pavlos Ratis retweeted
2 Jun 2022
usenix.org/publications/logi… USENIX's login magazine very kindly agreed to publish a slightly elaborated, text version of my keynote from last year's SRECon EMEA. Thanks to @lauralifts, @NYCDubliner and @systemician amongst many others for review support.
2
6
23
Pavlos Ratis retweeted
# Writing for Engineers Writing is a key skill to master if you want to grow your reach beyond the team level. Over the past year I have noted how many SWE struggle with this task, and helped a few to improve. Here are my top takes: - Blog heinrichhartmann.com/posts/w… - Thread 👇

1
13
23
Pavlos Ratis retweeted
Still a classic!
🎉 The awesome-sre repo turns six today! I hope it’s helped folks and teams to get started with SRE. github.com/dastergon/awesome… #SRE #SiteReliabilityEngineering
2
4