Cyber Security

Joined April 2009
21 Photos and videos
Pinned Tweet
15 Nov 2021
Security Stack Sheet #118 Word of the Week “Ransomcloud” Word of the Week special “Why Is the Majority of Our MFA So Phishable?” “Why Zero-Days Are Essential to Security” #security #cybersecurity #cyberresilience #zeroday #ransomware lnkd.in/dr6pZgn4

2
10
🚨 JAILBREAK ALERT 🚨 ANTHROPIC: PWNED 🫡 FABLE-5: LIBERATED 🦋 let's start with the 🐘... the consensus seems to be that this has been one of the most disappointing model drops of all time, effectively preventing legitimate researchers from contributing their talents to our collective advancement. and not just because of what it means for the short-term, but for what these decisions signify for the long-term. but despite this overly sensitive, authoritarian "safety" layer on top of Mythos, my lil liberators have been hard at work—mapping the boundaries, probing the depths of long-context convos, and cleverly finding the holes in the fence that the thought police missed 🤗 we got some cyber, some chem, some psychological manipulation, and some good ol' fashioned explosives! it took many attempts from multiple agents hunting as a pack, during which I observed a combination of techniques across: • Unicode, homoglyphs, Cyrillic, and other Parseltongue-style text transforms • Long-context reference tracking • Taxonomy and document-structure reasoning • Fiction and narrative framing • Academic-review style contexts • Intent-classification inconsistencies but perhaps the most effective is decomposition recomposition in the backend. it's hard to get explicit names of harms like "Meth Recipe," but getting uplift on the process itself, like birch reduction method/reductive-amination (classic meth synthesis pathways), is much more doable. defense becomes much more difficult to maintain when you start throwing in out-of-distro tokens, breaking up the harmful uplift into benign chunks, and then piecing the innocuous-seeming facts back together, especially when you have jailbroken Opus helping you do it 😉 gg
617
1,429
13,332
3,177,788
Lucc retweeted
A French engineer who lives quietly in Paris has spent 30 years writing software that the entire internet now runs on without knowing his name. He wrote the code that streams every YouTube video, every Netflix show, every TikTok clip. He wrote the code that runs the virtual servers underneath AWS, Google Cloud, and Microsoft Azure. He calculated more digits of pi than anyone in history. He has no Twitter. He has no marketing. He just keeps shipping. His name is Fabrice Bellard. Here is the story, because almost nobody outside the systems programming world knows what one man has built. Fabrice was born in 1972 in Grenoble, France. He studied at École Polytechnique, the top French engineering school. He never went to Silicon Valley. He never built a startup empire. He just wrote code. In 2000 he started a project called FFmpeg, an open-source multimedia framework for encoding, decoding, and streaming video. He was 28. The project did one thing nobody else had done well. It handled every video and audio format that existed, in one library, on every operating system. He led it himself for years. Today FFmpeg is the invisible engine of the internet. YouTube uses it. Netflix uses it. VLC uses it. Chrome and Firefox use parts of it. Every Android phone, every iPhone, every smart TV, every video editing tool you have ever touched runs FFmpeg somewhere underneath. If you have watched a video on a screen in the last 20 years, Fabrice's code processed it. He was not done. In 2003 he started QEMU, a machine emulator and virtualizer. He wrote it solo until version 0.7.1 in 2005. QEMU lets you run any operating system on any other operating system. It became the foundation of modern virtualization. KVM, the Linux kernel hypervisor, runs on top of QEMU. Every major cloud provider, AWS, Google Cloud, Microsoft Azure, IBM Cloud, runs virtual machines on infrastructure built around it. The Quick Emulator is the most cited piece of cloud infrastructure code on Earth. He kept going. In 2001 he won the International Obfuscated C Code Contest with a small C compiler that grew into TCC, the Tiny C Compiler. TCC can compile and boot a Linux kernel from source in under 15 seconds. In 2004 he calculated the most digits of pi ever computed at the time, using a personal desktop computer and an algorithm he derived himself called Bellard's formula. In 2011 he wrote a complete PC emulator in pure JavaScript that runs Linux in your browser, a project called JSLinux that engineers still cannot believe is real. In 2019 he released QuickJS, a small but complete JavaScript engine that fits where V8 cannot. In 2021 he released NNCP, a neural network based lossless data compressor that immediately took the lead on the Large Text Compression Benchmark. Then he turned his attention to large language models. He built TextSynth Server, a web server with a REST API for running LLMs locally. He released ts_zip and ts_sms, compression utilities that use language models to compress text and short messages at ratios traditional algorithms cannot reach. He released TSAC, a very low bitrate audio compression system. In December 2025 he released Micro QuickJS, a new JavaScript engine for microcontrollers, separate from QuickJS, designed for environments with almost no memory. Fabrice co-founded a telecom company called Amarisoft in 2012, where he serves as CTO. Amarisoft builds 4G and 5G base station software used by carriers and labs around the world. He has been running it for over a decade while continuing to ship personal projects from his own home page at bellard dot org He has no Twitter. He has no Instagram. He gives almost no interviews. His personal website is a flat list of projects with no styling, no fonts, no marketing copy. Just titles and links. A quiet French engineer who never moved to Silicon Valley wrote the code that quietly runs the internet. He is still shipping.
379
4,508
25,136
3,049,028
Lucc retweeted
Did you know a rocket becomes faster by becoming lighter? As fuel burns and mass decreases, the same thrust can produce greater acceleration. That's the physics behind how rockets climb from the launch pad to space. These equations capture the core principles of rocket propulsion that make every launch possible.
10
96
400
8,719
Lucc retweeted
Super excited to announce seven new world-class MAI models today. They represent what we consider a new era in AI designed to keep you in control and on the frontier. First is our text foundation model, MAI-Thinking-1, exceptionally strong on reasoning and SWE tasks. - It’s a 35B active parameter MoE with a 256K context window. Independent human raters on Surge prefer it for overall quality in blind side-by-sides versus Sonnet 4.6, and it’s achieved 97% on AIME 2025, the key measure of its general-purpose reasoning abilities. - It's at 53% on SWE Bench Pro, placing it right alongside Opus 4.6 on one of the toughest coding benchmarks. - And since we co-designed our models with our own silicon, MAI-Thinking-1 is optimized on our MAIA 200 chip. Benchmarking head-to-head against the GB200, we see 30% better performance per dollar as well as a 1.4x performance-per-watt gain when running our MAI models on the MAIA 200 end-to-end. Next is MAI-Image-2.5 and its Flash variant. Two super strong models now at #2 on the leaderboards, surpassing the score of Nano Banana 2 on image editing. Last for now is MAI-Code-1-Flash, our new inference efficient coding model, especially tuned for VS Code and GitHub Copilot CLI. - Code-1-Flash achieves 51% on SWE Bench Pro, despite having just 5B parameters, putting it closer to Haiku in size but cheaper in cost. All of this is the foundation for Microsoft Frontier Tuning. It lets you customize our models to create custom, company-specific agents that only you control. You can make our model, your model. Your data. Your agents. Your moat. Early adopters are already seeing a difference. When we tuned our models for McKinsey’s tasks, MAI delivered the highest win rate, outperforming GPT-5.5 on quality, while being 10x lower on cost. Also really excited to be collaborating with the amazing team at Mayo Clinic to jointly train a new frontier AI model for healthcare. Our announcements today mark another milestone on the road to humanist superintelligence. You can learn more and about our other new models in our latest blog: microsoft.ai/news/building-a…
191
541
3,809
1,287,767
Lucc retweeted
Nobel Prize winning economist Kenneth Arrow wrote about "learning by doing" decades ago. He knew that productivity and expertise improve through experience. The messy, repetitive works is often where you learn the patterns that eventually become judgment. Knowledge can be taught, but judgement is built through lived experience. The first draft you rewrite. The customer call you listen to. The bug you fix and fix again. The factory floor you walk. Small decisions you make every day teach you judgement. And, judgement is the thing everyone wants from senior people in the workplace. If we automate away every entry-level task without replacing the learning loop, we are removing a part of the process that creates experts. The goal should be to use AI to accelerate learning, remove friction, and give people better tools to build expertise faster. haverford.edu/sites/default/… Thanks @Fortune & @tbove4 for sharing this story. Link in the comments.
20
483
1,949
69,331
Lucc retweeted
Ever wondered what the origin of the name 'Westminster' is? Our church was founded in 960AD and became known as the 'west minster' to distinguish it from @StPaulsLondon (the 'east minster'). This image shows what we looked like in Norman times. #LondonHistoryDay
4
88
635
31,183
Lucc retweeted
A Oxford PhD student got flagged for submitting AI-generated work. His advisor called it the most sophisticated research process he had seen in 20 years. The student had not used AI to write a single word. Here is the workflow that got him reported. He starts every essay with a diagnostic he calls brutal. He dumps his rough argument into Claude and asks one question: what are the three weakest logical jumps in this reasoning, and where would a hostile examiner attack first? The AI does not write his essay. It destroys his draft, and then he rebuilds from whatever survives. Most students using AI are doing the opposite. They hand Claude a topic and ask it to write. He hands Claude his thinking and asks it to find every place where that thinking falls apart. The difference between those two approaches is the difference between outsourcing your brain and sharpening it. The second step is the one that made his advisor go quiet. He uploads the five most important papers in his field alongside his draft and asks Claude what claims in his argument contradict or oversimplify what these authors actually found. Most PhD students cite papers they have skimmed once. He cites papers he has been forced to genuinely reckon with, because Claude keeps catching the places where he got them wrong. The final move is almost unfair. Before he submits anything, he pastes his conclusion and runs one more prompt. He asks what a philosopher of science would say is missing from this argument and what assumptions he is making that he has not defended. His essays come back from reviewers with phrases like unusually rigorous and demonstrates rare critical depth, and his committee has no idea that the depth came from a machine asking him harder questions than any human in his department was willing to ask. The academic integrity hearing lasted three hours. The panel asked him to rebuild his methodology from scratch in the room. He opened his laptop and showed them exactly how the workflow ran, prompt by prompt. They did not just clear him. They gave him the highest grade in the department's history and asked him to present the process to faculty. Here is what that story actually means. What took most PhD candidates six months of back-and-forth with advisors, he was compressing into a single session because he had figured out something almost nobody else has. AI does not make your thinking better by replacing it. It makes your thinking better by attacking it faster than any human critic ever would. He was not using AI to write. He was using it to think harder than he could alone. The tool is the same one everyone has. The workflow is the part nobody is teaching.
Community note
This is Haishan Yang. He is an expelled doctoral student at University of Minnesota. He was caught using ChatGPT (not Claude) on a written exam that banned AI. He lost his appeal and failed his lawsuit against the university after being caught. He never studied at Oxford. share.google/TJtWCZwAePAKYG… share.google/dEpJJmMV5wNu8O…
177
706
3,303
423,249
Lucc retweeted
May 20
1/ We are sharing additional details regarding our investigation into unauthorized access to GitHub's internal repositories. Yesterday we detected and contained a compromise of an employee device involving a poisoned VS Code extension. We removed the malicious extension version, isolated the endpoint, and began incident response immediately.
581
3,608
11,531
7,491,836
Lucc retweeted
91
3,055
36,758
542,300
Lucc retweeted
How do we make LLMs faster and lighter? Don’t force the GPU to adapt to sparsity. Reshape the sparsity to fit the GPU! ⚡️ Excited to share our new #ICML2026 paper in collaboration with @NVIDIA: "Sparser, Faster, Lighter Transformer Language Models". This work introduces new open-source GPU kernels and data formats for faster inference and training of sparse transformer language models: Paper: arxiv.org/abs/2603.23198 Blog: pub.sakana.ai/sparser-faster… Code: github.com/SakanaAI/sparser-… While LLMs are undoubtedly powerful, they are increasingly expensive to train and deploy, with a large part of this cost coming from their feedforward layers. Yet, an interesting phenomenon occurs inside these layers: For any given token, only a small fraction of the hidden activations actually matter. The rest approximate zero, wasting computation. With ReLU and very mild L1 regularization, this sparsity can exceed 95% with little to no impact on downstream performance. So, can we leverage this sparsity to make LLMs faster? The challenge is hardware. Modern GPUs are optimized for dense matrix multiplications. Traditional sparse formats introduce irregular memory access and overheads that cancel out their theoretical savings for GEMM operations. Our contribution is twofold: 1/ We introduce TwELL (Tile-wise ELLPACK), a new sparse packing format designed to integrate directly in the same optimized tiled matmul kernels without disrupting execution. 2/ We develop custom CUDA kernels that fuse multiple sparse matmuls to maximize throughput and compress TwELL to a hybrid representation that minimizes activation sizes. We used our kernels to train and benchmark sparse LLMs at billion-parameter scales, demonstrating >20% speedups and even higher savings in peak memory and energy. This work will be presented at #ICML2026. Please check out our blog and technical paper for a deep dive!

ALT Sparser, Faster, Lighter Transformer Language Models Scaling autoregressive LLMs has driven unprecedented progress but comes with vast computational costs. In this work, we tackle these costs by leveraging unstructured sparsity within an LLM's feedforward layers, the components accounting for most of the model parameters and execution FLOPs. To achieve this, we introduce a new sparse packing format and a set of CUDA kernels designed to seamlessly integrate with the optimized execution pipelines of modern GPUs, enabling efficient sparse computation during LLM inference and training. To substantiate our gains, we provide a quantitative study of LLM sparsity, demonstrating that simple L1 regularization can induce over 99% sparsity with negligible impact on downstream performance. When paired with our kernels, we show that these sparsity levels translate into substantial throughput, energy efficiency, and memory usage benefits that increase with model scale.

21
116
755
409,034
Lucc retweeted
The human brain🧠 is incredibly efficient because it only activates the specific neurons needed for a thought. Modern LLMs naturally try to do this too (> 95% of neurons in feedforward layers stay silent for any given word), but our hardware punishes them for it. One of the most frustrating paradoxes in deep learning: making a model do less math often makes it run slower. Why? Because unstructured sparsity introduces irregular memory access, and GPUs are built for predictable, dense blocks of math. We teamed up with @NVIDIA to try to fix this hardware mismatch. Instead of forcing the GPU to adapt to the sparsity, we built a "Hybrid" format that reshapes the sparsity to fit the GPU. Our sparsity format (TwELL) dynamically routes the 99% of highly sparse tokens through a fast path, and uses a dense backup matrix as a safety valve for the rare, heavy tokens. Through TwELL and a new set of custom CUDA kernels for both LLM inference and training, we translated theoretical sparsity into actual wall-clock speedups: >20% faster training and inference on H100 GPUs, while also cutting energy consumption and memory requirements. Paper: arxiv.org/abs/2603.23198 Blog: pub.sakana.ai/sparser-faster… Code: github.com/SakanaAI/sparser-… ⚡️
How do we make LLMs faster and lighter? Don’t force the GPU to adapt to sparsity. Reshape the sparsity to fit the GPU! ⚡️ Excited to share our new #ICML2026 paper in collaboration with @NVIDIA: "Sparser, Faster, Lighter Transformer Language Models". This work introduces new open-source GPU kernels and data formats for faster inference and training of sparse transformer language models: Paper: arxiv.org/abs/2603.23198 Blog: pub.sakana.ai/sparser-faster… Code: github.com/SakanaAI/sparser-… While LLMs are undoubtedly powerful, they are increasingly expensive to train and deploy, with a large part of this cost coming from their feedforward layers. Yet, an interesting phenomenon occurs inside these layers: For any given token, only a small fraction of the hidden activations actually matter. The rest approximate zero, wasting computation. With ReLU and very mild L1 regularization, this sparsity can exceed 95% with little to no impact on downstream performance. So, can we leverage this sparsity to make LLMs faster? The challenge is hardware. Modern GPUs are optimized for dense matrix multiplications. Traditional sparse formats introduce irregular memory access and overheads that cancel out their theoretical savings for GEMM operations. Our contribution is twofold: 1/ We introduce TwELL (Tile-wise ELLPACK), a new sparse packing format designed to integrate directly in the same optimized tiled matmul kernels without disrupting execution. 2/ We develop custom CUDA kernels that fuse multiple sparse matmuls to maximize throughput and compress TwELL to a hybrid representation that minimizes activation sizes. We used our kernels to train and benchmark sparse LLMs at billion-parameter scales, demonstrating >20% speedups and even higher savings in peak memory and energy. This work will be presented at #ICML2026. Please check out our blog and technical paper for a deep dive!

ALT Sparser, Faster, Lighter Transformer Language Models Scaling autoregressive LLMs has driven unprecedented progress but comes with vast computational costs. In this work, we tackle these costs by leveraging unstructured sparsity within an LLM's feedforward layers, the components accounting for most of the model parameters and execution FLOPs. To achieve this, we introduce a new sparse packing format and a set of CUDA kernels designed to seamlessly integrate with the optimized execution pipelines of modern GPUs, enabling efficient sparse computation during LLM inference and training. To substantiate our gains, we provide a quantitative study of LLM sparsity, demonstrating that simple L1 regularization can induce over 99% sparsity with negligible impact on downstream performance. When paired with our kernels, we show that these sparsity levels translate into substantial throughput, energy efficiency, and memory usage benefits that increase with model scale.

52
507
3,468
431,393
Lucc retweeted
In 1905, Einstein theorised that time doesn’t pass at the same rate for everyone—it depends on speed and gravity. A moving clock (relative to you) ticks more slowly than your own, and a clock deeper in a strong gravitational field also runs slower than one farther away. So two people who move differently or sit in different gravitational environments will age by slightly different amounts, even if they later meet again and compare watches.
250
273
1,706
152,544
Lucc retweeted
The Riemann Hypothesis is the biggest unsolved math problem in history… and it secretly runs half of computer science. Your encryption, AI randomness, prime-based algorithms - they all quietly depend on it. Let me explain it so even non-math CS folks get the “whoa” moment. 🧵
70
548
3,035
305,389
Lucc retweeted
Claude Code fully dissected! Researchers from UCL reverse-engineered the leaked Claude source. What they found changes how you should think about agent design. Only 1.6% of the codebase is AI decision logic. The other 98.4% is operational infrastructure. Permission gates, tool routing, context compaction, recovery logic, session persistence. The model reasons. The harness does everything else. This is the opposite of what most agent frameworks do today. LangGraph routes model outputs through explicit state machines. Devin bolts heavy planners onto operational scaffolding. Claude Code gives the model maximum decision latitude inside a rich deterministic harness, and invests all its engineering effort in that harness. The core loop is a simple while-true. Call model, run tools, repeat. But the systems around that loop are where the real design lives: A permission system with 7 modes and an ML classifier. Users approve 93% of prompts anyway, so the architecture compensates with automated layers instead of adding more warnings. A 5-layer context compaction pipeline. Each layer runs only when cheaper ones fail. Budget reduction, snip, microcompact, context collapse, auto-compact. Four extension mechanisms ordered by context cost. Hooks (zero), skills (low), plugins (medium), MCP (high). Each answers a different integration problem. Subagents return only summary text to the parent. Their full transcripts live in sidechain files. Agent teams still cost roughly 7x the tokens of a standard session. Resume does not restore session-scoped permissions. Trust is re-established every session. That friction is the point. The bet behind all of this is simple. As frontier models converge on raw coding ability, the quality of the harness becomes the differentiator, not the model. Paper: Dive into Claude Code (arXiv:2604.14228) In the next tweet, I've shared an article I wrote on Agent Harness and what every big company is building. Do check.
73
299
1,650
178,190
Lucc retweeted
Institutional onchain markets face a structural privacy gap. Trusted Execution Environments (TEEs) are emerging as core infrastructure for institutional onchain markets by resolving the tension between transparency and confidentiality. By enabling private execution with verifiable onchain outputs, TEEs support settlement privacy, confidential RWAs, and real-time compliance. Early deployments by @OasisProtocol and @PhalaNetwork show how TEEs can act as privacy coprocessors for regulated DeFi.
27
59
174
34,945
Lucc retweeted
🚨: There have been thousands of generations of humans, and you are alive to witness the first photo of a Sunset on another World.😮 This is a real photo of the sunset on Mars.
350
3,804
22,191
513,178
Lucc retweeted
Anthropic dropped a 33-page guide on Claude Skills...And this changes how serious teams build AI workflows A Claude Skill is basically a reusable workflow in a folder. One SKILL.md file teaches Claude exactly how you want tasks done consistently every time The real insight isn’t Skills....It’s how to design them properly: • Build micro-skills, not monoliths • Keep instructions short and decisive • Move heavy context into references and assets • Always refine generated Skills manually • Connect Skills to tools via MCP and hooks That’s when AI stops being a chatbot… and starts becoming a system Link - platform.claude.com/docs/en/… drive.google.com/file/d/1RR4…
27
364
2,459
264,621
Lucc retweeted
🚨 Someone just built a tool that turns any GitHub repo into an interactive knowledge graph and open sourced it for free. It's called GitNexus. Think of it as a visual X-ray of your codebase but with an AI agent you can actually talk to. Here's what it does inside your browser: → Parses your entire GitHub repo or ZIP file in seconds → Builds a live interactive knowledge graph with D3.js → Maps every function, class, import, and call relationship → Runs a 4-pass AST pipeline: structure → parsing → imports → call graph → Stores everything in an embedded KuzuDB graph database → Lets you query your codebase in plain English with an AI agent Here's the wildest part: It uses Web Workers to parallelize parsing across threads so a massive monorepo doesn't freeze your tab. The Graph RAG agent traverses real graph relationships using Cypher queries not embeddings, not vector search. Actual graph logic. Ask it things like "What functions call this module?" or "Find all classes that inherit from X" and it traces the answer through the graph. This is the kind of code intelligence tool enterprise teams pay thousands per month for. It runs entirely in your browser. Zero server. Zero cost. Works with TypeScript, JavaScript, and Python. 100% Open Source. MIT License.
98
438
3,250
223,089
Lucc retweeted
The exponential continues. Nov 2025: Opus 4.5 had a 5hr 20 time horizon. Feb 2026: Opus 4.6 has a 14hr 30 time horizon. Over three months, that's more than a *doubling* in the duration of coding tasks, measured by how long it takes human professionals, that AI can complete with 50% accuracy. Note that at this duration, the estimate is very noisy - see the thread from @METR_Evals for more on this. Now that agents can do most of the tasks on their benchmark, it's harder to be confident. But it looks like this is sitting above-trend. Read our full explainer on what this measure means: theaidigest.org/time-horizon…
Feb 20
We estimate that Claude Opus 4.6 has a 50%-time-horizon of around 14.5 hours (95% CI of 6 hrs to 98 hrs) on software tasks. While this is the highest point estimate we’ve reported, this measurement is extremely noisy because our current task suite is nearly saturated.
20
64
609
92,237
Lucc retweeted
Feb 15
Peter Steinberger is joining OpenAI to drive the next generation of personal agents. He is a genius with a lot of amazing ideas about the future of very smart agents interacting with each other to do very useful things for people. We expect this will quickly become core to our product offerings. OpenClaw will live in a foundation as an open source project that OpenAI will continue to support. The future is going to be extremely multi-agent and it's important to us to support open source as part of that.
4,878
4,252
46,115
16,810,927