Sometimes I troll, mostly I roll...

Joined July 2024
73 Photos and videos
GalacticGazer_ retweeted
Introducing LocateAnything-3B, a vision-language model for fast, precise visual grounding from NVIDIA. Up to 2.5x throughput improvement over prior methods. šŸ¤– modelscope.ai/models/nv-comm… Trained on 12M images, 138M queries, 785M bounding boxes across natural scenes, robotics, autonomous driving, GUI, and document understanding. šŸŽÆ Object detection, phrase grounding, GUI element locating, scene text detection, document layout, pointing — all in one model. Already integrated into NVIDIA Nemotron Nano Omni for production-grade VLM grounding. šŸ“„ Non-commercial research use only.
1
41
342
16,045
GalacticGazer_ retweeted
For over a decade, we’ve accepted that end-to-end backprop is the only way to train deep networks. But holding the entire network in memory all at once is why AI training is hitting a resource wall. We found a new way to break the network into blocks and train them independently. The trick? Treating the network’s forward pass like a diffusion model denoising a signal. This reinterpretation slashes the memory needed to train deep models. In our #ICLR2026 paper (arxiv.org/abs/2506.14202), we matched end-to-end performance across ViTs, DiTs, and LLMs. We did this while training just one isolated block at a time.
Introducing DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation pub.sakana.ai/diffusionblock… What if we didn’t have to hold an entire neural network in memory to train it? Standard neural net training optimizes all parameters jointly. As a result, the memory required during training grows linearly with the depth of the network. In our #ICLR2026 paper, we propose DiffusionBlocks, a principled framework to train networks one block at a time, drastically reducing memory requirements while matching end-to-end performance. With DiffusionBlocks, we split the network into blocks and train them one at a time, so you only need memory for a single block. How? We explicitly assign each block a role: to move the representation a little closer to the target than the block before it did. That role turns out to be precisely what a diffusion model does, step by step. Each block only needs to optimize its own objective and can be trained independently. We validated this across five different architectures: • ViT • DiT • Masked diffusion • Autoregressive transformers • Recurrent-depth transformers In each case, performance is competitive with end-to-end training while using a fraction of the memory. This perspective also extends naturally to recurrent-depth (Looped) transformers, which apply the same network iteratively and normally require expensive backpropagation through time (BPTT). Viewed through DiffusionBlocks, we can replace those multiple iterations with a single forward pass during training. Read our paper and code, to learn more. Paper: arxiv.org/abs/2506.14202 GitHub: github.com/SakanaAI/Diffusio… 🐟
154
638
5,764
741,999
Looks like they have nerfed the Fable 5 since release, typical anthropic.
1
1
45
#minimax M3 is here and its pricy, also as usual benchmaxxed, also not on par with DS v4 PRO but priced higher than v4PRO. Not a good direction for @MiniMax_AI
1
263
No more starter plans for you, peasants - Says @MiniMax_AI
76
GalacticGazer_ retweeted
May 23
The 10 fastest growing GitHub repos this week: 1. codegraph ( 14.1K stars) Pre-indexed code knowledge graph for Claude Code, Codex, Cursor, OpenCode, and Hermes Agent — fewer tokens, fewer tool calls, 100% local github.com/colbymchenry/code… 2. openhuman ( 17.1K stars) Your Personal AI super intelligence. Private, Simple and extremely powerful. github.com/tinyhumansai/open… 3. academic-research-skills ( 11.6K stars) Academic Research Skills for Claude Code: research → write → review → revise → finalize github.com/Imbad0202/academi… 4. RuView ( 6.8K stars) Ļ€ RuView turns commodity WiFi signals into real-time spatial intelligence, vital sign monitoring, and presence detection — all without a single pixel of video. github.com/ruvnet/RuView 5. agentmemory ( 6.9K stars) #1 Persistent memory for AI coding agents based on real-world benchmarks github.com/rohitg00/agentmem… 6. supertonic ( 3.6K stars) Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX. github.com/supertone-inc/sup… 7. CloakBrowser ( 7.0K stars) Stealth Chromium that passes every bot detection test. Drop-in Playwright replacement with source-level fingerprint patches. 30/30 tests passed. github.com/CloakHQ/CloakBrow… 8. ViMax ( 2.7K stars) "ViMax: Agentic Video Generation (Director, Screenwriter, Producer, and Video Generator All-in-One)" github.com/HKUDS/ViMax 9. 12-factor-agents ( 1.9K stars) What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers? github.com/humanlayer/12-fac… 10. bun ( 2.0K stars) Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one github.com/oven-sh/bun The theme this week: agent memory, context efficiency, and on-device intelligence are making AI infrastructure the hottest build category. Bookmark this. Next week's list will look completely different.
69
197
1,644
143,131
GalacticGazer_ retweeted
It's been *almost* a bit quiet around LLM architecture releases in the past two weeks šŸ˜… Interesting tidbit is the parallel block design. Via the Cmd-A the tech report "equivalent performance but significant improvement in throughput compared to the vanilla transformer block."
May 20
Introducing: Cohere Command A We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
32
78
669
66,309
GalacticGazer_ retweeted
Command A from @cohere is out now :) its our best model yet and its open source apache 2.0
56
132
1,342
203,227
GalacticGazer_ retweeted
May 20

28
129
887
184,856
GalacticGazer_ retweeted
gemini flash output pricing over time: 1.5 Flash: $0.30 2.0 Flash: $0.40 2.5 Flash: $2.50 3.0 Flash: $3.00 3.5 Flash: $9.00 that's a 30x price increase across five generations. for the "cheap fast model." 3.5 Flash costs $9/M output. 2.5 Pro was $10/M. your new flash model costs the same as last year's pro. at this rate gemini 3.5 pro is going to be $30/M and we'll need deepseek just to afford the flash tier
9
2
52
6,482
GalacticGazer_ retweeted
Btw you’ll never fucking guess who canceled Ebola prevention
WHO declares Ebola outbreak a global health emergency bbc.in/4uKFLbe
260
9,193
59,421
2,005,866
GalacticGazer_ retweeted
May 14
JUST IN: Classrooms have seen a 30% increase in ā€œAā€ grades since the launch of ChatGPT, study reveals
97
240
4,018
337,683
GalacticGazer_ retweeted
ā€¼ļøšŸšØ ALARMING: Google now treats privacy as suspicious behavior by default. Users of GrapheneOS, CalyxOS, /e/OS, and other deGoogled Android phones are being locked out of millions of websites unless they install the exact Google Play Services software they deliberately removed. GrapheneOS is recommended by the EFF and used by journalists, lawyers, and activists in high-risk environments. The audience most likely to read Google's data practices and refuse its terms is now flagged as fraudulent for that exact decision. What happened?: ā–Ŗļø Google announced "Cloud Fraud Defense" at Cloud Next on April 22-23, 2026, branding it "the next evolution of reCAPTCHA." Existing reCAPTCHA customers were auto-migrated. ā–Ŗļø When the system flags traffic as suspicious, the old click-the-bus puzzle is gone. Users get a QR code instead. ā–Ŗļø Scanning the QR code requires Google Play Services running on the device. Internet Archive snapshots show this requirement has been live since at least October 2025, silently rolled out for 7 months before anyone noticed. ā–Ŗļø No Play Services = no QR scan = locked out. The bigger picture: ā–Ŗļø Google already tried this in 2023. It was called Web Environment Integrity (WEI), and it would have let Google decide which devices were "real enough" to access the web. Standards bodies and the public pushed back hard, and Google killed it. Three years later, the same idea is back, just hidden behind a QR code instead of a browser feature. ā–Ŗļø reCAPTCHA runs on millions of websites. Every developer who keeps using it is now, by default, telling deGoogled Android users they're not welcome...
544
4,982
16,740
1,638,020
If you are using @OpenRouter for production, please block @Cloudflare as provider. The amount of bad results I get from models hosted their is absurdly high. #llm #api They don't even show up as failed the output are just bad compared to other providers. Extremely low quality
1
128
I am getting 4-10 request at most from codex 30$ enterprise plan for 5.5 low reasoning coding tasks today, what gives ? #openai @OpenAIDevs @sama This seems very much abnormal, one request and it down by 20%
1
120
GalacticGazer_ retweeted
May 4
OPENAI: The nonprofit’s president, Greg Brockman, used the fact he AND Elon Musk were donating as inducement to convince others to donate. Greg even told Sam, Elon, and employees he donated $100K. He didn’t donate a dime. It wasn’t true.

33
268
2,588
115,608
GalacticGazer_ retweeted
IQ test disguised as cartoons.
39
654
12,165
1,370,274
Why is GPT-5.4 suddenly completly dumb ? @OpenAI @sama
1
85
GalacticGazer_ retweeted
deepseek v4 is now the cheapest sota model available at 1/20th the cost of opus 4.7. for perspective, if uber used deepseek instead of claude their 2026 ai budget would have lasted 7 years instead of only 4 months.
108
354
4,445
236,914