Rule#1: Don't lose money🦎 | Trader of Charts & Chaos | 💬 Meta Rants | 🤖 AI-Pilled | 🧠 Code & Cold Logic |

Joined January 2023
541 Photos and videos
Let's add this on the list of models to test
Gemma 4 12B Coder is here and it's a game changer for local code generation. This GGUF model packs Google's latest gemma-4 architecture into a compact 12B size, perfect for running on consumer hardware. It's optimized for reasoning and thinking, making it ideal for developers who want fast, private coding assistance without the cloud.
3
32
A good time to try new stuff: it comes with MiMo V2.5, a multimodal model available free for a limited time, featuring a million-token context window. Mimocode is actually a fork of Opencode, so the interface and usage are exactly the same.
🚀 MiMo Code V0.1 is now live and open-source! More than an AI coding assistant in your terminal — it's the smartest coding partner you'll ever work with. Comes with MiMo V2.5, a multimodal model available free for a limited time, featuring a million-token context window—ready to use out of the box. ♾️ Infinite Context: Knowledge accumulates automatically, and with lossless compression, even million-line projects keep every critical detail intact—quality never drops. 🧠 Agent-Model Synergy: An Agent framework deeply optimized for MiMo, with a full closed loop of testing, review, and validation—so complex tasks get done in one pass. 📝 Compose Mode: Specs → Plans → Build → Report. Design first, code second—clear thinking, no rework. 🔄 Self-Evolving System: Every session is automatically reviewed, distilling experience and best practices—the more you use it, the smarter it gets. 🎙️ Voice Input: Powered by MiMo-V2.5-ASR — just speak instead of type, and your voice becomes the prompt for truly hands-free coding. 🔌 Claude Code Compatible: Automatically loads your existing skills, MCP servers and commands, and reuses your API configuration—zero-cost migration, no setup required. 🌐 Open & Flexible: MIT licensed, with support for leading model providers including Anthropic, OpenAI, DeepSeek, Kimi, GLM and more. Install in one line: Mac & Linux curl -fsSL mimo.xiaomi.com/install | bash (For the best experience,we recommand Mac user use it on iTerm or vscode terminal) Windows npm install -g @mimo-ai/cli 🔗 Learn more Website ↓ mimo.xiaomi.com/mimocode Blog ↓ mimo.xiaomi.com/zh/blog/mimo… GitHub ↓ github.com/XiaomiMiMo/MiMo-C…
4
105
Kripto Geko retweeted
Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇
699
1,757
14,782
5,929,457
Kripto Geko retweeted
MiniMax-M3 scores 55 on the Artificial Analysis Intelligence Index. Once the weights are released, it will be the leading open weights model M3 is @MiniMax_AI's first multimodal M-series model, adding image and video input and a 1M token context window over the text-only MiniMax-M2.7 (50). At 55 on the Intelligence Index it sits just ahead of open weights peers Kimi K2.6 (54) and MiMo-V2.5-Pro (54). MiniMax has noted they plan to release the weights within ~10 days. When MiniMax released the weights for M2.7, it was under a commercially restricted license. Key takeaways: ➤ MiniMax-M3 improves on MiniMax-M2.7 across most evaluations. HLE 9 points (28% to 37%), GPQA Diamond 6 (87% to 93%), AA-LCR 5 (69% to 74%), IFBench 7 (76% to 83%), and CritPt 3 (1% to 4%), with a small regression on SciCode (47% to 45%) ➤ M3 scores ~1670 on GDPval-AA, behind Claude Opus 4.8 (max, 1890) and GPT-5.5 (xhigh, 1769), and level with Claude Sonnet 4.6 (max, 1676). GDPval-AA measures real-world tasks across 44 occupations and 9 industries ➤ Native multimodality, scoring ~80% on MMMU-Pro. Level with GPT-5.5 (xhigh, 79.9%) and Kimi K2.6 (79.4%), behind Gemini 3.5 Flash (high, 84.3%). Not all open weights models support native vision input ➤ On AA-Omniscience, heavy abstention drives both low hallucination and low accuracy. M3 attempts only 30.9% of questions, the lowest among current peers, yielding a low hallucination rate (16.1%) and low accuracy (15.0%) ➤ MiniMax-M3's token usage is close to M2.7's, using ~91M output tokens to run the Intelligence Index (~81M reasoning) versus ~87M (~79M reasoning), while scoring 5 points higher Key model details: ➤ Context window: 1M tokens, up from MiniMax-M2.7's 200K ➤ Pricing: $0.30/$1.20 per 1M input/output tokens up to 512K context, rising to $0.60/$2.40 for 512K to 1M context ➤ Weights: Not yet released. MiniMax has stated the weights will follow ➤ Availability: MiniMax first-party API, @SiliconFlowAI, @gmi_cloud, and @novita_labs
35
50
728
56,824
Kripto Geko retweeted
Google's new algorithm just shrunk 31GB of memory down to 4GB 🤯 TurboVec is a new open-source tool that stores the data your AI app searches through, using 16x less memory. It runs on Google's TurboQuant, which skips the slow setup step every other tool needs. → Faster search than the popular alternative (FAISS) → Works on both Mac and standard servers → Narrow results to exactly what you want → Plugs straight into LangChain and LlamaIndex Your data never leaves your machine. Runs fully offline, works with Python out of the box. 100% Open Source.
57
343
2,631
148,372
Kripto Geko retweeted
Günaydın, Piyasada para kaybettiren şey çoğu zaman yanlış analiz değil, yanlış ruh halidir. Bu yüzden duygularınız değiştiğinde davranışınızı da değiştirin: • FOMO varsa, bekleyin. • Korku varsa, plana dönün. • Açgözlülük varsa, karın bir kısmını alın. • Aşırı özgüven varsa, riskinizi artırmayın. • Tereddüt varsa, kurallarınızı hatırlayın. • Sıkıldıysanız, işlem açmayın. • Kaygılıysanız,pozisyonunuzu küçültün. • İntikam hissediyorsanız, ekranı kapatın. • Kafanız karışıksa, daha büyük resmi inceleyin. • Fazla heyecanlıysanız, girişinizi yeniden kontrol edin. Piyasayı kontrol edemeyiz. Ama kendimizi kontrol etmeyi öğrenebiliriz. Uzun vadede farkı yaratan da genellikle budur.
2
8
158
7,295
Kripto Geko retweeted
RTX 5060 Ti 16GB. $429 GPU. Last night I got 128 t/s on Qwen3.6-35B using ik_llama.cpp's R4 quant format. Crushing performance. Faster than the 5070 Ti on mainline llama.cpp. Performance stays consistent from 0 to 139k context and no speculative decoding used!🤯 Special thanks to @MakJoris for sharing ik_llama.cpp with us! Today I wanted to know if it's actually *useful* at that speed. So I gave it a coding agent and 4 creative challenges. Here's what it built. 🧵
37
49
531
30,688
Kripto Geko retweeted
Sci-Hub is an evil website that pirated 85M research papers and made them freely available And now they've added AI to their database to make Sci-Bot. It answers your questions using latest, full-text articles. But DO NOT use it. We should all try to make billion-dollar academic publishers richer. I'm putting the link below so you know how to avoid it.
836
8,900
46,823
4,950,395
Kripto Geko retweeted
Apr 26
23 yaşında bi genç 60 yıldır çözülemeyen Erdös problemlerinden birini chatgpt 5.4 pro ile çözmüş. hem de tek atışta. chatgpt'nin soruyu çözmek için harcadığı süre 1 saat 20 dakika. işin ilginci ai, herkesin bildiği ama kimsenin bu probleme uygulamadığı bi formülü kullanarak problemi çözmüş. burada chatgpt yazışması; chatgpt.com/share/69dd1c83-b… bu da problem; erdosproblems.com/1176
253
1,011
12,170
5,869,514
Kripto Geko retweeted
My 4090 went from 26 -> 154 tok/s Qwen 3.6 27B🤯 Same GPU. Same Q4_K_M . No FP8, no extra quant. The unlock: ik_llama.cpp speculative decoding using Qwen3-1.7B as the draft model. 85% acceptance rate. Full config benchmarks 👇🏻
79
152
1,653
127,413
Kripto Geko retweeted
Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code. But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along. So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions. TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.
The degree to which you are awed by AI is perfectly correlated with how much you use AI to code.
1,197
2,528
20,876
4,493,384
I was complaining to Gemini about how he brought up irrelevant things it knows about me into the current chat and it revealed some interesting stuff in its thoughts.@elder_plinius do you know what he is talking about? this "OMNI - PROTOCOL FOR INVISIBLE PERSONALIZATION" ?
3
42
Kripto Geko retweeted
A peanut-sized Chinese model just dethroned Gemini at reading documents. GLM-OCR is a 0.9B parameter vision-language model. It scores 94.62 on OmniDocBench V1.5, ranking #1 overall. For context, it outperforms models 100x its size. 100% open-source. It works in two stages. 1. A layout engine detects every region in a document. 2. Each region gets read in parallel. The model predicts multiple tokens per step instead of one. That's what makes it so fast at small size. It handles things most OCR tools struggle with: > Complex tables and nested layouts > Handwritten text and stamps > Math formulas and code blocks > Mixed image-and-text documents You can run it locally through Ollama. It fits on edge devices with limited compute. Every expensive OCR API just got a free competitor.
22
161
1,303
91,769
Kripto Geko retweeted
SOMEONE TURNED THE VIRAL "TEACH CLAUDE TO TALK LIKE A CAVEMAN TO SAVE TOKENS" STRATEGY INTO AN ACTUAL CLAUDE CODE SKILL one-line install and it cuts ~75% of tokens while keeping full technical accuracy they even benchmarked it with real token counts from the API: > explain React re-render bug: 1180 tokens → 159 tokens (87% saved) > fix auth middleware: 704 → 121 (83% saved) > set up PostgreSQL connection pool: 2347 → 380 (84% saved) > implement React error boundary: 3454 → 456 (87% saved) > debug PostgreSQL race condition: 1200 → 232 (81% saved) average across 10 tasks: 65% savings. range is 22-87% depending on the task. three intensity levels: > lite: drops filler, keeps grammar. professional but no fluff > full: drops articles, fragments, full grunt mode > ultra: maximum compression. telegraphic. abbreviates everything works as a skill for Claude Code and a plugin for Codex. this is PEAK
161
363
6,303
520,062
Kripto Geko retweeted
An average picture that you save on your phone or PC has a size of around 400 kilobytes. It doesn't do anything, it's just a static image. Now divide that by the factor 10, so you drop to 40 kilobytes. That's the size of The Last Ninja, developed by System 3 and published in 1987. I still struggle to comprehend, even in the slightest, how programmers back then did what they did - and the worlds they created with the limitations they had to work with. I was simply blown away by the graphics (isometric on the C64 with such an amazing level of detail - simply gorgeous) and absolutely mesmerized by the kickass sound. What Ben Daglish and Anthony Lees conjured up musically will forever be part of gaming history - an iconic masterpiece. 40 kilobytes man...
317
949
9,143
635,000
Kripto Geko retweeted
🎗️ "Medium-Sized" LLM Burners Coming Soon! 🔥 This Could Make Local HyperToken Generation a Reality. ⚡️ NVIDIA’s worst nightmare? 😱 ⚙️ Application-Specific Hardware Taalas new PCIe ASIC board would burn the entire medium-sized Qwen 3.5-27B LLM straight into silicon 🤯 (already doing it with small models) Taalos said medium models on ASIC would be available in their lab by Spring '26. 💭Imagine: 🚫 No more loading weights 🚀 ~10,000 Tokens Per Second locally (Llama 3.1 8B already @ 17,000 tps) 💻 Standard PC slot, ultra-low power (10x less) 🔋 🌍 100% offline with no cloud, no GPU farm 💰 Reddit unit cost rumor $300 to $400 🖥️ Imagine HyperToken generation on your desktop. 🤖 AI agents that think at light speed. ⚡️ Are you ready? 👀
179
423
2,714
492,356
Kripto Geko retweeted
BREAKING🚨: ALL FIVE types of nucleic acid bases, the building blocks of LIFE 'DNA and RNA', have been found in samples collected from asteroid Ryugu
540
3,295
22,985
3,748,335
Kripto Geko retweeted
A tale as old as time
24
40
688
44,603
Kripto Geko retweeted
The best trade of the last 20 years might surprise you. It wasn't tech. It wasn't software. And it wasn't $NVDA $10K invested in 2003, held to today: $MSFT → $220,000 $AMZN → $1,900,000 $NVDA → $11,000,000 $MNST → $25,000,000 It was an energy drink.
100
197
2,015
306,396