yoppi

yoppi

88 Photos and videos

Tweets

Pinned Tweet

yoppi @yoppiblog

21 Aug 2024

note.mntsq.co.jp/n/nd94260d3… 僕も解きましたがとてもいい問題です。一緒にML・LLMやっていく方お待ちしております。

MNTSQのAIエンジニア選考課題を公開します｜MNTSQ株式会社

こんにちは、MNTSQの平田です。このたび、あまりにも人が足らなすぎて一緒に働いてくださるAIエンジニアの求人をオープンしました。 AIエンジニアの求人オープンしました🧩https://t.co/J4urZ2h5xs — hrappuccino (@_hrappuccino) August 13, 2024 ちなみに職種はアルゴリズムエンジニアなのですが、求人票は市場に合わせてAIエンジ...

note.mntsq.co.jp

2,506

Noam Brown

yoppi retweeted

Noam Brown

@polynoamial

May 28

After AlphaGo, the skill of human Go players noticeably improved. I suspect we will see a similar pattern in math.

Timothy Gowers @wtgowers @wtgowers

May 28

Another major problem, this time in additive combinatorics, has fallen, this time to humans rather than AI, but using methods related to the AI solution to the unit distance conjecture.

187

974

9,043

785,310

yoppi

yoppi @yoppiblog

May 10

MarkdownからHTMLか〜。今までもXMLタグは有効だったし、統一されるといいな

118

Sakana AI

yoppi retweeted

Sakana AI

@SakanaAILabs

Apr 25

What if instead of building one giant AI, we evolved a coordinator to orchestrate a diverse team of specialized AIs? 🐟 Excited to share our new paper: “TRINITY: An Evolved LLM Coordinator”, published as a conference paper at #ICLR2026! Paper: arxiv.org/abs/2512.04695 In nature, complex problems are rarely solved by a single monolithic entity, but rather by the coordinated efforts of specialized individuals working together. Yet, modern AI development is heavily focused on endlessly scaling up single, massive monolithic models, yielding diminishing returns. While model merging offers a way to combine different skills, it is often impractical due to mismatched neural architectures and the closed-source nature of top-performing models. To address this, we took a macro-level approach: test-time model composition. We introduce TRINITY, a system that fuses the complementary strengths of diverse, state-of-the-art models without needing to modify their underlying weights. TRINITY processes queries over multiple turns. At each step, a lightweight coordinator assigns one of three distinct roles to an LLM from its available pool: 1/ Thinker: Devises high-level strategies and analyzes the current state. 2/ Worker: Executes concrete problem-solving steps. 3/ Verifier: Evaluates if the current solution is complete and correct. By dynamically assigning these roles, the coordinator effectively offloads complex reasoning and skill execution onto the external models. What makes TRINITY unique is its extreme efficiency. The coordinator relies on the hidden states of a compact language model and a small routing head. In total, it has fewer than 20K learnable parameters. Training this system presented a massive challenge. Traditional Reinforcement Learning (REINFORCE) failed because the gradients had a low signal-to-noise ratio due to binary rewards and weak parameter coupling. Imitation learning (Supervised Fine-Tuning) was ruled out because generating multi-turn labels is prohibitively expensive. Our solution? We turned to nature-inspired algorithms. We optimized the coordinator using a derivative-free evolutionary algorithm. We found that evolution is uniquely suited to optimize this tight, high-dimensional coordination problem where traditional gradient-based methods fail. The results are very promising. In our experiments, TRINITY consistently outperforms existing multi-agent methods and individual models across various benchmarks. At the time of publication, it set a new state-of-the-art record on LiveCodeBench, achieving an 86.2% pass@1 score. More importantly, it demonstrated incredible generalization. Without any retraining, TRINITY transferred zero-shot to four unseen tasks (AIME, BigCodeBench, MT-Bench, and GPQA). On average, the evolved coordinator surpassed every individual constituent model in its pool, including GPT-5, Gemini 2.5-Pro, and Claude-4-Sonnet (the top frontier models available at the time of our #ICLR2026 submission last year). This work is central to Sakana AI's vision. We believe the future of AI isn't just about scaling monolithic models, but engineering collaborative, diverse AI ecosystems that can adapt and combine their strengths. We invite the community to read the paper and explore these ideas! Paper: arxiv.org/abs/2512.04695 OpenReview: openreview.net/forum?id=5HaR… This foundational research is part of the core engine powering our multi-agent product: Sakana Fugu 🐡👇

Sakana AI

@SakanaAILabs

Apr 24

We’re launching the beta for our new commercial AI product: Sakana Fugu 🐡, a multi-agent orchestration system! Blog: sakana.ai/fugu-beta Fugu hits SOTA on SWE-Pro, GPQA-D, and ALE-Bench, and has been our internal secret weapon. It dynamically coordinates frontier models, autonomously selecting the optimal agent combinations and roles for each task. Available as an OpenAI-compatible API, you can seamlessly integrate Fugu into your existing workflows with minimal changes. 🐟 Fugu Mini: High-speed orchestration optimized for latency 🐡 Fugu Ultra: Full model pool utilization for deep, complex reasoning Apply for the beta test here: forms.gle/BtKkhc2CfLKk1dvNA

405

99,895

yoppi

yoppi @yoppiblog

Apr 27

大峯奥駈道、また走るぞ〜。2泊3日計画

131

yoppi

yoppi @yoppiblog

Apr 20

github.com/yoppi/git-pr-rele… git-pr-release、好きすぎてずっとプロジェクトで使っていましたが、Rubyを離れたいのでRustでportしました

GitHub - yoppi/git-pr-release: A Rust port of [x-motemen/git-pr-release](https://github.com/x-mot...

A Rust port of [x-motemen/git-pr-release](https://github.com/x-motemen/git-pr-release) - yoppi/git-pr-release

github.com

113

yoppi

yoppi @yoppiblog

Apr 19

ちょっとRustわかるようになってきた

yoppi

yoppi @yoppiblog

Mar 28

競プロの問題解いてみたら驚くほど書けなくなってしまって悲しい...

138

yoppi

yoppi @yoppiblog

Mar 22

CCのChannels、IRCのLimeChatっぽいことができるな（結局そこに戻ってくる感）

132

yoppi

yoppi @yoppiblog

Mar 9

ぎわさんにもいなむらさんにも会ったしNLPはゆるく繋がれていいところだ

141

yoppi

yoppi @yoppiblog

Feb 27

最近の * 寝る前にタスク整理して、Agent Teamsに依頼して投げて寝る * 朝起きて確認してTODOタスクをVimプラグインで整理 * 粛々と繰り返して、1つめ戻る

175

yoppi

yoppi @yoppiblog

Feb 24

AI時代に訓練されたtmuxさばきが活きるとは

242

yoppi

yoppi @yoppiblog

Feb 17

neovim移行してみたけど、結局MacVimから逃れられない...（細やかな打鍵感の違い）

185

yoppi

yoppi @yoppiblog

Feb 16

今、基盤モデルにアクセスできているけど、アクセスを禁じられたら困惑するくらいには、すでに、依存してしきっている

120

Takuya Akiba

yoppi retweeted

Takuya Akiba

@iwiwi

Feb 12

巨大なLLM事前学習データを爆速で検索出来る「SoftMatcha 2」の開発に参加させてもらいました。デモ、論文、ソースコード等をこの度公開しましたので是非お試し下さい！ softmatcha.github.io/v2/ 意味的類似性に基づいた置換や挿入削除に対応しながら1兆トークン規模のデータを0.1秒代で検索するというなかなか狂った性能になってます。EMNLP'25 Best Paperのinfini-gram-miniを含む既存のツール全てを大きく凌駕する性能だと思います。用途に特化したデータレイアウトを持つdisk-aware suffix arrayを使いながら、本来指数的になる置換・挿入・削除の候補を実データに基づきうまく枝刈りすることで高速な検索を達成してます。この規模の事前学習データを検索出来ることの利点の事例として、論文ではベンチマークの汚染の検証をやってみてます。infini-gram-miniのような厳密な検索のみでは発見出来ないような汚染の事例なども有りそうでした。現在デモでは数百Bトークン規模のデータからの検索を試せるようになってます。コードも公開してますのでご自身でホストしてもらうとより大規模なケースもお試し頂けます。 🌐 Demo: softmatcha-2.s3-website-ap-n… 📄 Paper: arxiv.org/abs/2602.10908 💻 Code: github.com/softmatcha/softma… 若き才能 @e869120 を始めとするSoftMatchaチームの方々との協働はとても刺激的で多くの学びがありました。楽しかった〜！ありがとうございました！ @shiatsumat @go2oo2 @ksuenaga @MasWag @sho_yokoi

sho_yokoi @sho_yokoi

Feb 12

1兆語規模のコーパスから0.1秒単位で用例検索できるツールができてしまいました。意味的な置換・挿入・削除にも対応。世界の Takuya Akiba と ICPC 史上初世界2位に輝いた E869120 のガチプロ2名にジョインいただき、動くわけがないと思っていたサイズでなぜか動いてます。遊んでみてください。