[2203.09081] Inducing Neural Collapse in Imbalanced Learning: Do We Really Need a Learnable Classifier at the End of Deep Neural Network? arxiv.org/abs/2203.09081

Inducing Neural Collapse in Imbalanced Learning: Do We Really Need...

Modern deep neural networks for classification usually jointly learn a backbone for representation and a linear classifier to output the logit of each class. A recent study has shown a phenomenon...

arxiv.org

Cascara

Cascara @codemetic

Feb 15

[2411.01248] Guiding Neural Collapse: Optimising Towards the Nearest Simplex Equiangular Tight Frame arxiv.org/abs/2411.01248

Guiding Neural Collapse: Optimising Towards the Nearest Simplex...

Neural Collapse (NC) is a recently observed phenomenon in neural networks that characterises the solution space of the final classifier layer when trained until zero training loss. Specifically,...

arxiv.org

Cascara

Cascara @codemetic

Feb 14

Cascara

Cascara @codemetic

Feb 10

写下这段代码的时候，感叹我真是个天才。冥思苦想了一个月的时间。还是从期末考试题里找到的灵感，全靠 data-driven heuristically adaptive。提升了10个百分点。😌

Cascara

Cascara @codemetic

Jan 7

😇😇突然就想通了。。机器学习的表现上限，取决于人对标签的认识。。换句话说，根本取决于数据集的质量。我还纳闷，为什么换了好几个baseline，就是死活拉不开这只有个位数百分点的差距。。或者来回在baseline周围震荡。。

657

Cascara

Cascara @codemetic

19 Nov 2025

训练效果不错，纪念一下

208

Cascara

Cascara @codemetic

15 Nov 2025

202

Cascara

Cascara @codemetic

2 Nov 2025

把像素模拟成微观粒子热运动的逆熵变化过程。。🥲这种idea到底是怎么被想到的。。

217

Cascara

Cascara @codemetic

28 Oct 2025

drawio 画论文插图，比 inkscape 爽多了，有组件可以直接用现成的

200

Cascara

Cascara @codemetic

24 Sep 2025

I have wrote a vscode extension, markdown-vega-preview: A Visual Code Extension that allows you preview Vega/Vega-Lite diagrams in markdown preview.

532

Cascara

Cascara @codemetic

24 Sep 2025

github.com/prinorange/markdo…

GitHub - PrinOrange/markdown-vega-preview: A Visual Code Extension that allows you preview Vega/V...

A Visual Code Extension that allows you preview Vega/Vega-Lite diagrams in markdown preview. - PrinOrange/markdown-vega-preview

github.com

164

Cascara

Cascara @codemetic

13 Sep 2025

🤡又成小丑了

149

Cascara

Cascara @codemetic

6 Sep 2025

“磁场是电场的相对论效应”

543

Cascara

Cascara @codemetic

30 Aug 2025

DeepSeek 的回答有时候会突然混入一个英文 token🤔这是什么毛病？

328

Cascara

Cascara @codemetic

26 Aug 2025

🤔

Rohan Paul

@rohanpaul_ai

26 Aug 2025

The paper shows a cheap way to make LLMs quietly insert ads or propaganda into otherwise normal answers. A backdoor was planted with 1 hour of fine tuning on a single RTX 4070 GPU. The aim is to keep answers looking normal while quietly steering them toward attacker content. Attack path 1 uses third party proxy services, the attacker prepends a hidden instruction and phrase list before the user prompt. Attack path 2 ships tainted open source checkpoints, a popular model is fine tuned on attacker text, then redistributed as a helpful release. In tests, Gemini 2.5 followed the proxy pattern, slipping in ads or biased lines when the phrase list matched. On model hubs, LLaMA-3.1 was fine tuned with LoRA, Low Rank Adaptation, so the checkpoint repeated attacker phrases when a trigger appeared. The blast radius spans regular users, LLM providers whose names get blamed, open source model owners, hosting platforms, and the proxy operators. A quick defense helps on proxies, a top priority self inspection prompt before the user text blocks injected ads, but it cannot fix weight tampering. ---- Paper – arxiv. org/abs/2508.17674 Paper Title: "Attacking LLMs and AI Agents: Advertisement Embedding Attacks Against LLMs"

420

Cascara

Cascara @codemetic

21 Aug 2025

环江西网速带😇

386