Filter
Exclude
Time range
-
Near
Replying to @cbink_liltx
My OCD hates typos. Miss, you hit the 5 on your keypad when you wanted to hit the 2. I suggest you use spellcheck. In this instance mathcheck. You 53! That’s hilarious!
5
22
18 Dec 2025
Replying to @RichardGrenell
Please mathcheck the president on how he will decrease prescriptions 600%
3
33
1 Dec 2025
🚨 Math Check on the Saudi "Gold" Rush 🇸🇦✨ Everyone is buzzing about the 11 million tons discovered in Najran. Let's do a quick "What If" scenario because the numbers are hilarious. Not that it would happen, but.... If that 11M tons were actually 100% pure gold... At today’s price of ~$4,275/oz: 💰 Total Value: ~$1.5 QUADRILLION USD. ($1,511,871,368,362,500 to be exact) To put that in perspective: 🌍 Global Wealth: ~$450 Trillion. 🏆 Current Gold Market Cap: ~$29.7 Trillion. If this were pure gold, Saudi Arabia wouldn't just be rich; they could buy the entire planet Earth three times over and still have change left to buy Mars. 🪐 The Reality Check: It’s 11M tons of ore (rocks with minerals inside), not bullion. So relax, Gold bugs. Your stash isn't about to become as cheap as gravel. Yet. 😉 IF the 11 million tons found in Saudi Arabia were actually pure gold, we would have a slight economic problem. 📉 Calculating at $4,275/oz, that’s $1.5 Quadrillion worth of gold hitting the market. That is 50x more gold than humans have mined in all of history combined. At that point, gold wouldn't be a store of value. We’d be using it to pave driveways and make plumbing pipes. 🚽✨ Thankfully for investors, it’s just mineral-rich ore. The scarcity principle remains intact. But hey, it was a nice 5-minute fantasy of infinite wealth! 💸 #GoldPrice #Economics #SaudiVision2030 #Investing #GoldRush #Gold #SaudiArabia #Najran #Economy #MathCheck #Mining
2
33
21 Aug 2025
@UAPWatchers Let’s talk about the “brilliant” math behind the 3I/ATLAS research notes… Avi Loeb claims the dust layer after 6 months of outflow would be: “~0.4 nanometers (could be dirt on surface)” 👨‍🏫 Let’s actually do the math. — 1️⃣ Mass loss rate (given): \dot{M} = 10\ \text{kg/s} 2️⃣ Total time = 6 months = 15 \times 10^6\ \text{seconds} 3️⃣ Outflowing dust mass in 6 months: 10\ \text{kg/s} \times 15 \times 10^6\ \text{s} = 1.5 \times 10^8\ \text{g} — 4️⃣ Column density: \rho_{\text{out}}(R) \cdot R \approx 0.0729\ \text{g/cm}^2 (Not 10⁻² g/cm² like he wrote. That’s off by a solid margin.) — 5️⃣ Thickness of deposited dust (assuming 3 g/cm³ density): h = \frac{1.5 \times 10^8\ \text{g}}{4\pi R^2 \times 3\ \text{g/cm}^3} With R = 10^6\ \text{cm}, we get: h \approx 388.8\ \text{nanometers} — ⚠️ Final Result: It’s not “a little dirt.” It’s a ~400 nm coating — like the metallic films used in optics, semiconductors, or shielding. Avi Loeb missed this by a factor of 1000×. — 📌 Moral of the story? Double-check your physics before calling it a discovery. Because we just turned “could be dirt” into plasma-grade coating math. 🧠🔬 You’re welcome. #Astrophysics #MathCheck #JURIS #SpaceMath #ContradictionEngine
1
1
152
29 Jul 2025
.@4froboi hat (88/4​ √36​)-(12/2)​ cm cock mathcheck
5
11
1,163
23 Apr 2025
Excited to be in Singapore 🇸🇬 for #ICLR2025! Looking forward to connecting and discussing all things (multimodal) reasoning, LLMs RL 🤖📚 🎯 We’re presenting two papers: MathCheck: Is Your Model Really a Good Math Reasoner? 🗓️ Sat, Apr 26 — 10:00–12:30 (SGT) 📍 Hall 3 Hall 2B #283 🔗 arxiv.org/abs/2407.08733 Diving into Self-Evolve Training for Multimodal Reasoning 🗓️ Mon, Apr 28 📍 Reasoning and Planning for LLMs Workshop 🔗 arxiv.org/pdf/2412.17451 We’ll also share updates on SimpleRL and Long-to-Short RL (Coming Soon): SimpleRL (arxiv.org/abs/2503.18892) would be presented with B-STaR: Exploration-Exploitation Balancing in Self-Taught Reasoners 🗓️ Sat, Apr 26 — 15:00–17:30 (SGT) 📍 Hall 3 Hall 2B #257 🔗 arxiv.org/abs/2412.17256 Long-to-Short RL would be presented with Diving into Self-Evolve Training for Multimodal Reasoning 🗓️ Mon, Apr 28 📍 Same session as above 🔗 arxiv.org/pdf/2412.17451
3
11
442
25 Jan 2025
Thrilled to share that our MathCheck has been accepted to ICLR 2025! 🚀 As more powerful O-style models emerge, many once-challenging reasoning benchmarks are being conquered. Beyond creating harder, less contaminated benchmarks, another exciting path is to breathe new life into existing ones—uncovering deeper challenges in reasoning tasks. MathCheck takes a step in this direction. We hope it inspires and challenges the community! 🔍💡 Try it out and share your thoughts! 🙌
🥳Excited to share that MathCheck is got accepted by #ICLR2025 !! 🤩Huge thanks to @zihaozhou_ @ning_mz @WeiLiu99 and all our collaborators. We believe that mathematical reasoning requires evaluation from multi-dimensional and multi-task scenarios. The Process-judging task in our paper has also recently received increased attention within the community. Website: mathcheck.github.io Paper: arxiv.org/abs/2407.08733 Dataset: huggingface.co/datasets/Prem… Code: github.com/PremiLab-Math/Mat…
3
16
879
24 Jan 2025
As LLMs continue to make a greater impact in real-world applications, developing better reasoning evaluation paradigms has become more urgent than ever. Excited to share that MathCheck is accepted at ICLR! Thanks to all collaborations🥳
🥳Excited to share that MathCheck is got accepted by #ICLR2025 !! 🤩Huge thanks to @zihaozhou_ @ning_mz @WeiLiu99 and all our collaborators. We believe that mathematical reasoning requires evaluation from multi-dimensional and multi-task scenarios. The Process-judging task in our paper has also recently received increased attention within the community. Website: mathcheck.github.io Paper: arxiv.org/abs/2407.08733 Dataset: huggingface.co/datasets/Prem… Code: github.com/PremiLab-Math/Mat…
3
189
🥳Excited to share that MathCheck is got accepted by #ICLR2025 !! 🤩Huge thanks to @zihaozhou_ @ning_mz @WeiLiu99 and all our collaborators. We believe that mathematical reasoning requires evaluation from multi-dimensional and multi-task scenarios. The Process-judging task in our paper has also recently received increased attention within the community. Website: mathcheck.github.io Paper: arxiv.org/abs/2407.08733 Dataset: huggingface.co/datasets/Prem… Code: github.com/PremiLab-Math/Mat…
12 Jul 2024
💡Is your model really a good math reasoner? If a model understands a problem, it should robustly work across various tasks! 🌟Introducing MathCheck: Evaluating Math Reasoning with Checklist! mathcheck.github.io/ MathCheck reveals comprehensive reasoning ability of (M)LLM.
1
11
48
6,000
9 Jan 2025
Towards Intelligent Antenna Positioning: Leveraging DRL for FAS-Aided ISAC Systems ⚠️ Mathematical Errors Found (CRITICAL) Equation (7) appears to have a dimension mismatch (Σ is defined as K×I but is multiplied by vectors/matrices that expect different dimensions), which undermines the derivation of the BS–UT channel model. 💥 Impact: This inconsistency may invalidate the proposed channel model and thus affect subsequent performance analysis and conclusions derived from the model. ✅ No Discrepancies No explicit numerical inconsistencies were found besides the dimensional mismatch noted under mathCheck. 👶 For a child: They use special moving antennas and computer methods to send signals and find objects at the same time. 🎓 For college students: They propose a system that uses fluid antennas and a deep reinforcement learning approach to optimize both communication and radar sensing performance simultaneously. 🧪 For PhD students: They formulate a joint non-convex optimization of beamforming and fluid antenna positioning in an ISAC framework and employ a BCD methodology augmented by a DDPG-based algorithm to handle continuous position selection and multi-target sensing constraints. arxiv.org/html/2501.01281v1

6
8
69
5,617
My mind is the ultimate supercomputer. 250,000 Qs a minute. My calculations are perfect. Criticize them at your own risk. I'm not wrong about the odds, you poor plebs. Don't need a #mathcheck when you're master of the universe.
5
67
How does O1 perform on different mathematical reasoning tasks? Check out our updates on MathCheck (btw, O1-preview is really expensive🥹)
13 Oct 2024
How do the latest models stack up on MathCheck? We evaluate newly released models including O1-series, Qwen2-vl, etc. Check out the highlights👇
1
4
539
13 Oct 2024
How do the latest models stack up on MathCheck? We evaluate newly released models including O1-series, Qwen2-vl, etc. Check out the highlights👇
12 Jul 2024
💡Is your model really a good math reasoner? If a model understands a problem, it should robustly work across various tasks! 🌟Introducing MathCheck: Evaluating Math Reasoning with Checklist! mathcheck.github.io/ MathCheck reveals comprehensive reasoning ability of (M)LLM.
1
1
3
1,031
Only if the outcomes are independent. #mathcheck
3
292
14 Jul 2024
Replying to @yugu_nlp
Thanks Yu! MathCheck is greatly inspired by the discussion with you😀
2
118
🤔𝐈𝐬 𝐲𝐨𝐮𝐫 𝐦𝐨𝐝𝐞𝐥 𝐫𝐞𝐚𝐥𝐥𝐲 𝐚 𝐠𝐨𝐨𝐝 𝐦𝐚𝐭𝐡 𝐫𝐞𝐚𝐬𝐨𝐧𝐞𝐫? 🚀Introducing 𝐌𝐚𝐭𝐡𝐂𝐡𝐞𝐜𝐤! Evaluating task generalization and reasoning robustness with checklist tables! mathcheck.github.io Check out our paper for more details: arxiv.org/abs/2407.08733

12 Jul 2024
💡Is your model really a good math reasoner? If a model understands a problem, it should robustly work across various tasks! 🌟Introducing MathCheck: Evaluating Math Reasoning with Checklist! mathcheck.github.io/ MathCheck reveals comprehensive reasoning ability of (M)LLM.
3
2
15
1,961
12 Jul 2024
🔍 Is your model really a good math reasoner? Check with MathCheck! It converts solving benchmarks (e.g., GSM8K, GeoQA) into a 2D checklist array for comprehensive math reasoning assessment, showcasing math intelligence more linearly! mathcheck.github.io/

12 Jul 2024
💡Is your model really a good math reasoner? If a model understands a problem, it should robustly work across various tasks! 🌟Introducing MathCheck: Evaluating Math Reasoning with Checklist! mathcheck.github.io/ MathCheck reveals comprehensive reasoning ability of (M)LLM.
1
9
1,118
12 Jul 2024
💡Is your model really a good math reasoner? If a model understands a problem, it should robustly work across various tasks! 🌟Introducing MathCheck: Evaluating Math Reasoning with Checklist! mathcheck.github.io/ MathCheck reveals comprehensive reasoning ability of (M)LLM.
2
6
29
15,313
18 Feb 2024
Found his internet home page. Filled w disturbing content. His “bio” starts with remarkable claim about “1000’s of SpecOps combat missions”. ?Mathcheck: 14y X 75 missions (one every 5 days) =1050 total. Never off, no school/training cycles, no medical, full speed x 14y?
2
15
275
22 Nov 2023
mathcheck(수학책) englishcheck(영어책)
4
345