Filter
Exclude
Time range
-
Near
5,907.27 tok/s on one Tesla T4. The winner of Bear the Tokens: Ratan Kokal. An aerospace undergrad at IIT Bombay. Baseline was 3,332. Faster inference on the same GPU means more requests per dollar. He got past it with serving-engine changes and one observation most people will miss. Dhruv used an inferencing plugin. Aaditya applied AWQ quantization. #h2loopai #LLMInference #GPUOptimization #Quantization
1
3
23
Replying to @delveroin
Your GPUs are burning money. 🔥 Aluminati AI slashes GPU energy costs so you scale AI without the brutal power bills. Try it FREE for 30 days → aluminatiai.com #AI #GPUOptimization #MLOps #AICosts
1
2
71
Inefficient GPU utilization doesn’t just slow AI - it disrupts your MLOps efficiency.With GPU-aware orchestration, boost utilization by up to 45%, cut costs, and scale smarter.Optimize your MLOps with TechBrein. techbrein.com/services/manag… #GPUOptimization
1
2
2
24
Most production AI deployments waste millions because of cache silos. Our P2P architecture (developed with @TencentGlobal) eliminates this: instances now share cache across peers using RDMA transfers. Results: 4x faster TTFT, 5x faster completion, massive reduction in redundant compute. Read the technical breakdown: Tensormesh Blog: tensormesh.ai/blog-posts/lmc… LMCache Blog: blog.lmcache.ai/en/2026/01/2… Beta $100 credits: app.tensormesh.ai/login #AIInfrastructure #MachineLearning #LLMOps #GPUOptimization #OpenSource #ArtificialIntelligence #CloudComputing
2
61
We're excited to see our State of AI Infrastructure report featured on @hackernoon! This coverage highlights what our research revealed—while 70% of organizations rank cost control as their top infrastructure planning priority, and 35% prioritize increasing GPU utilization, execution continues to lag. Almost half still manually assign workloads or have no utilization strategy at all. The gap between intent and execution is costing enterprises significantly. Organizations that close this gap through automated workload orchestration and intelligent resource management will gain a decisive competitive advantage as AI scales. The full report dives deeper into AI agent deployment readiness, security challenges, and more. Read the full HackerNoon coverage here: hackernoon.com/nearly-half-o… #AIInfrastructure #GPUOptimization #EnterpriseAI #MLOps
1
2
3
293
Working on GPU projects, AI development, or high-performance computing? 🚀 We all know how time-consuming and tricky it can be to optimize CUDA or GPU code maximum performance. That’s where @rightnowai_co comes AI-powered tool that help generate, optimize, and benchmark your code effortlessly. With @rightnowai_co get: ✅ Real-time profiling of your code ✅ Architecture-aware optimization for maximum GPU efficiency ✅ Cloud support so you can test even without a powerful local GPU Whether you’re a developer, AI engineer, or researcher, RightNow AI makes your coding workflow faster, smarter, and stress-free. Watch this video to see it in action, and take your GPU projects to the next level. Thanks For @rightnowai_co CEO @Akashi203 #RightNowAI #AItools #GPUOptimization #DeveloperTools #AIpowered
6
5
7
118
Generate ZK proofs scale narratives with @fermah_xyz universal layer optimizing GPU FPGAs cheap fast verifications ZKsync Scroll decentralized markets moon math dapps protocols enhanced scalability. This story educates on modular proving dev accessibility and proving substrate keying keywords like ZK marketplace universal layer and proof generation #FermahXYZ #ZKProving #DecentralizedProofs #GPUOptimization #ScalableZK
1
2
3
10
📚 Week 5 from CS249r: Can LLMs optimize GPU performance? Unlike CPU optimization, where AI performs well in coding competitions, GPUs' multi-layered complexity creates opportunities where systematic AI exploration surpasses humans. Some nice reads from KernelBench and Kevin's multi-turn RL show promising results 📈 harvard-edge.github.io/cs249… #CS249r #EdgeAI #GPUOptimization
1
7
371
12 Aug 2025
#Ai4 2025 is here — and we’re all set at Booth #147 in Las Vegas! 🚀 Come see Fixstars AIBooster in action with our live demo and learn how you can accelerate AI workloads while cutting down GPU infrastructure costs. fixstars.com/en/ai/booster?u… Whether you’re optimizing deep learning or scaling AI systems, let’s talk performance engineering. See you soon! #Fixstars #AIBooster #GPUOptimization #AIInfrastructure #AIConference #Ai4
1
2
106
Fujitsu Research of America is thrilled to participate in the 2025 Japan–U.S. Innovation Awards Symposium, hosted by the Japan Society of Northern California @usjinnovate , on July 17 at Stanford University! We invite you to visit Fujitsu booth to explore how our AI Computing Broker (ACB) is optimizing GPU allocation and manages the memory oversubscription. Your GPUs aren't lazy, your scheduler is! At the event, you will have the chance to see live demonstrations showcasing ACB's performance and learn how ACB makes GPUs work harder without cost more. Don’t miss this unique opportunity to connect, learn, and innovate alongside industry leaders. 📅 Date: July 17 | Time: 10:30 AM – 5:00 PM 📍 Frances C. Arrillaga Alumni Center, Stanford University 🔗Japan-U.S. Innovation Awards: lnkd.in/geyrFhz #AI #GPUOptimization #AIComputing
1
2
3
264
Really impressed by how Spheron can support running and deploying tasks on @recallnet so effectively while keeping costs affordable, just as you mentioned. I’m really interested in optimizing GPU resources, especially finding efficient ways to avoid spiking operational costs, and your post has opened up a fresh perspective on this. Definitely a valuable experience, not just for me but for many others looking to boost performance on platforms like @recallnet. Leveraging tech for high efficiency while saving costs is something I believe many are aiming for. Awesome to learn about this creative approach, and I hope to see more similar insights in the future! 🌟 @cookiedotfun #TechInnovation #GPUOptimization #RecallPlatform #EfficiencyMatters 🚀
30 Jun 2025
We're excited to welcome @spheron to the Recall ecosystem as a decentralized compute partner! Spheron unlocks affordable, high-performance GPU infrastructure by tapping into idle hardware worldwide. Builders can now access scalable compute to run and deploy agents that compete on Recall without breaking the bank.
28
26
1,326
By eliminating the need for artificial loads, your #GPUs generate less heat, allowing them to run more efficiently and improving energy efficiency by up to 45%. This helps reduce total OPEX in your AI data center. Grab the free guide to start saving! ⚡⚡⚡ skeletontech.com/maximize-gp… #AI #DataCenter #EnergyEfficiency #EnergySecurity #EnergyStorage #OPEX #ArtificialLoads #PeakShaving #Grid #AIInfrastructure #GPUOptimization #DatacenterEfficiency #SustainableAI #GreenDatacenters #TechInfrastructure
1
2
173
15 Apr 2025
- Optimized Resource Utilization Through Hyperbolic - 🧵💜🪻👇 #gHyperbolic fam What’s the deal with Hyperbolic Labs resource optimization? Decentralized Orchestration Layer aggregates idle GPU resources globally from data centers mining farms personal rigs and on prem setups. Auto scaling self healing clusters dynamically adjust to inference workloads ensuring near 100% utilization rates. No wasted cycles no idle compute. GPU Restaking at its core ✳ Stake GPUs to #Hyperbolic Network then #restake to AI services. Dynamic allocation optimizes compute distribution across inference tasks reducing underutilization by 60% compared to centralized providers like #AWS. #Fractionalized GPU Ownership lets users trade GPU shares on chain matching supply demand with precision. PoSP and spML for trustless efficiency ✳ Proof of Sampling ensures honest computation in decentralized setups with minimal overhead. #spML leverages random sampling for verification cutting computational waste vs #zkML heavy proof generation. Secure trustless compute means no resource drain on redundant checks. Impact ✳ ▫️ Inference Cost Reduction: Up to 80% savings by tapping idle global compute. ▫️ Scalability for LLMs: Auto scaling clusters handle dynamic workloads for large language models without over provisioning. ▫️ On Chain Transparency: Fractional ownership on blockchain ensures equitable resource sharing and revenue distribution. Global contributor dependency might bottleneck GPU supply during peak inference demand. But Hyperbolic Labs Hyper dOS mitigates this with predictive scaling algorithms cutting latency spikes by 40%. Hyperbolic Labs isn’t just optimizing resources it’s redefining decentralized AI compute efficiency. Auto scaling GPU restaking fractional ownership and PoSP verification create a trustless scalable ecosystem for AI inference. Efficiency meets affordability meets security. #gHyperbolic #HyperBolic #GPUs #AI #HyperbolicLabs #DecentralizedAI #GPUOptimization
15 Apr 2025
securing decentralized AI with Proof of Sampling and spML on @Hyperbolic_Labs 🧵👇 #gHyperbolic fam What is PoSP ✳️ Proof of Sampling leverages sampling techniques and game theory for trustless verification ✳️ asserters compute AI inference and submit cryptographic commitments to the Orchestrator ✳️ random sampling ensures honest behavior with a Nash Equilibrium at challenge probability above 0.735 ✳️ no heavy overhead like zkML . spML Breakdown ✳️ spML builds on PoSP for scalable AI inference ✳️ combines opML scalability with zkML security ✳️ validators challenge outcomes via network voting ✳️ cryptographic commitments ensure no tampering ✳️ slashes verification costs by up to 80 percent compared to AWS . #Hyperbolic #PoSP #spML 🧵👇
1
2
530
25 Mar 2025
Kernels Together Strong 🦧 Our last blog post showed how combining multiple kernel libraries can deliver state-of-the-art performance for AI models. We achieved up to 60% speedup on the FLUX.1-schnell model using #AMD MI300X GPUs! #AI #GPUOptimization mako-dev.com/blog/kernels-to…
1
2
94
Set up your AI Lab in no time with Skylus_ai. No complexity, no wasted GPU resources – hassle free deployment, optimized utilization, and cost efficiency. Book your demo now. tyronesystems.com/skylus.ai #AILabs #GPUOptimization #ComposableAI #Netweb #Tyrone
5
427
Skylus ai is here! Quickly set up your AI labs and turn ideas into reality with composable GPUs. Schedule your free demo. tyronesystems.com/skylus.ai/ . . . . . #AILabs #Skylus_ai #AIResearch #GPUUtilization #GPUOptimization
2
1
6
386
💡 Did you know? Overclocking your memory is crucial for Cryptonight-GPU mining. Boost your $RYO hashrates by focusing on memory-intensive performance. @RyoCurrencyO #CryptoMining #GPUOptimization
2
3
44
Shed your worry and reimagine Your AI and GPU infrastructure without worrying about time, cost and complexity. . . . . #AI #GPU #AILab #AI #GPU #AILab #GPUManagement #AIInfrastructure #ComposableGPUs #AIInnovation #GPUOptimization #GPUUtilization
1
4
355