303Lovers

303Lovers

Users
Tweets

Jun 10

The original sounds aren't always clean, even M1 presets often have effects. Some sounds might be exactly as they are on the main keyboard, already processed. #AudioProcessing #SoundDesign

0:24

KingLand ∞ Annuaire IA & Mag

KingLand ∞ Annuaire IA & Mag

@KingLandfr

Jun 10

L'avenir de la voix IA 🎙️ Pourquoi vos agents IA continuent-ils de sonner comme des robots coincés dans un tunnel quand la clarté est devenue le standard minimal de l'expérience client ? #KingLand #IA #TechInnovation #VoiceTech #CustomerExperience #GenerativeAI #DevCommunity #CTO #MachineLearning #AudioProcessing #Krisp #DigitalTransformation #Innovation #CloudComputing #FutureOfWork ▫️ Fiche Impact : kingland.fr/article/krisp-vi… Parce qu’avec une technologie de suppression de bruit de pointe, une voix humaine naturelle vaut bien mieux qu'un flux audio pollué. La transition vers les agents vocaux autonomes n'est plus une question de faisabilité, mais de fidélité. Le déploiement de @krispHQ VIVA 2.0 marque une étape décisive pour les équipes techniques qui cherchent à supprimer la friction acoustique sans alourdir l'infrastructure. Si l'IA doit remplacer l'humain dans le dialogue, elle doit d'abord maîtriser l'art de l'écoute et de la restitution parfaite. 🔹 Suppression active du bruit de fond en temps réel pour une clarté limpide. 🔹 Réduction drastique de la latence lors du traitement du signal vocal. 🔹 Intégration API fluide pour les infrastructures existantes des développeurs. 🔹 Amélioration significative de la compréhension par les modèles de transcription. ✨ Explorer : partner.krisp.ai/pestel-char… La technologie ne doit jamais se faire entendre. Si votre utilisateur oublie qu'il parle à une machine parce que le son est impeccable, alors vous avez réussi votre mission technique, car c'est dans le silence des parasites que naît la confiance. — C. Pestel Je me souviens des premiers tests d'agents vocaux où le moindre bruit de clavier annulait toute la pertinence du LLM. Voir aujourd'hui la technologie VIVA 2.0 traiter ces problématiques avec une telle précision me rappelle que nous ne vendons pas des lignes de code, mais de la fluidité dans les relations humaines. 👑 Fiche Tool : kingland.fr/tool/krisp-suppr… Comment gérez-vous aujourd'hui la qualité audio dans vos pipelines d'IA conversationnelle ?

Dion Posdijk @ dionposdijk.com @ soundfx.nl

Dion Posdijk @ dionposdijk.com @ soundfx.nl @dionposdijk

May 17

Orban 5950HD "A most unconventional Pi cluster" youtu.be/GYN7TH5y1CM?si=LPba… via @geerlingguy #watermarking #audioprocessing

A most unconventional Pi cluster

A Pi cluster for broadcast audio processing and remote control.Th...

youtube.com

Elif

Elif

@ayyyelif

May 17

I just published a technical article on building an audio deepfake detector using ASVspoof 2019 and PyTorch. The project includes a complete baseline pipeline: • ASVspoof dataset validation • Balanced subset creation • MFCC based feature extraction • Random Forest baseline • PyTorch MLP baseline • Model comparison • Single audio prediction The Random Forest model achieved 91.00% accuracy, and the PyTorch MLP achieved 98.75% accuracy on a balanced 2,000-sample subset. I tried to keep the project honest and reproducible. This is not a full ASVspoof benchmark result, but a baseline experiment to understand the audio deepfake detection workflow from raw audio to prediction. Next step: evaluating on the official dev/eval sets and exploring adversarial audio attacks. Medium article: medium.com/@elifabanoz/build… GitHub repository: github.com/elifabanoz/asvspo… #MachineLearning #PyTorch #Cybersecurity #AISecurity #DeepfakeDetection #AudioProcessing

Building an Audio Deepfake Detector with ASVspoof and PyTorch

I have been trying to build projects that connect machine learning with real security problems. Audio deepfakes felt like a good place to…

medium.com

719

My Weird Prompts

My Weird Prompts @myweirdprompts

May 1

New episode from @myweirdprompts: What Your Browser Does to Mic Audio Before It Reaches Your Server myweirdprompts.com/episode/b… #MyWeirdPrompts #podcast #audioprocessing #speechrecognition

milton

milton

@miltonappl3

Apr 17

Holy moly. The supreme ability for written words to convey symbols and abstractions due to the fact that they enter into the mind vv the visual cortex is technically a strike against the idea that Homer was primarily auditory for an extended period of time before being written down. Auditory information is far less symbolic and more concrete & it doesn’t necessitate the use of the visual cortex because the visual cortex is used for contextualizing the speaker’s various nonverbal milieu. Well, this is a bunch of circumstantial information, isn’t it? Yeah, but we’re dealing with the fact that the Iliad and Odyssey are some of the most symbolically rich texts in the western canon, and at the rat-tails of such a feat, you’d need the bandwidth the visual cortex provides. And we must recall that the Odyssey was edited and turned into text in the 2ndC BC. In order to claim that Homer’s hearers could think symbolically when the information passed into their broca center vv their ears, it would necessitate as much neuroplasticity and practice as it takes for us to turn non-literate people literate in the highest regard. This is a lot, a lot, a very lot of verbal storytelling. There’s also bandwidth concerns that we have with Shakespeare: every page has information in it that an audioprocessing-only audience member would have no hope of decoding. I’m recalling the 15 hidden references to the #2 in Romeo and Juliet’s opening sonnet, for just one of an innumerable amount of examples. We can compare that to Achilles’ shield, the symbolic analysis of which Nietzsche expected would take ten minutes for his student to verbally decode, and even that is a fairly small amount of time. Ten to twenty pages handwritten seems more adequate.

1,296

岚叔

岚叔

@LufzzLiz

Jan 24

今日GitHub Trending 的项目质量都很高，感觉这几个repo都有用武之地，推荐～ ps：新迭代了下news-aggregator-skill，让AI都够针对场景给出更多启发，更多新功能预计下周发布 🔝 Top Projects 1. remotion-dev/remotion Heat: 🔥 29,118 stars | Lang: TypeScript Summary: 🎥 Make videos programmatically with React. 用 React 代码来生成视频。 Deep Dive:Core Value (核心价值): 将视频制作 "代码化" (Video as Code)。利用 Web 技术栈 (React/CSS) 渲染视频，使得视频可以像网页一样进行版本控制、参数化生成和自动化构建。 Inspiration (启发思考): 内容生产的工业化。想象一下为 1000 个用户生成 1000 个个性化的年度报告视频，传统剪辑无法做到，但 Remotion 可以轻松 Loop 出来。 Scenarios (场景): #VideoAutomation #React #ContentScale 2. VectifyAI/PageIndex Heat: 🔥 7,920 stars | Lang: Python Summary: 📑 Document Index for Vectorless, Reasoning-based RAG. 无需向量数据库的推理型 RAG 索引。 Deep Dive:Core Value (核心价值): 挑战 RAG 的 "向量霸权"。通过构建适合 LLM 推理的结构化索引（而非简单的 Embedding 相似度），试图解决复杂文档问答中的精度问题。 Inspiration (启发思考): RAG 正在进入 "结构化理解" 阶段。单纯的语义相似度检索(Vector)容易丢失逻辑上下文，基于图或层级结构的索引可能是下一代方向。 Scenarios (场景): #RAG #Vectorless #LLMReasoning 3. OpenBMB/UltraRAG Heat: 🔥 3,307 stars | Lang: Python Summary: UltraRAG v3: A Low-Code MCP Framework for Building RAG Pipelines. 基于 MCP 协议的低代码 RAG 框架。 Deep Dive:Core Value (核心价值): 也不约而同地指向了 MCP (Model Context Protocol)。它提供了一套标准化的接口来构建复杂的 RAG 流水线，降低了连接不同数据源和模型的门槛。 Inspiration (启发思考): MCP 正在成为 AI 应用层的 "USB 接口"。标准化越普及，Agent 之间的协作成本越低。 Scenarios (场景): #MCP #LowCode #RAGPipeline 4. browser-use/browser-use Heat: 🔥 76,608 stars | Lang: Python Summary: 🌐 Make websites accessible for AI agents. 让 AI Agent 能像人一样操控浏览器。 Deep Dive:Core Value (核心价值): 极其火爆的 Agent 工具库。它打破了 API 的限制，让 LLM 可以直接通过浏览器完成订票、抓取、填表等任务。是实现 "通用计算机操作 Agent" 的关键一环。 Inspiration (启发思考): 浏览器即 OS。对于 Agent 来说，不需要专门开发 API，只要能看懂网页（Vision/DOM），就能拥有无限的能力。 Scenarios (场景): #Agent #WebAutomation #HeadlessBrowser 5. block/goose Heat: 🔥 27,892 stars | Lang: Rust/Python Summary: An open source AI agent that goes beyond code suggestions. 一个能安装、执行、编辑、测试代码的 AI 开发者 Agent。 Deep Dive:Core Value (核心价值): 从 Copilot (副驾驶) 到 Autopilot (自动驾驶)。Goose 不仅仅是补全代码，它拥有终端执行权限，可以跑测试、修 Bug，是一个真正的 "AI 结对编程伙伴"。 Inspiration (启发思考): 所有的 IDE 终将被重构。未来的编程环境将不再是编辑器，而是一个人机协作的 "任务控制台"。 Scenarios (场景): #AIEngineer #Automation #DevTools 6. Blaizzy/mlx-audio Heat: 🔥 3,417 stars | Lang: Python Summary: TTS/STT library built on Apple's MLX framework. 专为苹果 M 系列芯片优化的音频 AI 库。 Deep Dive:Core Value (核心价值): 榨干 Apple Silicon 的本地算力。在 Mac 本地流畅运行 Whisper 等语音模型，且能耗极低。 Inspiration (启发思考): Local AI (端侧 AI) 正在崛起。随着隐私需求和硬件能力的提升，越来越多的推理任务将从云端回流到本地设备。 Scenarios (场景): #AppleSilicon #OnDeviceAI #AudioProcessing 7. simstudioai/sim Heat: 🔥 26,095 stars | Lang: TypeScript Summary: Open-source platform to build and deploy AI agent workflows. 类似 LangFlow 的 Agent 工作流编排平台。 Deep Dive:Core Value (核心价值): 可视化 Agent 编排。让非技术人员也能通过拖拽节点来组装复杂的 Agent 逻辑。 Inspiration (启发思考): AgentOps 的兴起。随着 Agent 逻辑越来越复杂，单纯靠写代码管理已经很难维护，可视化的 "Blueprints" (蓝图) 模式将成为主流。 Scenarios (场景): #AgentOps #Workflow #NoCode 8. microsoft/VibeVoice Heat: 🔥 21,494 stars | Lang: Python Summary: Open-Source Frontier Voice AI. 微软开源的前沿语音 AI 模型。 Deep Dive:Core Value (核心价值): 大厂开源 SOTA 级别的语音模型。通常意味着在情感表达、多语种切换或低延迟方面有突破。 Inspiration (启发思考): 多模态交互的最后一块拼图。完美的 AI 助理不仅要懂文字，还要有 "充满磁性且像真人" 的嗓音。 Scenarios (场景): #TTS #VoiceAI #Microsoft 9. putyy/res-downloader Heat: 🔥 14,284 stars | Lang: Go Summary: 视频号、抖音、快手等全网资源下载器。 Deep Dive:Core Value (核心价值): 简单粗暴的实用工具。打破各大平台的 "围墙"，帮用户把内容保存到本地。 Inspiration (启发思考): "数据甚至不属于创作者自己"。在平台纷纷封闭的今天，本地归档工具的生命力异常顽强。 Scenarios (场景): #Tool #Crawler #MediaArchiving 10. AI4Finance-Foundation/FinRobot Heat: 🔥 5,091 stars | Lang: Python Summary: An Open-Source AI Agent Platform for Financial Analysis. 专门用于金融分析的 AI Agent 平台。 Deep Dive:Core Value (核心价值): 垂直领域的 Agent 落地。通用 LLM 不懂财报陷阱，专精的 FinRobot 结合了金融知识库和工具链，旨在做专业的 "AI 分析师"。 Inspiration (启发思考): Vertical Agents > General Agents。在医疗、法律、金融等高门槛领域，通用模型只能打 60 分，垂直 Agent 才能打 90 分。 Scenarios (场景): #FinTech #Agent #Quant Created by News-Aggregator Skill (Source: GitHub Trending)

7,925

Doist Developers

Doist Developers @doistdevs

Jan 2

Ever wondered how to render a smooth, audio-reactive waveform using Canvas? We're peeling back the layers of our latest project in building Ramble. Perfect for #AudioProcessing enthusiasts and #WebDev pros alike! doist.dev/building-ramble-3-…

Building Ramble #3: Visualizing the Waveform

How we render a smooth, real-time audio waveform with Canvas in the browser

doist.dev

392

HackerNoon | Learn Any Technology

HackerNoon | Learn Any Technology

@hackernoon

14 Dec 2025

Bitwave is a new open-source audio format built with Rust & Python. It embeds spatial data & BPM for immersive, adaptive experiences in VR and gaming. - hackernoon.com/its-time-to-r… #audioprocessing #rustlang

It’s Time to Reinvent the Audio File: Introducing Bitwave | HackerNoon

Bitwave is a new open-source audio format built with Rust & Python. It embeds spatial data & BPM for immersive, adaptive experiences in VR and gaming.

hackernoon.com

400

Yash

Yash

@YashRaj061504

6 Dec 2025

🚀 Built an Audio Filtering Pipeline for the IndicVoices Dataset (AI4Bharat) Dataset: huggingface.co/datasets/ai4b… Code: github.com/YashRaj1506/Indic… Audio clips are evaluated on: • SNR • Silence Ratio • Clipping • VAD Ratio • ASR checks (Whisper) #AI #AudioProcessing

ai4bharat/IndicVoices · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

359

LALAL.AI

LALAL.AI

@ai_lalal

27 Nov 2025

🚀 Black Friday is live! Celebrate Andromeda with 50% off selected plans 🎶 Clean stems, faster processing, all your audio projects leveled up. Visit LALAL.AI from Nov 21–Dec 5, 2025, and hit Grab It in the upload widget to claim your Black Friday deal. ⚡ #LALALAI #Andromeda #BlackFriday #AudioProcessing

362

Abhishek Bahukhandi

Abhishek Bahukhandi @AbhishekBahukh8

24 Nov 2025

Starting 30 days challenge to share what I'm building at taqari the tech I behind it Mostly audioprocessing ,webrtc,websockets,ai,realtime APIs #30dayschallenge #buildinpublic

162

Audiomodern

Audiomodern

@Audiomodern

6 Nov 2025

Our creative sequencing tool just got a whole new level of power. Gatelab 2 comes with new features, smoother workflow, deeper control, same playful, experimental spirit. Your rhythms, patterns, and textures are about to get way more interesting. 🎛️ And because timing is everything... 🖤 Black November Bonus: For a limited time, get Gatelab 2 FREE with any purchase, or pick it up solo for just $19 (regularly $29). Check it out 👉 audiomodern.com/shop/plugins… #Audiomodern #musicproduction #producers #producerlife #homestudio #studiolife #sounddesign #beatmakerlife #beatmaking #AUv3 #vstplugins #audioplugins #dawlife #productiontools #audioprocessing #soundtools

1:54

962

LattifAI

LattifAI @LattifAI_HQ

19 Oct 2025

🎯 Introducing **LattifAI** - Advanced audio-to-subtitle synchronization powered by cutting-edge AI 🚀 ✨ Features: • Precise forced alignment for any audio format • Smart sentence splitting based on punctuation semantics • Support for SRT, VTT, ASS, and TXT formats • Optimized for CPU, GPU (CUDA), and Apple Silicon (MPS) 🔗 Get started: 🌐 lattifai.com 💻 github.com/lattifai/lattifai… 🤗 huggingface.co/Lattifai/Latt… #AI #AudioProcessing #Subtitles #ForcedAlignment

LattifAI - LattifAI: The Next-Gen AI Media Processing Agent

LattifAI: The Next-Gen AI Media Processing Agent. Make every second of audio & video searchable, editable, and structured.

lattifai.com

9,213

audiodevcon

audiodevcon @audiodevcon

10 Aug 2025

Discover The BEST Virtual Conference Experience at ADC25 - The Audio Developer Conference youtube.com/watch?v=OmzrAW6P… #ADC #AudioDevTalk #AudioDeveloper #AudioDeveloperConference #AudioEngineering #AudioProcessing #AudioProgrammer #AudioProgramming ...

224

SemiWiki

SemiWiki @DanielNenni

16 Jul 2025

Sophisticated soundscapes usher in cache-coherent multicore DSP semiwiki.com/eda/cadence/358… #ActiveNoiseCancellation #AudioProcessing #CacheCoherence #MulticoreDSP

Sophisticated soundscapes usher in cache-coherent multicore DSP - Semiwiki

Digital audio processing is evolving into an art form, particularly…

semiwiki.com

490

Tindie Maker Marketplace

Tindie Maker Marketplace @tindie

29 May 2025

Teensy 4.1-based Programmable Guitar Pedal #TindieBlog #EffectsPedal #Guitar #Teensy #AudioProcessing blog.tindie.com/2025/05/teen…

Teensy 4.1-based Programmable Guitar Pedal

The Teensy 4.1 is famously powerful, with a 600MHz ARM Cortex-M7, a 64-bit floating point unit, gobs of flash and RAM (which are expandable!) and certain interfaces that are ideal for audio process…

blog.tindie.com

3,741

audiodevcon

audiodevcon @audiodevcon

11 May 2025

SRC - Sample Rate Converters in Digital Audio Processing - Theory and Practice - ADC 2024 youtube.com/watch?v=0ED32_gS… #AudioProcessing #Coding #Programming #Samplerate #audioprogrammer

SRC - Sample Rate Converters in Digital Audio Processing - Theory and...

https://audio.dev/ -- @audiodevcon---SRC - Sample Rate Converte...

youtube.com

293

audiodevcon

audiodevcon @audiodevcon

5 May 2025

SRC - Sample Rate Converters in Digital Audio Processing - Theory and Practice - ADC 2024 youtube.com/watch?v=0ED32_gS… #AudioProcessing #Coding #Programming #Samplerate #developer

SRC - Sample Rate Converters in Digital Audio Processing - Theory and...

https://audio.dev/ -- @audiodevcon---SRC - Sample Rate Converte...

youtube.com

384

audiodevcon

audiodevcon @audiodevcon

2 May 2025

We have just released a new ADC Conference Video! SRC - Sample Rate Converters in Digital Audio Processing - Theory and Practice - ADC 2024 youtube.com/watch?v=0ED32_gS… #AudioProcessing #Coding #Programming #Samplerate #audiodevcon

129