Ash DCosta

Ash DCosta

142 Photos and videos

Tweets

Pinned Tweet

Ash DCosta

@softwareweaver

4 Dec 2023

Connecting UX and Workflow with AI Models #AI #huggingface #stablediffusion #mistral #llm #whisper youtu.be/wkaQZnSgBvk?feature…

Unleash Your Creativity with AI at Your Fingertips using Fusion...

Fusion Quill, an innovative AI tool revolutionizing desktop interac...

youtube.com

901

Guilherme Penedo

Ash DCosta retweeted

Guilherme Penedo @gui_penedo

Jun 13

This is a direct consequence of Anthropic's scaremongering as a PR strategy I assume this will be reverted soon enough, maybe after some supposed additional nerfing of the model, but still a terrible precedent, even if it might boost Anthropic's value pre-IPO once the model comes back (we're so good the government had to do this) The most surprising part is all foreign employees essentially not being able to do any work now. I can't imagine that even Anthropic expected that one, given how much foreign talent they hire in general (and even research superstars that don't have citizenship like Karpathy) Also a bit unclear how much of this isn't just payback for the previous Ant-US gov confrontation that gave Ant really good PR at the expense of the gov. We'll have to wait to see if OpenAI's next release faces similar restrictions

Anthropic

@AnthropicAI

Jun 13

The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…

3,651

jietang

Ash DCosta retweeted

jietang

@jietang

Jun 13

GLM-5.2 is Fully Open, Frontier Intelligence Belongs to Everyone Today, the sudden restriction of certain frontier models is deeply regrettable. At a time when access to frontier models is abruptly cut off for non-technical reasons, we are even more convinced of one thing: science should be global. The path to AGI (Artificial General Intelligence) must never be enclosed by high walls. We have always believed that AGI should be the cornerstone for all of humanity to collaboratively explore the boundaries of intelligence and solve complex challenges, rather than a privilege monopolized by a few rules and subject to revocation at any moment. In the face of external blockades and restrictions, our attitude is one of radical openness. Frontier intelligence must remain open-source, accessible, and buildable, serving every dedicated developer. GLM-5.2 is Zhipu's most capable open-source model to date. It not only supports a truly usable 1M context window but also maintains a continuous lead in the independent completion of long-horizon tasks, providing solid foundational support for building complex agent applications. It also continues to be our main engine for creating the strongest domestic coding model. Tonight at 5:21—at this special moment—GLM-5.2 will officially be available to all GLM Coding Plan users (including Lite / Pro / Max). The API will also go live next week. A step closer to frontier intelligence for everyone. The future of AI is open, and it is for the people. ModelKey: GLM-5.2

253

745

7,260

893,260

elie

Ash DCosta retweeted

elie

@eliebakouch

Jun 9

mythos will be bad ON PURPOSE on ai "frontier llm research" tasks, this is very very sad for the research community also the fact that this is un purpose not visible to the user is crazy

Claude

@claudeai

Jun 9

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.

0:20

359

644

5,637

3,887,132

Mikel Artetxe

Ash DCosta retweeted

Mikel Artetxe

@artetxem

Jun 9

Brilliant idea! Next up: Apple randomly reboots your Mac if you're building competing tech, Gmail silently edits your email if you mention rival platforms, and Tesla Autopilot swerves if it detects you're working on self-driving cars. All in the name of safety, of course. Because malicious actors controlling the world’s operating systems, inboxes and cars would be extremely dangerous!

elie

@eliebakouch

Jun 9

mythos will be bad ON PURPOSE on ai "frontier llm research" tasks, this is very very sad for the research community also the fact that this is un purpose not visible to the user is crazy

101

763

6,770

357,924

Fei-Fei Li

Ash DCosta retweeted

Fei-Fei Li

@drfeifei

Jun 10

Scientific research is fundamental to advancing civilization and helping people globally to solve the most critical problems, from medicine to materials, from brain science to physics, and much beyond. This is only possible when scientists have access to the best tools of the time to conduct scientific research, including having access to AI-based tools.

120

470

3,085

190,814

SemiAnalysis

Ash DCosta retweeted

SemiAnalysis

@SemiAnalysis_

Jun 9

BREAKING NEWS: Anthropic's latest model will NOT help you if it thinks your ML research/ML engineering is interesting, and/or will secretly degrade its IQ so that the average engineer won't notice. We are already seeing Anthropic's latest model's moderation filters our GPU inference research and programming 😭

206

522

4,575

1,994,904

NVIDIA AI

Ash DCosta retweeted

NVIDIA AI

@NVIDIAAI

Jun 4

Today we're shipping Nemotron 3 Ultra. A 550B MoE frontier-intelligence open model built for long-running agents. It delivers 5x faster inference and lowers the cost of complex agentic tasks by up to 30% versus other open frontier models.

2:59

199

462

3,483

1,242,070

California YIMBY

Ash DCosta retweeted

California YIMBY @cayimby

Jun 1

San Diego built more apartments per person than any CA metro last year—and rents fell 2.2%, dropping the city from 5th to 12th priciest nationally.

102

781

158,601

Karri Saarinen

Ash DCosta retweeted

Karri Saarinen

@karrisaarinen

May 31

The fallacy of this is that more creates more. More hours, more hiring, more something. And it is true in a sense. If you put in more work, more work will happen. But I think for most startups, the leverage is really in how differently you approach the problem, how well you cultivate your team, and the strategy. Any large company can outspend you on hours. They have thousands or tens of thousands more people, spending more hours. If hours worked were the metric, every large company and government organization would always win and do the best work. More hours, better output. This thinking is often representative of younger founders, where the startup becomes their identity and life. They have a hard time doing anything else, and cannot understand that your work is not the person that is you. But activities outside of work can grow you as a person too and make you do better work. I’ve never worked this way. As a designer, I always saw the need to take a step back, to take a break. At times, I might work 12 hours or 16 hours, or whatever amount was needed, but it wasn’t the norm. You just can't grind design, you need inspiration. But taking that step away from the work, would give me more perspective, inspiration and I could approach the problem differently or I could just see the solution. Grinding is never good for any creative problem, and startups or creating new products are often mostly about creative problem solving. Grinding works ok for email jobs, or where you just executing on very clear playbook. With Linear, we’ve never worked this way. We work reasonable hours, 5 days a week. All of us founders have families. Many of our employees have families. I personally stop every evening, spend time with the family, cook dinner for the family, eat dinner together, and focus on things outside of work. Sometimes I work in the late evenings or weekends, but to me the pride is that I don’t need to. Company should be succesful without it. My goal is to build a company that is sustainable in the long term, and doesn’t require heroics or personal sacrifices every single day. There are times when our team is heroic. Launches, incidents, some other work that just needs to be done. They will work late into the night because they know it is the right thing. But we don’t require that every day or every week, and the more this happens, the more I think it is a failure of our company and leadership. The team and the leaders should always keep a reserve to use when something is needed. Our thinking was also that quality, which we value, doesn’t emerge from working more or stressing people more. It emerges when you create the conditions for it to emerge. Often it is the appreciation, space, time, and how the person feels. A person who is rested will do better work. I wouldn’t attribute much of our success to working a lot. The success came from having clear thinking, ideas, and focus to do the right things. I sometimes wish we could move the culture more toward a Zen master. Real mastery is not exerting the most effort. It is achieving the outcome with the least necessary effort.

Harry Stebbings

@HarryStebbings

May 30

"If you are not working 7 days per week, you are going to lose". Corgi Insurance is the most intense workplace culture in startups. - The company works 7 days per week. - Founder (@nico_laqua) lives and sleeps in the office. - He built a cafe in the office because there was no local cafe that was open 24/7. - 2/3 of the first 30 team members have the Corgi logo as a tattoo. Today I went behind the scenes with Nico, who has used this culture to scale the company to a $2.6BN valuation in just two years. My condensed notes below: 1. If You Are Not Working 7 Days Per Week, You Are Going to Lose: Whatever you can get done in 5 days, you'll get more done in 6 and 7. If you are trying to solve the world’s hardest problems, a standard 5-day workweek will not cut it. 2. Work Trials Repel the Mediocre: Corgi forces candidates into mock work trials over the weekend. If seeing a full office on a Saturday scares them, they don't belong. True intensity acts as a natural filter to attract killers and repel clock-watchers. 3. Lead from the Front Lines You can’t demand 7-day weeks while sitting on a yacht. Nico sleeps 3–4 hours a night on a mattress inside the office. If you want your troops to bleed, you have to be in the trenches with them. 4. Culture Only Means One Thing: Winning Forget superficial jargon like "hackers" or "ex-founders." Strip away the corporate fluff. A great startup culture is aggressively optimized around one single word: Winning. 5. Lifespan vs. Victories Building something world-historic requires radical sacrifice. When asked if he'd rather build a trillion-dollar company and die at 50, or fail and live to 80, the answer was easy. "I would rather measure my lifespan in victories." 6. Reject the Comfort of "Quiet Quitting." If you are operating in a hyper-growth environment and your days off happen to be Saturday and Sunday every single week, you are quiet quitting. To win, you must deliberately bypass the off-ramps of personal comfort and low volatility. Corgi isn't for everyone—and that’s exactly the point.

0:32

157

444

5,101

1,345,009

Lotto

Ash DCosta retweeted

Lotto

@LottoLabs

May 28

A very cool model for the GPU poor bros Trained on an ungodly amount of tokens for a 8b a1b model Gonna be super fast excited to try this out huggingface.co/LiquidAI/LFM2…

LiquidAI/LFM2.5-8B-A1B-GGUF · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

723

55,306

Daniel Jeffries

Ash DCosta retweeted

Daniel Jeffries

@Dan_Jeffries1

May 26

The road to Hell is paved with closed-source citadels disguised as good intentions. The Pope is right: AI takes on the characteristics of those who build it, finance it, and regulate it. So the question is: who gets to hold the great and wonderful power of AI? If the answer is a handful of closed source companies, murkily censored, quietly surveilling every step of our lives, every private conversation, enshrined in law as 'safe' and 'open' when they're nothing but the surveillance economy squared, then all we've done is build a few modern East India Companies, digital oligarchies of the few, cloaked in the language of safety. Open Source and Open Weights are how you spread the fantastic enabling power of AI to everyone, everywhere. Permissionless innovation. Everyone gets the hammer and nails to build houses and churches and factories. The more hammers, the more widely spread, the more the decentralized genius of humankind can flourish. Everyone gets the Printing Press. The printing press singlehandedly uplifted and spread of intelligence and knowledge around the world. The more we could record all kinds of knowledge, the more we spread the ability to read, the more equal and advanced society became. Before the press, knowledge was learned by one person and passed into dust with them when they died or passed only to only a small group of students. When we only had monks in a cave copying religious texts, a closed system, it limited the spread of intelligence and limited the growth of civilization. The printing press was the single greatest invention in the history of the world because it let anyone print anything and spread knowledge throughout the whole world. AI can do the same, but only if we build the bazaar, and never let the citadel people convince the world that they're the special people who should control who gets access to intelligence while pretending they're building the bazaar. What the world needs now is more intelligence, more widely spread and more widely available. Open is the way. And it always has been. And the road to Hell was always built with walls, towers, spiked gates and moats so that only the few could enter.

Pope Leo XIV

@Pontifex

May 25

Replying to @Pontifex

In the abstract, technology in and of itself is not a solution to humanity’s problems, just as, in and of itself, it is not inherently evil. In practice, however, technology is never neutral, because it takes on the characteristics of those who devise it, finance it, regulate it and use it.

159

43,475

Fahd Mirza

Ash DCosta retweeted

Fahd Mirza

@fahdmirza

May 21

🦙 llama.cpp now has a BUILT-IN model router ♠ and it completely replaces Ollama Open WebUI for model switching 🔹 One server, one config file, any model on disk 🔹 Switch models instantly without restarting anything 🔹 Zero duplicate model storage across backends 🔹 Full per-model control via a simple INI file 🔹 Native llama.cpp performance, no abstraction layer 🔥 Watch the full video below 👇 youtu.be/V2t_YRsyqeI

Llama.cpp Router Mode: Switch Models Instantly: Hands-on Local Demo

Run multiple AI models from a single llama.cpp server and switch be...

youtube.com

545

47,608

Yuchen Jin

Ash DCosta retweeted

Yuchen Jin

@Yuchenj_UW

May 13

I’m so glad AI killed LeetCode interviews. For 10 years, tech companies made every engineer grind the same puzzles and prove they could invert a binary tree from memory. Today, the dumbest AI model can walk in and one-shot the entire interview. Thank you, AI.

224

151

2,875

672,247

Arthur Zucker

Ash DCosta retweeted

Arthur Zucker

@art_zucker

May 12

This is going to be a little bit long, but I want to give hope to my fellow anxious ML engineers. We see a lot of propaganda on how this or that AI one shotted something, about how incredibly strong the models are getting and how we don't even need to review PRs and we can just ship to production. Although this can be true for some cases, its also far from being representative of all the challenges we have to face. I started using claude code 4 month ago, and quickly realized how it really does change the way we work. I can experiment 10x faster, fix small issues without coding and refactor code without sweating. BUT, these tasks were "just" tedious and not hard. The challenge in my day to day work is to take a research code and integrate it into transformers using our standards. Its challenging because code beauty is abstract and subjective just like a philosophy. By relying too much on claude, and on how seemingly good the code it produces look, I pushed the deepseekv4 integration without realizing that claude really did not understand the model. I gave it access to `transformers`, the original paper, the original code, the different blog posts and my past chats and skills created to add a model, a b200 node node and a LOT of tokens, but it did NOT nail it. It did not understand the eager attention path, it did not understand the basics of causal attention. It was even wrong implementing the manifold constrained hyper connections. It helped to reduce the burden of exploring implementation and debugging but it did not help reason around the model. I am not a doomer, I think our job as Software Engineers has never been this great, I am just saying that we still have a job, and we should still be a bit careful when it looks to good to be true 😉

210

21,672

Ash DCosta

Ash DCosta

@softwareweaver

May 12

Like the fact that they are doing something different and saving valuable time when interacting with AI

Soumith Chintala

@soumithchintala

May 11

Thinky's secret plan: 1: Increase Human<->AI bandwidth 2: Raise ceiling of human AI intelligence 3: Help humans continue as main-characters in the new world We are at Step 1. Interaction Models are great real-time collaborative tools for humans. Here's a preview:

Soumith Chintala

Ash DCosta retweeted

Soumith Chintala

@soumithchintala

May 11

Thinking Machines

@thinkymachines

May 11

People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way. We share our approach, early results, and a quick look at our model in action. thinkingmachines.ai/blog/int…

2:15

114

1,533

118,396

Ash DCosta

Ash DCosta

@softwareweaver

May 12

Looks interesting. Good to research in non turn based models. It gets tiring to wait for the model to finish it's turn to prompt it or steer it with more information.

Mira Murati

@miramurati

May 11

Today we're sharing our work on interaction models. A new class of model trained from scratch to handle real-time interaction natively, instead of gluing it onto a turn-based one. youtu.be/A12AVongNN4

The Pragmatic Engineer

Ash DCosta retweeted

The Pragmatic Engineer

@Pragmatic_Eng

May 5

Pi was built when there were already agent harnesses around. Here’s why Mario Zechner(@badlogicgames), found them suboptimal and built Pi, a minimalist self-modifying agent: #1 - Mario initially was a believer in Claude Code: "I was a believer in Claude code because they were the first that packaged agentic search up in a really compelling package. And at the time that fit my workflow really well. Everything around the LLM was kind of nice and tidy and easy to understand. I was super happy. I was proselytising Claude code." #2 - Reverse engineering Claude Code highlighted the degradation that Mario felt as a user: "I personally like simple tools that are stable and that I can rely on. Even if they have non-deterministic parts, all the deterministic parts should be as stable as possible. That was just not the experience with Claude Code around summer 2025. They would take away your control of the context. They would inject stuff behind your back, which is bad. Then, your workflows stopped working because there's now a system reminder that you don't even see in the UI that would modify the behaviour of the model. They would also do this to the system prompt. I built a little service where I can track the progression or evolution of the system, prompt and tool definitions and, with every release, it was messing with stuff. That just messed with my workflows and I don't appreciate that." #3 - PI was built with an appreciation for simple and reliable tools: "If I commit to a development tool, I want it to be a stable, reliable thing like a hammer. I don't want my hammer to break a different spot every day. That's terrible. We need somebody who goes the full velocity kind of way. But I don't want to work with a tool like that."

2:55

246

62,732

Daniel Jeffries

Ash DCosta retweeted

Daniel Jeffries

@Dan_Jeffries1

May 1

Jensen is one the smartest and most far seeing folks the world. "If an AI scientist warns people that AI is going to permeate across radiology and radiologists are going to get wiped out, it might seem helpful but it's hurtful. If we convince everybody not to be radiologists and we now need radiologists, that actually is hurtful to society. "It is hurtful to convince all the young college graduates not to study software engineering because we are going to need more software engineers than ever. That's hurtful." "Scaring people with nonsensical things, which are not going to happen, that this is an existential threat, there's a 20% chance that is is existential, that's ridiculous. "That it's going to wipe out 50% of college level jobs. "That is it going to completely destroy democracy. "These kinds of comments are not helpful. They are made by...CEOS. And you become a CEO, maybe you adopt a God complex and somehow you know everything." Brutal. And right.

7:06

245

812

5,280

854,218

Ash DCosta

Ash DCosta

@softwareweaver

Apr 30

This is a great thing

Derek Thompson

@DKThomp

Apr 30

New newsletter: MODERN FATHERHOOD WOULD BE UNRECOGNIZABLE TO A 1950'S DAD Compared to their Boomer parents, childcare time among Millennial dads has more than doubled. Compared to their Silent Generation grandparents, it’s nearly quadrupled. You will be hard-pressed to find any part of day-to-day modern life that has changed more in the last half-century than the way today’s parents—and fathers, in particular—spend their time. The new American dad is more present and more exhausted—but also, more satisfied with life. What's behind this half-century transformation? Today's piece combines history, economic analysis, and gorgeous charts galore from @AzizSunderji