building better and fairer interviews at hirevoice.com

Joined September 2020
132 Photos and videos
Juan retweeted
the future of all work is this. You must define: - a goal - the criteria that define it - the verifier that makes sure it is achieved - the sensors that inform the verifier - the actuators that affect the sensors - The envelope that contains the sensors and actuators
The codex "goal" feature is a really good way to spend dozens of hours optimizing some total bullshit btw. If your final criteria is it all vague it will specification game and make masturbatory "evidence" and "verifiers" and "gates" and "smoke tests". must be hell internally
40
42
807
74,934
updated claude-code to use fable this is the buggies i have seen claude-code is this fable at work?
31
Juan retweeted
En Hirevoice buscamos >> Founding Engineer (60k-100k) - Remote o Barcelona [agentic coder] ***pls🙏 repost/max.difusión GRACIAS!!*** Mucha info en... 1/2
7
39
68
36,171
lol the claude extension is the wors ask: "please format this document" ai: "Oh perfect, ill write an app script for that, search the strings, and format with functions :D"
20
Lol for awhile i was looking at this and thinking: ugh when will the opensource models ever get to SOTA, then i remembered that opus 4.7 is like 3 months old and just last week was outdated FUCKING UNBELIEVABLE GREAT WORK
Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: platform.minimax.io Token Plan: platform.minimax.io/subscrib… 🚀New! MiniMax Code: code.minimax.io Weights & Tech Report in ~10 Days
35
So much better to work with voice ai by having my own workbench than to deal with provider BS Frameworks like livekit are the ones doing most of the work
1
33
Been using this internally for a couple weeks, this is the best solution so far
TesterArmy is the simplest way to QA your website or mobile app. It runs real tests across browsers and devices, catches regressions on every PR, generates tests from natural language, and much more. Try now and start testing in minutes: @TesterArmy
2
28
Juan retweeted
TesterArmy is the simplest way to QA your website or mobile app. It runs real tests across browsers and devices, catches regressions on every PR, generates tests from natural language, and much more. Try now and start testing in minutes: @TesterArmy
85
81
423
106,432
RT @SIGKITTEN: oh no what happened to all the principals and morals from last month
904
Juan retweeted
May 20
Replying to @github
holy shit, how did the attackers find a large enough uptime window to get in?
177
697
14,495
957,421
looking at gofeatureflag im becoming go pilled, so cool to have binaries run cheaply and neatly alongside other services given how performant go is im not afraid of running stuff as sidecars
1
1
30
nobody is prepared for how much opus-4.7 costs to produce code, shits crazy
1
66
Juan retweeted
May 13
b2b event planning app · components
3
2
32
1,086
Juan retweeted
May 10
FireTail landing exploration
2
1
17
566
Juan retweeted
Today is a hard day. I shared this note with the @linear team today: We’ve made the difficult decision to increase our workforce. This is not a cost-cutting exercise or a reflection of anyone’s performance. We’re simply reimagining every role for the agentic AI era. We’re hiring. We’re sorry about that.
449
639
13,980
986,645
Juan retweeted
Pasado mañana Jueves en BCN (7 Mayo) de 10 a 12: ME PONGO PINGANILLO: >>> MUST: Ven sólo si eres reclutador o co-founder reclutando (sino vas a tirar tu tiempo a la basura) 1/3 >>
2
10
12
3,603
been doing a lot of work with just text using wisperflow and chatting with the AI. 400k tokens is barbarically big
41
Juan retweeted
Apr 20
Extraordinary scenes on my TL Guys there’s more than one way to optimize AI on a task. If you’re working on harnesses try to slowly add all these in your bag. The classic way is to update the weights (RL)… The modern way is to optimize prompts/context (Dspy optimizers/GEPA)… and the hypermodern way is to self evolve the codebase itself (auto research/alpha-evolve/darwin-godel variants) All of them need an eval dataset of prompts/task scenarios, a rubric of success, and an initial forward pass (harness model) to learn. They just update different things to get your system to better evals. There’s nuance to each. There’s a time and place for all of them.
8
23
269
19,428
Gotta start wrking on getting familiar with this. Pretty sure this is also how you get better results with cheaper and faster models
RLMs pretty much solved context btw You can shove tens of millions of tokens into a good RLM harness and it just works. I’m spending all my free time here.
20
I agree with @theo , t3code has the best implementation thus far for worktrees Couple changes i'd like to suggest
1
56
I had like to see the sessions related to the same worktree all co-located, this way i can also close a worktree and call it a day once its done
1
19
A way to properly delete worktrees, right now I have way too many and find myself asking claude to cleanup the worktrees aiming to an specific branch i want to be working on
23