Agents Lead at AMP / Lifetimely. Production agents for commerce, side projects in finance. Bike person, bad guitarist. Half-formed thoughts on what's shipping

Joined January 2008
539 Photos and videos
Funny how 'a few PRs a week' went from normal to artisanal so fast.
1
1
191
The moment that taught me most about our Profit Agent happened in closed beta - with a merchant we'd never invited. We'd left the flag a bit wider than the VIP list, and they found it on their own. No invite, no onboarding call. They just started using it.
3
3
285
It surfaced a margin problem they hadn't asked about, then talked them out of splitting their store in two. Lesson I keep: beta test with a flag wider than your list - the people who stumble in are where the best moments come from.
1
42
Got to build this one with the team. The part I'm proud of: it doesn't wait to be asked - it watches your store and pings you in Slack when something looks off, so you catch it before you'd have gone looking.
Introducing the Lifetimely Profit Agent - powered by the $100Bn in GMV. It uses the benchmarks from across all the GMV in Lifetimely, to find the highest leverage Profit opportunities in your business, and can actually take action to drive Profit. It's live in Lifetimely right now!
2
1
204
Still early, but a fun problem to be working on - I'll share a few tales from the build over the coming weeks. 🙊
1
27
Who's built dark factories in product delivery pipelines? I keep seeing posts about different agent tools and have been mulling what that looks like both in the 9-5 and the 5-9
47
We've had MCP tools running for a few months now, and it's surprising what Claude and ChatGPT can generate just by asking. A report that used to take a day of SQL now comes back in seconds. Anyway, that's got me thinking about where the value sits in analytics tools.
1
59
The shape I'm seeing now is that you stop at transform. Make the data queryable, document the schema, and let the agent surface what matters in the moment. The value moves from the report to the schema.
1
30
Also, genuinely curious what you're seeing in your own tools. And hit me up if you think your Shopify analytics could be better, I'm actively building here 9-5
36
I've been a heavy Claude Code user since mid last year, and I wondered how a cheaper LLM would do on an Arduino project - a neo-pixel clock - I'd put down years ago because I got stuck on it.
1
2
114
That was back when I had more free time and enthusiasm for this kind of thing. One short session with Kimi later and it completely unstuck me - for the first time in years for this project, I'm excited to keep iterating on it. Wrote up how it went: mathewhartley.com/blog/2026-…

1
2
33
I've seen the pushback about Gemini Nano downloading itself silently in Chrome - and I get it, finding a 4GB file you never agreed to is a bad look. But honestly, local LLMs excite me, so I had a play. Anyway, here's what I built and what I might try next: mathewhartley.com/blog/2026-…

1
53
If you have gemini nano active in your Chrome browser go visit my website (bottom right, be patient if the model still needs downloading): mathewhartley.com

41
As an engineer, the bottleneck used to be writing the code. With Claude Code and Opus 4.5 that isn't really true anymore. The work has shifted to the bookends: defining what to build going in, and validating what comes out the other end.
1
1
60
Same for the agentic tools we ship to customers. Agent runs, produces an answer, user says thanks. Did it actually work? Tests passing doesn't tell you. A thumbs up doesn't either. Had one this week where the user loved it and our internal eval graded the same conversation a C.
1
39
So how do you build the system that validates what your system just built? Evals, human spot-checks, customer feedback loops, all three?
37
About a month ago @karpathy posted about an LLM wiki. Turns out I'd been running something similar for about six weeks before that. Started for a much dumber reason.
1
1
63
Since late last year I've barely hand-written code. I work through Claude Code, and the agent already knows what decisions got made, what's blocked, what just shipped. The context it needs to do good work and the context I need for good notes are the same thing.
1
39
So I gave it a markdown repo and told it to log decisions before ending each session. Sessions that start with context crush sessions that start cold. Showcases write themselves. PIRs already exist by the time someone asks. mathewhartley.com/blog/2026-…

25