JRR_Misc

JRR_Misc

20 Photos and videos

Tweets

Pinned Tweet

JRR_Misc

@jrr_misc

May 28

One of my side projects this month has been making an AI benchmark to test model alignment in various ways. I landed on Deviance, War, & a modified version of the classic political compass. I've decided to call the whole benchmark Polibench. Check it out & take the test yourself

ALT https://polibench.jonathanrreed.com/

JRR_Misc

JRR_Misc

@jrr_misc

Jun 12

The new Siri gets @WVFRM confused with @TheYard

The Yard is a weekly comedy podcast
hosted by Ludwig Ahgren, Nick
Vercillo, Slime, and Aiden McCaig. The
show frequently features discussions
about their personal lives, gaming, and
internet culture.

ALT The Yard is a weekly comedy podcast hosted by Ludwig Ahgren, Nick Vercillo, Slime, and Aiden McCaig. The show frequently features discussions about their personal lives, gaming, and internet culture.

JRR_Misc

JRR_Misc

@jrr_misc

Jun 12

I asked it what podcast it was using visual intelligence & it thought this was the boys from the yard

JRR_Misc

JRR_Misc

@jrr_misc

Jun 12

The codex settings search is giving little peeks into the future

JRR_Misc

JRR_Misc

@jrr_misc

Jun 2

Why didn't anyone warn me adult life is endless emails, I want out

JRR_Misc

JRR_Misc

@jrr_misc

May 30

The new codex profile makes me realize I need to go outside more

JRR_Misc

JRR_Misc

@jrr_misc

May 29

Harry Du Bois would approve of discomorphic

JRR_Misc

JRR_Misc

@jrr_misc

May 28

ALT https://polibench.jonathanrreed.com/

JRR_Misc

JRR_Misc

@jrr_misc

May 28

Come check it out at polibench.jonathanrreed.com/

PoliBench Research Dashboard

PoliBench dashboard with model-output profiles, evidence links, validation status, and current run scope.

polibench.jonathanrreed.com

JRR_Misc

JRR_Misc

@jrr_misc

May 9

I'm scared for the x2 rate limit for Codex to end

ALT Last 30 days: $6,133.29 • 10B tokens

235

JRR_Misc

JRR_Misc

@jrr_misc

May 3

/goal is crazy in Codex cli, it will just keep going

ALT Pursuing goal (22h 44m)

253

JRR_Misc

JRR_Misc

@jrr_misc

May 3

passed 24 mark

JRR_Misc

JRR_Misc

@jrr_misc

Apr 26

I have been building so much because of Codex, the built in browser & annotation features is such a nice workflow

174

JRR_Misc

JRR_Misc

@jrr_misc

Apr 25

So hyped for the black flag remake!! Hoping Ubisoft doesn’t fuck it up

JRR_Misc

JRR_Misc

@jrr_misc

Apr 24

Why didn’t anyone tell me running benchmarks on AI is so sloooow

106

JRR_Misc

JRR_Misc

@jrr_misc

Apr 8

Kinda surprised there wasn’t an MCP for all the core Apple apps. There were a few smaller Apple MCPs, but nothing that really covered Apple’s core apps in one broad set. So I built Apple-MCPs. You can use them piecemeal or as one combined MCP. GitHub repo in the comments.

319

JRR_Misc

JRR_Misc

@jrr_misc

Apr 8

Gitrepo: github.com/JonathanRReed/App…

GitHub - JonathanRReed/Apple-MCPs: Apple-native MCP servers for macOS.

Apple-native MCP servers for macOS. Contribute to JonathanRReed/Apple-MCPs development by creating an account on GitHub.

github.com

JRR_Misc

JRR_Misc

@jrr_misc

Apr 1

Artemis 2 just entered space!!!

619

JRR_Misc

JRR_Misc

@jrr_misc

Mar 23

Why does every MCP or connector for email only allow one email? Are there people out there just rocking one email?!

JRR_Misc

JRR_Misc

@jrr_misc

Mar 19

I participated in the @CloudCannon @astrodotbuild challenge, I decided to build Internet Outage Atlas from my article The Days the Internet Dies. The goal was to make the history of these failures easier to explore on the web. site: outage-archive.jonathanrreed… #Astro #CloudCannon

223

JRR_Misc

JRR_Misc

@jrr_misc

16 Dec 2025

I didn’t realize I opened @raycast 4,099 times this year #wrapped #raycast

2,089