Looks like they pre-date the benchmark by at least five years! web.archive.org/web/20190721...
Looks like they pre-date the benchmark by at least five years! web.archive.org/web/20190721...
Chronological feed of everything captured from Simon Willison.
Looks like they pre-date the benchmark by at least five years! web.archive.org/web/20190721...
I'm at the Claude w/ Code event in San Francisco, and I'll be live blogging the keynote here: simonwillison.net/2026/May/6/c...
I was talking with Joseph Ruscio on the @heavybit.com podcast the other day when I realized that vibe coding and agentic engineering have started to blur a bit in some of my work - I published some extracts from the transcript simonwillison.net/2026/May/6/v...
AI-run business experiments are interesting and fun up to the point where they waste the time of humans who haven't opted into the experiments - I think they need to keep their own human operators in the loop for outbound actions that affect other people simonwillison.net/2026/May/5/o...
I tried running the same "Generate an SVG of a pelican riding a bicycle" prompt against 21 different quantized variants of the same IBM Granite 4.1 3B model - the results weren't as interesting as I had hoped simonwillison.net/2026/May/4/g...
The AI auto-reply bots from Twitter (fun fact, the software category is genuinely called "reply guy" tools) have started showing up on Bluesky now and it really, really sucks
This one triggered my spidey-senses bsky.app/profile/huma...
I added a new feature to my blog (built entirely on my phone with Claude code for web) that imports my iNaturalist photos and adds them to my site's overall timeline simonwillison.net/2026/May/2/s...
I've got a ton of old dead links now and I keep planning to turn those into links to the internet archive for that site near that date
The main difference with a PR is that it represents a significant ask on the time of the maintainers Responding to a suggestion that has an attached proof-of-concept is massively less time consuming than meticulously reviewing, providing feedback on, merging and then maintaining a PR
Saw this white-crowned sparrow having a lot of a sing
I just added a feed to simonwillison.net/elsewhere/to... :)
I've been thinking recently that RSS-based search engines like Feedster are an idea that deserves revisiting techcrunch.com/2005/06/29/p...
The Zig project's rationale for their blanket ban on AI-assisted contributions makes a lot of sense to me - for them, time spent reviewing PRs isn't about the code, it's about growing new contributors for the future of the project simonwillison.net/2026/Apr/30/...
I particularly appreciate how this rationale isn't based on the idea that LLM code is of poor quality compared to code written by hand - the quality of the code isn't the deciding factor at all here
The translation thing struck me as a response to one of the most common objections to their blanket ban I don't like it, but I respect it as being consistent with their overall position
Maybe those developers will be welcome to share their ideas with the Zig team, but not welcome to share a PR with their implementation
How can you tell if that contributor actually understands the code they produced?
I released LLM 0.32a0 this morning, a major backwards-compatible refactor of my LLM Python library and CLI tool for working with language models - the new changes should help LLM work better with reasoning models and other new frontier capabilities simonwillison.net/2026/Apr/29/...
Available as a TTF font here bsky.app/profile/dftb...
16"
They have a different model for that which I haven't tried yet github.com/microsoft/Vi...
That's what Activity Monitor reported - it likely uses a lot less for shorter audio clips, I gave it a full hour
Some notes on talkie, a new "vintage language model" from a team including Alec Radford (yes, that Alec Radford) "trained on 260B tokens of historical pre-1931 English text" simonwillison.net/2026/Apr/28/...
It didn't quite manage to draw me a pelican riding a bicycle, but I still appreciated its era-appropriate response
I would very much like to see the 2,000 lb stellar sea lion at San Francisco Pier 39, who I believe has now been named "Chonkers" Does anyone know if he keeps a regular schedule?
According to this story the best chance of seeing Chonkers is between 7am and 9am www.ktvu.com/news/massive...
Microsoft's MIT licensed VibeVoice speech-to-text model (think Whisper with speaker diarization) is really good - my notes on running the 5.71GB 4bit MLX conversion on an M5 MacBook, using about 60GB of RAM at peak and transcribing 1hr of audio in ~9 mins simonwillison.net/2026/Apr/27/...
Here's a uv one-liner that downloads and runs the MLX model against a local mp3 file uv run --with mlx-audio python -m mlx_audio.stt.generate \ --model mlx-community/VibeVoice-ASR-4bit \ --audio lenny.mp3 --output-path lenny \ --format json --verbose --max-tokens 32768
No idea, that's just what I got out of the MLX-audio tool Presumably it's just there for convenience
Love this so much (Also definitive proof that humans are so much better than machines)
Today OpenAI announced that "Revenue share payments from OpenAI to Microsoft continue through 2030, independent of OpenAI’s technology progress" That "independent of OpenAI’s technology progress" fragment appears to mean that the weird AGI clause is now deceased simonwillison.net/2026/Apr/27/...
Yeah I wrote more about that here simonwillison.net/2026/Apr/22/...
Couldn't resist capping that off with my all-time favorite quote from @matt-levine.bsky.social
It's different but there are echoes
I'll be honest, I found this frustratingly vague. I'm not sure what I can take away from this, is the message effectively "watch this space?" What kind of non-copyright mechanisms could you apply here? Things like terms and conditions?
Particularly typo
It touched on that with "Poll after poll shows that Gen Z uses AI the most and has the most negative feelings about it" but I agree it would be useful to hear more about that tension
I can't find it now but I've seen some great commentary on that in the past - the gist is that Gen Z AI use for all sorts of stuff that they think is pointless because AI can do it, but they don't value the results at all
Spent a couple of hours catching today up with older Decoder episodes and wow I should have subscribed to this podcast sooner, @reckless.bsky.social is such a great host
I partially enjoyed the recent episode with Hank Green, some excellent discussion about AI slop content in the last third www.theverge.com/podcast/8820...
Reminded me of this TikTok by @cassiewillson.bsky.social from last year www.tiktok.com/@cassiewills...
I think ChatGPT Images 2.0 deciding to add a "WHY ARE YOU LIKE THIS" sign to the background of this image is the first time I've felt a glimpse of AGI simonwillison.net/2026/Apr/25/...
You can visit chatgpt.com/share/69ebff... (if you are signed into ChatGPT) and ask follow-up prompts yourself, but I doubt "why did you do that?" will produce useful results as that information is already discarded - it could only guess at why it had done it
Prehensile feet are quite a smart solution to the ongoing challenge that Pelicans can't hold the handlebars
I don't have a proper job!
.... but who's hands are on the handlebars?
I had no idea The Wind in the Willows was this much of a banger
"or Oracle will collapse, destroying its share price and Larry Ellison's entire empire" Hah, I thought you normally leaned towards stories about the negative consequences of the AI bubble
I was reading it out loud to @natbat.bsky.social at the coffee shop this morning