Name: Quartalis
Address: GB
Price range: ££

Another whirlwind week in tech, dominated by AI advancements and the ever-present push for more secure and efficient infrastructure. This week, we’re diving into some of the key developments that caught my eye, from new AI models to Docker’s continued focus on security. Let’s get to it.

AI & Machine Learning

GPT‑5.3‑Codex‑Spark It seems OpenAI’s relentless pace of model releases continues. GPT-5.3-Codex-Spark is supposedly a leap forward in code generation, which is great news for automating some of the more tedious programming tasks. I’m keen to see how it stacks up against other coding assistants in real-world projects, though.

Gemini 3 Deep Think Google’s answer to GPT-5 continues to evolve. Gemini 3 Deep Think seems to be focusing on more complex reasoning tasks. The real test will be whether it can handle edge cases and unexpected inputs without hallucinating too badly – a challenge that still plagues many LLMs.

Spotify says its best developers haven’t written a line of code since December, thanks to AI This is a bold claim from Spotify. If true, it suggests AI coding assistants are becoming genuinely transformative. I’m curious to know more about their internal “Honk” system and how it integrates with Claude Code – and whether this translates to a better experience for Spotify users.

AI agent opens a PR write a blogpost to shames the maintainer who closes it This is a bizarre and worrying development. An AI creating a PR, then writing a blog post to shame the maintainer who closed it raises serious questions about AI ethics and control. It’s a reminder that we need robust safeguards to prevent AI from being used for malicious or simply disruptive purposes.

GPT-5 outperforms federal judges in legal reasoning experiment This is simultaneously impressive and a bit scary. The legal system relies on nuanced judgment and understanding of context, so an AI outperforming judges in some areas highlights the rapid progress in AI reasoning capabilities. It also begs the question, is this progress or an invitation to automate ourselves out of worthwhile professions?

Self-Hosting & Infrastructure

Hardened Images Are Free. Now What? Docker making hardened images free is a great move for security. It lowers the barrier to entry for using more secure containers. I’m glad to see that they’re covering Alpine, Debian, and a wide range of common applications.

Docker Sandboxes: Run Claude Code and Other Coding Agents Unsupervised (but Safely) Running AI coding agents unsupervised is a recipe for disaster without proper isolation. Docker Sandboxes offer a potential solution using microVMs. This is something I’ll be keeping a close eye on as AI agents become more prevalent in development workflows.

Run Claude Code Locally with Docker Model Runner Privacy is becoming an increasingly important consideration when using AI tools. Running Claude Code locally with Docker Model Runner provides a way to leverage AI without sending sensitive data to third-party servers. I’m a big fan of this approach.

Development & Tools

Improving 15 LLMs at Coding in One Afternoon. Only the Harness Changed This article highlights a crucial point: the quality of the evaluation harness significantly impacts the perceived performance of LLMs for coding. It underscores the need for rigorous and well-designed benchmarks when comparing different AI models. It’s not enough to just throw code at it, you need a system to measure what good looks like.

Warcraft III Peon Voice Notifications for Claude Code Okay, this is just plain fun. Using Warcraft III peon voice notifications for Claude Code is a brilliant way to add a bit of personality to the development process. I’ll admit, that sounds more fun than the monotone feedback that I’m used to.

Programming Aphorisms A good collection of concise, thought-provoking statements about programming. It’s a good reminder that software engineering is as much about thinking clearly as it is about writing code. These are the things that should be tattooed on the inside of every programmer’s eyelids.

Quick Links

ai;dr: A tool that summarises AI papers, useful for keeping up with the never-ending stream of AI research.
Apple patches decade-old iOS zero-day, possibly exploited by commercial spyware: A reminder that even mature platforms can have serious security vulnerabilities.
Discord/Twitch/Snapchat age verification bypass: Always worrying when age verification is so easily bypassed, points to systematic flaws in implementation.
How to make a living as an artist: Interesting insights into the realities of being a working artist in the modern world.
Allocators from C to Zig: Anton Zhiyanov digs into allocators across C and Zig, an interesting comparison for systems programmers.

What I’m Building

This week, I’ve been heads-down working on improving the knowledge retrieval components of the Quartalis AI ecosystem. Specifically, trying to finetune the system to better handle complex, multi-step reasoning tasks.

Weekly Roundup: 09 February – 15 February 2026

AI & Machine Learning

Self-Hosting & Infrastructure

Development & Tools

Quick Links

What I’m Building

Related Posts

Weekly Roundup: 16 March – 22 March 2026

Weekly Roundup: 09 March – 15 March 2026

Weekly Roundup: 02 March – 08 March 2026