Ivan Nardini

Ivan Nardini

309 Photos and videos

Tweets

Pinned Tweet

Ivan Nardini

@ivnardini

Apr 26

Presenting at the Next '26 Developer Keynote is one of those moments I'll remember for a long time. Thank you to everyone who played a part. Till the next Next!

1,372

Ivan Nardini

Ivan Nardini

@ivnardini

Jun 12

Anthropic just announced a new batch of self-hosted sandbox providers, and Google Cloud is on the list with the GKE Agent Sandbox. With this integration, Claude and the agent loop stay on the Anthropic platform, while every tool call runs in a gVisor pod on your cluster inside your environment, close to your data, in a pool that scales with demand. Sample repo in 🧵

ClaudeDevs

@ClaudeDevs

Jun 12

Claude Managed Agents can operate in a sandbox you control, on your own infrastructure or with any provider you choose. Today we added new guides for @blaxelAI, @e2b, @googlecloud, @namespacelabs, and @superserve_ai, so you can choose the best fit for your use case.

0:25

3,205

Ivan Nardini

Ivan Nardini

@ivnardini

Jun 12

Sample repo github.com/GoogleCloudPlatfo…

164

Ivan Nardini

Ivan Nardini

@ivnardini

Jun 11

Ray's cluster dashboard now shows TPU tensor core utilization and HBM memory usage alongside GPU metrics, with smart column labeling for GPU-only, TPU-only, or mixed accelerator clusters.

Ivan Nardini

Ivan Nardini

@ivnardini

Jun 1

I will personally realize two dreams here. Going to Japan and present with さん Kaz

Kazunori Sato

@kazunori_279

Jun 1

6/10開催のAnthropic主催イベントCode w/ Claudeでは、 @ivnardini と私で"Building with Claude on GoogleCloud"を担当します。すでにオンサイトは満席ですが、オンライン視聴可能なのでぜひ。 claude.com/code-with-claude/…

4,207

Ivan Nardini

Ivan Nardini

@ivnardini

May 31

With v2.1.158, Anthropic shipped Auto mode in Claude Code with Google Cloud You can now run commands in Claude Code using Claude models on Google Cloud without stopping for permission prompts every time code.claude.com/docs/en/chan…

1,142

Ivan Nardini

Ivan Nardini

@ivnardini

May 24

Next Friday we are running a hands-on Claude Code on Google Cloud workshop together with the @AnthropicAI team in SF Half day, Guided labs, and Live Q&A Link dev.to/googleai/google-cloud…

Google Cloud & Claude Code workshop - 5/29 in SF, CA.

Hey developers! Want to learn best practices for integrating Claude into your Google Cloud...

dev.to

321

Ivan Nardini

Ivan Nardini

@ivnardini

May 24

I looked into Keras Kinetic recently Keras Kinetic is a framework that lets you run Keras and JAX workloads on Cloud TPUs by writing a training function and adding a decorator Personally, it is one of the easiest ways I’ve seen to run a first TPU job so far Here is a great blog post on fine tuning Gemma to speak Gen-Z slang using Kinetic Blog jigyasa-grover.github.io/Kin…

In my Kinetic era - Fine-tuning Gemma 3 to speak Gen Z on a Cloud TPU with one decorator 🤌🏻

TL; DR We fine-tuned Google's Gemma 3 1B to respond in Gen Z slang using supervised fine-tuning (SFT) on just 30 prompt/response pairs. The entire job runs on a Cloud TPU v5 Lite, deployed with a...

jigyasa-grover.github.io

317

Ivan Nardini

Ivan Nardini

@ivnardini

May 24

Github github.com/keras-team/kineti… Docs kinetic.readthedocs.io/en/la…

GitHub - keras-team/kinetic: Run ML workloads seamlessly on cloud TPUs and GPUs with a single...

Run ML workloads seamlessly on cloud TPUs and GPUs with a single Python decorator. No infrastructure management required. - keras-team/kinetic

github.com

145

Ivan Nardini

Ivan Nardini

@ivnardini

May 23

A good read about Cloud TPU generations medium.com/womenintechnology…

Google Cloud TPU Architecture Versions Explained: From v1 to the Eighth Generation

A guide to Cloud TPU generations, what changed between them, and how to choose the right one for your workload

medium.com

1,488

Ivan Nardini

Ivan Nardini

@ivnardini

May 21

What a great series about getting started with TPUs medium.com/@roya90/tpu-101-w…

TPU 101 with JAX: A Medium Series

A six-part, beginner-friendly course that takes you from “I’ve never touched a TPU” to training models across multiple TPU chips in JAX.

medium.com

250

Ivan Nardini

Ivan Nardini

@ivnardini

May 7

I spent some time testing elastic training capabilities on MaxText recently. MaxText is Google’s open-source JAX library for the full LLM lifecycle scaling from one host to hundreds of TPU chips. Pre-train with train method, run SFT/DPO/GRPO in the same package, and serve via vLLM. It supports several models including Gemma, DeepSeek, Qwen, Kimi and more. Docs maxtext.readthedocs.io/en/ma… Tutorial coming soon.

484

Ivan Nardini

Ivan Nardini

@ivnardini

May 6

Wrapping up the demo for Code with Claude in SF. If you’re around, I'm happy to talk. See you tomorrow!

326

Ivan Nardini

Ivan Nardini

@ivnardini

Apr 29

Ray Serve now supports multi-host TPU slice deployments with gang scheduling. Before, TPU slices required manual host counts and bundle replication, with no guarantee of a single co-located slice. Now, Ray Serve uses Ray Core’s SlicePlacementGroup to pin deployments to one co-located TPU slice, matching Ray Train. Code github.com/ray-project/ray/b…

ray/python/ray/llm/tests/serve/cpu/deployments/llm/test_llm_engine_tpu.py at f229d5376eb87b09a3fa...

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads. - ray-project/ray

github.com

422

Ivan Nardini

Ivan Nardini

@ivnardini

Apr 23

Anthropic released the public beta of Cowork on Third-Party Providers (3P) Claude Desktop with Cowork and Code can now run using your own Google Cloud endpoint, billed as token consumption to your GCP project. Docs claude.com/docs/cowork/3p/ov…

Overview - Claude.ai Documentation

Run Cowork against your own cloud inference provider

claude.com

744

Ivan Nardini

Ivan Nardini

@ivnardini

Apr 19

vLLM v0.19.1 shipped a bunch of optimizations and fixes for Gemma 4 > Gemma 4 MoE quantization support > Eagle3 speculative decoding for faster inference > Streaming and tool-call bug fixes for production applications

646

Ivan Nardini

Ivan Nardini

@ivnardini

Apr 19

Release notes github.com/vllm-project/vllm… Recipe docs.vllm.ai/projects/recipe…

Release v0.19.1 · vllm-project/vllm

This is a patch release on top of v0.19.0 with Transformers v5.5.3 upgrade and bug fixes for Gemma4: Update to transformers v5 (#30566) [Bugfix] Fix invalid JSON in Gemma 4 streaming tool calls by...

github.com

288

Ivan Nardini

Ivan Nardini

@ivnardini

Apr 16

Vertex AI Agent Engine Memory Bank just landed two features I’ve been looking for. You can now push events yourself and decide when memories get generated. Before, agent memory was passive. You knew conversations were flowing in, but you didn’t know when extraction happened. Now you have > ingest events method lets you push raw turns in per user (and force_flush if you want it now) > generation trigger config sets idle-duration, fixed-interval, and event-count rules Code github.com/googleapis/python…

python-aiplatform/vertexai/_genai/memories.py at main · googleapis/python-aiplatform

A Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning. - googleapis/python-aiplatform

github.com

387