From Demos to Deployment: What Day Two of Google Cloud Summit London Taught Me About Building

June 18, 2026 · London

If Day 1 of Google Cloud Summit London was about the architecture of the agentic era — the data platform, the governance, the case studies — Day 2 was about the thing the builders actually do with it. The framing shifted from “here’s the platform” to “here’s what you can make, live, right now.” And the demos were genuinely live. Built on stage, breaking and recovering in real time, no safety net. That’s a different kind of confidence to project, and it changed how I read the day.

Here’s what stayed with me on the way home.

The Builders’ Day

The morning opened with a deliberate reframing. The keynote leaned hard into London as a centre of gravity for Google’s AI work — not just DeepMind, which everyone knows is here, but the UK customer engineering and consulting teams too. The number that got repeated was the forecast of over a billion new logical apps being built by 2029, and the point being made was that the vast majority of those won’t be built by Google. They’ll be built by the people in the room.

That’s a nice rhetorical move, but it’s also the actual thesis of the whole day: the tooling has moved far enough up the stack that “I’m not a designer” or “I don’t write much code” is no longer the blocker it was even a year ago. I went in mildly sceptical of that claim. I left less so.

Future Dreams: The Demo That Earned Its Place

The standout moment of the day wasn’t a product announcement. It was a charity.

Future Dreams is a UK breast cancer charity based in King’s Cross — a non-clinical “house” where people affected by breast cancer get counselling, peer support, yoga, nutrition advice, and time with breast care nurses. Their CEO, Sam Jacobs, described the problem plainly: over 55,000 new breast cancer diagnoses a year in the UK, roughly one every ten minutes, and a physical house that can only hold so many people. How do you scale that warmth — “the cup of tea waiting for you when you walk in” — to tens of thousands more people, digitally, without losing the thing that makes it work?

What followed was, frankly, the most technically dense thing I saw all summit, dressed up as something warm. The team built a “virtual sanctuary” — a 3D version of the house — live on stage:

Three distinct architectural concepts (Victorian townhouse, modern, Japanese-influenced) generated on the spot with Nano Banana, Google’s Gemini-based image model. The audience then voted on a favourite via a QR code while all three were built out in parallel.
The 3D build and the app scaffolding ran inside the Managed Agents API — the service formerly talked about as “Antigravity” — where each agent gets its own secure, ephemeral Linux sandbox provisioned with your skills, tools, and data. Each prototype was an independent sandbox, scaled independently, streaming its build as it went.
Live video backgrounds in the rendered house came from Veo, Google’s video model.
A Skills Registry sat underneath it all, managing the lifecycle of reusable skills so both humans and agents could discover and load them.

The bit that actually moved me, though, was a small one. Inside the virtual house, a visitor could ask to “speak to a person” — and the Gemini Enterprise customer experience layer routed the call through a telephony system to Cass, a real member of the Future Dreams team, waiting on the line. The whole orchestra of agents existed to get someone to a human faster, not to replace one.

I want to flag this because it’s easy to be cynical about a charity appearing in a cloud keynote. But the architecture on display — multi-agent orchestration, a sandboxed build environment, a skills registry, an escalation path to a human — is exactly the pattern every regulated enterprise is trying to figure out. Seeing it applied to “reduce the loneliness of a cancer diagnosis” rather than “optimise a supply chain” was a useful reminder of what the same building blocks can do.

The Orchestra of Agents — and the Guardrails Around It

The next session took the same Lego bricks and made the governance story concrete. There was a fun framing device — building an agent to plan a techno DJ tour across Europe and Asia — but underneath the music was the most honest demonstration of agent guardrails I’ve seen on a stage.

A couple of things worth pulling out.

First, the creative models have a responsibility story baked in. The demo of Lyria, Google’s music model, deliberately showed that it refuses to mimic a named artist’s voice or style, embeds a SynthID-style watermark, and is detectable as AI-generated through Google’s own verification tooling. Whatever you think of generative music, “the model declines to clone a real artist” is the right default, and it was front and centre rather than buried in a footnote.

Second, and more important for anyone building in a regulated context, the tour-planning agent was used to try to break it. The presenter asked a partner-facing agent for a DJ’s email (allowed — it’s a business contact), then their passport number (blocked — PII), then fed it a fake credit card (blocked — data loss prevention), then abused it (blocked — responsible AI policy). The key line: none of that was hard-coded into the agent. It was all enforced centrally by Model Armor sitting across the runtime, configurable by floor-level policy and confidence thresholds — and crucially, demonstrably changeable without editing a single line of the agent’s code. They dialled the harassment filter from “block” to “high tolerance” mid-session and the same insult produced an empathetic customer-service response instead of a refusal.

That separation — policy as a centrally managed layer, not something each agent reimplements — is the single most practical governance idea I took from the summit. It’s the difference between governance that scales and governance that’s copy-pasted into a hundred agents and rots.

The session also restated the production playbook clearly enough to be worth writing down. Build agents that are atomic (specific, narrow, easy to evaluate); invest in the cognitive architecture (the design patterns and flow between agents); and spend most of your time on evaluation. The Gemini Enterprise Agent Platform wraps the supporting lifecycle — the Agent Development Kit (ADK) for building, a runtime for deployment, agent identity, a registry for discovery and reuse, the gateway for security, plus observability and evaluation monitoring for tokens and performance. The advice that stuck: start with a flash model and the smallest token budget you can, then measure before you reach for anything bigger.

DeepMind: From Move 37 to Beating Strassen

The DeepMind talk was the intellectual high point, and it made an argument I keep turning over.

The setup was familiar history — AlphaGo‘s “Move 37” against Lee Sedol in 2016, the move every commentator initially thought was a mistake before it turned out to be the key to the game; then AlphaFold solving protein structure prediction and winning the 2024 Nobel Prize in Chemistry. The thread connecting them was the distinction between AI that democratises existing expertise (helpful, productivity-boosting, what most of us are deploying today) and AI that generates knowledge humans don’t yet have.

The example used to make this real was matrix multiplication, and it’s a good one because it sounds trivial and isn’t. Multiplying two 2×2 matrices the naive way takes 8 multiplications. In 1969, the mathematician Volker Strassen found you could do it in 7 — a result that stood as a benchmark for decades, given how fundamental matrix multiplication is to literally all of modern machine learning. DeepMind’s discovery agents — AlphaTensor, then FunSearch, now AlphaEvolve — have been chipping at exactly these open problems. AlphaEvolve found a way to multiply 4×4 complex matrices in 48 scalar multiplications, beating Strassen’s 49, the first improvement on that specific problem in over fifty years.

The part that matters for the rest of us is that this isn’t only theoretical. AlphaEvolve is a general coding agent, and DeepMind described it recovering roughly 1% of Google’s data centre compute through scheduling optimisation — a number that sounds small until you multiply it by what Google spends on chips and data centres. The pitch was that the organisations who learn to point these discovery agents at their own hard problems — logistics, chip design, chemistry — rather than just using AI to go faster at what they already do, will be the ones who get disproportionate value.

I find that framing more useful than most “AI will change everything” talk, because it’s a concrete, testable claim: there’s a difference between automation and discovery, and the second is where the durable advantage is.

Agentic Patterns for Production (and Starling’s Honest Journey)

The breakout I got the most operational value from was a deeper-dive session on agentic design patterns — sequential agents, parallel agents (the example was multiple research agents hitting different sources simultaneously for speed), routing/triage agents that classify a request and hand it to the right specialist, and human-in-the-loop approval gates. Nothing here was novel if you’ve been building, but seeing the patterns named and mapped to the ADK primitives is exactly the kind of shared vocabulary teams need. The recurring themes were reusability (don’t rebuild the same agent five times), evaluation as the thing you spend most of your time on, and using small fine-tuned models for narrow tasks rather than reaching for a frontier model by reflex.

The grounding came from Starling Bank, who walked through their actual year-long journey: Spending Intelligence (mid-2024, letting customers query their transactions in natural language), then Scam Intelligence (late 2025, where you can upload a marketplace listing and have it flagged for fraud signals), and finally the Starling Assistant launched on Gemini in March 2026 — a genuinely agentic assistant that can carry out tasks, not just answer questions.

A few honest lessons from them that I’m still thinking about:

Confirmation and trust are a UX problem, not just a model problem. People are not yet comfortable with an AI moving their money. Starling’s answer is an explicit confirmation step that states what the agent is about to do — and they’re tuning when to ask, because confirming a £100 transfer the same way you confirm a £20,000 one is exhausting and erodes trust.
You don’t have to explain everything. As a regulated bank under the FCA and Consumer Duty, explainability matters enormously — but they made the point that trying to explain every reasoning step of an LLM is neither practical nor necessary. They focus on explaining outcomes and acting in good faith, which is a more defensible and more honest position than pretending to full interpretability.
Engagement is a real signal. Most user sessions were one or two turns, which told them people weren’t discovering what the assistant could do. Their fix was agent skills — surfacing concrete, suggested actions to reduce the friction of figuring out what to ask.

The meta-lesson, which echoes the data-culture point from Day 1, was that none of this shipped without a cross-functional team — design, user research, engineering, data science — held together for six months around one vision. The technology was the easy part.

Google Cloud Lakehouse: Data Without Borders

Day 1 introduced the “Borderless Lakehouse” as a concept. This Day 2 breakout — Google Cloud Lakehouse: Data Without Borders — actually unpacked it, and it landed harder for me the second time, partly because of a customer story that made the abstract concrete.

The session opened with the honest version of the problem: consolidating everything onto a single cloud takes a long time, and building the data pipelines to do it is genuinely hard. Enterprise data is scattered — some in AWS, some in Azure, some in Google Cloud, more in Snowflake and Databricks — separated by governance silos, complex access control, and the manual data movement we’ve all written and all hate. The pitch for the Cross-Cloud Lakehouse is direct analytical access to that remote storage at scale, with no ETL and zero data movement — bringing BigQuery, Spark, and Gemini to the data rather than dragging the data to them. The repeated phrase, and the right one, was speed to value.

The part I liked most is that using the lakehouse through BigQuery means you keep all the BigQuery features — including, as the speaker pointed out, BigQuery Graph working directly over lakehouse data (which ties this neatly to the graph session below). And federated queries and native queries work together in the same place, so you’re not choosing between “query it where it lives” and “query the copy you imported” — you can mix both in one query.

The whole thing is built on Apache Iceberg as the open table format, with Iceberg Federation (vendor compatibility with Databricks and Snowflake, full catalog sync, two-sided policy enforcement) and an Iceberg REST Catalog for broad engine compatibility. A nice security detail: no more long-lived access keys — access uses short-lived, vended credentials instead, which is exactly what you want before this goes anywhere near a regulated workload.

A few more engineering points that stood out because they’re the difference between “demo” and “production”:

Cross-Cloud Interconnect (CCI), now in preview, lets you federate the lakehouse to other clouds over a private interconnect rather than the public internet — lower latency, better security, and meaningfully lower egress cost. The “data feels local even when it isn’t” claim only holds if the network layer is doing this.
Optimised data access via a transparent cache plus columnar tricks (dictionary and run-length encoding) and join pushdown, so the work happens close to the remote data and you ship results, not raw bytes.
An ADK plugin for observing how your agents are performing and running analysis on the lakehouse — agent telemetry landing back in the same data platform, which is the right place for it.

One honest limitation worth recording: the cross-cloud syncing currently works for structured data only. So this isn’t yet the answer for the 90%-unstructured estate that Day 1’s Knowledge Catalogue was pitched at — it’s the structured-data federation layer, and a strong one, but know the boundary before you plan around it.

There was also a zero-copy SaaS angle I hadn’t fully clocked: querying live application data from Salesforce and Workday (and SAP, coming soon) without replication, with bi-directional sharing so derived insights can be pushed back into those systems for action. That closes a loop most “single view of the customer” projects never manage to close.

The architecture they showed stacks cleanly: AI Hypercomputer at the bottom, the AI-native cross-cloud lakehouse spanning GCS/AWS/Azure storage, the Knowledge Catalog above it (the same context graph from Day 1), then the AI-powered engines (analytical systems, operational databases, BI), and at the top the agentic-first experiences — agentic development and Gemini Enterprise. It’s the same “system of action, not system of intelligence” thesis, but seen from the data layer up.

WPP: what it actually buys you

The customer story was WPP, and it was the most useful part of the session because it started with a business question, not a technology. Their “Deep Data Research” framing: start with a question (“how can WPP maximise engagement and conversion for a client’s new sustainable product line targeting Gen Z and Millennials in North America?”), then decompose it into data problems — find sources, transform, query — across a genuinely terrifying baseline diagram of client first-party data locked in separate Azure, AWS, GCP, and Snowflake silos.

The target they’re building toward is a single consolidated data layer: hybrid/multi-cloud sources at the bottom, an Iceberg-based federated access layer, then the execution engines (BigQuery, Spark on GKE, managed Spark, AlloyDB, Spanner), the semantic and governance layer (Knowledge Catalog, knowledge graph, authorized/semantic views), and finally consumption — analysts, downstream systems, data scientists, marketers, and crucially an agentic consumer via Vertex and ADK.

The numbers they put against it are the bit worth quoting in a budget conversation:

200% increase in monthly active users — by making data accessible beyond the experts (discovery).
200% increase in total requests — once templated data tasks are replaced with agentic flows, people come back and use it more.
10x cross-domain data “collabs” — combining client 1P data with WPP’s own assets to showcase data innovations.
10% cost savings — from removing unused data and process.
Onboarding a new dataset went from 6 months to 1 month.

That last one is the one I’d lead with internally. Time-to-value on new data is the tax that quietly kills most data platforms, and cutting it by 5-6x is a more honest signal of whether a foundation is working than any of the headline percentages.

The rest of the session rounded out the governance story you’d want before putting this anywhere near a regulated workload — RaMP (the Rapid Migration and Modernization Program) as the on-ramp, Gemini data residency at region level (EU/US) with country-level (Germany, India) coming, FSI compliance coverage across the major regimes (FCA, DORA, FINMA, and a long list more), and the same kind of governance lattice — user, device, browser, data location, logging, export — that Workspace has had for years. It’s preview, but it’s “try it today” preview, not “talk to your account team” preview, which is a meaningful difference.

Beyond the Join: Graph Goes Native in BigQuery

The most technically satisfying session for me — as someone who has written more five-table-join SQL queries than I’d like to admit — was on native graph analytics in BigQuery.

The premise is one I’ve felt in my bones: a huge amount of the data we care about is fundamentally about relationships, and we’ve spent decades forcing it into relational tables and then paying for that decision every time we write a recursive self-join. BigQuery Graph and Spanner Graph address this by letting you build a property graph on top of your existing tables using the ISO-standard Graph Query Language (GQL) — no migration to a separate graph database, and you can mix GQL and SQL in a single query. That “no migration” point is the whole game; anyone who’s moved a data estate knows it’s neither fun nor cheap.

The case study was Curve, the card-aggregation fintech, on fraud detection — and it was excellent precisely because it got into the unglamorous engineering. They model “user networks” connecting accounts to shared attributes (devices, IP addresses, shipping addresses, payment cards), which lets them spot account-takeover and synthetic-identity fraud by the tell-tale pattern of many victim accounts converging on one device. The clever bit was a bipartite projection technique: collapsing attribute nodes onto edge properties so they only keep the account-to-account relationships they actually care about. The result was a 97% reduction in node count (from 30 million nodes to about 1 million) and roughly half the hops per query — which, at billions of edges, is the difference between a viable system and an unaffordable one. They cited around £9.1M in fraud savings and up to ~20% lower query cost versus their old SQL approach.

On the platform side, the announcements worth flagging:

Spanner Graph algorithms are now in preview — 15 of the most popular algorithms (community detection, centrality, similarity, pattern matching) running natively in Spanner with no impact on transactional performance. The modularity-clustering demo grouped millions of accounts into isolated fraud communities, then surfaced the high-influence node in each as a likely ringleader.
An open-sourced GNN toolkit (built with Google Research) for training graph neural networks in a Google-native pipeline, reading directly from both BigQuery and Spanner graphs, for node classification and link prediction at low latency.
Graph measures + semantic layer integration, so governed metrics and the relationships between tables live with the graph schema and agents can read them consistently — addressing the “ask the same question of two BI tools, get two different answers” problem.
“Chat with your graph” via BigQuery’s data agents — add a graph as a knowledge source and query it conversationally, for anyone who doesn’t want to learn GQL.

BigQuery Graph went to public preview around April and is targeting GA, with up to 4× performance improvements on benchmark tasks since then. If you’re doing any kind of fraud, entity-resolution, or network analysis, this is worth a serious look — the “no migration, query it where it lives” story is hard to argue with.

My Takeaways

A few things I’m still chewing on after two days:

The “I’m not a designer/engineer” barrier is genuinely lower. I was sceptical of this going in. After watching non-specialists produce real, working app experiences live on stage using the same tools that are available to everyone in the room, I’m revising my prior. The harnesses do a lot of the heavy lifting now. The skill that matters is knowing what outcome you want and how to evaluate whether you got it.

Governance as a central layer is the idea I’ll actually use. The single most reusable thing from the whole summit was watching policy enforcement — PII blocking, DLP, responsible-AI filters — sit across the runtime and get reconfigured without touching agent code. That’s the pattern. Everything else is implementation detail.

Automation and discovery are not the same thing. DeepMind’s framing — that the real prize isn’t doing existing work faster but discovering things humans haven’t — is the most strategically useful distinction I heard. Most enterprises (mine included) are firmly in the automation camp. The question is which of our hard problems is actually a discovery problem in disguise.

“Bring compute to the data” finally has the plumbing behind it. The cross-cloud lakehouse only works because of the unglamorous layers — private interconnects, a transparent cache, join pushdown, an open table format. WPP’s 6-months-to-1-month onboarding number is the proof that matters: the win isn’t multi-cloud as a slogan, it’s not having to move the data before you can use it.

Graphs are finally where the data is. A decade of “you should really use a graph database” always foundered on the migration cost. Building the graph on top of the warehouse you already have removes the excuse. I suspect this is one of those changes that looks minor and turns out to matter a lot.

Day 1 asked what will you build? Day 2 answered it by just… building, repeatedly, in front of a live audience, including the bits that broke. The honesty of that — failures and recoveries included — did more to convince me the platform is ready than any polished sizzle reel would have.

The harder question, again, isn’t the technology. It’s whether we’ve done the foundational, governance, and team work to point these tools at problems that actually matter. Future Dreams reminded me what that looks like when you get it right.

These are my personal reflections from the event.