LLM inference is becoming a distributed systems problem. Explore the architecture patterns reshaping AI infrastructure ->

A New Live Streaming Origin Built for Global Scale

Lionel Bringuier

 

It’s the world’s most watched sport. In the US, it’s soccer; everywhere else, it’s football; and in the UK, it’s nearly a religion. In the run-up to the FIFA World Cup 2026, a large UK-based broadcaster made a big bet: they migrated their live streaming origin off from the deprecated AWS Elemental MediaStore service to Momento, and they did it in time to serve tens of millions of fans.

This wasn’t a simple lift and shift. The broadcaster’s live stack is a mature, battle-tested system that has evolved over years of 24×7 live linear channels, major tournaments and peak news moments. Replacing the core media storage and origin layer under that stack meant Momento had to meet an exacting bar on latency, reliability and operational visibility, while continuing to support an existing fleet of encoders and packagers, CDNs, and control-plane tooling.

This post walks through the existing architecture, adapting Momento as a drop-in replacement for AWS Elemental MediaStore, and then hardening the infrastructure for the World Cup and beyond.

The Starting Point: Live Origin at Scale

The broadcaster runs a large portfolio of 24×7 simulcast channels plus frequent pop-up live events in both HD and UHD, delivered via DASH and HLS. Streams are published into multiple AWS regions for redundancy, and each stream fans out across multiple CDNs.

To give a rough estimate of the data volume, their live origin manages 40+ live 24×7 channels, with some additional seasonal pop-up channels (up to 40 concurrent ones for large sport events). The live channels are encoded in H.264 and HEVC, with typically 8 to 11 video encoding profiles and 4 audio tracks in the ABR ladder. Segments and manifests are pushed into a media storage service deployed in two AWS regions, with cross-AZ replication in each region. On the playback side, CDNs pull from that origin, either directly or via an internal cache concentration layer to distribute traffic across providers and geographies.

At first, the broadcaster evaluated moving the origin to a standard S3 bucket. However, their architecture and operations depended on certain capabilities that would not be guaranteed.


When AWS Elemental MediaStore became deprecated, the broadcaster faced a classic “you have to rebuild the airplane while flying it” challenge: replace a key service in their workflow without rewriting their packagers, CDN configs, video players, or control planes, and without introducing new failure modes at the worst possible time, a year ahead the global football tournament. What were the tenets for the new service?

  • Fast live-edge reads and writes from London-based clients: tight time-to-first-byte and time-to-last-byte for both PUT (publication from the encoders) and GET (playback).
  • Transient data policies to aggressively expire stale manifests, forcing automatic failover to backup origins when the primary stopped publishing.
  • Per-container request limits to prevent one high-traffic service from drowning others.
  • Per-container access policies and CORS for secure origin access from CDNs and packagers.
  • Detailed access logging and metrics for publication latency, error codes, empty object publications, and regional breakdowns.
  • Lifecycle management to trim historical content and control storage costs for the content in the hot cache.
  • Keep the durability of an S3-backed storage under the hood, but without compromising access latency consistency.

Design Goal: A Drop-In MediaStore Replacement

The joint design goal was straightforward to state but hard to achieve: “just swap the origin to Momento Media Storage with minimal application changes, while preserving, or improving, the operational semantics we rely on today”.

Concretely, that translated into a few core requirements for Momento:

  • Equivalent HTTP surface area
    Keep the same style of HTTP PUT/GET semantics, 404/50x behavior for missing segments, and origin-side access control via headers and tokens.
  • Performance parity or better from London
    For UK-based clients, Momento had to deliver GET and PUT latencies that matched or beat their existing origin across both eu-west-1 (Dublin) and eu-west-2 (London).
  • Configurable object TTLs to emulate transient data policies
    Instead of path-based lifecycle rules on containers, the broadcaster wanted fine-grained TTL control per object class (e.g., manifests vs segments) to preserve their model where a stale manifest can trigger failover.
  • Operational observability that matched their current dashboards
    Publication latency, error codes, throttling, per-path metrics, and near-real-time access logs had to remain available for the broadcaster’s existing monitoring and alerting workflows.
  • Sensible multi-tenant safety rails
    Per-service request limits and regional SLAs needed to be enforced in a way that matched their mental model from the previous platform.

Today: We Are Ready for Kick-Off

The journey from early performance tests to full production lasted almost a year. During that time, the broadcaster and Momento ran a substantial battery of tests:

  • Distributed publication and playback across dozens of channels in parallel.
  • Comparative latency benchmarks from London-based clients to both eu-west-1 and eu-west-2, under varying bitrates and ladders.
  • Load and failover drills to confirm that short manifest TTLs and 404 semantics still triggered the right automatic reactions in their CDN and player stack.
  • SDK vs native HTTP tests to iron out any client-side inefficiencies and eliminate measurement artifacts.
  • Reproducibility and automation, with a single orchestration layer for the whole video stack.
  • Operational observability at every step of the workflow, that slots into their existing CloudWatch-based dashboards and alerting.

 

Most importantly, they are now running on an origin layer that is actively developed, not deprecated, and that can evolve with their roadmap and future needs.

For now the focus is simple: when the referee blows the whistle to start the first World Cup match, tens of millions of fans across the UK and beyond will be watching through a new origin, built on Momento, and they won’t even notice the difference. And that’s exactly how it should be.

On this page