BBC Online: Architecting for Scale with the Cloud and Serverless
Matthew Clark discusses how the BBC’s website is designed in a scalable, performant, and resilient way, what the architectural solution is, and some of the technologies used.
A foundation technology of the Cloud Native approach is Serverless computing.
Speaking at QCon Plus 2021 Matthew Clark provides a detailed analysis of the BBC Online platform, which powers sites like BBC News, the number one news site in the world, and how they have achieved this scale by using serverless.
The estate publishes in over 40 different languages, spans a massive content portfolio covering radio, podcasts, live events and childrens e-learning materials, and includes over 15 apps like the BBC iPlayer.
From 3m:20s Matthew begins an overview of the architecture, highlighting that last year the BBC switched off their on-premise data centres and moved to the Cloud.
Thousands of content producers like journalists feed into CMS (Content Management Systems) and media stores, a variety of technologies such as relational databases. Search engines are also used here.
Users interface with server side HTML rendering and APIs for mobile apps, and business logic connects these to the content stores. CDNs and routing layers provide traffic management capabilities.
At 8m:30s he moves on to caching, which delivers scaling, speed, cost optimization and resilience, and is integrated with the traffic management layer, and is also in front of the content store and in between the app API and HTML rendering.
Mathew recommends Redis as a caching technology, which the BBC has been using for years. A key requirement is to keep the caches refreshed as content updates.
From 12m:15s Mathew describes their use of serverless technologies. He highlights how virtual machines, containers and serverless are used, with the criteria for each being the trade off between how much control you have over each, and equally how little you have to manage.
The BBC use AWS Lambda for serving a lot of HTML rendering, and while the majority of requests perform at a fast rate they do experience a small percentage that perform very slowly due to the ‘serverless cold start‘ challenge.
Idle Function Problem
At 15m:00s he zooms in on the ‘Idle Function Problem’, where you are paying for compute time for functions that aren’t doing anything, magnified if functions are chained together.
If you compare the rack rate of equivalent virtual machine capability you’re paying 2-4 times as much to use serverless, but this balances out due to the fact you only pay for when you run code.
From 19m:15s Matthew concludes by overlaying the role of serverless across the whole architecture he has described.
There are functions that are ideal for utilizing serverless for, such as the interface between the CMS and the content store, and some aspects of the content store itself can be serverless, although much of it is too complex for this. The HTML rendering and API gateway, and the business logic they use to connect to the content store, are ideal candidates for serverless.
Elements such as the traffic management are ill suited to serverless, as they want much more raw control over the performance.
From 22m:52s Matthew wraps up with an overview of how to organize the team structure around this architecture and fields audience Q&A.