We will start with the basic components and then dive deeper into the technicalities of system design Netflix and study the end-to-end processes involved in delivering content at scale.
Starting with, what is Netflix?
Netflix, Inc. is an American over-the-top content platform and production company headquartered in Los Gatos, California. Netflix was founded in 1997 by Reed Hastings and Marc Randolph in Scotts Valley, California.
The company is ranked 164th on the Fortune 500 and 284th on the Forbes Global 2000. Their primary business is a subscription-based streaming service offering online streaming from a library of films and television series, including those produced in-house.
In system design interviews, system design Netflix is quite a common question.
You may be given problem statements like –
- System design Netflix (Video processing and content onboarding System)
- Design High-Level System Architecture for Netflix
- Design an on-demand video streaming system
- Enlist different layers and cloud operations in designing a Video Processing System.
Also Read About, Interpolation in Angular
Let’s first analyse the requirements of the system we are going to design.
- Feature to allow users to create accounts and subscribe to a plan
- Allow users to handle multiple accounts
- Allow users to watch videos
- Let Netflix developers upload video from the backend and make it available on the platform.
- There should be no buffering, i.e., provide users with real-time video streaming without any lag.
- Reliable system
- High Availability
Architecture and Components
From the point of view of software architecture, Netflix comprises three main parts: Client, Backend and Content Delivery Network (CDN).
Let’s look into the components in detail:
1. Client App
Netflix has an interactive User interface that works on almost all devices like mobile, iPad, TV, laptop.
like searching video and providing intelligent video recommendations which ease users to enjoy their favourite binge-watch. It has great features for users; for example, when watching on a device, say mobile, you can continue watching it on a computer. Specific UX rules are applied to make it simple and intuitive.
React.js is used to write the front-end because of its loading speed, durability, and performance.
Netflix has chosen Microservices architecture for their cloud-based system to manage running both the heavy and lightweight workloads on the same infrastructure.
It has small manageable software components on the API level, which enables and serves requests from apps and websites. Microservices rely on each other internally for requests and fetching data. Java, MySQL, Gluster, Apache Tomcat, Hive, Chukwa, Cassandra and Hadoop are used to power the backend system.
The backend handles everything other than streaming videos after you hit the play button, such as processing videos, onboarding new content, network traffic management and distribution of resources across servers worldwide. The major role is played by AWS(Amazon Web Services).
Below are some important services provided by Netflix –
- User and Authentication Service
- Subscription Management
- Videos Service
- TransCoder Service
- Global Search
With the rapidly increasing content and resources to manage, Netflix also migrated its IT infrastructure to a public cloud.
Netflix works on two cloud services, i.e. Amazon web services and Open Connect(Netflix’s custom CDN). These two cloud services work parallelly for video processing and delivering content to end-users.
4. CDN(Content Delivery Network)
CDN is a Content Delivery Network or Content Distribution Network. As the name suggests, it is a network of servers distributed globally. When you hit the play button, the video displayed on your device is streamed from this component. This significantly reduces the response time as the video is streamed from the server nearest to your location.
- CDNs replicate content in multiple places. There’s a better chance of videos being closer to the user and with fewer hops.
- CDN machines make heavy use of caching and can mostly serve videos out of memory.
- Less popular videos (1-20 views per day) that are not cached by CDNs can be served by the servers in various data centers.
5. Open Connect
This is Netflix’s in-house or custom global content delivery network (CDN) responsible for the storage and delivery of movies and TV shows to Netflix users globally.
When we play any video on Netflix, that video is streamed from open connect stored in different locations in the world, so if that video is present there, our client easily shows that video. If that video is not present in your location, then Netflix needs to process that video from S3 AWS first then, open connect will stream that video to your device.
How does Netflix use microservices in the backend?
And What is critical and stateless services?
Adoption of microservices allows faster deployments as any change in services can be done easily. The performance of each service can be tracked and if there is any issue, then it can be quickly isolated from other running services.
On the basis of Functionality, there are 2 types of services –
- Critical services
Critical services are those services that users interact with very frequently. These services are kept independent of other services so that even in case of any fail-over, users can continue to perform basic operations.
- Stateless services
Stateless services are those which serve API requests to clients, and these are deployed in such a way that they continue to work with other instances even if any server fails. This ensures high availability.
REST API’s are used mostly to interact with the clients.
How does Netflix onboard new content?
Content is a movie or show in video format. In order to be able to serve the content on various devices and varying network speeds, a series of preprocessing is done. This is termed encoding or transcoding, which involves converting the videos from one format to another like changing resolutions, aspect ratio, reducing the file size etc., with the aim to make the video compatible across multiple devices.
The range of questions covered in this section include –
- Explain the database design of Netflix
- With the help of a diagram to showcase the various operations of the platform interacting with the DB?
- How are MySQL and NoSQL used as databases for operations?
- What is live and compressed viewing history?
Netflix uses different data stores comprising both SQL and NoSQL for different purposes.
MySQL databases are used for managing movie titles, billing, and transaction purposes.
To be specific, AWS EC2 Deployed MySQL is used to store the data.
MySQL is built using the InnoDB engine over large AWS EC2 instances. Data from User Service where we need strong ACID properties, this RDBMS is an obvious choice.
Replication on this database is done synchronously, which states that there is a master-master relationship between nodes, and any write operation on the primary node will be considered as done only if that data is synchronised by both local and remote nodes to ensure high availability.
Read queries aren’t handled by the primary(master) node; it’s handled by replicas, only write queries are handled by master DB. In case of failover, the secondary node will take up as the master node and will handle the write query well.
Cassandra is a distributed column-based NoSQL database that is free to use and open source that enables the storage of a large amount of data over servers. As we know, Netflix has a large user base globally, so it requires such DB to store user history. It enables handling of large amounts of reading requests efficiently and optimises the latency for large read requests.
As the user base grew, it became difficult to store so many rows of data, and it was also costly and slow. So Netflix designed a new Database to store the history of users based on the time frame and recent use.
1. LiveVH (Live Viewing History) – Only recent data, with frequent updates, smaller numbers of rows are stored in an uncompressed form that is used for many operations like analysis, recommendations to the user after performing ETL(extracting, transforming, and loading). This fulfils the motive of using fast and smaller DB and also performing the functionality.
2. CompressedVH (Compressed Viewing History) – Old data of browsing history and viewed by users is stored after compressing with occasional updates. Storage size is also decreased, and only one column per row was stored.
You can also read about mock interview.
Searching and Data Processing
We will talk about –
- How is search implemented in Netflix?
- What is elastic search, and how does your platform implement it?
- What happens after a user clicks on the video?
Search is implemented using Elastic Search DB that enables users to search for movies, series by title, or any meta-data associated with the video. Elastic search provides the feature of full-text data search and ranking the data based on recommendations, reviews, rankings during search only.
Another application of Elastic search is tracking down users’ events in cases of failures(e.g. if a user is unable to play some video). Then the customer care team uses elastic search to resolve issues.
2. Data Processing
Data processing involves all the events required after a user clicks on the video; it takes nanoseconds to process the video and stream it to the user.
There are around 600 billion events daily, resulting in 1.5 PB data, and during peak hours(evening and night), there are around 8 million events per second.
Events are UI Activities, Video viewing activities, logging errors, troubleshooting, processing events and performance events in the backend.
Here comes the role of Big Data and Hadoop.
Frequently Asked Questions
What is Netflix’s architecture?
Netflix uses microservices architecture.
What software does Netflix use?
Netflix uses a variety of open-source software in its backend, which includes – Java, MySQL, Gluster, Apache Tomcat, Cassandra, Hadoop and Hive.
How do I prepare for system design?
Firstly learn all the basics and fundamentals of system design. Knowing fundamentals will help you make the right decisions while designing a system. Once you understand the basics, the move on case studies of system design like twitter system design, URL shortener etc. You can refer to the system design guided path for detailed and structured content on system design.
Are system design interviews hard?
System design is one of the most difficult topics, and many candidates fail to answer the questions during the interview, but solid preparation and case studies of different tech giant system designs can help you ace interviews and make it easy for you. You can refer to the system design guided path for detailed and structured content on system design.
In this article, we covered all about the System design of Netflix. We started with the system’s requirements and then learned in detail about the various components, be it Netflix’s cloud architecture, backend and the databases used.
Also, it is pertinent to know that you can extend the architecture according to the design goals.
Along with this, it is always good to know about the system designs of Twitter, URL shortening etc., to be more confident and have an in-depth understanding to crack system design interviews easily.
This kind of setup helps deliver large, complex applications quickly and reliably.
The figure below is an overiew of Netflix’s backend.
1. The Client sends a Play request to a Backend running on AWS. Netflix uses Amazon’s Elastic Load Balancer (ELB) service to route traffic to its services.
2. AWS ELB will forward that request to the API Gateway Service. Netflix uses Zuul as its API gateway, which is built to allow dynamic routing, traffic monitoring, and security, resilience to failures at the edge of the cloud deployment.
3. The heart of Netflix’s operations is the Application API component, with various APIs for things like signing up or providing video recommendations. In this scenario, the Play request gets handled by the Play API.
4. Play API will call a microservice or a sequence of microservices to fulfill the request.
5. Microservices are mostly stateless small programs, there can be thousands of these services communicating with each other.
6. Microservices can save or get data from a data store during this process.
7. Microservices can send events for tracking user activities or other data to the Stream Processing Pipeline for either real-time processing of personalized recommendations or batch processing of business intelligence tasks.
8. The data coming out of the Stream Processing Pipeline can be persistent to other data stores such as AWS S3, Hadoop HDFS, Cassandra, etc.
9. To send push notifications, Netflix use distributed messaging queue.
10. Open Connect is Netflix’s custom global content delivery network(CDN). These OCAs servers are placed inside internet service providers (ISPs) and internet exchange locations (IXPs) networks around the world to deliver Netflix content to users