Categories
Software Architecture

Designing Instagram, Linkedin, Facebook like applications

Hey Everyone,

Welcome to the first article of The GeekNarrator newsletter, I am excited to start this newsletter along with the release of my 35th episode.

In this episode we have discussed on the most popular topic on earth “System Design”. I am joined by Arslan Ahmad who is the Author of the popular “Grokking” series. He is also the CEO & Co-Founder of DesignGurus.org.

Arslan, has created a template which he calls the master template, that can be used to approach many common system design problems in an interview and real world. Here is the what the master template looks like:

System Design Master Template. credit: DesignGurus.org

 

Here is a brief introduction of some of the components involved:

𝗔𝗣𝗜 𝗚𝗮𝘁𝗲𝘄𝗮𝘆
An API Gateway (AG) is a server that acts as a single point of entry for a set of #microservices. AG receives client requests, forwards them to the appropriate microservice, and then returns the server’s response to the client. AG is responsible for tasks such as routing, authentication, and rate limiting.

𝗖𝗗𝗡
A Content Delivery Network (CDN) is a distributed network of servers that are deployed in multiple locations around the world. These servers are designed to deliver web content, such as images, videos, and other static files, to users based on their geographical location. The main purpose of a #cdn is to improve the performance and availability of web content by caching it on servers that are closer to the users who are requesting it.

𝗗𝗮𝘁𝗮 𝗣𝗮𝗿𝘁𝗶𝘁𝗶𝗼𝗻𝗶𝗻𝗴
In a database, 𝗵𝗼𝗿𝗶𝘇𝗼𝗻𝘁𝗮𝗹 𝗽𝗮𝗿𝘁𝗶𝘁𝗶𝗼𝗻𝗶𝗻𝗴, also known as sharding, involves dividing the rows of a table into smaller tables and storing them on different servers or database instances. This is done to distribute the load of a database across multiple servers and to improve performance.

On the other hand, 𝘃𝗲𝗿𝘁𝗶𝗰𝗮𝗹 𝗽𝗮𝗿𝘁𝗶𝘁𝗶𝗼𝗻𝗶𝗻𝗴 involves dividing the columns of a table into separate tables. This is done to reduce the number of columns in a table and to improve the performance of queries that only access a small number of columns.

𝗗𝗶𝘀𝘁𝗿𝗶𝗯𝘂𝘁𝗲𝗱 𝗺𝗲𝘀𝘀𝗮𝗴𝗶𝗻𝗴 𝘀𝘆𝘀𝘁𝗲𝗺𝘀
These are used to send messages between distributed components of a system. Examples include Apache #kafka and #rabbitmq.

𝗗𝗶𝘀𝘁𝗿𝗶𝗯𝘂𝘁𝗲𝗱 𝗳𝗶𝗹𝗲 𝘀𝘆𝘀𝘁𝗲𝗺𝘀
These are file systems that are designed to store and manage files across a group of servers.

𝗡𝗼𝘁𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀 𝘀𝘆𝘀𝘁𝗲𝗺
These are used to send notifications or alerts to users, such as emails, push notifications, or text messages.

𝗙𝘂𝗹𝗹-𝘁𝗲𝘅𝘁 𝘀𝗲𝗮𝗿𝗰𝗵
Full-text search enables users to search for specific words or phrases within an app or website. When a user queries, the app or website returns the most relevant results. To do this quickly and efficiently, full-text search relies on an inverted index, which is a data structure that maps words or phrases to the documents in which they appear.

 

As you can see, there are many components that are involved in designing a scalable, reliable and highly robust application. These are different patterns that can be understood at a high level and applied in many different problems.

For example

  • A Load balancer is useful whenever you need to evenly distribute the load across several machines.
  • A Api Gateway is used to route traffic from the external world to the internal world of services depending on the functionality required. It is also responsible for several other features and functionality.
  • Storage systems like the File system and Databases.
  • Queuing system to loosely couple different components and async processing.
  • A CDN to serve static content across the globe in a low latency and highly available manner.
  • A cache is used to avoid duplicate work, requests and disk access.

And so on there are many components/patterns that are applicable across many problems.

We have taken #instagram as an example and discussed two user flows in good detail:

  • Posting a photo/video
  • News feed generation service.

Some interesting discussion points were:

  • Making workflows asynchronous
  • File System vs Object stores
  • Feed generation service as a mutable infinite stream of posts
  • NoSQL vs SQL database choice.
  • Segregating metadata and data flows
  • Using relational database to store images?
  • Amazing white-papers to read to understand storage systems in depth.

And many more…

Watch the full episode to dig in and join the amazing discussion we had.

Leave a Reply Cancel reply