Design Facebook’s Newsfeed

Problem Statement:


Step-1: What is Facebook’s newsfeed?


Step-2: Requirements and Goals of the System


Step-3: Capacity Estimation and Constraints

Traffic estimates:
Memory estimates:


Step-4: System APIs

Once finalized with requirements, it’s always good to define the system APIs, this would explicitly state what is expected from the system.

Get UserFeed API
getUserFeed(api_dev_key, user_id, since_id, count, max_id, exclude_replies)


Step-5: Database Design


Step-6: High Level System Design

Feed generation:
Feed publishing:


Following is the high-level architecture diagram of our system. User B and C are following User A.


Step-7: Detailed Component Design

a) Feed Generation Service
SELECT FeedItemID FROM FeedItem WHERE SourceID in ( 
    SELECT EntityOrFriendID FROM UserFollow WHERE UserID = <current_user_id>
) ORDER BY CreationDate DESC LIMIT 100
Here are issues with this design for the feed generation service:
  1. Crazy slow for users with a lot of friends/follows as we have to perform sorting/merging/ranking of a huge number of posts.
  2. We generate the timeline when a user loads their page. This would be quite slow and have a high latency.
  3. For live updates, each status update will result in feed updates for all followers. This could result in high backlogs in our Newsfeed Generation Service.
  4. For live updates, the server pushing (or notifying about) newer posts to users could lead to very heavy loads, especially for people or pages that have a lot of followers. To improve the efficiency, we can pre-generate the timeline and store it in a memory.
Offline generation for newsfeed:
public class NewsFeed {
   LinkedHashMap<FeedItemID> feedItems;
   DateTime lastGenerated; 
}
How many feed items should we store in memory for a user’s feed ?
Should we generate (and keep in memory) newsfeed for all users ?



Let’s now discuss some solutions to our “live updates” problems in the following section.

b) Feed Publishing Service
How many feed items can we return to the client in each request ?
Should we always notify users if there are new posts available for their newsfeed ?


Step-8: Feed Ranking

High-level idea of ranking:


Step-9: Data Partitioning

a) Sharding Posts and Metadata
b) Sharding Feed Data




← Previous: Design Web Crawler

Next: Design Yelp / Nearby →