How Twitter was designed

Goal of Twitter

People are creating content(tweets) on twitter platform and the jobs are

Twitter maintains different types of timelines

User Timeline (All the tweets that a particular user has sent)
Home Timeline (Which is a temporal merge of user timelines that a user is following)
Query Timeline (Mainly Search API)

Tweet comes in, hits the right server
Fanout demon queries social graph service(Flock - which holds all the followers and followings of people)
Iterates through all the followers timelines stored in redis cluster(which has replication factor of 3) and add this tweet id, user id and some other information to all the home timelines
Size of home timeline in per user in Redis cluster is 800
This home timeline will also be stored in disc also for persistent storage
User timeline will always be stored in disk, because fetching user timeline is not latency intensive as it will be a single query
As one can clearly see reads are extremely faster when compared to writes (Since Read-Write ratio of request rate will be > 50)

When Tweet comes in, twitter is going to tokenize and ingest into an earlybird machine (Modified version of Lucene), Entire Lucene index is stored in RAM
Blender (Which creates search timeline in Twitter) issues a search query to early bird cluster, which internally queries every shard of earlybird
Ranking is mainly done through number of re-tweets and favorites
These re-tweets/favorites are calculated in activity timeline when user re-tweets/favorites
So Blender will get tweet ids from earlybird and it recomputes and sorts(based on ranking) the results

When the user who has Millions of followers, If that user tweets twitter is going to store those tweet id in each and every follower home timeline which might be computationally expensive
Solution for it is twitter is going to store tweet ids in home timelines of users if number of followers are less then some threshold, And while fetching twitter will do temporal merge of tweet ids with user timeline who has more number of followers, This way reads become slow (Tradeoff) (Experimental Phase)

It is stored in Tweetie-Pie service
From home timeline we will get tweet ids and do a batch get on tweetie-pie service