System Design¶
Resources¶
- DDIA (2017)
- Grokking the system design interview
- Alex Xu
- System Design Primer
- just a markdown file
- Gaurav System Design Playlist | Youtube
- Exponent | Youtube
- Problems Aggretation
- Database Schema Templates
Scaling¶
Vertical Scaling¶
- pros
- fast inter-process communication
- data consistent
- cons
- single point of failure
- hardware limit
Horizontal Scaling¶
- pros
- scale well
- cons
- slow inter-process communication
- need RPC between machines
- need load-balancing
- data inconsistency
- slow inter-process communication
Performance¶
- the performance of the 99.9th percentile is the most important
- slow requests -> have many data -> valuable customer
- Amazon:
-
- 100ms -> -1% revenue
-
- 1s -> -16% revenue
-
- service level agreements (SLAs)
- service is up when
- median < 200ms
- 99th < 1s
- service needs to be up > 99.9% of the time
- refund if not met
- service is up when
Maintenability¶
- Operability
- easy to operate
- Simplicity
- easy to understand
- Evolvability
- easy to make changes
Operability¶
To make things easier to operate
- provide observability
- avoid single point of failure
- -> no down time when updating a machine
- good documentation
- finite state machine of all the things that may happen
Simplicity¶
- complex -> bad operability & evolvability
- abstraction reduces accidental complexity
- accidental complexity = complexity rising not from design but implementation
- pros of abstraction
- hides implementation details
- can be reused
- benefits all apps using it when improved
- e.g. SQL
Database¶
One-to-Many Relationship¶
Object-Relational Mismatch Problem
Many real life data has one-to-many relationship e.g. a person may have several work experience, but relational tables aren't like that, so a translation layer between database model & application code is needed.
Solutions
- traditional SQL model: creating multiple tables with foreign keys to the common table
- cons
- bad locality: needs to perform multiple queries or multi-way join to get all data
- cons
- later SQL versions
- use JSON datatype
- pros
- good locality
- everything in one place
- only need one query
- good locality
- encode those as JSON and store as text
- cons
- can't query the keys in the text json
- cons
Many-to-One & Many-to-Many Relationship¶
Problems of document model
Data has a tendency of becoming more interconnected as features are added to applications.
Many tree-like one-to-many data actually become graph-like many-to-many relationship when scaling and adding new features.
e.g. a person has 2 schools in his profile page, but the schools should also have the person in their profile page
Document databases e.g. MongoDB, which excel at tree-like structures, have weak or no support for joins as one-to-many relationships don't need joins. This makes them not suitable for many-to-many relationships.
Problems of network model
Network model stores hierarchical as a graph. Each record can have multiple parents.
To query or update a record, the application code has to go through the graph from the root, making it complicated and inflexible.
Relational model the solution
- query optimizer does the heavy-lifting, not the application code
- declare a new index to query data in a new way
- query optimizer will automatically choose the best index
ACID¶
The 4 properties that define a transaction.
- atomicity
- consistency
- isolation
- transactions are isolated to each other
- durability
Typically a trait of RDBMS, while NoSQL DBs typically don't support it.
CAP¶
Only 2 of the 3 can be satisfied at the same time for any distributed data store.
- Consistency
- the read value is always the most recent
- Availability
- you can read whenever you want
- Partition tolerance
- the must in a distributed data store
Partition tolerance is a must, so the choice is between consistent & availability. Typically, RDBMS chooses consistency over availability, while NoSQL chooses availability over consistency.
NoSQL¶
Some nice properties
- horizontal scaling
- redundancy
- replicas
- data don't get removed instantly
- can undelete
Primary Key¶
- uuid (universally unique identifier)
- how to generate
- timestamp
- random
- pros
- unique
- stateless
- cons
- not intrinsically sortable
- performance issue for MySQL
- how to generate
- integer
- how to generate
- auto increment
- pros
- readable
- cons
- not unique across distributed systems
- stateful
- neet to consult db to know the next integer
- how to generate
Handling Hierarchical Data¶
If your data is a tree-like structure (undirected graph), you can design your db schema in multiple ways.
Multiple Tables¶
Use multiple tables with each non-root table having a foreign key referencing to the parent.
Analysis
- pros
- ?
- cons
- may have multiple tables storing the same kinds of information
- e.g. managers & SWEs are all employees
- may have multiple tables storing the same kinds of information
Adjacency List¶
Have a column recording to the parents.
Minimal schema example
- id
- parent_id
See SQL#Recursive Queries for quering a subtree
Analysis
- pros
- simple
- cons
- need recursive queries to get a subtree, slow in deep trees
Nested Sets¶
A segment tree. Record left & right values in a row.
Example table
id | name | left | right |
---|---|---|---|
1 | Electronics | 1 | 12 |
2 | Cell Phones | 2 | 7 |
3 | Smartphones | 3 | 4 |
4 | Basic Phones | 5 | 6 |
5 | Computers | 8 | 11 |
6 | Laptops | 9 | 10 |
7 | Desktops | 8 | 9 |
8 | TV & Home Theater | 11 | 12 |
9 | Books | 13 | 18 |
10 | Fiction | 14 | 15 |
11 | Non-Fiction | 16 | 17 |
12 | Music | 19 | 20 |
Exclusively use left
& right
to do queries (i.e. id
isn't involved when quering)
Analysis
- pros
- faster to query a deep subtree
- cons
- need to update the
left
&right
fields of many records when inserting or moving a node
- need to update the
Lineage Column / Path Enumeration / Materialized Path¶
Store the path in a field.
Example table
id | name | path |
---|---|---|
1 | Electronics | 1 |
2 | Cell Phones | 1/2 |
3 | Smartphones | 1/2/3 |
4 | Basic Phones | 1/2/4 |
5 | Computers | 1/5 |
6 | Laptops | 1/5/6 |
7 | Desktops | 1/5/7 |
8 | TV & Home Theater | 1/8 |
9 | Books | 9 |
10 | Fiction | 9/10 |
11 | Non-Fiction | 9/11 |
12 | Music | 12 |
Analysis
- pros
- very easy to query a subtree
- e.g.
LIKE '1/2/%'
- e.g.
- very easy to query a subtree
- cons
- very complex to move a node
- need to update the path of itself and all its children
- no referential integrity
- very complex to move a node
Closure Table / Bridge Table¶
Have a dedicated table recording the relationship.
Example relationship table
For 1 -> 2 -> 3
ancestor_id | node_id |
---|---|
1 | 1 |
1 | 2 |
1 | 3 |
2 | 2 |
2 | 3 |
3 | 3 |
(omitting primary key)
Analysis
- pros
- easy to select a subtree
- cons
- complex to move a node
- relationship table \(O(n^2)\) space worst case
References¶
- Hierarchical Data in SQL: The Ultimate Guide
- Hierarchical Database, multiple tables or column with parent id? | Stack Overflow
Replication¶
Leader-Based Replication¶
- 1 replica leader/master, others followers/slaves
- client only writes to leader, changes later replicated to followers (sync or async)
- followers are read-only to client
- built-in for relational databases & message queues of high availability (e.g. Kafka & RabbitMQ)
replication methods¶
- sync
- leader wait for response
- -> high durability
- async
- leader don't wait for response
- -> high avaibility
Leader-Based Replication often uses async replication
adding a new follower¶
- set up a follower node with a snapshot of the leader
- sync with the leader with leader's log
Load Balancing¶
request id -> hashed request id -> mod by n (# of servers) -> direct to the server
Cost of adding new servers¶
New servers -> some requests will be directed to a different server -> cache miss
Consistency Hashing¶
Goal: Minimize the cost when adding new servers, we want to as few mapping changed as possible.
Key concept: Randomly assign servers to a position in an arbitrary circle, and each of them serve the requests closest to them (in the counterclockwise direction)
Each object's location = hashed key mod # seats in the ring
reference
Sharding¶
Partitioning the database
Vertical Partitioning¶
- partition by different features
- e.g. profiles, messages, connections, etc.
- cons
- table can still be large -> need repartition
Key/Hash-Based Partitioning¶
- N servers -> mod by N
- cons
- add servers -> need reallocate all the data -> very expensive
Directory-Based Partitioning¶
- maintain a lookup table
- pros
- easy to add servers
- cons
- lookup table -> single point of failure
- constantly access lookup table -> impact performance
Problems¶
- join is expensive
- if the target data are located in different shards, than we need to join multiple queries, which is expensive
- fixed number of shards
- solution: #Consistency Hashing
- memcached
- solution: hierarchical sharding
- each shard partitioned into many mini-shards
- solution: #Consistency Hashing
Distributed DB System¶
Master-Slave Architecture¶
- Write to master -> replicate data to slave
- Read from either of the master or slaves
- Master dies -> elect a new master among the slaves
- e.g.
- DynamoDB
Peer to Peer Architecture¶
- e.g.
- Cassandra
Caching¶
Advantages of caching¶
- reduce network calls
- reduce repetitive calculations
- reduce DB load
In-memory cache vs. global cache¶
tradeoff: speed vs. accuracy
- in-memory cache
- fast
- data inconsistency across multiple caches
- global cache
- slower
- all clients access the same global cache so no data inconsistency problem
Data syncing policy¶
tradeoff: performance vs. accuracy
- write-through cache
- write to cache -> write to db
- problem: data inconsistency across multiple caches
- write-back cache
- write to db -> write to cache
- problem: performance problem since it has to write back to every cache
- hybrid
- uncritical data -> write-through
- critical data -> write-back
Microservices vs. Monolith¶
You probably don't need to use microservices.
Microservice Story - A collection of technical blog ahout microservice & monolith
Bad microservices design examples¶
Scaling up the Prime Video audio/video monitoring service and reducing costs by 90%
This is not a discussion of monolith vs serverless. This is some terrible engineering all over that was "fixed". - LASR
Async¶
- slow operations should be done asyncly
- pre-process
- have the thing ready before user access if being out-of-date is acceptable
Networking Metrics¶
- bandwidth
- max throughput
- throughput
- latency
MapReduce¶
- to process large amounts of data in parallel
- map: label
- input: raw data
- output: (key, value)
- reduceL aggregate and organize
- input: many (key, value)
- output: (key, value)
- e.g.
Storing passwords in the database¶
- db entry
- hash(password + salt)
- salt
- register flow
- randomly generate a salt -> combine with user entered password -> hash -> store hashing result & salt into database
- auth flow
- retrieve user's salt from db -> combine with entered password -> hash -> compare with db entry
- ref
Case Study¶
URL Shortening¶
https://www.educative.io/courses/grokking-the-system-design-interview
Twitter Timeline¶
source: DDIA
Approach 1
- write: post a tweet -> insert into global tweets
- simple
- read: sql join query to select tweets from those who the user follows
- complex
Suitable for
Approach 2
- write: post a tweet -> insert into the cache of each of the tweeter's followers
- complex
- read: fetch the cache
- simple
Suitable when # read reqs >> # write reqs
Hybrid
Approach 2 for tweets from the mass with little followers, and approach 1 for tweets from celebs
i.e. Tweets from celebs are indepedently fetched and then inserted into users' timeline cache