System Design: URL Shortener

1. Design Bit.ly
Topics: Scaling Reads
Bit.ly is a URL shortening service that converts long URLs into shorter, manageable links. It also provides analytics for the shortened URLs.
Functional Requirements ::2::
Features needed to satisfy the needs of the user.
In Scope
- Users should be able to submit a long URL and receive a shortened URL.
-
Optionally, users should be able to specify a custom alias.
-
Optionally, users should be able to specify an expiration date.
- Users should be able to access the original URL by using the shortened URL.
Out of Scope
-
User authentication and account management
-
Analytics on link clicks (e.g., click counts, geographic data).
Non Functional Requirements ::4::
Specifications about how a system operates, rather than what tasks it performs. Defines system attributes like scalability, latency, security, availability, and are often framed as specific benchmarks. (e.g., system's ability to handle 100 million daily active users, response to queries within 200 milliseconds)
Benchmarks
-
System should ensure uniqueness for short codes (no two long URLs should map to the same short URL)
-
Redirection should occur with minimal delay (< 100 ms)
-
System should be reliable and available 99.99% of the time (availability > consistency)
-
System should scale to support 1B shortened URLs and 100M DAU
Out of Scope
-
Data consistency in real time analytics
-
Advanced security features like spam detection and malicious URL filtering.
Note: The important consideration in this system is the imbalance between reads and writes. The ratio is skewed towards reads, since users will more frequently access shortened URLs vs create new ones. (e.g., average pf 1000 reads to 1 create). Our caching strategies, database choices, and architecture will revolve around this.
Set Up
Entities
The core objects or concepts within a system which define its functionality and data.
In our case simple, but still good to list out. The core entities are the long URL, short URL, and user.
API: the delivery framework ::2::
The contract between the client and the server and first point of reference in high level design. Usually map 1:1 with functional requirements.
- POST: Generate a new short URL
A POST endpoint to intake the long URL, optional alias, optional expiration date, and return the short URL.
// Generate short URL from long URL
Request: POST/{long_url}
Body:
{
"long_url": "https://www.example.com/some/very/long/url",
"custom_alias": "optional_custom_alias",
"expiration_date": "optional_expiration_date"
}
Response: HTTP/1.1 201 Created
Body:
{
"short_url": "http://short.ly/abc123"
}
- GET: Retrieve an existing long URL
A GET endpoint that intakes the short url, and redirects the user to the original long URL.
// Using short URL, redirect to original long URL
Request: GET/{short_url}
Response: HTTP 302 Redirect to the original long URL
High Level Design Functional Requirement 1: User submits a long URL and receives a short URL
Core components
Core components high level:
- Client: Users interact with system through web or mobile app
- Primary Server: Receives requests from client and handles logic
- Database: Stores mapping of short URLs to long URLs, user generated aliases, and expiration dates
POST Diagram
User submits a long URL and receives a short URL():
+--------+
Request: | Client | Response:
POST/urls +--------+ HTTP 201: Created
{ ^ {
"long_url": ... | "short_url": "http://short.ly/abc123"
"custom_alias": ... | }
"expiration_date": ... | or
} | HTTP 400/409/500 (failure)
| {
| "error": "..."
| }
v
+------------------+
Write(): | Primary Server | On Success:
1) generate short URL +------------------+ Return short URL
2) SQL Query (INSERT) ^ On Failure:
| Return error msg
|
|
v
+-----------+
Urls Table | Database | SQL Insert:
{ +-----------+ INSERT INTO Urls (...)
string short_url
string original_url
datetime creation_time
datetime expiration_time
created_by
}
POST Processing Steps
- Client Request()
Client sends a POST request to /urls: Sends long URL, custom alias, expiration dates.
- Primary Server processes Request()
Validates that is a real URL (to avoid unnecessary shortening), via open-source library like is-url or our own validation
Validates that it does not already exist in our system (to avoid collisions), via querying our database to check if long URL is already present
- Primary Server Validation()
If URL is valid and does not already exist, we can generate a short URL Abstract this to magic function for now
If user specified alias, we can use that as short code
- Primary Server Insert Database()
Once short URL is generated, we insert it into our database Store short URL, long URL, and expiration date
- Primary Server Response()
Return short URL to client. Send back response to client with short URL.
High Level Design Functional Requirement 2: User can access original long URL by using short URL
Core components
Core components high level:
- Client: Users interact with system through web or mobile app
- Primary Server: Receives requests from client and handles logic
- Database: Stores mapping of short URLs to long URLs, user generated aliases, and expiration dates
GET Processing Steps
- User Request()
User browser sends a GET request to serve with short URL GET /abc123
- Primary Server Processes Request()
Primary server receives this request and looks up the short code in database
- Primary Server Lookup and Validation()
If short code is found and not expired, server retrieves corresponding long URL.
- Primary Server Response()
Server sends HTTP redirect response to user's browser, sending it to long URL. There are two main HTTP redirects we could use for this purpose:
-
HTTP/1.1 301 (Permanent Redirect): Indicate resource has permanently moved to target URL, which will typically cause browsers to cache this responses, leading to future requests for the short URL to go directly to the long URL, bypassing the server.
-
HTTP/1.1 302 (Temporary Redirect): Indicate resource has temporary moved to target URL, which browsers do not cache this response, leading to future requests for the short URL to go through our primary server first.
We will use 302 redirect because:
-
Prevents browser caching and gives us direct control over redirection process, in case we need to update or expire links as needed
-
Allows us to track click statistics for each short URL, since it is forced to go through our primary server
GET Diagram
User requests a short URL and is redirected to the original URL:
+--------+
Request: | Client | Response:
GET/{short_url} +--------+ HTTP 302: Found
^ {
| "Location": "https://www.example.com/"
| }
| or
| HTTP 404 (resource not found)
| {
| "error": "..."
| }
v
+------------------+
Read(): | Primary Server | On Success:
1) QUERY short URL +------------------+ Return 302 redirect
2) Redirect on match ^ On Failure:
| Return 404 (resource not found)
|
|
v
+-----------+
Urls Table | Database |
{ +-----------+
string short_url
string original_url
datetime creation_time
datetime expiration_time
created_by
}
Deep Dive 1: Short and Unique Urls
Constraints
We need to ensure:
- Each short URLs should be unique
- Keep short URLs as short as possible
- Efficiently generate short URLs
Bad Solution: Long URL Prefix
Approach:
The simplest thing we could do is take a prefix of the input as the short code.
www.linkedin.com/ -> www.short.ly/www.link
Challenges:
This would break constraint 1. as different URLs could have the same short URL.
Good Solution: Random Number Generator or Hash Function
Approach:
To add entropy (randomness) to ensure our codes our unique, we could try a random number generator or hash function.
Challenges:
Even with entropy, there is a non negligible changes of generating duplicate short codes, due to our constraint 2 of short codes. Tis leads to a necessity of either generating longer short codes (breaking constraint 2) or adding a check for already existing code which becomes a bottleneck as the system scales (breaking constraint 3)
Great Solution: Unique Counter with Base62 Encoding
Approach:
One solution to guaranteeing no collisions is to increment a counter for each new url. We would then take the output and encode it using base62 encoding to ensure short representation.
Each counter value would be unique which then eliminates the risk of collisions.
Challenges:
In a distributed environment, maintaining a single global counter can be challenging due to synchronization issues. All instances of our primary server would need to agree on the counter value.
Challenges:
With length, if we have 1B urls, when base62 encoded will lead to 6 character strings.
So we would need to move to 7 character codes once we hit the trillions.
Deep Dive 2: Fast Redirection to Original URL
Constraints
Within context of a large database of shortened URLs we need to ensure:
- Find correct long URL quickly for smooth user experience
- Optimizations to avoid 'full table scans' on short URL keys
Good Solution: Add an Index
Approach:
To avoid full table scans, we can use indexing. An index will create a separate, sorted list of short URLs, each with a pointer to the full information in the database table. This allows us to use efficient search methods reducing lookup time.
- B-Tree Indexing A B-Tree is a specialize m-way tree designed to optimize data access, especially on disk-based storage systems.
+--------+
| 30 |
+--------+
/ \
+--------+--------+ +--------+--------+
| 10 | 20 | | 40 | 50 |
+--------+--------+ +--------+--------+
/ | | / | |
... ... ... ... ... ...
B-Trees allow for fast data retrieval and updates by maintaining a balanced tree.
In our case, we would create a B-Tree index on the short code column providing O(log n) lookup.
- Primary Key
For out database, we designate the short URL as the primary key of our table. This gives the benefits of indexing and uniqueness of short URLs.
- Hash Indexing
For databases that support it (e.g., PostgreSQL), we can use hash indexing on the short URL column. This provides O(1) average lookup time.
Challenges:
Relying solely on a disk-based database for redirects will present some challenges.
A typical SSD can handle ~100,000 Input/Output Operations Per Second.
The issue is the volume of read operations for our application. With 100M DAU, averaging each user does 5 reads() per day:
100,000,000 users * 5 redirects = 500,000,000 redirects per day
86,400 seconds per day
500,000,000 / 86,400 seconds ~= 5,787 redirects per second
And this is assuming evenly distributed reads() throughout the day which is unrealistic. Most reads() will occur during peak hours, meaning we need to design for high traffic spikes.
spikes 100x more traffic
(500,000,000 * 100) / 86,400 seconds ~= 578,700 redirects per second
A single database will struggle to keep up with the volume of traffic. A high read load would then lead to increased response times, potential timeouts, and could slowdown other database operations like write().
Great Solution: Implementing an In Memory Cache (e.g, Redis)
Approach:
To improve redirect speed, we can add an in-memory cache like Redis or Memcached which will sit in between the primary server and database.
The cache will store frequently accessed mappings of short codes to long URLs. When a redirect request comes in, the server will first check the cache.
If short code is found (cache hit) it will grab the long URL from cache. This will significantly reduce latency from reads().
If short code is not found (cache miss) it will query the database, retrieve the long URL, and store it in the cache for future requests.
Note: we are mapping directly from memory instead of disk
Memory access time: ~100 nanoseconds (0.0001 ms)
SSD access time: ~0.1 milliseconds
HDD access time: ~10 milliseconds
Which means memory access is about 1,000 times faster, which increases our reads() even more:
Memory: 1,000,000 IOPS
SSD: ~100,000 IOPS
HDD: ~100-200 IOPS
User requests a short URL and is redirected to the original URL:
+--------+
Request: | Client | Response:
GET/{short_url} +--------+ HTTP 302: Found
^ {
| "Location": "https://www.example.com/"
| }
| or
| HTTP 404/410/500 (failure)
| {
| "error": "..."
| }
v
+------------------+ +-------------+
Read(): | Primary Server | < ---------------- > | Redis Cache |
1) QUERY short URL +------------------+ +-------------+
2) Redirect on match ^ Purpose:
| - Server checks cache before querying DB
| - Fast key-value store
| - Maps short_url_code → original_url
v
+-----------+
Urls Table | Database | SQL Query:
{ +-----------+ SELECT * FROM Urls WHERE short_url_code = ...
string short_url
string original_url
datetime creation_time
datetime expiration_time
created_by
}
Challenges:
Implementing in-memory cache offers performance improvements.
It also adds the need to complex cases, such as cache invalidation for updates or deletions (which for us can more or less be ignored since URLs are read heavy).
The cache would also need to 'warm-up' before the performance benefits kick in, leading to initial requests still hitting the database until cache is populated.
Memory limits would also require decision s about cache size, eviction policies (e.g., LRU).
Overall, it does offer performance benefits, but also adds tradeoffs and invalidation strategies that need to be worked out
Great Solution: Leveraging Content Deliver Networks (CDNs) and Edge Computing
Deep Dive 3: Scaling to 1B shortened URLs and 100M DAU
Constraints
Lets start with the size of the database.
Long URL Table:
| ------------------------- |
Indexing --> | Short URL (~8 bytes) |
| Long URL (~100 bytes) |
| creation Time (~8 bytes) |
| custom alias (~100 bytes) |
| expiration date (~8 bytes)|
| ------------------------- |
Leading to ~200 bytes per row
Rounding up to ~500 bytes to account for additional metadata
such as creator id, analytics id, etc.
If we store 1B shortened URLS we are looking at around:
~500 bytes * 1B rows = 500 GB of data
500 GB is within the capabilities of SSDs. And given that the number of urls is limited by the number of urls on the internet, we can expect it to grow but only modestly.
If we were to hit the hardware limit, we could shard our data across multiple servers, but for now a single Postgres instance as mentioned previously should be ok.
To estimate, we could say 100k new urls are created per day.
100k urls per day
86,400 seconds in day
100k / 86,400 ~= Insert 1 row per second
Most database technology should can handle this as we already fixed the heavy read throughput via cache and have bounded our write throughput as low.
Challenges:
What if the DB goes down?
- Database Replication
By using a database like Postgres that supports replication, we can create multiple identical copies of our database on different servers. If one goes down we can redirect to another.
This adds complexity to our system as we need to ensure our Primary Server can interact with any replica without issues, adding operational overhead.
- Database Backup
We could implement a backup system that periodically takes snapshot of our database and stores it in a separate location.
This adds complexity to our system as we need to ensure our Primary Server can interact with the backup without any issues, adding operational overhead.
Approach:
What about our Primary Server?
We can split our Primary Server by separating read() and write() operations. A microservice architecture where Read() Services handle redirects while Write() Services handle the creation of new short URLS. It allows use to scale each service independently based on specific demand.
We can add multiple instances of the Read() and Write() services to horizontally scale and distribute our load across multiple servers.
This can allow us to handle a large number of requests per second without increasing the load on a single server, and thus when a new request comes in, it is randomly routed to one of the instances of the server.
Approach:
What about our counter?
Horizontally scaling our write services means that we need a single source of truth for the counter as it needs to be accessible to all instances of the Write Service so they can agree on the next value.
We could solve this with a centralized Redis instance to store the counter and other metadata that needs to be shared across all instances of the Write Service. And since Redis is single-threaded, very fast for this use case, and allows for atomic increment operations: 'all or nothing' units of work that atomically add value to a memory location, in our case creating a short URL, we can increment the global counter without any issues.
Global Counter():
+--------+
| Client |
+--------+
^
|
|
v
+------------------+
- Route to microservice | API Gateway |
+------------------+
/ \
/ \
/ \
/ \
/ \
+------------------+ +------------------+
| Read Service | | Write Service |
+------------------+ +------------------+
/ \ / \
/ \ / \
/ \ / \
/ \ / \
+--------------+ +------------+ +----------------------+
| Redis Cache | | Database | | Redis Global Counter |
+--------------+ +------------+ +----------------------+
keyValue pair: URL Table: - atomic increment operation
- key: short_url - short_url
- value: long_url - long_url
- creation_time
- expiration_time
- created_by
Challenge:
Should we be concerned about network overhead? Not really, network request is negligible compared to other operations in the system. However, we could also do 'counter batching' to reduce network requests.
The Redis Global Counter, instead of incrementing per operation, will instead issue batch groups of 1000 to each instance of the Write() service. So a Write service cna use 1000 values locally without needing to contact Redis for a new URL.
Now with the batch counters, if a write service goes down before using the 1000 values locally, the global counter in Redis has already been incremented so these local values will never be used.
This is not an issue however, because we will need up to a trillion short URLs before we need a 7 character short codes (refer to short code section), so we can tolerate a fair amount of loss for batch counters.