Author: Nancy Agarwal
Original post on Foojay: Read More
Table of Contents
A Practical Introduction to Horizontal Scaling
When building applications, most developers start with a single database server.
At the beginning, everything works perfectly.
Your application might have:
- A few thousand users
- Manageable traffic
- Datasets that easily fit on one machine
But as your application grows, something interesting starts to happen.
Queries take longer.
Write operations slow down.
The database server starts hitting CPU, RAM, or storage limits.
At this stage, many engineers ask an important question:
Should we upgrade the server or scale the database differently?
This is where horizontal scaling and sharding come into the picture.
If you’re using MongoDB, sharding is the mechanism that allows your database to scale beyond the limits of a single machine.
In this article, we’ll walk through:
- What sharding actually is
- Why horizontal scaling matters
- How MongoDB implements sharding
- When you should (and shouldn’t) use it
The Scaling Problem Most Databases Face
Imagine your application stores user data in a database.
Initially, the architecture looks like this:
Application
│
Database Server
All reads and writes go to one machine.
This approach is called vertical scaling, when you keep upgrading the same server by adding:
- More CPU
- More RAM
- Faster storage
While this works for a while, vertical scaling eventually hits limits:
- Hardware upgrades become expensive
- There is always a maximum server size
- Downtime may be required during upgrades
Eventually, a single server becomes a bottleneck.
Instead of making one machine bigger, the better approach is to add more machines.
This approach is called horizontal scaling.
What is Horizontal Scaling?
Horizontal scaling means distributing data across multiple servers rather than relying on a single server.
Instead of storing all data on a single machine:
Server A
2 TB of data
You distribute the data:
Server A → 500 GB
Server B → 500 GB
Server C → 500 GB
Server D → 500 GB
Each server stores only part of the dataset.
This is exactly what sharding does.
What is Sharding in MongoDB?
Sharding is the process of splitting large datasets across multiple database servers.
Each server stores a portion of the data, called a shard.
For example, imagine an application storing millions of users.
Instead of keeping all users on one server:
| Shard | Data |
| Shard 1 | Users with IDs 1–1M |
| Shard 2 | Users with IDs 1M–2M |
| Shard 3 | Users with IDs 2M–3M |
Each shard contains only a subset of the collection.
When queries come in, MongoDB determines which shard contains the relevant data.
This allows the database to handle massive datasets and high traffic efficiently.
MongoDB Sharded Cluster Architecture
A sharded cluster in MongoDB consists of three main components: shards, config servers, and MongoDB routers
1. Shards
Shards are where the actual data is stored.
Each shard is usually deployed as a replica set to ensure high availability and fault tolerance.
2. Config Servers
Config servers store metadata about the cluster.
They maintain information such as:
- Which shard contains which data
- How data is distributed
- Shard key ranges
Without config servers, the cluster would not know where data lives.
3. Mongos Router
Applications do not connect directly to shards.
Instead, they connect to mongos, which acts as a query router.
Its responsibilities include:
- Receiving application queries
- Determining which shard contains the data
- Forwarding the query to the correct shard
A simplified architecture looks like this:
Application
│
Mongos
/ |
Shard1 Shard2 Shard3
This abstraction means the application does not need to know where the data is stored.
Choosing a Shard Key
A shard key determines how data is distributed across shards.
For example:
{ userId: 1 }
MongoDB uses the shard key to decide which shard a document belongs to.
Choosing a shard key is one of the most critical decisions in a sharded architecture.
A good shard key should:
- Distribute data evenly
- Avoid hotspots
- Support common query patterns
For example, if most queries are based on userId, using it as the shard key makes sense.
However, choosing something like country might create imbalanced shards if most users are from one region.
Creating a Sharded Collection
Let’s look at a simple example.
First, enable sharding for a database.
sh.enableSharding("companyDB")
Next, shard a collection.
sh.shardCollection(
"companyDB.employees",
{ employeeId: 1 }
)
MongoDB will now automatically distribute documents across shards.
Querying Data in a Sharded Cluster
One of the nice things about sharding in MongoDB is that application queries remain the same.
For example:
db.employees.find(
{ department: "Engineering" },
{ name: 1, managerName: 1, departmentName: 1 }
)
The mongos router determines which shard contains the relevant documents and routes the query to that shard.From the application’s perspective, it still feels like one database.
When Should You Use Sharding?
Sharding is powerful, but it should be introduced only when needed.
Here are common situations where sharding makes sense.
Large datasets
If your dataset grows into hundreds of gigabytes or terabytes, a single server may not be sufficient.
Examples include:
- Analytics platforms
- Log storage systems
- IoT platforms
High write throughput
Applications that generate large numbers of writes can benefit from sharding because writes can be distributed across multiple nodes.
Examples include:
- Event tracking systems
- Gaming platforms
- Social media feeds
Rapid data growth
If you expect your dataset to grow rapidly, designing the system with sharding in mind early can save major architectural changes later.
When Sharding Might Be Overkill
Despite its benefits, sharding adds operational complexity.
You probably don’t need sharding if:
- Your dataset is relatively small
- Your workload is moderate
- Vertical scaling still works
Many applications run perfectly fine with replication and proper indexing.
Sharding should usually be considered after other scaling strategies have been exhausted.
Sharding vs Replication
Developers sometimes confuse these two concepts.
| Feature | Replication | Sharding |
| Purpose | High availability | Horizontal scaling |
| Data | Same data on every node | Data split across nodes |
| Reads | Can scale reads | Scales read and write |
| Storage | Data duplicated | Data distributed |
In practice, MongoDB often uses both together.
Each shard is typically configured as a replica set, ensuring both scalability and fault tolerance.
Final Thoughts
Sharding is one of the most powerful scaling mechanisms available in MongoDB.
It allows databases to handle:
- Massive datasets
- High query throughput
- Continuously growing applications
However, like most architectural decisions, it should be introduced carefully and intentionally.
Understanding your data access patterns and choosing the right shard key are essential for a successful sharded deployment.
If you’re building applications expected to scale to millions of users or terabytes of data, sharding becomes a key tool in your database architecture.
The post What is Sharding in MongoDB and When Should You Use It? appeared first on foojay.
NLJUG – Nederlandse Java User Group NLJUG – de Nederlandse Java User Group – is opgericht in 2003. De NLJUG verenigt software ontwikkelaars, architecten, ICT managers, studenten, new media developers en haar businesspartners met algemene interesse in alle aspecten van Java Technology.