What Is MongoDB? | Feature Overview and FAQ
With so many open source databases to choose from, finding the right database can be a challenge. That’s especially true for non-relational databases like MongoDB, where there are many viable options to choose from.
In this blog, we give an overview of MongoDB, answer frequently asked questions about the database, and discuss important components of MongoDB, including documents, storage engines, and sharding.
- What Is MongoDB?
- MongoDB Storage Engines
- Deploying MongoDB
- MongoDB Sharding
- MongoDB Frequently Asked Questions
- Final Thoughts
What Is MongoDB?
MongoDB is a popular NoSQL document-oriented database developed by MongoDB Incorporated.
The word “Mongo” was derived by the database’s ability to store humongous amounts of JSON data. Documents can have any schema which is unlike a relational database management system (RDBMS). This means data in related tables can be joined into a single document in affect de-normalizing the data.
There are two important terms with regards to MongoDB. Documents are synonymous with records in an RDBMS. Collections are a grouping of documents, and the equivalent structure in an RDBMS for a collection in MongoDB is a table.
Find the Right Open Source Database
Want to find the right open source database for your needs? Our Decision Maker's Guide to Open Source Databases is a must-read for any organization worried about data performance, security, and stability.
MongoDB Storage Engines
MongoDB has three different storage engines: In-Memory, WiredTiger, and the Encrypted Storage Engine.
In-Memory Storage Engine
The In-Memory storage engine is exactly as it sounds, and only uses very little on-disk data (metadata, diagnostic data). Data with this engine is meant for applications where performance is paramount, and data can be ephemeral. Once the machine loses power then the data is lost unless you are running a replica set. Do not use this storage engine if you require persistent data.
WiredTiger is the default storage engine for MongoDB. It uses document level concurrency, which means that clients can modify different documents within a collection simultaneously. WiredTiger is suitable for most workloads.
Encrypted Storage Engine
The Encrypted Storage Engine is an enhanced version of WiredTiger Storage Engine that supports Encryption at Rest. It is only available in MongoDB Enterprise.
You have several different options when deploying MongoDB: a standalone single instance, a replica set, and a sharded cluster.
If you want redundancy for your database, then you will need to create a replica set. The recommended machines in a replica set are three, so you have three copies of your data. The nodes in a replica set will vote to determine who is the primary. The primary is the only node in the replica set that can accept writes. The other two nodes can accept reads (after a configuration change).
The typical amount of data that your application queries is called the working set. We usually try to strive to have the replica set fit into memory. If it does not fit, then you will experience swapping of data to and from the disk. This will slow your application down considerably.
What if you cannot fit your working set entirely into memory due to cost or other limitations? You will want to scale your cluster horizontally by using sharding.
What Is MongoDB Sharding?
MongoDB sharding is a horizontal partition of data in a database. This makes it possible to divide your data between nodes based on a key, so that not all the data falls into one replica set. This allows you to fit your working set into memory.
MongoDB Sharding Example
For example, assume we have data being imported into a database from across the country. We have two data centers geographically placed, DCEast and DCWest, that each house a replica set. We also place these data centers close to the region the data will be queried in (with one node being a member of the other replica set for redundancy).
It would not make sense to place East data in the DCWest data center and vice versa. The data contains a location key that identifies where it came from. We can designate location keys 1 to 100 go to data center DCEast, and keys 101 to 200 go to data center DCWest. By distributing data in this fashion, we have a better chance to fit our working set into memory and avoid swapping.
MongoDB Sharding Types
MongoDB supports two types of sharding: range-based sharding (as explained above) and hashed sharding. What is hashed sharding? Hashed sharding uses a hash of a field as the shard key to partition the data across the cluster. Hashed sharding will usually give you good data distribution because of high cardinality.
MongoDB Sharding Advice
Know your data! We have many customers who come to us with severely unbalanced sharded data. Choosing a shard key with good distribution is imperative when you model your data. In some situations, sharding by a geographic key would not work for your data. Picking a shard key should be a one-time operation and changing the shard key can be very difficult especially with large databases. Choose a shard key with good cardinality with many different values.
MongoDB Frequently Asked Questions
In the sections below, we answer some of the most commonly asked questions about MongoDB.
Is MongoDB Open Source?
MongoDB is open source, and is licensed via the Server Side Public License for all versions released after October 16, 2018.
Versions prior to that date were released under GNU AGPL v3.0.
Is MongoDB a Relational Database or a NoSQL Database?
MongoDB is one of the most popular NoSQL databases.
Is MongoDB Scalable?
Yes, MongoDB supports horizontal scaling via sharding and replication sets.
For companies looking for a mature, scalable, and open source document database, MongoDB is an attractive option. It’s easy to scale, and can enable fast access by avoiding I/O on ephemeral data.
That said, MongoDB can have sharp edges for companies working with large amounts of data – especially when it comes to sharding.
Get Guidance and Support for Your Open Source Databases
Need support for MongoDB or another open source database? OpenLogic provides SLA-backed database support directly from Enterprise Architects. Talk to an expert today to learn how OpenLogic can help support your integrated (and planned) open source.
- White Paper - Decision Maker's Guide to Open Source Databases
- White Paper - 2021 Open Source Database Trend Report
- Blog - MongoDB Big Data
- Support - OpenLogic Support for Open Source Databases
- Resource Collection - Intro to Open Source Databases
- Blog - What Is PostgreSQL?
- Blog - Cassandra vs. MongoDB
- Blog - Top Data Technologies of 2022