Difference between AWS DynamoDB vs AWS DocumentDB vs MongoDB?
The number of database options has gotten better and better over the years to the point that it’s becoming overwhelming. Choosing the right database type (SQL vs NoSQL) is a whole topic in itself. Let’s say you have chosen to use NoSQL, what now?
While the options are numerous, there’s a simple way to narrow down which system is for you. The question is, do you want ease of use and almost zero maintenance hassle, or do you want configurability and all the “bells and whistles”.
Lets say you want the zero maintenance option, then Amazon Web Service’s (AWS) DynamoDB is the logical option and recently it has gotten better. Here’s the reason you definitely want to use DynamoDB.
1. A simple storage system
2. Speed / Scalability / Predictability
3. Security
4. Pricing model
5. The fact that Amazon uses it
A simple storage system: While DynamoDB can handle JSON documents like you would traditionally use them, it does prefer you to keep them simple, and that’s a good thing. Typically with JSON style DBs you can get carried away with trying to store EVERYTHING in the one document. While it’s great having an array of sub-documents or multiple layers of nested objects, it makes querying hard and it also makes future changes really difficult.
DynamoDB charges based on the WHOLE document retrieval, having large documents can cost you more compared to keeping the document small and simple. DynamoDB’s query “language” is also extremely simple. While it can be frustrating trying to work out how to retrieve a document in such a way compared to other DB’s, it forces you to simplify your document storage and also the way you code your app.
The key that I’ve found is to try and keep your DynamoDB documents to 1 object layer, which works well for tables such as “products” or “transactions”, which is the way Amazon uses it for it’s shopping system.
Another simple aspect is backup and recovery (snapshot and “point in time”). To backup and recovery, it’s a simple button.
Speed / Scalability / Predictability: There’s not much to say about this because AWS handles all of it. The only configuration you can do is based on whether you want predicable pricing and provision your read / write requests, or to all DynamoDB to max itself out when it needs to and you simply pay for it’s usage. Typically, I prefer to use “On Demand” as it’s the cheaper option: you don’t pay for what you don’t use.
Security: Anyone that’s used AWS before knows that’s it’s pretty secure by default, and if you keep your AWS access keys safe, then you’re pretty much set. The only way to access DynamoDB is through the AWS SDK. As long as you follow the standard programming “best security practices”, then there’s nothing really to be done. AWS has secured everything for you. Recently they’ve also enabled “Encryption at Rest”. I.e, your data is encrypting in storage by default.
Pricing model: This can be both good and bad. When using the “Provisioned” pricing model, you set the upper limit you think you will need, and you pay for that as if it was being used. If you don’t use it, tough, you still have to pay for it because it’s been reserved just like a normal server would cost you even if no one is using it. “On Demand” however doesn’t charge you for no activity. You only pay for the storage used. The bonus is also that if you have an unexpected spike in usage, “On Demand” won’t throttle you. It will just accept the requests and you pay for those requests. This can be a surprise though with a bigger bill.
The fact that Amazon uses it: This is the interesting one. Amazon generally designs its web services around its own needs and obviously it uses it’s own products to power their other products. DynamoDB is designed to have predictable performance which is something you need when powering a massive online shopping site.
The other aspect to Amazon designing it is that it works seamlessly with its other products. For example, if a new transaction document is inserted, an AWS Lambda function can be triggered to send an email to the client. Once you start using triggers, you start to see how serverless systems can really cut down on the maintenance and development time for your application.
MongoDB
MongoDB is not really the opposite to DynamoDB, but when compared its pretty different. For one thing, the feature set in MongoDB is AMAZING, and its list of features keep growing. You want ACID transactions? Done. Multi-stage pipelines? Easy. With each version, new cool features are introduced.
1. Rich features
2. Flexibility / Compatibility / No vender lock in
3. Your configuration, the way you want it
Rich Features: This is my favourite aspect of MongoDB and I’m a particular fan of using its aggregation. Typically, I’ll only use a few pipeline stages, such as “lookups” (connecting tables) and result counting (such as needing to know how many notifications a client has to read). After using DynamoDB I’ve learnt not to overload JSON documents as it doesn’t matter what system you use, it still has to pull the document and heavy documents decrease performance. That being said, it’s great being able to use a MongoDB feature on the DB server compared to having to pull the documents and transforming them in your app. If you can transform the data in the DB first, then do it.
Flexibility / Compatibility / No vender lock in: The issue with DynamoDB is that you’re locked with AWS. For most applications / purposes, that fine. If you don’t use AWS however, then DynamoDB will be more hassle than it’s worth. MongoDB is pretty flexible and you can even run it / maintain it on your own servers if you really want to. There’s even support (although buggy) to run them on ARM64 CPUs such as Raspberry Pi’s.
There’s also a wide range of supported programming architectures that natively support MongoDB and all it’s features. This also means that if you have a problem or stuck on something, then someone has probably already asked it and has the solution. The development community and support is outstanding.
Your configuration, the way you want it: This is probably aimed more at system admins or people who enjoy managing web systems compared to the typical programmer. I personally do get a kick out of maintaining a system, tweaking it and optimising it as much as possible. The enables me to understand what’s going on in the backend and how the application structure can be improved. It’s like a person who can simply drive a car compared to a mechanic. A person might be able drive better than a mechanic, but they won’t be able to keep the car going as long or as smoothly as a mechanic.
AWS DocumentDB
This is the middle ground. Amazon has taken MongoDB and designed it in such a way to try and bring it closer to DynamoDB. Document DB has the feature set of MongoDB (up until v3.6) but it’s maintained by Amazon. You select the number of instances you need and the size of them for the cluster. You still need to keep an eye on it to make sure its scaling as required, but you don’t need to configure it compared to running it on your own server.
The key difference is that the storage and the compute are “decoupled” as AWS describes it. What this means is that you can rapidly scale your DocumentDB server within minutes and throw a number of read replica instances at the cluster without downtime. Therefore in the event of a sudden spike in usage, you don’t have to wait nervously for new instances to start up. Compared to typical managed MongoDB services, these scaling instances are there almost “instantly”.
The biggest issue is pricing, and I do mean “big”. A single DocumentDB server can cost around $200 and that’s because they start big. With the starting server size being 2 CPUs and 16GB of RAM, they’re not joking when they say that “Amazon DocumentDB is designed for 99.99% availability”. These managed servers are designed for really high work loads and not for your hobby app.
While DocumentDB is a great middle ground, it’s definitely not something you would jump into straight away.
In summary, DynamoDB is typically best for simple transactional based document storage, MongoDB for flexible and broad document type storage and AWS DocumentDB is best used for when your MongoDB project has gotten too big to handle and you don’t mind paying a bit more to have your DB managed for high workloads.