Mongo DB (No SQL database for web){6}

by John J
Today’s highly social and interactive web has created a market for a database management system with the ability to offer fast real time access over the Internet while managing massive data sets that are growing by the minute in volume and complexity. MongoDB fills this need. As I will explain later in this blog, MongoDB is not the perfect solution for every project, but for certain tasks that are within it’s niche, it is the best solution.

MongoDB is a NoSQL or non-Relational Database Management System that uses a document-oriented storage format. Other storage formats used in the non-relational class of databases include graph, key-value store, multi value, object, RDF, tabular, and tuple store. Since we’ve all (presumably) taken a databases course already, I would like to explain how MongoDB works in the context of relational databases.

MongoDB uses a document, which can be thought of like a row in an SQL table and a collection, which is like the whole table itself. Everything between the curly braces {} is called the document. A single collection can contain millions or more of documents. A document resembles a JSON object and looks like the picture below:

(Image obtained from

Since it is not a relational database, MongoDB does not enforce a schema. In a relational database, a schema would be the various column headings of a table. Every record in that table would have to have the same fields and thus the same schema. In MongoDB, each document in the collection can have a different number of fields with different data types and you can also have documents nested in documents.

(Image obtained from

Now let’s take a look at how MongoDB statements compare to SQL statements. Here is the basic command to make a new data entry:

(Image obtained from

Now let’s take a look at some basic queries:

(Image obtained from

As you can see, it’s not too steep a learning curve for MongoDB statements, especially if you’ve got some experience with SQL. It’s the same logic, just slightly different syntax.

MongoDB is best for Big Data environments. “Big Data refers to the massive growth in the volume, variety and velocity of data being produced and the set of applications that generate, store, process and monetize this data.” (10gen., 2013) Sites like craigslist, intuit, Disney and foursquare have Big Data environments. Craigslist moved over two billion documents to MongoDB. They used to use MySQL and an ALTER TABLE statement “took months” to finish running on their archive which caused performance loss on their live database during execution of that statement. (10gen, 2013)

Content Management is another area where MongoDB shines and MTV uses it for exactly that purpose. MongoDB is exceptionally well suited for a site like MTV because of it’s multimedia saturated environment. The GridFS technology in MongoDB allows MTV to “store and serve rich media such as video, images and audio in the database itself.” (10gen, 2013) GridFS allows the storage and retrieval of documents that exceed 16MB size limit. It does this by breaking the large file down into many small chunks and when it is queried “the driver or client will reassemble the chunks as needed.” (, 2013)

The location-based social networking site foursquare moved over to MongoDB for two reasons: it’s built in auto-sharding capability and geospatial indexing support. Sharding is something that invariably happens when you have a lot of data. It describes a situation where data fills a server and needs to continue on to another server (possibly breaking across a collection or even a single document). In relational databases, “which were designed to run on just one machine” (Finley, 2013) this usually required the writing of custom code to manage this, but MongoDB does this automatically.

Since foursquare’s entire business model relies location-based data, the fact that MongoDB supports geospatial indexing made them the perfect fit. MongoDB does this by recursively dividing a map into quadrants where quadrant looks like:

(Image obtained from

It then concatenates these ones and zeros to make the hash identifier which will specify the exact location. The more bits in the hash identifier, the greater the accuracy of the location.

Disney Interactive Media Group uses MongoDB for user data management. Disney was having problems with their existing MySQL database in regards to performance and scalability. As a result, their game developers spent more time trying to develop their own database management system then they did actually developing games. Disney has many games with users bases of varying size on any given day. When a game suddenly becomes really popular, they wanted the ability to be able to change how they modeled their data as a particular game grew and continued to be developed. The flexible schema of MongoDB offers exactly that.

All this speed and flexibility comes at a price. “Security was not a primary concern of MongoDB’s designers” (Okman, 2011, p. 546) and to achieve the speed and scalability they desired they had to make some trade offs. They decided to “trade consistency and security for performance and scalability.” (Okman, 2011, p. 541) In a paper titled Security Issues in NoSQL Databases the author points out seven different security vulnerabilities in MongoDB. I will mention two of them, the first of which being that “Mongo data-files are unencrypted and Mongo doesn’t provide a method to automatically encrypt these files.” (Okman, 2011, p. 546) The second security hole talks about the potential for injection attacks: “Mongo heavily utilizes JavaScript as an internal scripting language… because JavaScript is an interpreted language, there is a potential for injection attacks.” (Okman, 2011, p. 546)

The sacrifices MongoDB makes in security and consistency have certainly paid off. The authors of MongoDB vs Oracle – database comparison ran some tests and produced some vary interesting data. The following insert function was used for this test:

As you can see, MongoDB drastically outperformed an Oracle database, especially when working with an extremely large number of records. The results of a similar test using update was equally impressive:

In conclusion if speed of retrieval and flexibility in data representation aren’t key to a given project you might work on in the future and things like security and accuracy of data are more of what you’re looking for, then MongoDB is not the product for you. However, if your project is an on-line game, a website with personalized content and/or real-time updated data or anything where performance and/or evolving data representation is the primary concern, than MongoDB is what you’re looking for. If the data has to be instantly and completely accurate and/or security is a primary concern, stick with the tried and true SQL relational databases.


Boicea, A., Radulescu, F., & Agapin, L. I. (2012, September). MongoDB vs Oracle–Database Comparison. In Emerging Intelligent Data and Web Technologies (EIDWT), 2012 Third International Conference on (pp. 330-335). IEEE.

Finley, K. (2013, March 19). NoSQL Database MongoDB Reaches Beyond Software Coders | Wired Enterprise | Retrieved April 24, 2013, from

MongoDB. (2013, April 24). Retrieved from

Identity management systems and MongoDB | 10gen. (2013, April 24). Retrieved from

Okman, L., Gal-Oz, N., Gonen, Y., Gudes, E., & Abramov, J. (2011, November). Security issues in nosql databases. In Trust, Security and Privacy in Computing and Communications (TrustCom), 2011 IEEE 10th International Conference on (pp. 541-547). IEEE.