MarkLogic and Hadoop

by Asbed P
MarkLogic’s database, named MarkLogic 5, is another database that will use the open source Hadoop programming framework.  It was just released a few weeks ago, and has a Hadoop connector that allows its users to “aggregate data inside MarkLogic for richer analytics, while maintaining the advantages of MarkLogic indexes for performance and accuracy.”  The database is described by its own company as an enterprise class database which does not use SQL.  Instead it uses both XML and Xquery which means its more well suited for certain classes of applications.  It’s main appeal so far is its ability to manage, index and handle unstructored information from anything from text documents to media files. A great use of this database would be in, for example, an insurance company who has a great amount of documents that need information pulled off of them and sorted into a database.  This combination of MarkLogic and Hadoop will allow MarkLogic to pull the info and Hadoop to sort it and analyze it.

This sounds like a great new database in my opinion. The combination sounds like a great idea, allowing the MarkLogic half to pull data while the Hadoop part analyzes it.  The fact that they plan on continuously updating is great as well.  The backup feature they are calling the “Hot Copy” seems like a great idea that other companies should adopt.  The fact that it is using an open course framework like Hadoop does mean that it will get a lot of competition though.  And since it is a relatively small company compared to other database providers, like Oracle for example, It’s not as easy for them to keep up with the rest of the giants.  But they are already turning a major profit and quickly growing.  They have 275 customers as of two weeks ago and 500 implementations according to Bill Veiga, the vice president of solutions marketing.  I hope for the best for them and wish for their XML based database to succeed, or at least make an impact.

Kanaracus, Chris. (2011, November 1).  MarkLogic Ties its database to Hadoop for ‘Big Data’ support.  Retrieved November 13, 2011, from