RDBMS | Analytic Data Stores | NoSql | Hadoop | |
---|---|---|---|---|
Description | Traditional row-column databases used for both transactional systems, reporting, and archiving. | Optimized for data-access (as opposed to writes) and leverage columnar or in-memory technology to provide fast data access at the expense of write-performance limitations. | Designed for rapid access to “key-value” pair combinations. Useful for products like Facebook and Twitter where most information revolves around one key piece of data. | An open-source approach to storing data in a file system across a range of commodity hardware and processing it utilizing parallelism (multiple systems at once) |
Examples | Sql Server, MySql, Oracle, etc | Vertica, Kognitio, ParAccel, Netezza, InfoBright, Amazon RedShift | MongoDB, Cassandra | Hadoop implementations by CloudEra, Intel, Amazon, Hortonworks |
Good for… | Reads & Writes, “reasonable” data sets (< 1B rows) | Storing lots of information, great query/retrieval speeds. | Storing information of a certain type, great retrieval speed based on a key, write performance | Inexpensive storage of mass data, structured & semi-structured |
Not good for… | Massive data volumes, unstructured & semi-structured data | Unstructured & semi-structured data, writes (one at a time) | Not used for grouping information across keys (such as for reporting) | Complex, code-based, incompatible approaches in market, writes (one at a time) |
Notes | Challenging to “scale-out” | Often viewed as an alternative to traditional RDBMS when read performance is important | Enables faster productivity when creating data-driven applications as there is less up-front design work needed | Strong bias to the open-source community & Java |
About the Author
David Abramson has more than 10 years-experience in full lifecycle product development and management, from product inception through general availability. He has shepherded multiple analytics and business intelligence products, and has worked with hundreds of customers, both enterprises and ISVs, to support data-driven application implementations.