Big Data 101 – NoSql, Hadoop, Analytic DS, RDBMS Differences for Business People

February 7, 2021

insightsoftware is a global provider of reporting, analytics, and performance management solutions, empowering organizations to unlock business data and transform the way finance and data teams operate.

The following post was coauthored by David Abramson, director of product management, Logi Analytics, and Steven Schneider, VP of sales and business development, Logi Analytics, and was originally published on Slinging Software. Confusion Reigns – The basic differences between Hadoop, NoSql, Analytic Data Stores & traditional databases.Organizations are now creating more data than ever before, and as such a new set of tools and technologies are becoming popular to facilitate the storage and retrieval of this information in a timely and cost-effective manner. There are many technologies that are attempting to address these challenges, and as such there are different (and often incompatible) approaches, each with positives and negatives depending on the use-case.While initially big data was synonymous with Hadoop, through aggressive vendor marketing and leadership discussion, the term has broadened to it mean “a lot of data” and a wider set of data storage technologies. At a high-level, there are four competing sets of data storage/access technologies that you are likely to hear about related to big-data:

	RDBMS	Analytic Data Stores	NoSql	Hadoop
Description	Traditional row-column databases used for both transactional systems, reporting, and archiving.	Optimized for data-access (as opposed to writes) and leverage columnar or in-memory technology to provide fast data access at the expense of write-performance limitations.	Designed for rapid access to “key-value” pair combinations. Useful for products like Facebook and Twitter where most information revolves around one key piece of data.	An open-source approach to storing data in a file system across a range of commodity hardware and processing it utilizing parallelism (multiple systems at once)
Examples	Sql Server, MySql, Oracle, etc	Vertica, Kognitio, ParAccel, Netezza, InfoBright, Amazon RedShift	MongoDB, Cassandra	Hadoop implementations by CloudEra, Intel, Amazon, Hortonworks
Good for…	Reads & Writes, “reasonable” data sets (< 1B rows)	Storing lots of information, great query/retrieval speeds.	Storing information of a certain type, great retrieval speed based on a key, write performance	Inexpensive storage of mass data, structured & semi-structured
Not good for…	Massive data volumes, unstructured & semi-structured data	Unstructured & semi-structured data, writes (one at a time)	Not used for grouping information across keys (such as for reporting)	Complex, code-based, incompatible approaches in market, writes (one at a time)
Notes	Challenging to “scale-out”	Often viewed as an alternative to traditional RDBMS when read performance is important	Enables faster productivity when creating data-driven applications as there is less up-front design work needed	Strong bias to the open-source community & Java

About the Author

David Abramson has more than 10 years-experience in full lifecycle product development and management, from product inception through general availability. He has shepherded multiple analytics and business intelligence products, and has worked with hundreds of customers, both enterprises and ISVs, to support data-driven application implementations.

Is Business Intelligence (BI) Right for You?

Download Now:

"*" indicates required fields

Use Case*

Business Email*

First Name*

Last Name*

Job Title

Primary Financial System

Financial System Version

Product of Interest

Hidden

What is your primary reason for attending?

Hidden

Which solutions are you currently evaluating?

Country

State/Province

Having trouble?

Cookies are required to submit forms on this website. Enable cookies. How insightsoftware is using cookies.

Still experiencing an issue? Please contact our website administration team.