An Introduction to Hadoop!

Wednesday, 17 October 2012

An Introduction to Hadoop!

Hadoop is an open source project from Apache that has become into a major technology movement. It has emerged as the best option to handle the Big data, including not only structured data but also complex, unstructured data as well. It has become popular because of its ability to store and process large amounts of data, quickly and cost effectively across clusters of commodity hardware

With Hadoop, organizations are discovering and putting into practice new data analysis and mining techniques that were previously impractical for performance, cost, and technological reasons.

Advanatage of Hadoop is cost-effective scalability to control hardware. It provides support for the processing of all data types – whether structured, semi-structured or unstructured – and the open extensibility of Hadoop

Apache Hadoop is actually a collection of several components including the following:

MapReduce
Hadoop Distributed File System (HDFS)
Hive
Pig
HBase
ZooKeeper
Ambari
HCatalog

DATAWAREHOUSE CONCEPTS

Pages

Wednesday, 17 October 2012

An Introduction to Hadoop!

No comments:

Post a Comment

ShareThis