The Apache Hadoop project is a collection of many sub projects and ZooKeeper(now a top-level project in its own) is one among them and is noticeable for its wide applicability for building distributed systems.
ZooKeeper is a distributed, open-source coordination service for distributed applications. Very large Hadoop clusters can be maintained by multiple ZooKeeper servers as it ensures the availability by never-ending services.In Hadoop project it is used to manage master election and store other process metadata.
In Hadoop we do have many types’ nodes, master and multiple worker nodes. If by any chance the master node fails then role of master node has to be transferred to different node. This is done by zookeeper as it takes care of clusters by assigning tasks to new master node
Few more points to be added about Zookeeper
- Zoo synchronizes the tasks in across the clusters in Hadoop using ZooKeeper servers
- It’s easily programmable and we can Interaction with ZooKeeper occurs via Java or C interfaces time.
- The service is named zookeeper because "coordinating distributing services is a zoo.
- ZooKeeper is used by companies including Rackspace, Yahoo! and eBay