Saturday, 10 August 2013

Introduction about Pig In Hadoop

Pig was developed at Yahoo around 2006 for the purpose of reducing the burden of complex mapper and reducer programs in Hadoop.To get idea on Hadoop,please check  “An Introduction to Hadoop!
Pig is just like how we use SQL query for Oracle. In Pig most of the operations are designed to transform the data at one shot. This includes transformations like filtering and joining two or more data sets. Pig's language layer currently consists of a textual language called Pig Latin

Why the name Pig?
Like the animal pig this can eat anything i.e. it can handle any kind of data sets. Hence the name Pig

Pig contains mainly two components
  • Pig Latin: This is the language used for this platform
  • Runtime environment: Infrastructure where Pig Latin programs are executed as MapReduce jobs.
 There are mainly three steps in Pig Latin script
Load, Transform & Dump
  • Load: This step is to load the Hadoop data that is stored in form of HDFS(Hadoop Distributed File System)
  • Transform: To transform the data using set of transformations
  • Dump: To dump the data to the screen directly or store somewhere in a file.
 Pig Latin can be extended using UDF (User Defined Functions) ,using which the user can write in Java, Python and JavaScript and then call directly from the language. 


  1. Worthful Hadoop tutorial. Appreciate a lot for taking up the pain to write such a quality content on Hadoop tutorial. Just now I watched this similar Hadoop tutorial and I think this will enhance the knowledge of other visitors for sure. Thanks anyway.:


  2. Thanks for your article. Its very helpful.As a beginner in hadoop ,i got depth knowlege. Thanks for your informative article. Hadoop training in chennai | Hadoop Training institute in chennai

  3. Top Trending Technologies of 2019. Watch here:


Related Posts Plugin for WordPress, Blogger...