Saturday 17 September 2016

How to Create/Change/Set Databases in Hive?

As discussed in previous posts, HIVE makes it easier for developers to port SQL-based applications to Hadoop, compared with other Hadoop languages and tools.


Hive is most suited for data warehouse applications where data is static and fast response time is not required and record-level inserts, updates, and deletes are not required

Creating a Database
The simplest syntax for creating a database in hive is shown in the following example:
Go to the Hive Shell by giving the command sudo hive and enter the command


CREATE DATABASE <DATA BASE NAME>
EXAMPLE
HIVE> CREATE DATABASE HR_STAGING;

HIVE> CREATE DATABASE IF NOT EXISTS HR_STAGING;


We can suppress the warning if the database hr_staging already exists in the hive database by using IF NOT EXISTS. The general syntax for creating the database in Hive is given below. The keyword ‘SCHEMA’ can be used instead of ‘DATABASE’ while creating database.



CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] DATABASE_NAME
  [COMMENT DATABASE_COMMENT]
  [LOCATION HDFS_PATH]
  [WITH DBPROPERTIES (PROPERTY_NAME=PROPERTY_VALUE, ...)]


The CREATE DATABASE command creates the database under HDFS at the default location: /user/hive/warehouse.
Hive creates a directory for each database. Tables in that database will be stored in sub directories of the database directory. The exception is tables in the default database, which doesn’t have its own directory.

Syntax to see the databases that already exists in hive. 

  

HIVE> SHOW DATABASES;
HR_STAGING

HIVE> CREATE DATABASE EMP_STAGING ;
HIVE> SHOW DATABASES;
HR_STAGING
EMP_STAGING

HIVE> SHOW DATABASES LIKE 'H.*';
HR_STAGING


Using a Database
The USE command sets a database as your working database, similar to changing working directories in a file system:

SYNTAX:
USE <DATABASE_NAME>
HIVE> USE HR_STAGING;
HIVE > USE DEFAULT;


Dropping Database in Hive
Syntax to drop a database:

HIVE> DROP DATABASE IF EXISTS HR_STAGING;


Hive won’t allow to drop the database if they contain tables. In such case we have to either drop the table first or append the CASCADE keyword to the command, which will cause the Hive to drop the tables in the database first.


DROP (DATABASE|SCHEMA) [IF EXISTS] DATABASE_NAME
[RESTRICT|CASCADE];

HIVE> DROP DATABASE IF EXISTS HR_STAGING CASCADE;



Alter Database in Hive
You can set key-value pairs in the DBPROPERTIES associated with a database using the ALTER DATABASE command. No other metadata about the database can be changed, including its name and directory location:

ALTER (DATABASE|SCHEMA) DATABASE_NAME
SET DBPROPERTIES (PROPERTY_NAME=PROPERTY_VALUE, ...); 

ALTER (DATABASE|SCHEMA) DATABASE_NAME
SET OWNER [USER|ROLE] USER_OR_ROLE;

HIVE> ALTER DATABASE HR_STAGING
SET DBPROPERTIES ('EDITED-BY' = 'XXXX');


If you like this post, please share it on google by clicking on the Google +1 button.


2 comments:

  1. This is a <a href="https://bigdataers.com*>nice article</a> on hive ql. Thanks

    ReplyDelete
  2. Thank you man. This nice article helped me in learning hive

    ReplyDelete

Related Posts Plugin for WordPress, Blogger...

ShareThis