Skip to main content

Hive Database Schema usage report UI using html


All,

Here we are trying to generate Hive schema usage report using shell scripting and html.

Prerequisites:

edge node with access to hive.

httpd /tomcat service is running on this node.

Google chart api

Create one script Hive_Rpt.sh , Let put high level steps here.

declare variables for Log/stage/tmp files

Get database list  using beeline/cmp Api calls , if you have single hive for multiple environments like dev/sit/prod , Please make necessary changes in  code to identify the environment.

Describe database <database name> will provide the database location in hdfs capture that to variable 





Also you can get hive table partition details using below query and format to html code.

SELECT DBS.NAME DB_NAME, TBLS.TBL_NAME TABLE_NAME, COUNT(*) NO_OF_PARTITIONS FROM PARTITIONS JOIN TBLS ON (TBLS.TBL_ID = PARTITIONS.TBL_ID) JOIN DBS ON (DBS.DB_ID = TBLS.DB_ID) GROUP BY TBLS.TBL_ID ORDER BY NO_OF_PARTITIONS DESC;

eg:


mysql hive  -u hive -p"hive_password" -e "SELECT DBS.NAME DB_NAME, TBLS.TBL_NAME TABLE_NAME, COUNT(*) NO_OF_PARTITIONS FROM PARTITIONS JOIN TBLS ON (TBLS.TBL_ID = PARTITIONS.TBL_ID) JOIN DBS ON (DBS.DB_ID = TBLS.DB_ID) GROUP BY TBLS.TBL_ID ORDER BY NO_OF_PARTITIONS DESC;" |awk '{ print "<tr><td>"$1"</td><td>"$2"</td><td>"$3"</td></tr>"}'>> $op_file

    
Once the html file is generated, copy the file (eg:hive_report.html)   below location and give read  permission
cp hive_report.html  edgenode::/home/tomcat/apache-tomcat-8.5.24/webapps/RPT/  or /var/www/html/RPT/ 


Enjoy.

for More details, please mail me at dhanooj.world@gmail.com or visit my linkedin Profile 




Comments

Popular posts from this blog

Installing Hadoop ecosystim in pseudo mode in UBUNTU 12.04 LTS

Changes Done in UBUNTU 12.04 64-bit : Post Installation 1. Access To Root:     eagroup@BI-Lab:~$ sudo su     [sudo] password for eagroup:     root@BI-Lab:/home/eagroup# sudo passwd     Enter new UNIX password: password     Retype new UNIX password: password     passwd: password updated successfully     root@BI-Lab:/home/eagroup# 2.  Add and manage users and groups:     a. Add user from GUI - hduser,     Username     Password    Privilage     root        password    root     eagroup        password    admin     hduser        hduser        hadoop user 3. Making hduser sudoer:     a. login as root   ...

Free easy Twitter sentiment analysis using R console (r-studio)in ubuntu(linux)

Twitter sentiment analysis using r-studio (r console) in ubuntu 1)  login as user su 2) Download  all files from following link: https://drive.google.com/folderview?id=0B1WeP8XHW0OzcEY2TEtwMlZDTmc&usp=sharing#list keep these files in /home/hduser/sentiment ( sample path) positive-words.txt negative-words.txt sentiment.r 3) pre-requisites open  /etc/apt/sources.list and add deb http://<my.favorite.cran.mirror>/bin/linux/ubuntu raring/ sudo apt-get install r-base -- now we have R Console in ubuntu. $ cd /home/hduser/sentiment/ call "R" from command prompt $ R -- Sett working directory(wd) and libraries required for analysis. setwd("/home/hduser/sentiment") install.packages('twitteR') install.packages("ROAuth") install.packages("RCurl") install.packages("plyr") install.packages("stringr") --  it's not required since we have this file. if u want u can download it  b...

how to get hive table size from metastore mysql

select    d.name  as db_name ,t.tbl_name     as tbl_name ,from_unixtime(min(t.create_time))   as create_time ,min(t.owner)          as owner ,min(case when tp.param_key = 'COLUMN_STATS_ACCURATE'  then tp.param_value                 end) as COLUMN_STATS_ACCURATE ,min(case when tp.param_key = 'last_modified_by'       then tp.param_value                 end) as last_modified_by ,min(case when tp.param_key = 'last_modified_time'     then from_unixtime(tp.param_value)  end) as last_modified_time  ,min(case when tp.param_key = 'numFiles'               then tp.param_value                 end) as numFiles ,min(case when tp.param_key = 'numRows'                th...