Skip to main content

Hive Database Schema usage report UI using html


All,

Here we are trying to generate Hive schema usage report using shell scripting and html.

Prerequisites:

edge node with access to hive.

httpd /tomcat service is running on this node.

Google chart api

Create one script Hive_Rpt.sh , Let put high level steps here.

declare variables for Log/stage/tmp files

Get database list  using beeline/cmp Api calls , if you have single hive for multiple environments like dev/sit/prod , Please make necessary changes in  code to identify the environment.

Describe database <database name> will provide the database location in hdfs capture that to variable 





Also you can get hive table partition details using below query and format to html code.

SELECT DBS.NAME DB_NAME, TBLS.TBL_NAME TABLE_NAME, COUNT(*) NO_OF_PARTITIONS FROM PARTITIONS JOIN TBLS ON (TBLS.TBL_ID = PARTITIONS.TBL_ID) JOIN DBS ON (DBS.DB_ID = TBLS.DB_ID) GROUP BY TBLS.TBL_ID ORDER BY NO_OF_PARTITIONS DESC;

eg:


mysql hive  -u hive -p"hive_password" -e "SELECT DBS.NAME DB_NAME, TBLS.TBL_NAME TABLE_NAME, COUNT(*) NO_OF_PARTITIONS FROM PARTITIONS JOIN TBLS ON (TBLS.TBL_ID = PARTITIONS.TBL_ID) JOIN DBS ON (DBS.DB_ID = TBLS.DB_ID) GROUP BY TBLS.TBL_ID ORDER BY NO_OF_PARTITIONS DESC;" |awk '{ print "<tr><td>"$1"</td><td>"$2"</td><td>"$3"</td></tr>"}'>> $op_file

    
Once the html file is generated, copy the file (eg:hive_report.html)   below location and give read  permission
cp hive_report.html  edgenode::/home/tomcat/apache-tomcat-8.5.24/webapps/RPT/  or /var/www/html/RPT/ 


Enjoy.

for More details, please mail me at dhanooj.world@gmail.com or visit my linkedin Profile 




Comments