Hive Database Schema usage report UI using html
All,
Here we are trying to generate Hive schema usage report using shell scripting and html.
Prerequisites:
edge node with access to hive.
httpd /tomcat service is running on this node.
Google chart api
Create one script Hive_Rpt.sh , Let put high level steps here.
declare variables for Log/stage/tmp files
Get database list using beeline/cmp Api calls , if you have single hive for multiple environments like dev/sit/prod , Please make necessary changes in code to identify the environment.
Describe database <database name> will provide the database location in hdfs capture that to variable
Also you can get hive table partition details using below query and format to html code.
SELECT DBS.NAME DB_NAME, TBLS.TBL_NAME TABLE_NAME, COUNT(*) NO_OF_PARTITIONS FROM PARTITIONS JOIN TBLS ON (TBLS.TBL_ID = PARTITIONS.TBL_ID) JOIN DBS ON (DBS.DB_ID = TBLS.DB_ID) GROUP BY TBLS.TBL_ID ORDER BY NO_OF_PARTITIONS DESC;
eg:
mysql hive -u hive -p"hive_password" -e "SELECT DBS.NAME DB_NAME, TBLS.TBL_NAME TABLE_NAME, COUNT(*) NO_OF_PARTITIONS FROM PARTITIONS JOIN TBLS ON (TBLS.TBL_ID = PARTITIONS.TBL_ID) JOIN DBS ON (DBS.DB_ID = TBLS.DB_ID) GROUP BY TBLS.TBL_ID ORDER BY NO_OF_PARTITIONS DESC;" |awk '{ print "<tr><td>"$1"</td><td>"$2"</td><td>"$3"</td></tr>"}'>> $op_file
Comments
Post a Comment