stop hive's RetryingHMSHandler logging to databricks cluster

1.6k views Asked by At

I'm using azure databricks 5.5 LTS with spark 2.4.3 and scala 2.11. Almost every request going to the databricks cluster is coming up with the following error log

ERROR RetryingHMSHandler: NoSuchObjectException(message:There is no database named global_temp)
at org.apache.hadoop.hive.metastore.ObjectStore.getMDatabase(ObjectStore.java:487)
at org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:498)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

While this isn't affecting the end-result of what we're trying to do, our logs are constantly getting filled with this and isn't very pleasant to go through. I've tried turning it off by setting the following property to the driver and executor

log4j.level.org.apache.hadoop.hive.metastore.RetryingHMSHandler=OFF

only to, later on, realize the class RetryingHMSHandler actually uses slf4j logger, is there an elegant way to overcome this?

1

There are 1 answers

2
Luca Carloni On

Maybe late, but faced the same issue with Databricks cluster 9.1 LTS (Apache Spark 3.1.2, Scala 2.12). Solved by using a init script that added the following two properties

log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATAL, publicFile
log4j.additivity.org.apache.hadoop.hive.metastore.RetryingHMSHandler=false

to driver's log4j.properties.

My goal was to remove all verbose logs from the "log4j-active.log" file that can be downloaded from a job UI. By following https://learn.microsoft.com/en-us/azure/databricks/kb/clusters/overwrite-log4j-logs, I decided to add/overwrite some property values within driver's log4j.properties (first I had a look at its content, of course).

Added that two properties, I was able to remove also RetryingHMSHandler (the only third-party log call that was still surviving)

Hope it helps ;)