how to log to the kernel-pyspark-*.log from a scheduled notebook?

915 views Asked by At

In my notebook, I have setup a utility for logging so that I can debug DSX scheduled notebooks:

# utility method for logging
log4jLogger = sc._jvm.org.apache.log4j
LOGGER = log4jLogger.LogManager.getLogger("CloudantRecommender")

def info(*args):

    # sends output to notebook
    print(args)

    # sends output to kernel log file
    LOGGER.info(args)

Using it like so:

info("some log output")

If I check the log files I can see my logout is getting written:

! grep 'CloudantRecommender' $HOME/logs/notebook/*pyspark* 

kernel-pyspark-20170105_164844.log:17/01/05 10:49:08 INFO CloudantRecommender: [Starting load from Cloudant: , 2017-01-05 10:49:08]
kernel-pyspark-20170105_164844.log:17/01/05 10:53:21 INFO CloudantRecommender: [Finished load from Cloudant: , 2017-01-05 10:53:21]

However, when the notebook runs as a scheduled job log output doesn't seem to be going to the kernel-pyspark-*.log file.

How can I write log output in DSX scheduled notebooks for debugging purposes?

1

There are 1 answers

0
Chris Snow On BEST ANSWER

The logging code actually works ok. The problem was that the schedule was pointing to an older version of the notebook that did not have any logging statements in it!