Calling CCL + Quicklisp script as executable with command line arguments and achieving the desired output

485 views Asked by At

After discovering a very simple way to watch YouTube videos from the command line using my new Raspberry Pi 2 (running Raspbian) using only easily obtainable packages, namely:

omxplayer -o local $(youtube-dl -g {videoURL})

I immediately wanted a way to watch entire YouTube playlists that way. So I saw this as a perfect excuse to hack together a solution in Common Lisp :)

My solution (imaginatively dubbed RpiTube) is a script that, when given the URL of a YouTube playlist, searches the page's HTML source and extracts the URLs for the videos contained within it. I can then pass these URLs to a Bash script that ultimately calls the above command for each video individually, one after the other. The Common Lisp script itself is complete and works, however I'm having difficulty invoking it with the URL as a command-line argument. This is mainly because I'm still quite new to Quicklisp, Lisp packages and creating executables from Common Lisp code.

I'm running Clozure Common Lisp (CCL) with Quicklisp (installed as per Rainer Joswig's instructions). I've included the complete code below. It may be a little inefficient, but to my amazement it runs reasonably quickly even on the Raspberry Pi. (Suggested improvements are appreciated.)

;rpitube.lisp

;Given the URL of a YouTube playlist's overview page, return a list of the URLs of videos in said playlist.

(load "/home/pi/quicklisp/setup.lisp")
(ql:quickload :drakma)
(ql:quickload "cl-html-parse")
(ql:quickload "split-sequence")

(defun flatten (x)
  "Paul Graham's utility function from On Lisp."
  (labels ((rec (x acc)
             (cond ((null x) acc)
                   ((atom x) (cons x acc))
                   (t (rec (car x) (rec (cdr x) acc))))))
    (rec x nil)))

(defun parse-page-source (url)
  "Generate lisp list of a page's html source."
  (cl-html-parse:parse-html (drakma:http-request url)))

(defun occurences (e l)
  "Returns the number of occurences of an element in a list. Note: not fully tail recursive."
  (cond
    ((null l) 0)
    ((equal e (car l)) (1+ (occurences e (cdr l))))
    (t (occurences e (cdr l)))))

(defun extract-url-stubs (flatlist unique-atom url-retrieval-fn)
  "In a playlist's overview page the title of each video is represented in HTML as a link,
  whose href entry is part of the video's actual URL (referred to here as a stub).
  Within the link's tag there is also an entry that doesn't occur anywhere else in the
  page source. This is the unique-atom (a string) that we will use to locate the link's tag
  within the flattened list of the page source, from which we can then extract the video's URL
  stub using a simple url-retrieval-fn (see comments below this function). This function is iterative, not
  recursive, because the latter approach was too confusing."
  (let* ((tail (member unique-atom flatlist :test #'equal))
         (n (occurences unique-atom tail))
         (urls nil))
    (loop for x in tail with i = 0
          while (< (length urls) n) do
          (if (string= x unique-atom)
              (setf urls (cons (funcall url-retrieval-fn tail i) urls)))
          (incf i))
    (reverse urls)))

;Example HTML tag:
;<a class="pl-video-title-link yt-uix-tile-link yt-uix-sessionlink  spf-link " data-sessionlink="verylongirrelevantinfo" href="/watch?v=uniquevideocode&index=numberofvideoinplaylist&list=uniqueplaylistcode" dir="ltr"></a>

;Example tag when parsed and flattened:
;(:A :CLASS "pl-video-title-link yt-uix-tile-link yt-uix-sessionlink  spf-link " :DATA-SESSIONLINK "verylongirrelevantinfo" :HREF "/watch?v=uniquevideocode&index=numberofvideoinplaylist&list=uniqueplaylistcode" :DIR "ltr")

;The URL stub is the fourth list element after unique-atom ("pl-video-title..."), so the url-retreival-fn is:
;(lambda (l i) (elt l (+ i 4))), where i is the index of unique-atom.

(defun get-vid-urls (url)
  "Extracts the URL stubs, turns them into full URLs, and returns them in a list."
  (mapcar (lambda (s)
            (concatenate 'string
                         "https://www.youtube.com"
                         (car (split-sequence:split-sequence #\& s))))
          (extract-url-stubs (flatten (parse-page-source url))
                             "pl-video-title-link yt-uix-tile-link yt-uix-sessionlink  spf-link "
                             (lambda (l i) (elt l (+ i 4))))))

(let ((args #+clozure *unprocessed-command-line-arguments*))
(if (and (= (length args) 1)
         (stringp (car args)))
    (loop for url in (get-vid-urls (car args)) do
          (format t "~a " url))
    (error "Usage: rpitube <URL of youtube playlist>

           where URL is of the form:
           'https://www.youtube.com/playlist?list=uniqueplaylistcode'")))

First I tried adding the following line to the script

#!/home/pi/ccl/armcl

and then running

$ chmod +x rpitube.lisp
$ ./rpitube.lisp {playlistURL}

which gives:

Unrecognized non-option arguments: (./rpitube.lisp {playlistURL})

when I would at least have expected that ./rpitube.lisp be absent from this list of unrecognized arguments. I know that in Clozure CL, in order to pass a command line argument to an REPL session untouched, I have to separate them from the other arguments with a double hyphen, like this:

~/ccl/armcl -l rpitube.lisp -- {playlistURL}

But invoking the script like this clearly lands me in a REPL after the script has run, which I don't want. Additionally the Quicklisp loading information and progress bars are printed to the terminal, which I also don't want. (Incidentally, as Rainer suggested, I haven't added Quicklisp to my CCL init file, since I generally don't want the additional overhead i.e. few second's loading time on the Raspberry Pi. I'm not sure if that's relevant).

I then decided to try creating a standalone executable by running (once the above code is loaded):

(ccl:save-application "rpitube" :prepend-kernel t)

And calling it from a shell like this:

$ ./rpitube {playlistURL}

which gives:

Unrecognized non-option arguments: ({playlistURL})

which seems to be an improvement, but I'm still doing something wrong. Do I need to replace the Quicklisp-related code by creating my own asdf package that requires drakma, cl-html-extract and split-sequence, and loading that with in-package, etc.? I have created my own package before in another project - specifically because I wanted to split up my code into multiple files - and it seems to work, but I still loaded my package via ql:quickload as opposed to in-package, since the latter never seemed to work (perhaps I should ask about that as a separate question). Here, the rpitube.lisp code is so short that it seems unecessary to create a whole quickproject and package for it, especially since I want it to be a standalone executable anyway.

So: how do I change the script (or its invocation) so that it can accept the URL as a command-line argument, can be run non-interactively (i.e. doesn't open a REPL), and ONLY prints the desired output to the terminal - a space-delimited list of URLs - without any Quicklisp loading information?

2

There are 2 answers

4
m-n On

I looked at this some and would like to share what I found. There are also several Lisp libraries which aim to facilitate scripting, executable building, or command-line argument handling.

For your executable building approach, save-application lets you specify a :toplevel-function, a function of zero arguments. In this case you will need to get the command line arguments through ccl:*command-line-argument-list*, and skip the first element (the name of the program). This is probably the minimal change to get your program running (I haven't run this; so it may have typos):

(defun toplevel ()
  (let ((args #+clozure *command-line-argument-list*))
    (if (and (= (length args) 2)
             (stringp (second args)))
        (loop for url in (get-vid-urls (second args)) do
              (format t "~a " url))
        (error "Usage: rpitube <URL of youtube playlist>

               where URL is of the form:
               'https://www.youtube.com/playlist?list=uniqueplaylistcode'"))))

(save-application "rpitube" :prepend-kernal t :toplevel-function #'toplevel)

Alternatively, some Lisp implementations have a --scpript command-line parameter which allows something similar to your #!/home/pi/ccl/armcl script to work. CCL doesn't seem to have an equivalent option, but a previous answer -- https://stackoverflow.com/a/3445196/2626993 -- suggests writing a short Bash script which would essentially behave like you hoped CCL would with this attempt.

quickload calls can be silenced with an argument:

  (ql:quickload :drakma :silent t)
3
Andy Page On

Ok, I've managed to adapt a solution from the suggestion linked by user @m-n above. RpiTube now seems to work for most playlists that I have tried except some music playlists, which are unreliable since I live in Germany and many music videos are blocked in this country for legal reasons. Huge playlists, very high quality (or very long) videos might be unreliable.

The BASH script:

#! /bin/bash

#Calls rpitube.lisp to retrieve the URLs of the videos in the provided
#playlist, and then plays them in order using omxplayer, optionally
#starting from the nth video instead of the first.

CCL_PATH='/home/pi/ccl/armcl'
RPITUBE_PATH='/home/pi/lisp/rpitube.lisp'
N=0
USAGE='
Usage: ./rpitube [-h help] [-n start at nth video] <playlist URL>

       where URL is of the form: https://www.youtube.com/playlist?list=uniqueplaylistcode
       ******** Be sure to surround the URL with single quotes! *********'

play()
{
  if `omxplayer -o local $(youtube-dl -g "$1") > /dev/null`; then
    return 0
  else
    echo "An error occured while playing $1."
    exit 1
  fi
}

while getopts ":n:h" opt; do
  case $opt in
    n ) N=$((OPTARG - 1)) ;;
    h ) echo "$USAGE"
        exit 1 ;;
    \? ) echo "Invalid option."
         echo "$USAGE"
         exit 1 ;;
  esac
done

shift $(($OPTIND - 1))

if [[ "$#" -ne 1 ]]; then
  echo "Invalid number of arguments."
  echo "$USAGE"
  exit 1
elif [[ "$1" != *'https://www.youtube.com/playlist?list='* ]]; then
  echo "URL is of the wrong form."
  echo "$USAGE"
  exit 1
else
  echo 'Welcome to RpiTube!'
  echo 'Fetching video URLs... (may take a moment, especially for large playlists)'
  urls="$(exec $CCL_PATH -b -e '(progn (load "'$RPITUBE_PATH'") (main "'$1'") (ccl::quit))')"
  echo 'Starting video... press Q to skip to next video, left/right arrow keys to rewind/fast-forward, Ctrl-C to quit.'
  count=0
  for u in $urls; do           #do NOT quote $urls here
    [[ $count -lt $N ]] && count=$((count + 1)) && continue
    play "$u"
    echo 'Loading next video...'
  done
  echo 'Reached end of playlist. Hope you enjoyed it! :)'
fi

I made the following changes to the CL script: added the :silent option to the ql:quickload calls; replace my own ocurrences function with the built-in count (:test #'equal); and most importantly several things to the code at the end of the script that actually calls the URL-fetching functions. First I wrapped it in a main function that takes one argument, namely the playlist URL, and removed the references to *command-line-argument-list* etc. The important part: instead of invoking the entire rpitube.lisp script with the URL as a command line argument to CCL, I invoke it without arguments, and instead pass the URL to the main function directly (in the call to exec). See below:

(defun main (url)
  (if (stringp url)
      (loop for u in (get-vid-urls url) do
            (format t "~a " u))
      (error "Usage: rpitube <URL of youtube playlist>

              where URL is of the form:
              'https://www.youtube.com/playlist?list=uniqueplaylistcode'")))

This method could be applied widely and it works fine, but I'd be amazed if there isn't a better way to do it. If I can make any progress with the "toplevel" function + executable idea, I'll edit this answer.

An example working invocation, run on a small playlist of short videos, with playback beginning at the 3rd video:

$ ./rpitube -n 3 'https://www.youtube.com/playlist?list=PLVPJ1jbg0CaE9eZCTWS4KxOWi3NWv_oXL'

Many thanks.