After discovering a very simple way to watch YouTube videos from the command line using my new Raspberry Pi 2 (running Raspbian) using only easily obtainable packages, namely:
omxplayer -o local $(youtube-dl -g {videoURL})
I immediately wanted a way to watch entire YouTube playlists that way. So I saw this as a perfect excuse to hack together a solution in Common Lisp :)
My solution (imaginatively dubbed RpiTube) is a script that, when given the URL of a YouTube playlist, searches the page's HTML source and extracts the URLs for the videos contained within it. I can then pass these URLs to a Bash script that ultimately calls the above command for each video individually, one after the other. The Common Lisp script itself is complete and works, however I'm having difficulty invoking it with the URL as a command-line argument. This is mainly because I'm still quite new to Quicklisp, Lisp packages and creating executables from Common Lisp code.
I'm running Clozure Common Lisp (CCL) with Quicklisp (installed as per Rainer Joswig's instructions). I've included the complete code below. It may be a little inefficient, but to my amazement it runs reasonably quickly even on the Raspberry Pi. (Suggested improvements are appreciated.)
;rpitube.lisp
;Given the URL of a YouTube playlist's overview page, return a list of the URLs of videos in said playlist.
(load "/home/pi/quicklisp/setup.lisp")
(ql:quickload :drakma)
(ql:quickload "cl-html-parse")
(ql:quickload "split-sequence")
(defun flatten (x)
"Paul Graham's utility function from On Lisp."
(labels ((rec (x acc)
(cond ((null x) acc)
((atom x) (cons x acc))
(t (rec (car x) (rec (cdr x) acc))))))
(rec x nil)))
(defun parse-page-source (url)
"Generate lisp list of a page's html source."
(cl-html-parse:parse-html (drakma:http-request url)))
(defun occurences (e l)
"Returns the number of occurences of an element in a list. Note: not fully tail recursive."
(cond
((null l) 0)
((equal e (car l)) (1+ (occurences e (cdr l))))
(t (occurences e (cdr l)))))
(defun extract-url-stubs (flatlist unique-atom url-retrieval-fn)
"In a playlist's overview page the title of each video is represented in HTML as a link,
whose href entry is part of the video's actual URL (referred to here as a stub).
Within the link's tag there is also an entry that doesn't occur anywhere else in the
page source. This is the unique-atom (a string) that we will use to locate the link's tag
within the flattened list of the page source, from which we can then extract the video's URL
stub using a simple url-retrieval-fn (see comments below this function). This function is iterative, not
recursive, because the latter approach was too confusing."
(let* ((tail (member unique-atom flatlist :test #'equal))
(n (occurences unique-atom tail))
(urls nil))
(loop for x in tail with i = 0
while (< (length urls) n) do
(if (string= x unique-atom)
(setf urls (cons (funcall url-retrieval-fn tail i) urls)))
(incf i))
(reverse urls)))
;Example HTML tag:
;<a class="pl-video-title-link yt-uix-tile-link yt-uix-sessionlink spf-link " data-sessionlink="verylongirrelevantinfo" href="/watch?v=uniquevideocode&index=numberofvideoinplaylist&list=uniqueplaylistcode" dir="ltr"></a>
;Example tag when parsed and flattened:
;(:A :CLASS "pl-video-title-link yt-uix-tile-link yt-uix-sessionlink spf-link " :DATA-SESSIONLINK "verylongirrelevantinfo" :HREF "/watch?v=uniquevideocode&index=numberofvideoinplaylist&list=uniqueplaylistcode" :DIR "ltr")
;The URL stub is the fourth list element after unique-atom ("pl-video-title..."), so the url-retreival-fn is:
;(lambda (l i) (elt l (+ i 4))), where i is the index of unique-atom.
(defun get-vid-urls (url)
"Extracts the URL stubs, turns them into full URLs, and returns them in a list."
(mapcar (lambda (s)
(concatenate 'string
"https://www.youtube.com"
(car (split-sequence:split-sequence #\& s))))
(extract-url-stubs (flatten (parse-page-source url))
"pl-video-title-link yt-uix-tile-link yt-uix-sessionlink spf-link "
(lambda (l i) (elt l (+ i 4))))))
(let ((args #+clozure *unprocessed-command-line-arguments*))
(if (and (= (length args) 1)
(stringp (car args)))
(loop for url in (get-vid-urls (car args)) do
(format t "~a " url))
(error "Usage: rpitube <URL of youtube playlist>
where URL is of the form:
'https://www.youtube.com/playlist?list=uniqueplaylistcode'")))
First I tried adding the following line to the script
#!/home/pi/ccl/armcl
and then running
$ chmod +x rpitube.lisp
$ ./rpitube.lisp {playlistURL}
which gives:
Unrecognized non-option arguments: (./rpitube.lisp {playlistURL})
when I would at least have expected that ./rpitube.lisp be absent from this list of unrecognized arguments. I know that in Clozure CL, in order to pass a command line argument to an REPL session untouched, I have to separate them from the other arguments with a double hyphen, like this:
~/ccl/armcl -l rpitube.lisp -- {playlistURL}
But invoking the script like this clearly lands me in a REPL after the script has run, which I don't want. Additionally the Quicklisp loading information and progress bars are printed to the terminal, which I also don't want. (Incidentally, as Rainer suggested, I haven't added Quicklisp to my CCL init file, since I generally don't want the additional overhead i.e. few second's loading time on the Raspberry Pi. I'm not sure if that's relevant).
I then decided to try creating a standalone executable by running (once the above code is loaded):
(ccl:save-application "rpitube" :prepend-kernel t)
And calling it from a shell like this:
$ ./rpitube {playlistURL}
which gives:
Unrecognized non-option arguments: ({playlistURL})
which seems to be an improvement, but I'm still doing something wrong. Do I need to replace the Quicklisp-related code by creating my own asdf package that requires drakma, cl-html-extract and split-sequence, and loading that with in-package
, etc.? I have created my own package before in another project - specifically because I wanted to split up my code into multiple files - and it seems to work, but I still loaded my package via ql:quickload
as opposed to in-package
, since the latter never seemed to work (perhaps I should ask about that as a separate question). Here, the rpitube.lisp code is so short that it seems unecessary to create a whole quickproject and package for it, especially since I want it to be a standalone executable anyway.
So: how do I change the script (or its invocation) so that it can accept the URL as a command-line argument, can be run non-interactively (i.e. doesn't open a REPL), and ONLY prints the desired output to the terminal - a space-delimited list of URLs - without any Quicklisp loading information?
I looked at this some and would like to share what I found. There are also several Lisp libraries which aim to facilitate scripting, executable building, or command-line argument handling.
For your executable building approach,
save-application
lets you specify a:toplevel-function
, a function of zero arguments. In this case you will need to get the command line arguments throughccl:*command-line-argument-list*
, and skip the first element (the name of the program). This is probably the minimal change to get your program running (I haven't run this; so it may have typos):Alternatively, some Lisp implementations have a
--scpript
command-line parameter which allows something similar to your#!/home/pi/ccl/armcl
script to work. CCL doesn't seem to have an equivalent option, but a previous answer -- https://stackoverflow.com/a/3445196/2626993 -- suggests writing a short Bash script which would essentially behave like you hoped CCL would with this attempt.quickload
calls can be silenced with an argument: