How can I show progress for a long-running Ansible task?

64.6k views Asked by At

I have a some Ansible tasks that perform unfortunately long operations - things like running an synchronization operation with an S3 folder. It's not always clear if they're progressing, or just stuck (or the ssh connection has died), so it would be nice to have some sort of progress output displayed. If the command's stdout/stderr was directly displayed, I'd see that, but Ansible captures the output.

Piping output back is a difficult problem for Ansible to solve in its current form. But are there any Ansible tricks I can use to provide some sort of indication that things are still moving?

Current ticket is https://github.com/ansible/ansible/issues/4870

4

There are 4 answers

3
halpdoge On BEST ANSWER

Ansible has since implemented the following:

---
# Requires ansible 1.8+
- name: 'YUM - async task'
  yum:
    name: docker-io
    state: installed
  async: 1000
  poll: 0
  register: yum_sleeper

- name: 'YUM - check on async task'
  async_status:
    jid: "{{ yum_sleeper.ansible_job_id }}"
  register: job_result
  until: job_result.finished
  retries: 30

For further information, see the official documentation on the topic (make sure you're selecting your version of Ansible).

2
Tom Manterfield On

There's a couple of things you can do, but as you have rightly pointed out, Ansible in its current form doesn't really offer a good solution.

Official-ish solutions:

One idea is to mark the task as async and poll it. Obviously this is only suitable if it is capable of running in such a manner without causing failure elsewhere in your playbook. The async docs are here and here's an example lifted from them:

- hosts: all
  remote_user: root
  tasks:
  - name: simulate long running op (15 sec), wait for up to 45 sec, poll every 5 sec
    command: /bin/sleep 15
    async: 45
    poll: 5

This can at least give you a 'ping' to know that the task isn't hanging.

The only other officially endorsed method would be Ansible Tower, which has progress bars for tasks but isn't free.

Hacky-ish solutions:

Beyond the above, you're pretty much going to have to roll your own. Your specific example of synching an S3 bucket could be monitored fairly easily with a script periodically calling the AWS CLI and counting the number of items in a bucket, but that's hardly a good, generic solution.

The only thing I could imagine being somewhat effective would be watching the incoming ssh session from one of your nodes.

To do that you could configure the ansible user on that machine to connect via screen and actively watch it. Alternatively perhaps using the log_output option in the sudoers entry for that user, allowing you to tail the file. Details of log_output can be found on the sudoers man page

0
geckos On

If you're on Linux you may use systemd-run to create a transient unit and inspect the output with journalctl, like:

sudo systemd-run --unit foo \                                      
     bash -c 'for i in {0..10}; do 
                   echo "$((i * 10))%"; sleep 1;
              done;
              echo "Complete"'

And in another session

sudo journalctl -xf --unit foo

It would output something like:

Apr 07 02:10:34 localhost.localdomain systemd[1]: Started /bin/bash -c for i in {0..10}; do echo "$((i * 10))%"; sleep 1; done; echo "Complete".
-- Subject: Unit foo.service has finished start-up
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit foo.service has finished starting up.
-- 
-- The start-up result is done.
Apr 07 02:10:34 localhost.localdomain bash[10083]: 0%
Apr 07 02:10:35 localhost.localdomain bash[10083]: 10%
Apr 07 02:10:36 localhost.localdomain bash[10083]: 20%
Apr 07 02:10:37 localhost.localdomain bash[10083]: 30%
Apr 07 02:10:38 localhost.localdomain bash[10083]: 40%
Apr 07 02:10:39 localhost.localdomain bash[10083]: 50%
Apr 07 02:10:40 localhost.localdomain bash[10083]: 60%
Apr 07 02:10:41 localhost.localdomain bash[10083]: 70%
Apr 07 02:10:42 localhost.localdomain bash[10083]: 80%
Apr 07 02:10:43 localhost.localdomain bash[10083]: 90%
Apr 07 02:10:44 localhost.localdomain bash[10083]: 100%
Apr 07 02:10:45 localhost.localdomain bash[10083]: Complete
0
John On

I came across this problem today on OSX, where I was running a docker shell command which took a long time to build and there was no output whilst it built. It was very frustrating to not understand whether the command had hung or was just progressing slowly.

I decided to pipe the output (and error) of the shell command to a port, which could then be listened to via netcat in a separate terminal.

myplaybook.yml

- name: run some long-running task and pipe to a port
  shell: myLongRunningApp > /dev/tcp/localhost/4000 2>&1

And in a separate terminal window:

$ nc -lk 4000
Output from my
long
running
app will appear here

Note that I pipe the error output to the same port; I could as easily pipe to a different port.

Also, I ended up setting a variable called nc_port which will allow for changing the port in case that port is in use. The ansible task then looks like:

  shell: myLongRunningApp > /dev/tcp/localhost/{{nc_port}} 2>&1

Note that the command myLongRunningApp is being executed on localhost (i.e. that's the host set in the inventory) which is why I listen to localhost with nc.