Using ImageMagick to efficiently stitch together a line scan image

1.8k views Asked by At

I’m looking for alternatives for line scan cameras to be used in sports timing, or rather in the part where placing needs to be figured out. I found that common industrial cameras can readily match the speed of commercial camera solutions at >1000 frames per second. For my needs, usually the timing accuracy is not important, but the relative placing of athletes. I figured I could use one of the cheapest Basler, IDS or any other area scan industrial cameras for this purpose. Of course there are line scan cameras that can do a lot more than a few thousand fps (or hz), but it is possible to get area scan cameras that can do the required 1000-3000fps for less than 500€.

My holy grail would of course be the near-real time image composition capabilities of FinishLynx (or any other line scan system), basically this part: https://youtu.be/7CWZvFcwSEk?t=23s

The whole process I was thinking for my alternative is:

  • Use Basler Pylon Viewer (or other software) to record 2px wide images at the camera’s fastest read speed. For the camera I am currently using it means it has to be turned on it’s side and the height needs to be reduced, since it is the only way it will read 1920x2px frames @ >250fps
  • Make a program or batch script that then stitches these 1920x2px frames together to, for example one second of recording 1000*1920x2px frames, meaning a resulting image with a resolution of 1920x2000px (Horizontal x Vertical).
  • Finally using the same program or another way, just rotate the image so it reflects how the camera is positioned, thus achieving an image with a resolution of 2000x1920px (again Horizontal x Vertical)
  • Open the image in an analyzing program (currently ImageJ) to quickly analyze results

I am no programmer, but this is what I was able to put together just using batch scripts, with the help of stackoverflow of course.

  • Currently recording a whole 10 seconds for example to disk as a raw/mjpeg(avi/mkv) stream can be done in real time.
  • Recording individual frames as TIFF or BMP, or using FFMPEG to save them as PNG or JPG takes ~20-60 seconds The appending and rotation then takes a further ~45-60 seconds This all needs to be achieved in less than 60 seconds for 10 seconds of footage(1000-3000fps @ 10s = 10000-30000 frames) , thus why I need something faster.

I was able to figure out how to be pretty efficient with ImageMagick:

magick convert -limit file 16384 -limit memory 8GiB -interlace Plane -quality 85 -append +rotate 270 “%folder%\Basler*.Tiff” “%out%”

#%out% has a .jpg -filename that is dynamically made from folder name and number of frames.

This command works and gets me 10000 frames encoded in about 30 seconds on a i5-2520m (most of the processing seems to be using only one thread though, since it is working at 25% cpu usage). This is the resulting image: https://i.stack.imgur.com/LuwK7.jpg (19686x1928px)

However since recording to TIFF frames using Basler’s Pylon Viewer takes just that much longer than recording an MJPEG video stream, I would like to use the MJPEG (avi/mkv) file as a source for the appending. I noticed FFMPEG has “image2pipe” -command, which should be able to directly give images to ImageMagick. I was not able to get this working though:

   $ ffmpeg.exe -threads 4 -y -i "Basler acA1920-155uc (21644989)_20180930_043754312.avi" -f image2pipe - | convert - -interlace Plane -quality 85 -append +rotate 270 "%out%" >> log.txt
    ffmpeg version 3.4 Copyright (c) 2000-2017 the FFmpeg developers
      built with gcc 7.2.0 (GCC)
      configuration: –enable-gpl –enable-version3 –enable-sdl2 –enable-bzlib –enable-fontconfig –enable-gnutls –enable-iconv –enable-libass –enable-libbluray –enable-libfreetype –enable-libmp3lame –enable-libopenjpeg –enable-libopus –enable-libshine –enable-libsnappy –enable-libsoxr –enable-libtheora –enable-libtwolame –enable-libvpx –enable-libwavpack –enable-libwebp –enable-libx264 –enable-libx265 –enable-libxml2 –enable-libzimg –enable-lzma –enable-zlib –enable-gmp –enable-libvidstab –enable-libvorbis –enable-cuda –enable-cuvid –enable-d3d11va –enable-nvenc –enable-dxva2 –enable-avisynth –enable-libmfx
      libavutil      55. 78.100 / 55. 78.100
      libavcodec     57.107.100 / 57.107.100
      libavformat    57. 83.100 / 57. 83.100
      libavdevice    57. 10.100 / 57. 10.100
      libavfilter     6.107.100 /  6.107.100
      libswscale      4.  8.100 /  4.  8.100
      libswresample   2.  9.100 /  2.  9.100
      libpostproc    54.  7.100 / 54.  7.100
    Invalid Parameter - -interlace
    [mjpeg @ 000000000046b0a0] EOI missing, emulating
    Input #0, avi, from 'Basler acA1920-155uc (21644989)_20180930_043754312.avi’:
      Duration: 00:00:50.02, start: 0.000000, bitrate: 1356 kb/s
        Stream #0:0: Video: mjpeg (MJPG / 0x47504A4D), yuvj422p(pc, bt470bg/unknown/unknown), 1920x2, 1318 kb/s, 200 fps, 200 tbr, 200 tbn, 200 tbc
    Stream mapping:
      Stream #0:0 -> #0:0 (mjpeg (native) -> mjpeg (native))
    Press [q] to stop, [?] for help
    Output #0, image2pipe, to ‘pipe:’:
      Metadata:
        encoder         : Lavf57.83.100
        Stream #0:0: Video: mjpeg, yuvj422p(pc), 1920x2, q=2-31, 200 kb/s, 200 fps, 200 tbn, 200 tbc
        Metadata:
          encoder         : Lavc57.107.100 mjpeg
        Side data:
          cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: -1
    av_interleaved_write_frame(): Invalid argument
    Error writing trailer of pipe:: Invalid argument
    frame=    1 fps=0.0 q=1.6 Lsize=       0kB time=00:00:00.01 bitrate= 358.4kbits/s speed=0.625x
    video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%
    Conversion failed!

If I go a bit higher for the height, I no longer get the “[mjpeg @ 000000000046b0a0] EOI missing, emulating” -error. However the whole thing will only work with <2px high/wide footage.

edit: Oh yes, I can also use ffmpeg -i file.mpg -r 1/1 $filename%03d.bmp or ffmpeg -i file.mpg $filename%03d.bmp to extract all the frames from the MJPEG/RAW stream. However this is an extra step I do not want to take. (just deleting a folder of 30000 jpgs takes 2 minutes alone…)

Can someone think of a working solution for the piping method or a totally different alternative way of handling this?

2

There are 2 answers

5
Mark Setchell On BEST ANSWER

I had another go at this to see if I could speed up my other answer by doing things a couple of different ways - hence a different answer. I used the same synthetic videoclip that I generated in the other answer to test with.

Rather than pass the 2x1920 scanlines into ImageMagick for it to append together and write as a JPEG, I did the following:

  • created the full output frame up front in a C++ program, and then looped reading in a 2x1920 scanline on each iteration and stuffed that into the correct position in the output frame, and

  • when the entire sequence has been read, compressed it into a JPEG using turbo-jpeg and wrote it disk.

As such, ImageMagick is no longer required. The entire program now runs in around 1.3 seconds rather than the 10.3 seconds via ImageMagick.

Here is the code:

////////////////////////////////////////////////////////////////////////////////
// stitch.cpp
// Mark Setchell
//
// Read 2x1920 RGB frames from `ffmpeg` and stitch into 20000x1920 RGB image.
////////////////////////////////////////////////////////////////////////////////
#include <iostream>
#include <fstream>
#include <stdio.h>
#include <unistd.h>
#include <turbojpeg.h>

using namespace std;

int main()
{
   int frames = 10000;
   int height = 1920;
   int width  = frames *2;

   // Buffer in which to assemble complete output image (RGB), e.g. 20000x1920
   unsigned char *img = new unsigned char [width*height*3];

   // Buffer for one scanline image 1920x2 (RGB)
   unsigned char *scanline = new unsigned char[2*height*3];

   // Output column
   int ocol=0;

   // Read frames from `ffmpeg` fed into us like this:
   // ffmpeg -threads 4 -y -i video.mov -frames 10000 -vf "transpose=1" -f image2pipe -vcodec rawvideo -pix_fmt rgb24 - | ./stitch
   for(int f=0;f<10000;f++){
      // Read one scanline from stdin, i.e. 2x1920 RGB image...
      ssize_t bytesread = read(STDIN_FILENO, scanline, 2*height*3);

      // ... and place into finished frame
      // ip is pointer to input image
      unsigned char *ip = scanline;
      for(int row=0;row<height;row++){
         unsigned char *op = &(img[(row*width*3)+3*ocol]);
         // Copy 2 RGB pixels from scanline to output image
         *op++ = *ip++; // red
         *op++ = *ip++; // green
         *op++ = *ip++; // blue
         *op++ = *ip++; // red
         *op++ = *ip++; // green
         *op++ = *ip++; // blue
      }
      ocol += 2; 
   }

   // Now encode to JPEG with turbo-jpeg
   const int JPEG_QUALITY = 75;
   long unsigned int jpegSize = 0;
   unsigned char* compressedImage = NULL;
   tjhandle _jpegCompressor = tjInitCompress();

   // Compress in memory
   tjCompress2(_jpegCompressor, img, width, 0, height, TJPF_RGB,
          &compressedImage, &jpegSize, TJSAMP_444, JPEG_QUALITY,
          TJFLAG_FASTDCT);

   // Clean up
   tjDestroy(_jpegCompressor);

   // And write to disk
   ofstream f("result.jpg", ios::out | ios::binary);
   f.write (reinterpret_cast<char*>(compressedImage), jpegSize);
}

Notes:

Note 1: In order to pre-allocate the output image, the program needs to know how many frames are coming in advance - I did not parameterize that, I just hard-coded 10,000 but it should be easy enough to change.

One way to determine the number of frames in the video sequence is this:

ffprobe -v error -count_frames -select_streams v:0 -show_entries stream=nb_frames -of default=nokey=1:noprint_wrappers=1 video.mov

Note 2: Note that I compiled the code with a couple of switches for performance:

g++-8 -O3 -march=native stitch.cpp -o stitch

Note 3: If you are running on Windows, you may need to re-open stdin in binary mode before doing:

read(STDIN_FILENO...)

Note 4: If you don't want to use turbo-jpeg, you could remove everything after the end of the main loop, and simply send a NetPBM PPM image to ImageMagick via a pipe and let it do the JPEG writing. That would look like, very roughly:

writeToStdout("P6 20000 1920 255\n");
writeToStdout(img, width*height*3);

Then you would run with:

ffmpeg ... | ./stitch | magick ppm:-  result.jpg
3
Mark Setchell On

I generated a sample video of 10,000 frames, and did some tests. Obviously, my machine is not the same specification as yours, so results are not directly comparable, but I found it is quicker to let ffmpeg transpose the video and the pipe it into ImageMagick as raw RGB24 frames.

I found that I can convert a 10 second movie into a 20,000x1920 pixel JPEG in 10.3s like this:

ffmpeg -threads 4 -y -i video.mov -frames 10000 -vf "transpose=1" -f image2pipe -vcodec rawvideo -pix_fmt rgb24 - | convert -depth 8 -size 2x1920 rgb:- +append result.jpg

The resulting image looks like this:

enter image description here


I generated the video like this with CImg. Basically it just draws a Red/Green/Blue splodge successively further across the frame till it hits the right edge, then starts again at the left edge:

#include <iostream>
#include "CImg.h"

using namespace std;
using namespace cimg_library;

int main()
{
   // Frame we will fill
   CImg<unsigned char> frame(1920,2,1,3);

   int width =frame.width();
   int height=frame.height();

   // Item to draw in frame - created with ImageMagick
   // convert xc:red xc:lime xc:blue +append -resize 256x2\! splodge.ppm
   CImg<unsigned char> splodge("splodge.ppm");

   int offs  =0;

   // We are going to output 10000 frames of RGB raw video
   for(int f=0;f<10000;f++){
      // Generate white image
      frame.fill(255);

      // Draw coloured splodge at correct place
      frame.draw_image(offs,0,splodge);
      offs = (offs + 1) % (width - splodge.width());

      // Output to ffmpeg to make video, in planar GBR format
      // i.e. run program like this
      // ./main | ffmpeg -y -f rawvideo -pixel_format gbrp -video_size 1920x2 -i - -c:v h264 -pix_fmt yuv420p video.mov
      char* s=reinterpret_cast<char*>(frame.data()+(width*height));   // Get start of G plane
      std::cout.write(s,width*height);                                // Output it
      s=reinterpret_cast<char*>(frame.data()+2*(width*height));       // Get start of B plane
      std::cout.write(s,width*height);                                // Output it
      s=reinterpret_cast<char*>(frame.data());                        // Get start of R plane
      std::cout.write(s,width*height);                                // Output it
   }
}

The splodge is 192x2 pixels and looks like this:

enter image description here