I am trying to convert image data saved in a rosbag file to numpy arrays and opencv images for further processing. I can not use cv_bridge or any of the other ROS utils.
I read the rosbag using the bagpy module here. And convert the data to a pandas dataframe:
import numpy as np
import cv2
import bagpy
from bagpy import bagreader
import matplotlib.pyplot as plt
import pandas as pd
import csv
b = bagreader('camera.bag')
image_csv = b.message_by_topic('/left/image')
df_limage = pd.read_csv('camera/left-image.csv')
Because the rosbag stores images as type bytestring, the df_limage dataframe looks like:
>>> df_limage.head()
time height width encoding is_bigendian data
1.593039e+09 1080 1920 rgb8 0 b' \'\n"*\x0c$\'\x14\x1f...
When I try to examine the image stored in the data column, I see that each image is stored as a string:
>>> type(df_limage['data'][0])
str
>>> len(df_limage['data'][0])
15547333
>>> print(df_limage['data'][0])
b' \'\n"*\x0c$\'\x14\x1f#\x0f\x1d!\x12 %\x16\x1f\'\x0e\x1c%\x0b\x1c&\x12\x19#\x10\x1e#\x13\x1f$\x14##\x16!!\x13$$"$$"&*\x12$(\x1...
When I try to decode this using code from this answer, I get warnings and NoneType returns:
>>> nparr = np.fromstring(df_limage['data'][0], np.uint8)
DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead
>>> img_np = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
>>> type(img_np)
NoneType
I think this is because the string isn't being read correctly as a bytestring and nparr hasn't been reshaped into a 3-channel RGB image of dimensions (1080 x 1920). The size of nparr is 15547333, so it can't be reshaped into a (1080 x 1920 x 3) image which leads me to believe that the np.fromstring call isn't correct.
How do I take a binarystring that is represented as string with a leading "b'", convert that back to a binarystring so I can then convert it into an array, and then an opencv image?
Thanks
Your image is pure
rgb8pixels in abytestype. That means:strand you shouldn't treat it as such, andcv2.imdecode()because that decompresses images and turns them into Numpy arrays of pixels, which is nearly what you already have.So, you have a number of contiguous bytes representing pixels. The length of your bytes should be 1920x1080x3, i.e. one byte per channel for 3 channels of 1080p dimensions. We need to make a Numpy array and then reshape it from a long line into 1080p:
General rule:
Part 1
You should generally only be calling
cv2.imdecode()on things that look like either a PNG:or a JPEG:
or a TIFF (
b'II'orb'MM') or BMP (b'BM') magic signature.Part 2
If your buffer begins with a base64-encoded version of either of the above, i.e.
iVBORw0KGgo=(PNG) or/9(JPEG), you need to base64-decode, then callcv2.imdecode()the result of that.Part 3
If your data is
bytestype and already has the same length as the dimensions of your image, i.e.len(YOURBYTES) == height*width*nChannelslike you have, that means it is pure, uncompressed pixels, so you just need the first part of this answer:Note that, unlike in Parts 1 and 2 above, the reshaping is necessary here because there was no JPEG or PNG metadata telling us the height and width of the image.