I am using Pybind11/Nanobind to write Python bindings for my C++ libraries.
One of my C++ functions takes in the argument of type std::istream &
e.g.:
std::string cPXGStreamReader::testReadStream(std::istream &stream)
{
std::ostringstream contentStream;
std::string line;
while (std::getline(stream, line)) {
contentStream << line << '\n'; // Append line to the content string
}
return contentStream.str(); // Convert contentStream to string and return
}
What kind of argument do I need to pass in Python which corresponds to this?
I have tried passing s
where s
is created:
s = open(r"test_file.pxgf", "rb")
# and
s = io.BytesIO(b"some initial binary data: \x00\x01")
to no avail. I get the error
TypeError: test_read_file(): incompatible function arguments. The following argument types are supported:
1. (self: pxgf.PXGStreamReader, arg0: std::basic_istream<char,std::char_traits<char> >) -> str
Invoked with: <pxgf.PXGStreamReader object at 0x000002986CF9C6B0>, <_io.BytesIO object at 0x000002986CF92250>
Did you forget to `#include <pybind11/stl.h>`? Or <pybind11/complex.h>,
<pybind11/functional.h>, <pybind11/chrono.h>, etc. Some automatic
Pybind11 doesn't provide support for stream arguments out of the box, so a custom implementation needs to be developed. We will need some kind of an adapter that will allow us to create an
istream
, which reads from a Python object. There are two options that come to mind:std::streambuf
and use that withstd::istream
(related Q&A).boost::iostreams::stream
(which you can pass to your function without any modifications).I'll focus on the latter option, and restrict the solution only to file-like objects (i.e. derived from
io.IOBase
) used for input.Using Boost IOStreams
Source Implementation
We need to create a class satisfying the "Source" model of Boost IOStreams (there is a handy tutorial in the documentation about this subject). Basically something with the following characteristics:
Constructor
In the constructor, we should store a reference to the Python object (
pybind11::object
) used as a data source. We should also verify that data source object is actually file-like.Finally, we can cache the
read
attribute of the file-like object in anotherpybind11::object
member variable, in order to avoid a lookup every time we want to call it.read(...)
This function needs to read the requested number of bytes to the provided buffer, and signal End-Of-File when it's reached. The simplest approach is to cast the result of the Python file-like's
read()
method tostd::string
, and then copy its contents to the read buffer. It works for all both binary and text IO.However, this involves an unnecessary copy. If you're using recent enough C++ standard, you could cast to
std::string_view
instead. Otherwise, a somewhat lower-level approach may be used (based on the implementation ofpybind11::string
):Sample Function Binding
Let's use a free-standing function modeled after your example to test this:
Creating a Pybind11 typecaster for streams appears like opening yet another can of worms, so let's skip that. Instead, let's use a rather simple lambda when defining the function bindings. It simply needs to construct a
boost::iostreams::stream
using our source, and call the wrapped function.Complete Source Code
Here it's all together:
Running it produces the following output:
Note: Due to buffering inherent in
istream
, it will usually read from the Python object in chunks (e.g. here it asks for 4096 bytes every time). If you make partial reads of the stream (e.g. just get a single line), you will need to re-adjust the file-like's read position explicitly to reflect the number of bytes actually consumed.Not Using Boost
Pybind11 source code contains an output
streambuf
implementation that writes to Python streams. This can be a decent inspiration to get started. I managed to find an existingstreambuf
implementation for input in BlueBrain/nmodl repository on GitHub. You would use it in the following manner (probably as part of a wrapper lambda used to dispatch your C++ function):