I am aware that io.BytesIO() returns a binary stream object which uses in-memory buffer. but also provides getbuffer() which provides a readable and writable view (memoryview obj) over the contents of the buffer without copying them.
obj = io.BytesIO(b'abcdefgh')
buf = obj.getbuffer()
Now, we know buf points to underlying data and when sliced(buf[:3]) returns a memoryview object again without making a copy. So I want to know, if we do obj.read(3) does it also uses in-memory buffer or makes a copy ?. if it does uses in-memeory buffer, what is the difference between obj.read and buf and which one to prefer to effectively read the data in chunks for considerably very long byte objects ?
Simply put,
BytesIO.readreads data from the in-memory buffer. The method reads the data and returns as bytes objects and gives you a copy of the read data.bufhowever, is a memory view object that views the underlying buffer and doesn't make a copy of the data.The difference between
BytesIO.readandbufis that, subsequent data retrieves will not be affected whenio.BytesIO.readis used as you will get a copy of the data of the buffer, but if you change databufyou also will change the data in the buffer as well.In terms of performance, using
obj.readwould be a better choice if you want to read the data in chunks, because it provides a clear separation between the data and the buffer, and makes it easier to manage the buffer. On the other hand, if you want to modify the data in the buffer, usingbufwould be a better choice because it provides direct access to the underlying data.