I have a project to be converted from C++ to Python. The project was originally written by others. The logic is described as below:
"It loaded 20 plain text files in which each line represents a record of 7 columns. It then parses each line and loaded all data into objects of C++ vectors and maps. It then uses binary serialization to output C++ vectors and maps into corresponding 20 binary files".
I don't know what's the purpose of the author by using binary serialization, but I guess it might be due to three reasons:
- Reduced binary file sizes than plain text files
- Reduced initialization time, since the system starts by loading these files into memory and populating structures.
- It MIGHT speed up at runtime besides initialization stage, although the possibility is small.
I have little experience of object serialization and when I now rewrite it with python, I don't know whether I should use object serialization or not. In Python, does object serialization speed up programs noticeably, or it reduces file sizes to be stored on disk? Or any other benefits? I know the downside is that it makes program more complicated.
In implementing the logic described above in Python, should I use object serialization as well?
Sounds like it's almost certainly the case that the point is to avoid this part: "It then parses each line and loaded all data into objects of C++ vectors and maps".
If the binary machine data in the "C++ vectors and maps" can later be loaded directly, then all the expenses of parsing the text to recreate them from scratch can be skipped.
Can't answer whether you "should" do the same, though, without knowing details of the expenses involved. Not enough info here to guess.