I am a computational-physics graduate student and my research requires me to write a large array storing the values of '1' and '-1' to a binary file(s). Currently I have come up with the following MWE:
#include <fstream>
#include <sstream>
#include <bitset>
const int Num = 1024;
std::string int_array_to_string(int state[], int start, int finish){
std::ostringstream oss("");
for (int i=start; i<start+finish; i++)
switch(state[i]){
case -1: oss << 0; break;
case 1: oss << 1; break;
}
return oss.str();
}
void printToBinary(int state[], std::ostream &output){
for (int i=0; i<Num; i+=32){
std::bitset<32> x( int_array_to_string(state, i, 32));
unsigned long n = x.to_ulong();
output.write(reinterpret_cast<const char*>(&n), sizeof(n));
}
}
void fakeUpSomeData(int state[]){
int ans = 1;
for (int i=0; i<Num; i++){
ans *= -1;
state[i] = ans;
}
}
int main(void){
int state[Num] = {0};
fakeUpSomeData(state);
std::ofstream output("output.bin", std::ios::binary);
printToBinary(state, output);
return 0;
}
This however, makes my program run three times slower than before and I'm certain there must be a better way to do this.
Additionally it would be useful to be able to register chunks of the data later, that is if I store the three states
{1,-1,1}
{1,-1,1}
{1,1,-1}
into one file it would be useful if a method exists to read the first chunk, then the second chunk, then the third chunk.
A bit of background/reasoning behind why I need to do this: I will need to store roughly 1024*1e5 up to 9632*1e6 of these ints to calculate low/high resolution predictions for neutron scattering. So being able to read out chunks of some size 'N' would be extremely useful instead of storing 1e6 separate binary files in a folder (just typing that option sounds ridiculous!).
Finally I have considered using the package HDF5 but it seems a bit overkill, and I was unable to get a MWE to work using it.
Any thoughts on how to improve the MWE would be appreciated and thank you for your time.
Check out this answer: Writing a binary file in C++ very fast
In summary, try using C Style I/O, that is forget about output streams and use open() and write() to write directly to the file descriptors.
You could even use read() with a buffer size the same number of bytes needed to store your NxN binary states in a single chunk andread them in one at a time.