Attempting to read a binary file produced in Fortran into Python, which has some integers, some reals and logicals. At the moment I read the first few numbers correctly with:
x = np.fromfile(filein, dtype=np.int32, count=-1)
firstint= x[1]
...
(np is numpy). But the next item is a logical. And later on ints again and after reals. How can I do it?
Typically, when you're reading in values such as this, they're in a regular pattern (e.g. an array of C-like structs).
Another common case is a short header of various values followed by a bunch of homogenously typed data.
Let's deal with the first case first.
Reading in Regular Patterns of Data Types
For example, you might have something like:
If that's the case, you can define the a dtype to match the pattern of types. In the case above, it might look like:
(Note: there are many different ways to define the dtype. For example, you could also write that as
np.dtype('f8,f8,i8,i8,?')
. See the documentation fornumpy.dtype
for more information.)When you read your array in, it will be a structured array with named fields. You can later split it up into individual arrays if you'd prefer. (e.g.
series1 = data['a']
with the dtype defined above)The main advantage of this is that reading in your data from disk will be very fast. Numpy will simply read everything into memory, and then interpret the memory buffer according to the pattern you specified.
The drawback is that structured arrays behave a bit differently than regular arrays. If you're not used to them, they'll probably seem confusing at first. The key part to remember is that each item in the array is one of the patterns that you specified. For example, for what I showed above,
data[0]
might be something like(4.3, -1.2298, 200, 456, False)
.Reading in a Header
Another common case is that you have a header with a know format and then a long series of regular data. You can still use
np.fromfile
for this, but you'll need to parse the header seperately.First, read in the header. You can do this in several different ways (e.g. have a look at the
struct
module in addition tonp.fromfile
, though either will probably work well for your purposes).After that, when you pass the file object to
fromfile
, the file's internal position (i.e. the position controlled byf.seek
) will be at the end of the header and start of the data. If all of the rest of the file is a homogenously-typed array, a single call tonp.fromfile(f, dtype)
is all you need.As a quick example, you might have something like the following: