Efficient way to build a data set from fits image

Question

Efficient way to build a data set from fits image

87 views Asked by Giuseppe Angora At 13 September 2017 at 22:13

I have a set of fits images: about 32000 images with resolution (256,256). The dataset that i've to build is matrix like, so the output shape is (32000, 256*256).

The simple solution is a for loop, samething like:

#file_names is a list of paths
samples=[]
for file_name in file_names:
    hdu=pyfits.open(file_name)
    samples.append(hdu[0].data.flatten())
    hdu.close()
#then i can use numpy.concatenate to have a numpy ndarray

This solution is very, very slow. So what is the best solution to build a so big data set?

Original Q&A

There are 1 answers

**chevydog** · Accepted Answer · 2017-09-13T22:39:01+00:00

This isn't really intended to be the main answer, but I felt it was too long for a comment and is relevant.

I believe there are a few things you can do without adjusting your code.

Python is a syntactical language and is implemented in different ways. The traditional implementation is CPython, which is what you download from the website. However, there are other implementations (see here).

Long story short, try PyPy as it often runs significantly faster with "memory-hungry python" such as yours. Here is a very nice reddit post about the advantages of each, but basically use PyPy, and optimize your code. Additionally, I have never used Numpy but this post suggests you might be able to keep Numpy and still use PyPy.

(Normally, I would also suggest you use Cython, but it does not appear to work nicely with Numpy at all. I don't know if Cython has any support for Numpy, but you can google that yourself.) Good luck!

TechQA.

Efficient way to build a data set from fits image

There are 1 answers

Related Questions in PYTHON

Related Questions in PYTHON-3.X

Related Questions in DATASET

Related Questions in PYFITS

Popular Questions

Popular Tags

Trending Questions