Use textscan in Matlab to output data

500 views Asked by At

I've got a large text file with some headers and numerical data. I want to ignore the header lines and specifically output the data in columns 2 and 4.

Example data

[headers]  
line1  
line2  
line3

[data]  
1 2 3 4  
5 6 7 8  
9 10 11 12

I've tried using the following code:

FID = fopen('datafile.dat');  
data = textscan(FID,'%f',4,'delimiter',' ','headerLines',4);  
fclose(FID);

I only get an output of 0x1 cell

1

There are 1 answers

0
grungetta On

Try this:

FID = fopen('datafile.dat');
data = textscan(FID,'%f %f %f %f', 'headerLines', 6);
fclose(FID);

data will be a 1x4 cell array. Each cell will contain a 3x1 array of double values, which are the values in each column of your data.

You can access the 2nd and 4th columns of your data by executing data{2} and data{4}.


With your original code, the main issue is that the data file has 6 header lines but you've specified that there are only 4.

Additionally, though, you'll run into problems with the specification of the number of times to match the formatSpec. Take for instance the following code

data = textscan(FID,'%f',4);

which specifies that you will attempt to match a floating-point value 4 times. Keep in mind that after matching 4 values, textscan will stop. So for the sake of simplicity, let's imagine that your data file only contained the data (i.e. no header lines), then you would get the following results when executing that code, multiple times:

>> FID = fopen('datafile_noheaders.dat');
>> data_line1 = textscan(FID,'%f', 4)

data_line1 = 

    [4x1 double]


>> data_line1{1}'

ans =

     1     2     3     4

>> data_line2 = textscan(FID,'%f', 4)

data_line2 = 

    [4x1 double]

>> data_line2{1}'

ans =

     5     6     7     8

>> data_line3 = textscan(FID,'%f', 4)

data_line3 = 

    [4x1 double]

>> data_line3{1}'

ans =

     9    10    11    12

>> data_line4 = textscan(FID,'%f', 4)

data_line4 = 

    [0x1 double]

>> fclose(FID);

Notice that textscan picks up where it "left off" each time it is called. In this case, the first three times that textscan is called it returns one row from your data file (in the form of a cell containing a 4x1 column of data). The fourth call returns an empty cell. For the usecase you described, this format is not particularly helpful.

The example given at the top should return data in a format that is much easier to work with for what you are trying to accomplish. In this case it will match four floating point values in each of your rows of data, and will continue with each line of text until it can no longer match this pattern.