Importing text file into Matlab, dimension not maintained

62 views Asked by At

I am trying to upload a text file into matlab as a matrix and then process based on user input so selected data are selected.

These are the first few rows of the data.

The United States of America, Deaths (1x1)     Last modified: 16-Nov-2012, MPv5 (May07)

Year     Age        Female             Male            Total
1933      0          52615.77         68438.11        121053.88
1933      1           8917.13         10329.16         19246.29
1933      2           4336.92          5140.05          9476.97
1933      3           3161.59          3759.88          6921.47
1933      4           2493.84          2932.59          5426.43
1933      5           2139.87          2537.53          4677.40
1933      6           1939.70          2337.76          4277.46
1933      7           1760.47          2163.90          3924.37
1933      8           1602.20          2015.97          3618.17
1933      9           1464.88          1893.96          3358.84

A larger part of the data is present here: https://www.dropbox.com/s/b4njypwmrxwxzl7/USA.Deaths_1x1.txt?dl=0

The problem I am facing is that everytime I use T=readable() to read in the data, the dimension of T is m x 1 table, rather than a m x 5 table.

I also tried to change the txt file into a csv file, but the data has non-numeric entries.

What could I do to accomplish this problem?

Thanks.

1

There are 1 answers

0
Hoki On

For your format of data, most straight forward import functions (importdata, dlmread, etc ...) will fail.

textscan has a few parameters which will allow you to import the full file without breaking at the first irregular line, however a few faulty lines will contain NaN.

%// Define special values which can be encoutered
specialValues = {'110+','other_special_values'} ;
formatSpec = '%n%n%f%f%f' ;

%// Read the file, treating special values 
fileID = fopen('USA.Deaths_1x1.txt');
C = textscan(fileID, formatSpec, ...
    'delimiter'     , ' ', ...
    'headerlines'   ,3, ...
    'treatAsEmpty'  , specialValues, ...
    'MultipleDelimsAsOne',1 );

fclose(fileID);

%// Convert cell array to matrix
data = cell2mat(C) ;

If you really need the faulty lines data, you'll have to write a more custom parser with the low level function fscanf and account for every edge case (unconventionnal line) that you may encounter.