I am really struggling to work out how I can import CSV data which contains 9 columns and around 400 rows of data into a table in the Matlab workspace. This would be easy if I was allowed to use the built-in toolboxes Matlab has to offer, but instead, I need to try and complete the task using only commands such as fscanf, fopen, etc. The data itself is in mixed formats for each column. Decimal, float, string, etc. I am also allowed to use CSVread, but I have not managed to get that to work since from my understanding CSVread only works with numeric values.
Here is my code:
>> filename = 'datatext.csv'; %The file in CSV format
>> fid = fopen(filename);
>> headers = fgetl(fid); %using fgetl to remove the headers and put them in a variable
I have used fgetl to skip the headers line of the file and add them to their own variable, however I am not sure where to go from this point in creating the table. Basically I want to achieve a 400 row by 9 columns table in the workspace of Matlab.
Here is a sample of a few lines of what the text file looks like:
18,8,318,150,3436,11,70,1,sampletext
16,8,304,150,3433,12,70,1,sampletext2
I assume I will have to use built-in conversion functions of some of the cells, which I am allowed to do. I have probably missed some important information in order to receive the correct help, but any help anyone has will be much appreciated. Thankyou.

The lowest level functions to read a file are, (as described in Import Text Data Files with Low-Level I/O):
fscanf, which reads formatted data in a text or ASCII file; that is, a file you can view in a text editor. For more information, see Reading Data in a Formatted Pattern.
fgetl and fgets, which read one line of a file at a time, where a newline character separates each line. For more information, see Reading Data Line-by-Line.
fread, which reads a stream of data at the byte or bit level. For more information, see Import Binary Data with Low-Level I/O.
In your case, the input file is ascii, not binary, so right away we can drop the last option (
fread).You're left with
fgetl/fgets(used to read a file line by line then parse each line) andfscanf.You already have two answers using the line by line approach, so I won't detail this one and rather show you how you can use
fscanf(since your data are suitable, they are indeed organised in a formated pattern).The advantage of using
fscanf, is that provided you use the rightformatSpecparameter, the function will be able to read the whole file in one go, instead of iterating line by line. That is true for all the numeric data in your file. We will have to do a second pass for the text elements in your last column.Defining the format specifiers:
first lets define your format specifications. We will use a different format for each pass. The first pass will read all the numeric data, but will skip the text fields, while the second pass will do the opposite, ignore all numeric data and read only the text fields. The
'*'character is really useful for that when defining a format specifier:I've used
%dfor all columns because in your sample data I did not see the variety of numeric types you mentionned. You can easily substitute this for%fif the data requires it, and you can mix both types without problem in the same specifier (as long as they are all numeric). Just use what makes sense for each column.Armed with that, let's get to the file. The data are in a formated pattern after the headerline, so we first need to get past the header line before calling
fscanf. We'll do that as you've done:The call to
ftellallows us to save the position of the file pointer. After we read the header, the pointer is located right at the beginning of the data section. We save it to be able to rewind the file pointer to the same point for the second reading pass.Reading the numeric data:
This is done extremely rapidly by
fscanfin a simple call:Notice the transpose operator
.'at the end of the line. This is becausefscanfpopulate the values it reads in a column major order, but the data in the text file are read in a line major order. The transpose operation at the end is just to have an output array the same dimension than it was in the text file.Now
dataArraycontains all your numeric data:Reading the text data:
This is were it gets slightly more complex.
fscanfautomatically convert text characters to their ascii value. It's easy enough to convert that back to actual characters (using the functionchar()). The biggest hurdle is that if we read all the texts fields in one go, they will all appear as a sucession of numbers, but there will be no way to know where each string stops and where the next starts. To overcome that, we'll do a reading line by line, but still usingfscanf:Notice again the use of the transpose operator
.', and the use ofchar(). NowtextArrayis a cell array containing all your text fields:Regrouping data set:
Personnally, I would keep the two arrays separate as they are the most optimized containers for each type of data (
doublearray for numeric data, andcellarray for an array of strings). However, if you need to regroup them in a single data structure, you can either use a cell array:Or you could use a
table:Which gives:
Obviously if you choose the table version, use the variable names you parsed from the header instead of the automatically generated one I used in this example.