Is there a way to convert csv data to a Matlab timetable using Python or Matlab?

230 views Asked by At

I would like to know how to make a Matlab timetable from this data:

               date   time     open     high      low    close tickvol
0        2011.01.02  17:01   1.3342   1.3344   1.3341   1.3341       0
1        2011.01.02  17:02   1.3343   1.3343   1.3343   1.3343       0
2        2011.01.02  17:03   1.3344   1.3344   1.3344   1.3344       0
3        2011.01.02  17:06   1.3345   1.3349   1.3345   1.3348       0
4        2011.01.02  17:07   1.3346   1.3346   1.3346   1.3346       0
...             ...    ...      ...      ...      ...      ...     ...
4558998  2023.05.19  16:54  1.08051  1.08057  1.08046  1.08046       0
4558999  2023.05.19  16:55  1.08047  1.08055  1.08042  1.08054       0
4559000  2023.05.19  16:56  1.08053  1.08058   1.0805  1.08055       0
4559001  2023.05.19  16:57  1.08055   1.0806  1.08048  1.08056       0
4559002  2023.05.19  16:58  1.08055  1.08057  1.08055  1.08057       0

Not much idea of matlab, and don't know what to do with python to leave correct the data.

What can I do to make this happen?

Edited:

I've tried with this:

TT=readtimetable('EURUSD_big_data.csv')

but this error appears:

Expected a datetime or duration variable to use as RowTimes. Check that the data can
be converted to datetime or duration, or supply a "TimeStep" or "SampleRate".

how to I use

 TimeStep

in this example?

3

There are 3 answers

1
Timeless On

Since you have what looks like a big DataFrame, you can use mfinance. I will plot a dataset found in the docs (that matches the format of yours but with more rows to select just a portion/period of time -- by using partial string indexing--) to have a meaningful visualization.

import pandas as pd
import mplfinance as mpf #pip install mplfinance

url = "https://raw.githubusercontent.com/matplotlib/" \
      "mplfinance/master/examples/data/SP500_NOV2019_IDay.csv"

df = pd.read_csv(url, index_col=0, parse_dates=True)

a_period = df.loc["2019-11-06 15:00":"2019-11-06 16:00", :]

mpf.plot(a_period, type="candle", mav=(7,12))

Output :

enter image description here

Dataset/Period used :

print(a_period)

                       Open   Close    High     Low  Volume
Date                                                       
2019-11-06 15:00:00 3075.25 3074.57 3075.25 3074.57       0
2019-11-06 15:01:00 3074.55 3074.49 3074.70 3074.30       0
2019-11-06 15:02:00 3074.48 3074.24 3074.56 3074.21       0
...                     ...     ...     ...     ...     ...
2019-11-06 15:58:00 3075.49 3075.54 3075.98 3075.49       0
2019-11-06 15:59:00 3075.69 3076.75 3076.79 3075.55       0
2019-11-06 16:00:00 3076.74 3076.73 3076.74 3076.72       0

[61 rows x 5 columns]
0
my enlightend god On

In matlab, it can be solved by using

table2timetable()

First you have to make a DataFrame in python, set the time to date time:

data["Time"]= pd.to_datetime(data["Time"],unit='s')

after this, you have to index time

data.set_index(data["Time"],inplace=True)

and pass the df to csv

data.to_csv(str(symbol)+".csv")

Once you have a csv, in matlab

read the csv and make it table

prices=readtable("EURUSD.csv")

make the table a timetebale

prices=table2timetable(prices)

& drop the NaN rows

prices=rmmissing(prices)
4
John Aven On

I don't think there is a lot of complexity around this. Write a CSV file with a time column and read it into matlab

https://www.mathworks.com/help/matlab/ref/readtimetable.html (may die at some time...)

Edit:

Taken what you have tried. you have OUTPUT the index column of your dataframe. That is being read in as your datetime column to the timetable. Remove that column manually (more painful) or change your output code so that it does not save the index column - which I assume here is fallacious anyway.

You may also need to combine your date and time columns into a datetime column using the datetime library or similar - or as timestamps/epochs.

Note also that table2timetable will NOT solve your problem since you do NOT have a column that works as sampling rate, timestamp, datetime, etc... You will have to fix your dataset output format FIRST, and then you can fix use either

readtimetable('file.csv')

or

table2timetabel('file.csv', 'RowTimes', timeVarName)