tick data at specific time in daylight saving timezone

166 views Asked by At

I have a Synology NAS mariadb has huge tick data. I want to create a table with price at 16:00 US/Eastern everyday.

It is easy to get the table with ticks at 10:05 GMT everyday without timezone consideration:

SELECT 
    TIMESTAMP,
    DATE(FROM_UNIXTIME(TIMESTAMP / 1000)) DATE,
    TIME(FROM_UNIXTIME(TIMESTAMP / 1000)) TIME,
    PRICE 
FROM DATASET.TICK
WHERE
    # 10:05:00 <= GMT < 10:07:00
    MOD(TIMESTAMP, 86400000) >= 79500000
    AND
    MOD(TIMESTAMP, 86400000) <  79620000
GROUP BY DATE;

seem the easiest way is to use mysql_tzinfo_to_sql then with CONVERT_TZ(Timestamp, 'UTC', 'US/Eastern'). But I have the command in my NAS.

-sh: mysql_tzinfo_to_sql: command not found
-sh: mariadb-tzinfo-to-sql: command not found

Since there would be no data between 17:00 and 18:00 in US/Eastern, I might come up with a dumb and slow way to do it (maybe not in mysql). But I wish to see if there is any smarter way of doing it in my setup via mysql?

UPDATED v1:

For the mysql_tzinfo_to_sql: command not found issue. I have dealt with it with my Ubuntu machine.

Setup:

  • Synology NAS (ssh, mariadb)
  • Ubuntu machine (ssh, mysql-server-core-8.0, mysql client)

Steps:

  1. copy zoneinfo directory from NAS to your machine: e.g.

scp -r [email protected]:/usr/share/zoneinfo ~/Documents

  1. run the command in your machine and update the mysql database: e.g.

mysql_tzinfo_to_sql zoneinfo | mysql -u root -h 192.168.1.2 -p mysql

Now, CONVERT_TZ() can be used. Since 'US/Eastern' does not exist in timezone table, I had chosen America/New_York instead. So the straight conversion from my previous code became:

SELECT
    TIMESTAMP,
    DATE(CONVERT_TZ(FROM_UNIXTIME(TIMESTAMP / 1000), 'SYSTEM', 'America/New_York')) DATE,
    TIME(CONVERT_TZ(FROM_UNIXTIME(TIMESTAMP / 1000), 'SYSTEM', 'America/New_York')) TIME,
    PRICE 
FROM DATASET.TICK
WHERE
    # 16:00:00 <= America/New_York < 16:00:05
    TIME(CONVERT_TZ(FROM_UNIXTIME(TIMESTAMP / 1000), 'SYSTEM', 'America/New_York')) BETWEEN CAST('16:00:00' AS TIME) AND CAST('16:00:05' AS TIME)
GROUP BY DATE, CODE

TIME column can show 16:00 everyday. Although this part is solved, this works only for a small table and I now suffer from

error 1206: The number of locks exceeds the lock table size.

I can see this conversion is tedious and inefficient. However, my best guess or next move is to add Timezone based Date and Time columns to the table.

1

There are 1 answers

0
Rick James On

Let's solve the second problem at the same time.

You want the last entry before, say, 10:07:00. [Perhaps you meant 'pm'?] That gets rid of the missing entry issue. And the multiple entries in an artificial time range.

But it raises another ugly situation -- how to get not just the timestamp in that range and get it for every stock, but get the rest of each matching row. That is a "groupwise-max" problem; see the tag I added.

Or see which of the efficient algorithms I describe in Groupwise-Max will suit your situation best. Note, you will need to add an INDEX for any solution, and that operation will take a bunch of time.

You mentioned DST, but let's get rid of that by simply asking for the last entry before 23:59:59 for each day. This assumes that you want the "market close" and that there are no trades after the close. And it does not care if [the stupid] Daylight Savings is in effect.