Limit/Resample data in CakePhp3

64 views Asked by At

I'm building a cake3 app which gathers A LOT of time series data in a table double_measures:

select * from double_measures limit 20;
+----+--------------------+---------------------+-------+--------+
| id | physical_sensor_id | time                | milis | value  |
+----+--------------------+---------------------+-------+--------+
|  1 |                  1 | 2016-11-25 00:50:01 |     0 | 306.15 |
|  2 |                  2 | 2016-11-25 00:50:01 |     0 | 300.15 |
|  3 |                  3 | 2016-11-25 00:50:01 |     0 | 308.15 |
|  4 |                  4 | 2016-11-25 00:50:01 |     0 | 308.15 |
|  5 |                  6 | 2016-11-25 00:50:01 |     0 | 310.15 |
|  6 |                  7 | 2016-11-25 00:50:01 |     0 | 310.15 |
|  7 |                  8 | 2016-11-25 00:50:01 |     0 | 305.15 |
|  8 |                  9 | 2016-11-25 00:50:01 |     0 | 306.15 |
|  9 |                 10 | 2016-11-25 00:50:01 |     0 | 304.15 |
| 10 |                 11 | 2016-11-25 00:50:01 |     0 | 309.15 |
| 11 |                  1 | 2016-11-25 00:55:01 |     0 | 306.15 |
| 12 |                  2 | 2016-11-25 00:55:01 |     0 | 300.15 |
| 13 |                  3 | 2016-11-25 00:55:01 |     0 | 308.15 |
| 14 |                  4 | 2016-11-25 00:55:01 |     0 | 308.15 |
| 15 |                  6 | 2016-11-25 00:55:01 |     0 | 310.15 |
| 16 |                  7 | 2016-11-25 00:55:01 |     0 | 310.15 |
| 17 |                  8 | 2016-11-25 00:55:01 |     0 | 305.15 |
| 18 |                  9 | 2016-11-25 00:55:01 |     0 | 306.15 |
| 19 |                 10 | 2016-11-25 00:55:01 |     0 | 304.15 |
| 20 |                 11 | 2016-11-25 00:55:01 |     0 | 309.15 |
+----+--------------------+---------------------+-------+--------+

Currently I am selecting values by the following query $measures = $table->find('all',['order'=>['id'=>'DESC']])->select(['time','milis','value'])->where(['physical_sensor_id'=>$sid])->limit($limit); which returns the last $limit entries for a given sensor.

However, I would prefer to evenly re-sample the table to get every nth value, with n being

[the number of values for $sid]/$limit.

Thanks to How do you select every n-th row from mysql, I know how to do it in MySql, but is there a way to accomplish this in Cakephp3?

Of course, I could re-sample AFTER querying, but this is not an option, as I run out of memory for HUGE result sets.

1

There are 1 answers

0
Stephan Richter On

Found a solution:

in my AppController I defined a function as follows:

protected function initRowNumbering(){
    $conn = ConnectionManager::get('default');
    $conn->execute("SET @i = 0");
}

Then in the controller where I need the re-sampling, I have the following:

        $query = $table->find('all',['order'=>['id'=>'DESC']])->select(['time','milis','value'])->where(['physical_sensor_id'=>$sid]);

        if ($sample != null){ // if the number of samples is set:
            $this->initRowNumbering(); // add index variable
            $count = $query->count(); // count total number of values for given sensor 
            $step = (int)($count / $sample); // calculate sample step size
            if ($step > 0){
                $measures = $query->where(['(@i := @i+1)%'.$step.'=0']); // add resampling filter
            } else {
                $measures = $query;
            }                               
        } else {
            $measures = $query->limit($limit);
        }

This works nicely, although it might not be the most elegant solution.

Any improvement ideas?