Currently i operate on postgresql 9.6 (debian stretch). My table has about 24 mio entries per day. In total it contains about 7 days of data. This is a temporary data storage because it cant be processed in real time. After i have processed this data it gets deleted. Just keep the results.
My SQL is
// build sql string
m_sql = "select dt, adc0, adc1, adc2, adc3, id from " + to_string( table ) +
" WHERE date(dt) = '" +
boost::gregorian::to_iso_extended_string( day ) + "'" +
" order by dt";
The table looks like
Tabelle »public.data«
Spalte | Typ | Attribute
--------+-----------------------------+-------------------------------------------------------
dt | timestamp without time zone | not null
adc0 | smallint |
adc1 | smallint |
adc2 | smallint |
adc3 | smallint |
id | integer | not null Vorgabewert nextval('data_id_seq'::regclass)
Indexe:
"data_pkey" PRIMARY KEY, btree (id)
"date_idx" btree (date(dt))
"dt_idx" btree (dt)
"id_idx" btree (id)
My current attempt is to go over them in batches of 1 mio per step and use OFFSET/LIMIT and order by time. Right now each iteration searches the whole data space again and sort them. until the offset is reached. When i fetch the data all at once my process segfaults as it kets killed by OOM.
This is inside a transaction.
// get the values
auto add_limit_offset = []( std::string sql,
size_t &offset ) -> std::string {
static const size_t delta = 1000000;
auto ret = sql + " LIMIT " + std::to_string( delta ) + " OFFSET " +
std::to_string( offset ) + ";";
offset += delta;
return ret;
};
bool run = true;
size_t offset = 0;
int nr_of_sinks = static_cast<int>( m_next_sinks.size() );
while ( run )
{
std::cout << "adc_source: at " << offset << std::endl;
auto sql = add_limit_offset( m_sql, offset );
auto dbvalues = T.exec( sql.c_str() );
run = dbvalues.size() > 0;
// fill dbvalues into stl container
for ( pqxx::result::const_iterator d = dbvalues.begin();
d != dbvalues.end(); ++d )
{
// timepoint
std::string s_dt = ( *d )[0].c_str();
s_dt.erase( s_dt.begin() + 19, s_dt.end() );
auto dt = boost::posix_time::time_from_string( s_dt );
for ( int adc = 0; adc < nr_of_sinks; ++adc )
{
std::string s_adc = ( *d )[1 + adc].c_str();
double v = std::atoi( s_adc.c_str() );
m_next_sinks[adc]->push( dt, v );
}
}
}