Cqlengine in python does not insert values in new fields after schema change (Cassandra)

169 views Asked by At

I'm trying to insert rows into a table after changing its schema in Cassandra with the CQLEngine python library. Before the change, the model looked like:

class MetricsByDevice(Model):
    device = columns.Text(primary_key=True, partition_key=True)
    datetime = columns.DateTime(primary_key=True, clustering_order="DESC")

    load_power = columns.Double()
    inverter_power = columns.Double()

I've changed the schema to this, adding four columns (DSO, node, park and commercializer):

class MetricsByDevice(Model):
    device = columns.Text(primary_key=True, partition_key=True)
    datetime = columns.DateTime(primary_key=True, clustering_order="DESC")

    DSO = columns.Text(index=True, default='DSO_1'),
    node = columns.Text(index=True, default='Node_1'),
    park = columns.Integer(index=True, default=6),
    commercializer = columns.Text(index=True, default='Commercializer_1'),

    load_power = columns.Double()
    inverter_power = columns.Double()

Then, I've synced the table with a script containing the line

sync_table(MetricsByDate)

I've checked the database and the four columns have been created. The existing rows has these fields with value NULL (as expected).

Then I've modified the script in charge of inserting in batch rows including the values corresponding to the new fields. It looks like:

        batch = BatchQuery()
        for idx, message in enumerate(consumer):

            data = message.value
            ts_to_insert = dateutil.parser.parse(data['timestamp'])

            filters = get_filters(message.partition_key)

            MetricsByDate.batch(batch).create(
                device=device,
                date=str(ts_to_insert.date()),
                time=str(ts_to_insert.time()),
                created_at=now,
                DSO=str(filters['DSO']),
                node=str(filters['node']),
                park=int(filters['park']),
                commercializer=str(filters['commercializer']),
                load_power=data['loadPower'],
                inverter_power=data['inverterPower'],
            )

            if idx % 100 == 0: # Insert every 100 messages

                batch.execute()

                # Reset batch
                batch = BatchQuery()

I've already checked that the values corresponding to the new fields aren't None and have the correct type. Nevertheless, it's inserting all the row correctly but the values in the new fields, that are NULL in Cassandra.

The batch insertion does not return any errors. I don't know if I'm missing something, or if I need to do an extra step to update the schema. I've been looking in the docs, but I can't find anything that helps.

Is there anything I'm doing wrong?

EDIT

After Alex Ott suggestion, I've inserted the lines one by one. Changing the code to:

for idx, message in enumerate(consumer):

            data = message.value
            ts_to_insert = dateutil.parser.parse(data['timestamp'])

            filters = get_filters(message.partition_key)

            metrics_by_date = MetricsByDate(
                device=device,
                date=str(ts_to_insert.date()),
                time=str(ts_to_insert.time()),
                created_at=now,
                DSO=str(filters['DSO']),
                node=str(filters['node']),
                park=int(filters['park']),
                commercializer=str(filters['commercializer']),
                load_power=data['loadPower'],
                inverter_power=data['inverterPower'],
            )

            metrics_by_date.save()

If before executing the line metrics_by_date.save() I add these print statements:

print(metrics_by_date.DSO)
print(metrics_by_date.park)
print(metrics_by_date.load_power)
print(metrics_by_date.device)
print(metrics_by_date.date)

The output is:

(<cassandra.cqlengine.columns.Text object at 0x7ff0b492a670>,)
(<cassandra.cqlengine.columns.Integer object at 0x7ff0b492d190>,)
256.99
SQ3-3.2.3.1-70-17444
2020-04-22

In the fields that are new I'm getting a cassandra object, but in the others I get their values. It maybe is a clue, because it continues to insert NULL in the new column.

1

There are 1 answers

0
ainsausti On

Finally I got It.

It was something stupid, in the model definition, for not knwon reasons, I've added commas to separate fields instead of linebreaks... So correcting the model definition to:

class MetricsByDevice(Model):
    device = columns.Text(primary_key=True, partition_key=True)
    datetime = columns.DateTime(primary_key=True, clustering_order="DESC")

    DSO = columns.Text(index=True, default='DSO_1')
    node = columns.Text(index=True, default='Node_1')
    park = columns.Integer(index=True, default=6)
    commercializer = columns.Text(index=True, default='Commercializer_1')

    load_power = columns.Double()
    inverter_power = columns.Double()

It works!!