Cassandra: Error: code=2200 [Invalid query] message="PRIMARY KEY column cannot be restricted as preceding column

9.8k views Asked by At

Table Schema:

CREATE TABLE com (
    receiverid text,
    senderid text,
    commatriid text,
    comtype tinyint,
    comid text,
    displaystatus tinyint,
    comdate timestamp,
    cominfoid bigint,
    comstatus tinyint,
    dateactioned timestamp,
    datedeleted timestamp,
    dateread timestamp,
    dateupdated timestamp,
    disclosedmatriid tinyint,
    filteredmsg tinyint,
    message text,
    recentstatus tinyint,
    regionallang tinyint,
    transmsg text,
    PRIMARY KEY (receiverid, senderid, commatriid, comtype, comid, displaystatus, comdate)
) WITH CLUSTERING ORDER BY (senderid ASC, commatriid ASC, comtype ASC, comid ASC, displaystatus ASC, comdate ASC);

Select query with where clause:

SELECT ComInfoId,ComId,ComType,SenderId,ReceiverId,ComMatriId,ComDate,ComStatus FROM com WHERE ComMatriId='M1'AND SenderId='M79984222' and ReceiverId='M2' and ComDate <= '2017-11-14 09:20:05+0000';

ERROR:

InvalidRequest: Error from server: code=2200 [Invalid query] message="PRIMARY KEY column "comdate" cannot be restricted as preceding column "comtype" is not restricted"

what is primary key column restricted as proceding column? if remove the comDate date in where clause ,i m able to retrive the data

2

There are 2 answers

0
Jim Wright On

In C* the order of the primary key columns matter, and the data is stored in disk in a manner that only allows data be selected if the key before is also specified.

PRIMARY KEY (receiverid, senderid, commatriid,
             comtype, comid, displaystatus, comdate)

The error you received means that if you want to query by comdate then you need to also query by ALL of the previous keys: receiverid, senderid, commatriid, comtype, comid, displaystatus.

If you want to query by comtype, then you need to specify ``receiverid, senderid, commatriid, comtype`.

Why do I have to specify the other keys?

This comes down to the way Cassandra was designed and it is like this for performance. Everything is stored in partitions and is stored in order on disk so that the database has to do minimal work. The downside is that you need to know what queries you will want to run when you design your schema.

Visualise the data on disk as an array (please excuse all of the columns equaling ints):

data =
receiverid(1)
    senderid(1)
        commatriid(1)
            comtype(1)
                comid(1)
                    displaystatus(1)
                        comdate(1)
                        comdate(2)
                        comdate(3)
    senderid(2)
        commatriid(1)
            comtype(1)
                comid(1)
                    displaystatus(1)
                        comdate(1)
                        comdate(2)
                        comdate(3)

Take the following queries and my examples of accessing the data like Cassandra would.

  1. WHERE receiverid IN (1, 2)

    result = (data[1], data[2])

  2. WHERE receiverid IN (1) AND senderid = 1

    result = data[1][1]

  3. WHERE receiverid IN (1) AND senderid = 1 AND comtype = 1

    result = data[1][1][????][1]

    This is where it gets interesting. I have deliberately missed out the commatriid field in my where clause. Now C* can narrow the results down to data[1][1] but which commatriid index should it look in to return the data? It doesn't know because I haven't specified it.

  4. WHERE receiverid IN (1) AND senderid = 1 AND comtype = 1 AND commatriid = 1

    result = data[1][1][1][1]

    Now we have included commatriid we know which index to access for the data.

0
Chakri Stark On

Delete commatriid from where clause in select query and then check the output once.. I guess, a query like this should not happen with multiple clustering keys in the query search keys. Correct me if I am wrong.

This may help you: https://www.datastax.com/dev/blog/a-deep-look-to-the-cql-where-clause