Uniqueness of a tuple in couchdb query

170 views Asked by At

I am trying to make a query which I haven't been able to yet. My permanent view function is following:

function(doc) { 
    if('llweb_result' in doc){
         for(i in doc.llweb_result){ 
             emit(doc.llweb_result[i].llweb_result, doc);
         }
    }
}

Depending on the key, I filter the result. So, I need this key. Secondly, as you see, there is a for loop. This causes identical tuples in the result. However, I also need to do this for loop to check everything. In here, I just want to know how to eliminate identical tuples?

I am using couchdb-python. My related code is:

result = {}
result['0'] = self.dns_db.view('llweb/llweb_filter', None, key=0, limit = amount, startkey_docid = '000000052130')
result['1'] = self.dns_db.view('llweb/llweb_filter', None, key=1, limit=amount)
result['2'] = self.dns_db.view('llweb/llweb_filter', None, key=2, limit=amount)

As it is understood from key values, there are three different types of keys. I thought that I can extend the 'key' with [doc._id, llweb_result]. I need a key like [*, 2], but I don't know it is possible. Then, use reduce function to group them. This will definitely work, but at this time the problem is how to make a selection query by using only the values [0,1,2].

Edited in 16.08.12

Example for 'llweb_result' property of a couchdb record:

"llweb_result": {
   "1": {
       "ip": "66.233.123.15",
       "domain": "domain.com",
       "llweb_result": 1
   },
   "0": {
       "ip": "66.235.132.118",
       "domain": "domain.com',
       "llweb_result": 1
   }
}

there is only one domain name in one record, but ther could be multiple ips for it. You can consider the record as a dns packet.

I want to group records depending on llweb_result (0,1,2). I will do a selection query for them(e.g. I fetch records which contains '1'). But for the example above, there will be two identical tuples in the result.

Any help will be appriciated.

2

There are 2 answers

2
behnam On

If you get duplicate pairs in the query results, it means that you have the duplicate doc.llweb_result[i].llweb_result values in each document.

You can change the view function to emit only one of these values (as the key). One way to do so would be:

function(doc) {
    if ('llweb_result' in doc) {
         distinct_values = {};
         for (var i in doc.llweb_result) {
             distinct_values[doc.llweb_result[i].llweb_result] = true;
         }
         for(var dv in distinct_values) {
             emit(dv, doc);
         }
    }
}
0
smathy On

I don't know anything about couchdb-python but CouchDB supports either a single key or multiple keys in an array. So, take a look in your couchdb-python docs for how to supply keys=[0,1,2] as a parameter.

Regarding getting just the unique values, take a look at this section of CouchDB The Definitive Guide which explains how to add basically a NOOP reduce, so you can use group=true