Couchbase Bulk loading error with upsert() (.NET SDK 2.0)

281 views Asked by At

I have encountering an error when inserting bulk data with the upsert function and cannot figure out how to fix it. Anyone know what is wrong here? What the program is essentially doing is grabbing data from a SQL server database and loading into our Couchbase bucket on an Amazon instance. It does initially begin loading but after about 10 or so upserts it then crashes.

My error is as follows: Collection was modified; enumeration operation may not execute. Here are the screen shots of the error (Sorry the error only is replicated on my other Amazon server instance and not locally): https://i.stack.imgur.com/jB7ue.jpg

Here is the function which is calling the upsert method. This is called multiple times since I'm retrieving only parts of the data at a time since the SQL table is very large.

 private void receiptItemInsert(double i, int k) {
        const int BATCH_SIZE = 10000;
        APSSEntities entity = new APSSEntities();
        var data = entity.ReceiptItems.OrderBy(x => x.ID).Skip((int)i * BATCH_SIZE).Take(BATCH_SIZE);
        var joinedData = from d in data
                            join s in entity.Stocks
                            on new { stkId = (Guid)d.StockID } equals new { stkId = s.ID } into ps
                            from s in ps.DefaultIfEmpty()
                            select new { d, s };
        var stuff = joinedData.ToList();
        var dict = new Dictionary<string, dynamic>();

        foreach (var ri in stuff)
        {
            Stock stock = new Stock();
            var ritem = new CouchModel.ReceiptItem(ri.d, k, ri.s);
            string key = "receipt_item:" + k.ToString() + ":" + ri.d.ID.ToString();
            dict.Add(key, ritem);
        }
        entity.Dispose();
        using (var cluster = new Cluster(config))
        {
            //open buckets here
            using (var bucket = cluster.OpenBucket("myhoney"))
            {
                bucket.Upsert(dict); #CRASHES HERE
            }
        }
    }
1

There are 1 answers

0
Simon Baslé On BEST ANSWER

as discussed in the Couchbase Forums, this is probably a bug in the SDK.

When initializing the internal map of the couchbase cluster, the SDK will construct a List of endpoints. If two+ threads (as is the case during a bulk upsert) trigger this code at the same time, one may see an instance of the List being populated by the other (because the lock is entered just after a call to List.Any(), which may crash if the list is being modified).