CloudKit: Preventing Duplicate Records

3.3k views Asked by At

I am working through an app that pulls data from an external web service into a private CloudKit database. The app is a single user app, however I am running into a race condition that I am not sure how to avoid.

Every record in my external data has a unique identifier that I map to my CKRecord instances. The general app startup flow is:

  1. Fetch current CKRecords for the relevant record type.
  2. Fetch external records.
  3. For every external record, if it doesn't exist in CloudKit, create it via batch create (modification operation).

Now, the issue is, if this process is kicked off on two of a user's devices simultaneously, since both the CK and external fetch is async, there is a strong possibility that I'll get duplicate records.

I know I can use zones to atomically commit all of my CKRecord instances, but I don't think that solves my issue because if all of these fetches happen at essential the same time, the save is not really the issue.

My questions are:

  1. Does anyone know of a way to "lock" the private database for writes across all of a user's devices?
  2. Alternatively, is there a way to enforce uniqueness on any CKRecord field?
  3. Or, is there a way to use a custom value as the primary key, in that case I could use my external ID as the CK ID and allow the system to prevent duplicates itself.

Thanks for the help in advance!

2

There are 2 answers

0
harryhorn On

Answers:

  1. No, you cannot lock the private database
  2. Cloudkit already enforces and assumes uniqueness of your record ID
  3. You can make the record ID anything you like (in the non zone part of it).

Explanation:

Regarding your issue of duplication. If you are the one creating the record IDs (from the external records you mentioned for example) then at worst you should have one record over write the other with the same data if you have a race condition. I do not think that is an issue for the extreme case two devices kick off this process at the same time. Basically you logic of first fetching existing records and then modifying them seems sound to me.

Code:

//employeeID is a unique ID to identify an employee
let employeeID = "001"

//Remember the recordID needs to be unique within the same database.
//Assuming you have different record types, it is better to prefix the record name with the record type so that it is unique
let recordName = "Employee-\(employeeID)"

//If you are using a custom zone
let customZoneID = CKRecordZoneID(zoneName: "SomeCustomZone", ownerName: CKCurrentUserDefaultName)
let recordIDInCustomZone = CKRecordID(recordName: recordName, zoneID: customZoneID)

//If you are using the default zone
let recordIDInDefaultZone = CKRecordID(recordName: recordName)
4
user3069232 On

I had similar issue of duplicates downloaded when I tried to read in a database of more than 100 records; the solution is found in the Apple's Atlas example which uses a Boolean to check if the last process finished before it launches the next. You find a block much like this...

@synchronized (self)
    {
        // Quickly returns if another loadNextBatch is running or we have the oldest post
        if(self.isLoadingBatch || self.haveOldestPost) return;
        else self.isLoadingBatch = YES;
    }

Incidentally here the code to create your own record key.

CKRecordID *customID = [[CKRecordID alloc] initWithRecordName:    [globalEOConfirmed returnEOKey:i]];
    newrecord = [[CKRecord alloc] initWithRecordType:@"Blah" recordID:customID];