ios - Removing near duplicates with Core Data and Ensembles (iCloud) -


summary

my problem want rid of near duplicates in core data based ios project uses ensembles sync icloud.

  • the sync icloud works in app.
  • the problem when user creates similar objects on multiple devices before persistent store leeched ensembles (connected icloud).
  • this generates near duplicates factually correct.
  • my approach remove these duplicates doesn't seem work.

detailed problem

a user can create nsmanagedobjects on different devices before connected icloud. lets has nsmanagedobject named car has "to one" relationship nsmanagedobject named person in return has "to many" relationship car. this: a simplified model

ok, lets imagine user has 2 devices , creates 2 nsmanagedobjects on each device. car named "audi" , person named "raphael". both connected through relationship. on other device creates car named "bmw" , person named "raphael". connected each other. user has 2 similar objects on each device: 2 person objects both named "raphael."

my problem user end having 2 person objects name "raphael" on each device after synced.

this correct since objects uniqueidentifiers (to identify objects in ensembles) when user leeches persistent store. objects factually different. want fix.

my approach

i implemented delegate method , removed duplicates in reparationcontext.

- (bool)persistentstoreensemble:(cdepersistentstoreensemble *)ensemble      shouldsavemergedchangesinmanagedobjectcontext:(nsmanagedobjectcontext*)savingcontext     reparationmanagedobjectcontext(nsmanagedobjectcontext *)reparationcontext {      [reparationcontext performblockandwait:^{          // find duplicates         // change relationships , use inserted person object (the 1 icloud)         // delete local person object         [reparationcontext save:nil];     }     return yes; } 

basically seems work on second device merges data first device. unfortunately seems local person still synced icloud if deleted in reparationcontext.

this leads broken state since first device merges changes second device , replaces person again deleted on second device. syncs later person missing in car relationship , app throws syncing errors.

steps reproduce problem

  • step 1 (device 1)

    • create objects
    • data: car "audi" -> person "raphael (device 1)"
  • step 2 (device 2)

    • create objects
    • data: car "bmw" -> person "raphael (device 2)"
  • step 3 (device 1)

    • leech data store
    • connect icloud
    • send data icloud
    • data: car "audi" -> person "raphael (device 1)"
  • step 4 (device 2)

    • leech data store
    • connect icloud
    • merge data icloud
    • replace local person device 2 inserted person device 1
    • delete local person device 2
    • send data icloud
    • data:
      car "audi" -> person "raphael (device 1)"
      car "bmw" -> person "raphael (device 1)"
  • step 5 (device 1)

    • merge data icloud
    • replace local person device 1 inserted person device 2 (this shouldn’t happen)
    • delete local person device 1 (this shouldn’t happen)
    • send data icloud
    • expected data:
      car "audi" -> person "raphael (device 1)"
      car "bmw" -> person "raphael (device 1)"
    • actual data:
      car "audi" -> person "raphael (device 2)"
      car "bmw" -> person "raphael (device 2)"

actually local person object "raphael (device 2)" deleted in step 4, seems still sent icloud because in step 5 pops insert in savingcontext.insertedobjects shouldsavemergedchangesinmanagedobjectcontext delegate method.

as far understood, ensembles first pulls changeds icloud, asks user if expected via delegate methods, merges persistent store , sends deltas icloud after merge.

am doing wrong? or ensembles bug?

there issue lars mentioned. have careful things deterministically. sorting on unique id 1 way that.

personally, handle 1 of 2 other ways:

  1. do dedupe after merge completes (again, making sure deterministic)
  2. using chosen global identifiers control dedupe you.

for example, use unique id raphael. thing need careful of when go create raphael on same machine, called raphael_1 (or whatever).

if unique id unique (e.g. first + last name unlikely clash), ensembles automatically merge person on different devices.


Comments