here reproducible example:
mydt <- data.table(id=c('a','b','b'), val=c('check','check','a')); mydt[val == "check"]; # <= secondary index created on calling mydt[, val:=ifelse(.n>1, '2', '1'), by=id] mydt # id val # 1: 1 # 2: b 2 # 3: b 2 key(mydt) # null key2(mydt) # [1] "val"
now, call simple command gives rather strange (for me) result:
mydt[val=='2', res:='yes'][]; # id val res # 1: 1 na # 2: b 2 yes # 3: b 2 na
with filter val=='2'
, expected records 2 , 3, in fact got record 3. due secondary key because removal brings expected behavior:
set2key(mydt, null) mydt[val=='2', res:='yes'][]; # id val res # 1: 1 na # 2: b 2 yes # 3: b 2 yes
i wondering if it's bug or expected behavior. in case, not desired: did not know such thing secondary key (before asking that question), , spent lot of time trying figure out why miss records. me, solved problem adding set2key(mydt, null)
instruction worrying similar thing happen in other parts of code , don't know how detect/prevent - wouldn't add set2key(., null)
calls after every other line...
this indeed bug (i reported turned out reported already), , fixed in package version 1.9.7 - works!
Comments
Post a Comment