r - Operator == inconsistent in logical columns in data.table -
please see following reproducible example:
library(data.table) set.seed(123) dt <- data.table(a=rep(0.3,10000)) dt[, b := runif(.n) < a] dt[b == t, .n] # [1] 3005 dt[, summary(b)] # mode false true na's # logical 6995 3005 0
everything looks fine , count of "true" values same 2 methods. replace col b new one.
dt[, b := runif(.n) < a] dt[b == t, .n] # [1] 3331 dt[, summary(b)] # mode false true na's # logical 6981 3019 0
the count of 't' in column b different!!! same column 1 method gives 3331 "true" values , other 3019.
when == bypassed
dt[b != f, .n] # [1] 3019 dt[, summary(b)] # mode false true na's # logical 6981 3019 0
which correct again
i can reproduce data.table v1.94 , 1.9.5 on windows 8.1 x64.
here's easier reproducible example without runif()
.
require(data.table) ## 1.9.4+ dt = data.table(x = 1:5) dt[, y := x <= 2l] # x y # 1: 1 true # 2: 2 true # 3: 3 false # 4: 4 false # 5: 5 false dt[y == true, .n] # [1] 2 <~~~~~~ correct result. dt[, y := x <= 3l] # x y # 1: 1 true # 2: 2 true # 3: 3 true # 4: 4 false # 5: 5 false dt[y == true, .n] # [1] 2 <~~~~~~ incorrect result, should 3!
now fixed in v1.9.5 on github.
:=
,set*
drop secondary keys (new in v1.9.4)dt[x==y]
works again after:=
orset*
without needingoptions(datatable.auto.index=false)
.setkey()
dropping secondary keys correctly. 23 tests added. user36312 reporting, #885.
Comments
Post a Comment