i have subset data onto people completed survey, variable "disposition"
> names(df) [1] "caseid" "disposition" "regstate" "pid7" "ideo5" "birthyr" "gender" "race" "educ" > summary.default(df) length class mode caseid 708 -none- numeric disposition 708 factor numeric regstate 708 factor numeric pid7 708 factor numeric ideo5 708 factor numeric birthyr 708 -none- numeric gender 708 factor numeric race 708 factor numeric educ 708 factor numeric
now subset data:
disposition <- df$disposition
i can see complete surveys between 33 , 708
completesurveys <- disposition[33:708]
i trie select data in following way:
selectdata <- complete.cases(df$caseid, df$regstate, df$pid7, df$ideo, df$birthyr, df$gender, df$race, df$educ)
and define data surveys completed:
completesurveysdat <- (selectdata & (df$disposition > 32 & df$disposition < 709))
unfortunately have got:
warning messages:
1: in ops.factor(df$disposition, 32) : ‘>’ not meaningful factors
2: in ops.factor(df$disposition, 709) : ‘<’ not meaningful factors
i recommend use either data.table or dplyr packages manipulate databases these problems easier.
also, true given filtering factor, won't possible use numeric operators (or not of them). should use as.numeric() function address this.
with data.table:
library(data.table) df <- as.data.table df[as.numeric(disposition)> 33 & as.numeric(disposition)<709]
with dplyr:
library(dplyr) df <- as.tbl(df) df %>% filter(as.numeric(disposition)> 33,as.numeric(disposition)<709)
you should type ?dplyr or ?data.table obtain more info, these packages have proven extremely useful manipulate databases.
your output means after filtering got 0 observations, hence as.numeric() not solving problem of factors. recommended using as.numeric(as.character()). make sure first have numeric values , use code showed above.
hope helps
Comments
Post a Comment