Wednesday, 21 August 2013

Remove dots in dataset and split data into equal bands

Remove dots in dataset and split data into equal bands

My dataset has more than 200 variables and most of them have dots which
indicate missing values:
Age
19
20
..
56
23
R will recognize dots as Null values. So when I use
> library(Hmisc) # cut2
> split(data, cut2(data$Age, g=3))
to divide data into 3 bands, I got error message:
Error in if (cj == upper) next : missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In cut2(data2$Household_Count, g = 10) : NAs introduced by coercion
2: In Ops.factor(x, (lower - min.dif.factor * min.dif)) : not meaningful
for factors
3: In Ops.factor(x, (lower - min.dif.factor * min.dif)) : not meaningful
for factors
4: In Ops.factor(x, (lower - min.dif.factor * min.dif)) : not meaningful
for factors
5: In Ops.factor(x, (lower - min.dif.factor * min.dif)) : not meaningful
for factors
6: In Ops.factor(x, (lower - min.dif.factor * min.dif)) : not meaningful
for factors
7: In Ops.factor(x, (lower - min.dif.factor * min.dif)) : not meaningful
for factors
8: In Ops.factor(x, (lower - min.dif.factor * min.dif)) : not meaningful
for factors
9: In Ops.factor(x, (lower - min.dif.factor * min.dif)) : not meaningful
for factors
I have confirmed that this error is caused by Null values. However, since
I have too many variables with dots in different rows, I cannot simply get
rid of dots by filtering. How can I get rid of dots and execute
"splitting" command to every variable?

No comments:

Post a Comment