Sunday, 18 August 2013

Categorize data

Categorize data

If I would order the data from Titanic like the following output
df<-data.frame(Titanic)
df_Crew <- df[df$Class=="Crew",]
L <- lapply(1:4, function(i) aggregate(df_Crew$Freq, by=df_Crew[1:i], sum))
L2 <- lapply(L, function(d) data.frame(group=do.call(paste,
c(as.list(d[names(d)!="x"]), sep="_")), freq=d$x))
L3<-data.frame()
for(i in 1:3){
d<-cbind(from=rbind(L2[[i]],L2[[i]])$group,L2[[i+1]])
L3<-rbind(L3,d)
}
L3
from group freq
1 Crew Crew_Male 862
2 Crew Crew_Female 23
3 Crew_Male Crew_Male_Child 0
4 Crew_Female Crew_Female_Child 0
5 Crew_Male Crew_Male_Adult 862
6 Crew_Female Crew_Female_Adult 23
7 Crew_Male_Child Crew_Male_Child_No 0
8 Crew_Female_Child Crew_Female_Child_No 0
9 Crew_Male_Adult Crew_Male_Adult_No 670
10 Crew_Female_Adult Crew_Female_Adult_No 3
11 Crew_Male_Child Crew_Male_Child_Yes 0
12 Crew_Female_Child Crew_Female_Child_Yes 0
13 Crew_Male_Adult Crew_Male_Adult_Yes 192
14 Crew_Female_Adult Crew_Female_Adult_Yes 20
Then I could greate a nice tree like
g <- graph.data.frame(L3, directed=TRUE)
plot(g,layout=layout.reingold.tilford(g,root=1),edge.arrow.size=0.5)
If I would set a better layout the tree would become a very nice look and
feel.
What I have done, I cluster the data Titanic and print a graph. I can very
well imagine there already exists pkg they are designed for group data in
several ways, depending on the question the R-user impose to the data. So
this pkg contains general functions for clustering and plotting data also
in the way I do above. So would you please sorted out for me wich function
can do this e.g. with the data titanic?

No comments:

Post a Comment