Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
470 views
in Technique[技术] by (71.8m points)

machine learning - How to prune a tree in R?

I'm doing a classification using rpart in R. The tree model is trained by:

> tree <- rpart(activity ~ . , data=trainData)
> pData1 <- predict(tree, testData, type="class")

The accuracy for this tree model is:

> sum(testData$activity==pData1)/length(pData1)
[1] 0.8094276

I read a tutorial to prune the tree by cross validation:

> ptree <- prune(tree,cp=tree$cptable[which.min(tree$cptable[,"xerror"]),"CP"])
> pData2 <- predict(ptree, testData, type="class")

The accuracy rate for the pruned tree is still the same:

> sum(testData$activity==pData2)/length(pData2)
[1] 0.8094276

I want to know what's wrong with my pruned tree? And how can I prune the tree model using cross validation in R? Thanks.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You have used the minimum cross-validated error tree. An alternative is to use the smallest tree that is within 1 standard error of the best tree (the one you are selecting). The reason for this is that, given the CV estimates of the error, the smallest tree within 1 standard error is doing just as good a job at prediction as the best (lowest CV error) tree, yet it is doing it with fewer "terms".

Plot the cost-complexity vs tree size for the un-pruned tree via:

plotcp(tree)

Find the tree to the left of the one with minimum error whose cp value lies within the error bar of one with minimum error.

There could be many reasons why pruning is not affecting the fitted tree. For example the best tree could be the one where the algorithm stopped according to the stopping rules as specified in ?rpart.control.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...