Exercises
- Visualise prediction results is a useful way to find the problem. using ‘Rtsne’ package from R to visualize decision tree model2 both the left branch and the right branch’s prediction, compare them.
features <- c("Sex", "Fare_pp", "Pclass", "Title", "Age_group", "Group_size", "Ticket_class", "Embarked")
Tree.left <- train[train$Title == "Mr",]
set.seed(984357)
tsne.left <- Rtsne(Tree.left[, features], check_duplicates = FALSE)
ggplot(NULL, aes(x = tsne.left$Y[, 1], y = tsne.left$Y[, 2],
color = Tree.left$Survived)) +
geom_point() +
labs(color = "Survived") +
ggtitle("Visualization of left branch of tree where title is 'Mr'")
#
Tree.right <- train[train$Title != "Mr",]
set.seed(984357)
tsne.right <- Rtsne(Tree.right[, features], check_duplicates = FALSE)
ggplot(NULL, aes(x = tsne.right$Y[, 1], y = tsne.right$Y[, 2],
color = Tree.right$Survived)) +
geom_point() +
labs(color = "Survived") +
ggtitle("Visualization of right branch of the tree")
- Considering re-engineer passengers with the same tickets.