Before blindly giving the data to the computer,
it is a good idea to look at it:
d <- read.csv("train.csv")
str(d)
# 'data.frame': 891 obs. of 12 variables:
# $ PassengerId: int 1 2 3 4 5 6 7 8 9 10 ...
# $ Survived : int 0 1 1 1 0 0 0 0 1 1 ...
# $ Pclass : int 3 1 3 1 3 3 1 3 3 2 ...
# $ Name : Factor w/ 891 levels "Abbing, Mr. Anthony",..: 109 191 358 277 16 559 520 629 417 581 ...
# $ Sex : Factor w/ 2 levels "female","male": 2 1 1 1 2 2 2 2 1 1 ...
# $ Age : num 22 38 26 35 35 NA 54 2 27 14 ...
# $ SibSp : int 1 1 0 1 0 0 0 3 0 1 ...
# $ Parch : int 0 0 0 0 0 0 0 1 2 0 ...
# $ Ticket : Factor w/ 681 levels "110152","110413",..: 524 597 670 50 473 276 86 396 345 133 ...
# $ Fare : num 7.25 71.28 7.92 53.1 8.05 ...
# $ Cabin : Factor w/ 148 levels "","A10","A14",..: 1 83 1 57 1 1 131 1 1 1 ...
# $ Embarked : Factor w/ 4 levels "","C","Q","S": 4 2 4 4 4 3 4 4 4 2 ...
summary(d)
Some of the variables have too many values to be useful
(at least in your first model):
you can remove the name, ticket, cabin and passengerId.
You may also want to transform some of the numeric variables (say, class), to factors,
if it is more meaningful.
Since neuralnet only deals with quantitative variables,
you can convert all the qualitative variables (factors)
to binary ("dummy") variables, with the model.matrix function --
it is one of the very rare situations
in which R does not perform the transformation for you.
m <- model.matrix(
~ Survived + Pclass + Sex + Age + SibSp + Parch + Fare + Embarked,
data = d
)
head(m)
library(neuralnet)
r <- neuralnet(
Survived ~ Pclass + Sexmale + Age + SibSp + Parch + Fare + EmbarkedC + EmbarkedQ + EmbarkedS,
data=m, hidden=10, threshold=0.01
)
Error Message "requires numeric/complex matrix/vector arguments" occur when you have factor or character variables in your data.
There are three ways to solve this problem:
Delete the variable
If the variable is an ordered factor, use integer instead.
If the variable is character,transform it into factor and then into dummy variable.
You can use model.matrix() mentioned above or class.ind() function from nnet package to transfer factor into dummy variable.