Does n-fold cross-validation for fastknn to find the best k parameter.

fastknnCV(x, y, k = 3:15, method = "dist", normalize = NULL, folds = 5,
  eval.metric = "overall_error", nthread = 1)

Arguments

x
input matrix of dimension nobs x nvars.
y
factor array wtih class labels for the x rows.
k
sequence of possible k values to be evaluated (default is [3:15]).
method
the probability estimator as in fastknn.
normalize
variable scaler as in fastknn.
folds
number of folds (default is 5) or an array with fold ids between 1 and n identifying what fold each observation is in. The smallest value allowable is nfolds=3. The fold assigment given by fastknnCV does stratified sampling.
eval.metric
classification loss measure to use in cross-validation. See classLoss for more details.
nthread
the number of CPU threads to use (default is 1).

Value

list with cross-validation results:

  • best_eval: the best loss measure found in the cross-validation procedure.
  • best_k: the best k value found in the cross-validation procedure.
  • cv_table: data.frame with the test performances for each k on each data fold.

See also

classLoss

Examples

## Not run: ------------------------------------ # library("mlbench") # library("caTools") # library("fastknn") # # data("Ionosphere") # # x <- data.matrix(subset(Ionosphere, select = -Class)) # y <- Ionosphere$Class # # set.seed(1024) # tr.idx <- which(sample.split(Y = y, SplitRatio = 0.7)) # x.tr <- x[tr.idx,] # x.te <- x[-tr.idx,] # y.tr <- y[tr.idx] # y.te <- y[-tr.idx] # # set.seed(2048) # cv.out <- fastknnCV(x = x.tr, y = y.tr, k = c(5,10,15,20), eval.metric="logloss") # # cv.out$cv_table ## ---------------------------------------------