knn {clustering} R Documentation

k-Nearest Neighbour Classification

Description


k-nearest neighbour classification for test set from training set. For each
row of the test set, the k nearest (in Euclidean distance) training set
vectors are found, and the classification is decided by majority vote, with
ties broken at random. If there are ties for the kth nearest vector, all
candidates are included in the vote.

Usage

knn(train, test, cl,
    k = 1,
    l = 0,
    prob = FALSE,
    use.all = TRUE);

Arguments

train

matrix or data frame of training set cases.

test

matrix or data frame of test set cases. A vector will be interpreted as a row vector for a single case.

cl

factor of true classifications of training set

k

number of neighbours considered. [as integer]

l

minimum vote for definite decision, otherwise doubt. (More precisely, less than k-l dissenting votes are allowed, even if k is increased by ties.). [as double]

prob

If this is true, the proportion of the votes for the winning class are returned as attribute prob. [as boolean]

use.all

controls handling of ties. If true, all distances equal to the kth largest are included. If false, a random selection of distances equal to the kth is chosen to use exactly k neighbours. [as boolean]

Details

Authors

MLkit

Value

Factor of classifications of test set. doubt will be returned as NA.

clr value class

Examples

 imports "dataset" from "MLkit";
 
 data(bezdekIris);
 print(bezdekIris);
 
 # create training/test for demo
 set.seed(123);
 
 let [training, test] = split_training_test(iris, ratio = 0.7);
 let predictions = knn(train = training[, -"class"], test = test[, -"class"], cl = training$class, k = 3); 
 
 print(predictions);

[Package clustering version 1.0.0.0 Index]