{dataset} R# Documentation

dataset


require(R);

#' the machine learning dataset toolkit
imports "dataset" from "MLkit";

the machine learning dataset toolkit

Datasets are collections of raw data gathered during the research process usually in the form of numerical data. Many organizations, e.g. government agencies, universities or research institutions make the data they have collected freely available on the web for other researchers to use.

the machine learning dataset toolkit

Datasets are collections of raw data gathered during the research process usually in the form of numerical data. Many organizations, e.g. government agencies, universities or research institutions make the data they have collected freely available on the web for other researchers to use.

.NET clr type export
data_matrix: UnionMatrix

A matrix for the sparse numeric (double) data.



.NET clr function exports
SGT

Sequence Graph Transform (SGT) — Sequence Embedding for Clustering, Classification, and Search Sequence Graph Transform (SGT) is a sequence embedding function. SGT extracts the short- and long-term sequence features and embeds them in a finite-dimensional feature space. The long and short term patterns embedded in SGT can be tuned without any increase in the computation. >https://github.com/cran2367/sgt/blob/25bf28097788fbbf9727abad91ec6e59873947cc/python/sgt-package/sgt/sgt.py

split_training_test
estimate_alphabets
sample_id
get_feature

get feature vector by a given feature column index

project_features

Makes the feature projection

sort_samples

sort the sample dataset

add_sample

Add a data sample into the target sparse sample matrix object

toFeatureSet

helper function for cast the R# dataframe runtime object as the clr dataframe object

as.MLdataset

Convert the sciBASIC general dataframe as the Machine learning general dataset

description

get summary and descriptions about the given dataset

normalize_matrix

get the normalization matrix from a given machine learning training dataset.

as.tabular

convert machine learning dataset to dataframe table.

as.sampleSet
read.ML_model

read the dataset for training the machine learning model

write.ML_model

write the data model to file

write.sample_set
read.sample_set
MNIST.dims
read.MNIST

read mnist dataset file as R# dataframe object

gaussian

create demo matrix for run test

q_factors

encode a given numeric sequence as factors by quantile levels

encoding

do feature encoding

to_bins
to_factors
to_ints

[Document Index]