{stats} R# Documentation

stats


require(R);

#' The R Stats Package
imports "stats" from "Rlapack";

The R Stats Package

R statistical functions, This package contains functions for statistical calculations and random number generation. For a complete list of functions, use library(help = "stats").

The R Stats Package

R statistical functions, This package contains functions for statistical calculations and random number generation. For a complete list of functions, use library(help = "stats").



.NET clr function exports
combin

calculates C(n, k).

pnorm

The Normal Distribution Density, distribution function, quantile function and random generation for the normal distribution with mean equal to mean and standard deviation equal to sd.

dnorm
p.adjust

Adjust P-values for Multiple Comparisons

Given a set of p-values, returns p-values adjusted using one of several methods.

ecdf

Empirical Cumulative Distribution Function

Compute an empirical cumulative distribution function, with several methods for plotting, printing and computing with such an “ecdf” object.

CDF

Empirical Cumulative Distribution Function

Compute an empirical cumulative distribution function

spline

Interpolating Splines

tabulate.mode

Average by removes outliers

prcomp

Principal Components Analysis

Performs a principal components analysis on the given data matrix and returns the results as an object of class prcomp. The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix. This is generally the preferred method for numerical accuracy. The print method for these objects prints the results in a nice format and the plot method produces a scree plot.

Unlike princomp, variances are computed With the usual divisor N - 1. Note that scale = True cannot be used If there are zero Or constant (For center = True) variables.

as.dist
corr

matrix correlation

corr_sign
corr.test

Find the correlations, sample sizes, and probability values between elements of a matrix or data.frame. Although the cor function finds the correlations for a matrix, it does not report probability values. cor.test does, but for only one pair of variables at a time. corr.test uses cor to find the correlations for either complete or pairwise data and reports the sample sizes and probability values as well. For symmetric matrices, raw probabilites are reported below the diagonal and correlations adjusted for multiple comparisons above the diagonal. In the case of different x and ys, the default is to adjust the probabilities for multiple tests. Both corr.test and corr.p return raw and adjusted confidence intervals for each correlation.

quantile

Sample Quantiles

The generic function quantile produces sample quantiles corresponding to the given probabilities. The smallest observation corresponds to a probability of 0 and the largest to a probability of 1.

median
level

get quantile levels

dist

Distance Matrix Computation

This function computes and returns the distance matrix computed by using the specified distance measure to compute the distances between the rows of a data matrix.

pt

The Student t Distribution

Density, distribution function, quantile function and random generation for the t distribution with df degrees of freedom (and optional non-centrality parameter ncp).

t.test

Student's t-Test Performs one and two sample t-tests on vectors of data.

fisher.test

Fisher's Exact Test for Count Data Performs Fisher's exact test for testing the null of independence of rows and columns in a contingency table with fixed marginals.

chisq.test

Pearson's Chi-squared Test for Count Data

chisq.test performs chi-squared contingency table tests and goodness-of-fit tests.

moran.test

Calculate Moran's I quickly for point data test spatial cluster via moran index

mantel.test

The Mantel test, named after Nathan Mantel, is a statistical test of the correlation between two matrices. The matrices must be of the same dimension; in most applications, they are matrices of interrelations between the same vectors of objects. The test was first published by Nathan Mantel, a biostatistician at the National Institutes of Health, in 1967.[1] Accounts of it can be found in advanced statistics books (e.g., Sokal & Rohlf 1995[2]).

lowess
var.test

F Test to Compare Two Variances

Performs an F test to compare the variances of two samples from normal populations.

aov

Fit an Analysis of Variance Model

Fit an analysis of variance model by a call to lm for each stratum.

filterMissing

set the NA, NaN, Inf value to the default value

opls
cmdscale

Classical (Metric) Multidimensional Scaling

Classical multidimensional scaling (MDS) of a data matrix. Also known as principal coordinates analysis (Gower, 1966).

plsda

Partial Least Squares Discriminant Analysis

plsda is used to calibrate, validate and use of partial least squares discrimination analysis (PLS-DA) model.

z

z-score

chi_square

The chiSquare method is used to determine whether there is a significant difference between the expected frequencies and the observed frequencies in one or more categories. It takes a double input x and an integer freedom for degrees of freedom as inputs. It returns the Chi Squared result.

gamma.cdf
gamma
lgamma
beta
lbeta
iqr_outliers

check of the outliers via IQR method

poisson_disk

Fast Poisson Disk Sampling in Arbitrary Dimensions. Robert Bridson. ACM SIGGRAPH 2007

kurtosis

Kurtosis is a statistical measure that describes the "tailedness" of the probability distribution of a real-valued random variable. In simpler terms, it indicates the extent to which the tails of the distribution differ from those of a normal distribution. ### Key Points about Kurtosis: 1. Definition: - Kurtosis is the fourth standardized moment of a distribution. - It is calculated as the average of the squared deviations of the data from its mean, raised to the fourth power, standardized by the standard deviation raised to the fourth power. 2. Types of Kurtosis: - Mesokurtic: Distributions with kurtosis similar to that of the normal distribution (kurtosis value of 3). The tails of a mesokurtic distribution are neither particularly fat nor particularly thin. - Leptokurtic: Distributions with positive kurtosis greater than 3. These distributions have "fat tails" and a sharp peak, indicating more frequent large deviations from the mean than a normal distribution. - Platykurtic: Distributions with kurtosis less than 3. These distributions have "thin tails" and a flatter peak, indicating fewer large deviations from the mean than a normal distribution. 3. Excess Kurtosis: - Often, kurtosis is reported as "excess kurtosis," which is the kurtosis value minus 3. This adjustment makes the kurtosis of a normal distribution equal to 0. - Positive excess kurtosis indicates a leptokurtic distribution, while negative excess kurtosis indicates a platykurtic distribution. 4. Interpretation: - High kurtosis in a data set is an indicator that data has heavy tails or outliers. This can affect the performance of statistical models and methods that assume normality. - Low kurtosis indicates that the data has light tails and lacks outliers. 5. Applications: - In finance, kurtosis is used to describe the distribution of returns of an investment. A high kurtosis indicates a higher risk of extreme returns. - In data analysis, kurtosis helps in understanding the shape of the data distribution and identifying potential outliers. 6. Calculation in R: - The `kurtosis()` function in the `e1071` package can be used to calculate kurtosis in R. - Alternatively, kurtosis can be calculated manually using the formula:

 kurtosis <- sum((data - mean(data))^4) / ((length(data) - 1) * sd(data)^4) - 3
kurtosis is a statistical measure for understanding the shape of a data distribution, particularly the behavior of its tails. It is widely used in various fields, including finance, data analysis, and statistics.

skewness

Skewness Skewness is a fundamental statistical measure used to describe the asymmetry of the probability distribution of a real-valued random variable. It provides insights into the direction and extent of the deviation from a symmetric distribution. ### Key Aspects of Skewness: 1. Definition: - Skewness is the third standardized moment of a distribution. - It is calculated as the average of the cubed deviations of the data from its mean, standardized by the standard deviation raised to the third power. 2. Types of Skewness: - Zero Skewness: Indicates a symmetric distribution where the mean, median, and mode are all equal. - Positive Skewness (Right-Skewed): The tail on the right side of the distribution is longer or fatter. In this case, the mean is greater than the median. - Negative Skewness (Left-Skewed): The tail on the left side of the distribution is longer or fatter. Here, the mean is less than the median. 3. Interpretation: - Skewness values close to zero suggest a nearly symmetric distribution. - Positive values indicate right-skewed distributions, while negative values indicate left-skewed distributions. - The magnitude of the skewness value reflects the degree of asymmetry. 4. Applications: - Finance: Used to analyze the distribution of returns on investments, helping investors understand the potential for extreme outcomes. - Economics: Assists in examining income distributions, enabling economists to assess income inequality. - Natural Sciences: Describes the distribution of experimental data in scientific research. 5. Considerations: - Skewness is just one aspect of distribution shape and should be considered alongside other statistical measures like kurtosis for a comprehensive understanding. - For small sample sizes, the estimation of skewness can be unreliable. In essence, skewness is a statistical tool for understanding the asymmetry of data distributions, with wide-ranging applications in various fields such as finance, economics, and the natural sciences.

product_moments

In statistics, moments are a set of numerical characteristics that describe the shape and features of a probability distribution. Sample moments are the same concept applied to a sample of data, rather than an entire population. They are used to estimate the corresponding population moments and to understand the properties of the data distribution. Here's a basic introduction to the concept of sample moments: ### Definition: 1. Sample Mean (First Moment): The sample mean is the average of the data points in a sample. It is a measure of the central tendency of the data. \[ \bar{x} = \frac{1}{n} \sum{i=1}^{n} xi \] where \( x_i \) are the data points and \( n \) is the number of data points in the sample. 2. Sample Variance (Second Central Moment): The sample variance measures the spread or dispersion of the data points around the sample mean. \[ s^2 = \frac{1}{n-1} \sum{i=1}^{n} (xi - \bar{x})^2 \] The denominator \( n-1 \) is used instead of \( n \) to provide an unbiased estimate of the population variance. 3. Sample Standard Deviation: The sample standard deviation is the square root of the sample variance and is also a measure of dispersion. \[ s = \sqrt{s^2} \] 4. Higher-Order Sample Moments: Higher-order moments describe the shape of the distribution. For example: - Third Moment: Measures skewness, which indicates the asymmetry of the data distribution. - Fourth Moment: Measures kurtosis, which indicates the "tailedness" of the data distribution. ### Calculation: To calculate sample moments, you simply apply the formulas to your data set. For instance, to find the sample mean, you add up all the data points and divide by the number of points. ### Use: Sample moments are used to: - Estimate population parameters. - Assess the shape of the data distribution (e.g., normality, skewness, kurtosis). - Form the basis for many statistical tests and procedures. ### Properties: - Unbiasedness: Some sample moments are designed to be unbiased estimators, meaning that the expected value of the sample moment equals the population moment. - Efficiency: Different sample moments may have different levels of variability; some are more efficient than others. - Robustness: Certain moments are more robust to outliers than others. ### Example: If you have a sample of data: \( \{2, 4, 4, 4, 5, 5, 7, 9\} \), you can calculate the sample mean, variance, and other moments to understand the central tendency, dispersion, and shape of the data distribution. sample moments are fundamental tools in statistics for summarizing and understanding the characteristics of a data set. They provide a way to quantify features such as location, spread, and shape, which are essential for further statistical analysis. @author WillandSara

moment

Statistical Moments

This function computes the sample moment of specified order.

emd_dist

Earth Mover's Distance

Implementation of the Fast Earth Mover's Algorithm by Ofir Pele and Michael Werman.


[Document Index]