stats

R statistical functions, This package contains functions for statistical calculations and random number generation. For a complete list of functions, use library(help = "stats").

.NET clr function exports
combin	calculates `C(n, k)`.
pnorm	The Normal Distribution Density, distribution function, quantile function and random generation for the normal distribution with mean equal to mean and standard deviation equal to sd.
dnorm
p.adjust	Adjust P-values for Multiple Comparisons Given a set of p-values, returns p-values adjusted using one of several methods.
ecdf	Empirical Cumulative Distribution Function Compute an empirical cumulative distribution function, with several methods for plotting, printing and computing with such an “ecdf” object.
CDF	Empirical Cumulative Distribution Function Compute an empirical cumulative distribution function
spline	Interpolating Splines
tabulate.mode	Average by removes outliers
prcomp	Principal Components Analysis Performs a principal components analysis on the given data matrix and returns the results as an object of class `prcomp`. The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix. This is generally the preferred method for numerical accuracy. The print method for these objects prints the results in a nice format and the plot method produces a scree plot. Unlike princomp, variances are computed With the usual divisor N - 1. Note that scale = True cannot be used If there are zero Or constant (For center = True) variables.
as.dist
corr	matrix correlation
corr_sign
corr.test	Find the correlations, sample sizes, and probability values between elements of a matrix or data.frame. Although the cor function finds the correlations for a matrix, it does not report probability values. cor.test does, but for only one pair of variables at a time. corr.test uses cor to find the correlations for either complete or pairwise data and reports the sample sizes and probability values as well. For symmetric matrices, raw probabilites are reported below the diagonal and correlations adjusted for multiple comparisons above the diagonal. In the case of different x and ys, the default is to adjust the probabilities for multiple tests. Both corr.test and corr.p return raw and adjusted confidence intervals for each correlation.
quantile	Sample Quantiles The generic function quantile produces sample quantiles corresponding to the given probabilities. The smallest observation corresponds to a probability of 0 and the largest to a probability of 1.
median
level	get quantile levels
dist	Distance Matrix Computation This function computes and returns the distance matrix computed by using the specified distance measure to compute the distances between the rows of a data matrix.
pt	The Student t Distribution Density, distribution function, quantile function and random generation for the t distribution with df degrees of freedom (and optional non-centrality parameter ncp).
t.test	Student's t-Test Performs one and two sample t-tests on vectors of data.
fisher.test	Fisher's Exact Test for Count Data Performs Fisher's exact test for testing the null of independence of rows and columns in a contingency table with fixed marginals.
chisq.test	Pearson's Chi-squared Test for Count Data chisq.test performs chi-squared contingency table tests and goodness-of-fit tests.
moran.test	Calculate Moran's I quickly for point data test spatial cluster via moran index
mantel.test	The Mantel test, named after Nathan Mantel, is a statistical test of the correlation between two matrices. The matrices must be of the same dimension; in most applications, they are matrices of interrelations between the same vectors of objects. The test was first published by Nathan Mantel, a biostatistician at the National Institutes of Health, in 1967.[1] Accounts of it can be found in advanced statistics books (e.g., Sokal & Rohlf 1995[2]).
lowess
var.test	F Test to Compare Two Variances Performs an F test to compare the variances of two samples from normal populations.
aov	Fit an Analysis of Variance Model Fit an analysis of variance model by a call to lm for each stratum.
filterMissing	set the NA, NaN, Inf value to the default value
opls
cmdscale	Classical (Metric) Multidimensional Scaling Classical multidimensional scaling (MDS) of a data matrix. Also known as principal coordinates analysis (Gower, 1966).
plsda	Partial Least Squares Discriminant Analysis `plsda` is used to calibrate, validate and use of partial least squares discrimination analysis (PLS-DA) model.
z	z-score
chi_square	The chiSquare method is used to determine whether there is a significant difference between the expected frequencies and the observed frequencies in one or more categories. It takes a double input x and an integer freedom for degrees of freedom as inputs. It returns the Chi Squared result.
gamma.cdf
gamma
lgamma
beta
lbeta
iqr_outliers	check of the outliers via IQR method
poisson_disk	Fast Poisson Disk Sampling in Arbitrary Dimensions. Robert Bridson. ACM SIGGRAPH 2007
kurtosis	Kurtosis is a statistical measure that describes the "tailedness" of the probability distribution of a real-valued random variable. In simpler terms, it indicates the extent to which the tails of the distribution differ from those of a normal distribution. ### Key Points about Kurtosis: 1. Definition: - Kurtosis is the fourth standardized moment of a distribution. - It is calculated as the average of the squared deviations of the data from its mean, raised to the fourth power, standardized by the standard deviation raised to the fourth power. 2. Types of Kurtosis: - Mesokurtic: Distributions with kurtosis similar to that of the normal distribution (kurtosis value of 3). The tails of a mesokurtic distribution are neither particularly fat nor particularly thin. - Leptokurtic: Distributions with positive kurtosis greater than 3. These distributions have "fat tails" and a sharp peak, indicating more frequent large deviations from the mean than a normal distribution. - Platykurtic: Distributions with kurtosis less than 3. These distributions have "thin tails" and a flatter peak, indicating fewer large deviations from the mean than a normal distribution. 3. Excess Kurtosis: - Often, kurtosis is reported as "excess kurtosis," which is the kurtosis value minus 3. This adjustment makes the kurtosis of a normal distribution equal to 0. - Positive excess kurtosis indicates a leptokurtic distribution, while negative excess kurtosis indicates a platykurtic distribution. 4. Interpretation: - High kurtosis in a data set is an indicator that data has heavy tails or outliers. This can affect the performance of statistical models and methods that assume normality. - Low kurtosis indicates that the data has light tails and lacks outliers. 5. Applications: - In finance, kurtosis is used to describe the distribution of returns of an investment. A high kurtosis indicates a higher risk of extreme returns. - In data analysis, kurtosis helps in understanding the shape of the data distribution and identifying potential outliers. 6. Calculation in R: - The `kurtosis()` function in the `e1071` package can be used to calculate kurtosis in R. - Alternatively, kurtosis can be calculated manually using the formula: `kurtosis <- sum((data - mean(data))^4) / ((length(data) - 1) * sd(data)^4) - 3` kurtosis is a statistical measure for understanding the shape of a data distribution, particularly the behavior of its tails. It is widely used in various fields, including finance, data analysis, and statistics.
skewness	Skewness Skewness is a fundamental statistical measure used to describe the asymmetry of the probability distribution of a real-valued random variable. It provides insights into the direction and extent of the deviation from a symmetric distribution. ### Key Aspects of Skewness: 1. Definition: - Skewness is the third standardized moment of a distribution. - It is calculated as the average of the cubed deviations of the data from its mean, standardized by the standard deviation raised to the third power. 2. Types of Skewness: - Zero Skewness: Indicates a symmetric distribution where the mean, median, and mode are all equal. - Positive Skewness (Right-Skewed): The tail on the right side of the distribution is longer or fatter. In this case, the mean is greater than the median. - Negative Skewness (Left-Skewed): The tail on the left side of the distribution is longer or fatter. Here, the mean is less than the median. 3. Interpretation: - Skewness values close to zero suggest a nearly symmetric distribution. - Positive values indicate right-skewed distributions, while negative values indicate left-skewed distributions. - The magnitude of the skewness value reflects the degree of asymmetry. 4. Applications: - Finance: Used to analyze the distribution of returns on investments, helping investors understand the potential for extreme outcomes. - Economics: Assists in examining income distributions, enabling economists to assess income inequality. - Natural Sciences: Describes the distribution of experimental data in scientific research. 5. Considerations: - Skewness is just one aspect of distribution shape and should be considered alongside other statistical measures like kurtosis for a comprehensive understanding. - For small sample sizes, the estimation of skewness can be unreliable. In essence, skewness is a statistical tool for understanding the asymmetry of data distributions, with wide-ranging applications in various fields such as finance, economics, and the natural sciences.
product_moments	In statistics, moments are a set of numerical characteristics that describe the shape and features of a probability distribution. Sample moments are the same concept applied to a sample of data, rather than an entire population. They are used to estimate the corresponding population moments and to understand the properties of the data distribution. Here's a basic introduction to the concept of sample moments: ### Definition: 1. Sample Mean (First Moment): The sample mean is the average of the data points in a sample. It is a measure of the central tendency of the data. \[ \bar{x} = \frac{1}{n} \sum{i=1}^{n} xi \] where \( x_i \) are the data points and \( n \) is the number of data points in the sample. 2. Sample Variance (Second Central Moment): The sample variance measures the spread or dispersion of the data points around the sample mean. \[ s^2 = \frac{1}{n-1} \sum{i=1}^{n} (xi - \bar{x})^2 \] The denominator \( n-1 \) is used instead of \( n \) to provide an unbiased estimate of the population variance. 3. Sample Standard Deviation: The sample standard deviation is the square root of the sample variance and is also a measure of dispersion. \[ s = \sqrt{s^2} \] 4. Higher-Order Sample Moments: Higher-order moments describe the shape of the distribution. For example: - Third Moment: Measures skewness, which indicates the asymmetry of the data distribution. - Fourth Moment: Measures kurtosis, which indicates the "tailedness" of the data distribution. ### Calculation: To calculate sample moments, you simply apply the formulas to your data set. For instance, to find the sample mean, you add up all the data points and divide by the number of points. ### Use: Sample moments are used to: - Estimate population parameters. - Assess the shape of the data distribution (e.g., normality, skewness, kurtosis). - Form the basis for many statistical tests and procedures. ### Properties: - Unbiasedness: Some sample moments are designed to be unbiased estimators, meaning that the expected value of the sample moment equals the population moment. - Efficiency: Different sample moments may have different levels of variability; some are more efficient than others. - Robustness: Certain moments are more robust to outliers than others. ### Example: If you have a sample of data: \( \{2, 4, 4, 4, 5, 5, 7, 9\} \), you can calculate the sample mean, variance, and other moments to understand the central tendency, dispersion, and shape of the data distribution. sample moments are fundamental tools in statistics for summarizing and understanding the characteristics of a data set. They provide a way to quantify features such as location, spread, and shape, which are essential for further statistical analysis. @author WillandSara
moment	Statistical Moments This function computes the sample moment of specified order.
emd_dist	Earth Mover's Distance Implementation of the Fast Earth Mover's Algorithm by Ofir Pele and Michael Werman.

stats

The R Stats Package

The R Stats Package

Adjust P-values for Multiple Comparisons

Empirical Cumulative Distribution Function

Empirical Cumulative Distribution Function

Principal Components Analysis

Sample Quantiles

Distance Matrix Computation

The Student t Distribution

Pearson's Chi-squared Test for Count Data

F Test to Compare Two Variances

Fit an Analysis of Variance Model

Classical (Metric) Multidimensional Scaling

Partial Least Squares Discriminant Analysis

Statistical Moments

Earth Mover's Distance