skewness {stats} | R Documentation |
Skewness is a fundamental statistical measure used to describe the asymmetry of the probability distribution of a
real-valued random variable. It provides insights into the direction and extent of the deviation from a symmetric
distribution.
### Key Aspects of Skewness:
1. Definition:
- Skewness is the third standardized moment of a distribution.
- It is calculated as the average of the cubed deviations of the data from its mean, standardized by the standard deviation raised to the third power.
2. Types of Skewness:
- Zero Skewness: Indicates a symmetric distribution where the mean, median, and mode are all equal.
- Positive Skewness (Right-Skewed): The tail on the right side of the distribution is longer or fatter. In this case, the mean is greater than the median.
- Negative Skewness (Left-Skewed): The tail on the left side of the distribution is longer or fatter. Here, the mean is less than the median.
3. Interpretation:
- Skewness values close to zero suggest a nearly symmetric distribution.
- Positive values indicate right-skewed distributions, while negative values indicate left-skewed distributions.
- The magnitude of the skewness value reflects the degree of asymmetry.
4. Applications:
- Finance: Used to analyze the distribution of returns on investments, helping investors understand the potential for extreme outcomes.
- Economics: Assists in examining income distributions, enabling economists to assess income inequality.
- Natural Sciences: Describes the distribution of experimental data in scientific research.
5. Considerations:
- Skewness is just one aspect of distribution shape and should be considered alongside other statistical measures like kurtosis for a comprehensive understanding.
- For small sample sizes, the estimation of skewness can be unreliable.
In essence, skewness is a statistical tool for understanding the asymmetry of data distributions,
with wide-ranging applications in various fields such as finance, economics, and the natural
sciences.
skewness(x,
type = Classical);
If x contains missings and these are not removed, the skewness is NA. Otherwise, write xi for the non-missing elements of x, n for their number, μ for their mean, s for their standard deviation, and mr =∑i (xi −μ) ^ r /n for the sample moments of order r.
Joanes and Gill (1998) discuss three methods for estimating skewness:
Type 1: g1 = m3 / m2 ^ (3/2). This is the typical definition used in many older textbooks. Type 2: G1 = g1 * sqrt(n(n−1)) /(n−2). Used in SAS and SPSS. Type 3: b1 = m3 /s^3 = g1 * ((n−1)/n) ^ (3/2) . Used in MINITAB and BMDP.
All three skewness measures are unbiased under normality.
# Example data
data <- c(2, 4, 4, 4, 5, 5, 7, 9);
# Calculate skewness using e1071 package
skewness_value <- skewness(data);
print(skewness_value);
skewness(data,type = 1);
# [1] 0.65625
skewness(data,type = 2);
# [1] 0.8184876
skewness(data,type = 3);
# [1] 0.5371325
# Manual calculation of skewness
n <- length(data);
mean_data <- mean(data);
sd_data <- sd(data);
skewness_manual <- sum((data - mean_data)^3) / ((n - 1) * sd_data^3);
print(skewness_manual);