PHP Class MathPHP\Statistics\Descriptive

Afficher le fichier Open project: markrogoyski/math-php Class Usage Examples

Méthodes publiques

Méthode	Description
IQR ( array $numbers, string $method = 'exclusive' ) : number	IQR - Interquartile range (midspread, middle fifty) Convenience wrapper function for interquartileRange.
coefficientOfVariation ( array $numbers ) : number	Coefficient of variation (cᵥ) Also known as relative standard deviation (RSD)
describe ( array $numbers, boolean $population = false ) : array	Get a report of all the descriptive statistics over a list of numbers Includes mean, median, mode, range, midrange, variance, standard deviation, quartiles, etc.
fiveNumberSummary ( array $numbers ) : array	Five number summary A descriptive statistic that provides information about a set of observations.
interquartileRange ( array $numbers, string $method = 'exclusive' ) : number	IQR - Interquartile range (midspread, middle fifty) A measure of statistical dispersion.
meanAbsoluteDeviation ( array $numbers ) : numeric	MAD - mean absolute deviation
medianAbsoluteDeviation ( array $numbers ) : numeric	MAD - median absolute deviation
midhinge ( array $numbers ) : number	Midhinge The average of the first and third quartiles and is thus a measure of location.
midrange ( array $numbers ) : number	Midrange - the mean of the largest and smallest values It is the midpoint of the range; as such, it is a measure of central tendency.
percentile ( array $numbers, integer $P ) : number	Compute the P-th percentile of a list of numbers
populationVariance ( array $numbers ) : numeric	Population variance - Use when all possible observations of the system are present.
quartiles ( array $numbers, string $method = 'exclusive' ) : array	Quartiles Three points that divide the data set into four equal groups, each group comprising a quarter of the data.
quartilesExclusive ( array $numbers ) : array	Quartiles - Exclusive method Three points that divide the data set into four equal groups, each group comprising a quarter of the data.
quartilesInclusive ( array $numbers ) : array	Quartiles - Inclusive method (R method) Three points that divide the data set into four equal groups, each group comprising a quarter of the data.
range ( array $numbers ) : number	Range - the difference between the largest and smallest values It is the size of the smallest interval which contains all the data.
sampleVariance ( array $numbers ) : numeric	Unbiased sample variance Use when only a subset of all possible observations of the system are present.
sd ( array $numbers, boolean $SD＋ = false ) : numeric	sd - Standard deviation - convenience method
standardDeviation ( array $numbers, boolean $SD＋ = false ) : numeric	Standard deviation A measure that is used to quantify the amount of variation or dispersion of a set of data values.
variance ( array $numbers, integer $ν ) : numeric	Variance

Method Details

IQR() public static méthode

IQR - Interquartile range (midspread, middle fifty) Convenience wrapper function for interquartileRange.

public static IQR ( array $numbers, string $method = 'exclusive' ) : number
$numbers	array
$method	string	What quartile method to use (optional - default: exclusive)
Résultat	number

coefficientOfVariation() public static méthode

A standardized measure of dispersion of a probability distribution or frequency distribution. It is often expressed as a percentage. The ratio of the standard deviation to the mean. https://en.wikipedia.org/wiki/Coefficient_of_variation σ cᵥ = - μ

public static coefficientOfVariation ( array $numbers ) : number
$numbers	array
Résultat	number

describe() public static méthode

Get a report of all the descriptive statistics over a list of numbers Includes mean, median, mode, range, midrange, variance, standard deviation, quartiles, etc.

public static describe ( array $numbers, boolean $population = false ) : array
$numbers	array
$population	boolean
Résultat	array	[ n, mean, median, mode, range, midrange, variance, sd, CV, mean_mad, median_mad, quartiles, skewness, kurtosis, sem, ci_95, ci_99 ]

fiveNumberSummary() public static méthode

It consists of the five most important sample percentiles: 1) the sample minimum (smallest observation) 2) the lower quartile or first quartile 3) the median (middle value) 4) the upper quartile or third quartile 5) the sample maximum (largest observation) https://en.wikipedia.org/wiki/Five-number_summary

public static fiveNumberSummary ( array $numbers ) : array
$numbers	array
Résultat	array	[min, Q1, median, Q3, max]

interquartileRange() public static méthode

Difference between the upper and lower quartiles. https://en.wikipedia.org/wiki/Interquartile_range IQR = Q₃ - Q₁

public static interquartileRange ( array $numbers, string $method = 'exclusive' ) : number
$numbers	array
$method	string	What quartile method to use (optional - default: exclusive)
Résultat	number

meanAbsoluteDeviation() public static méthode

The average of the absolute deviations from a central point. It is a summary statistic of statistical dispersion or variability. (https://en.wikipedia.org/wiki/Average_absolute_deviation) ∑|xᵢ - x̄| MAD = --------- N x̄ is the mean N is the number of numbers in the population set

public static meanAbsoluteDeviation ( array $numbers ) : numeric
$numbers	array
Résultat	numeric

medianAbsoluteDeviation() public static méthode

The average of the absolute deviations from a central point. It is a summary statistic of statistical dispersion or variability. It is a robust measure of the variability of a univariate sample of quantitative data. (https://en.wikipedia.org/wiki/Median_absolute_deviation) MAD = median(|xᵢ - x̄|) x̄ is the median

public static medianAbsoluteDeviation ( array $numbers ) : numeric
$numbers	array
Résultat	numeric

midhinge() public static méthode

Equivalently, it is the 25% trimmed mid-range or 25% midsummary; it is an L-estimator. https://en.wikipedia.org/wiki/Midhinge Midhinge = (first quartile, third quartile) / 2

public static midhinge ( array $numbers ) : number
$numbers	array
Résultat	number

midrange() public static méthode

(https://en.wikipedia.org/wiki/Mid-range) max x + min x M = ------------- 2

public static midrange ( array $numbers ) : number
$numbers	array
Résultat	number

percentile() public static méthode

Nearest rank method P-th percentile (0 <= P <= 100) of a list of N ordered values (sorted from least to greatest) is the smallest value in the list such that P percent of the data is less than or equal to that value. This is obtained by first calculating the ordinal rank, and then taking the value from the ordered list that corresponds to that rank. https://en.wikipedia.org/wiki/Percentile ⌈ P ⌉ n = | --- × N | | 100 | n: ordinal rank P: percentile N: number of elements in list

public static percentile ( array $numbers, integer $P ) : number
$numbers	array
$P	integer	percentile to calculate
Résultat	number	in list corresponding to P percentile

populationVariance() public static méthode

If used with a subset of data (sample variance), it will be a biased variance. ∑⟮xᵢ - μ⟯² σ² = ---------- N μ is the population mean N is the number of numbers in the population set

public static populationVariance ( array $numbers ) : numeric
$numbers	array
Résultat	numeric

quartiles() public static méthode

https://en.wikipedia.org/wiki/Quartile There are multiple methods for computing quartiles: - Inclusive - Exclusive

public static quartiles ( array $numbers, string $method = 'exclusive' ) : array
$numbers	array
$method	string	What quartile method to use (optional - default: exclusive)
Résultat	array	[ 0%, Q1, Q2, Q3, 100%, IQR ]

quartilesExclusive() public static méthode

https://en.wikipedia.org/wiki/Quartile 0% is smallest number Q1 (25%) is first quartile (lower quartile, 25th percentile) Q2 (50%) is second quartile (median, 50th percentile) Q3 (75%) is third quartile (upper quartile, 75th percentile) 100% is largest number interquartile_range is the difference between the upper and lower quartiles. (IQR = Q₃ - Q₁) Method used - Use the median to divide the ordered data set into two halves. - If there are an odd number of data points in the original ordered data set, do not include the median (the central value in the ordered list) in either half. - If there are an even number of data points in the original ordered data set, split this data set exactly in half. - The lower quartile value is the median of the lower half of the data. The upper quartile value is the median of the upper half of the data. This rule is employed by the TI-83 calculator boxplot and "1-Var Stats" functions. This is the most basic method that is commonly taught in math textbooks.

public static quartilesExclusive ( array $numbers ) : array
$numbers	array
Résultat	array	[ 0%, Q1, Q2, Q3, 100%, IQR ]

quartilesInclusive() public static méthode

https://en.wikipedia.org/wiki/Quartile 0% is smallest number Q1 (25%) is first quartile (lower quartile, 25th percentile) Q2 (50%) is second quartile (median, 50th percentile) Q3 (75%) is third quartile (upper quartile, 75th percentile) 100% is largest number interquartile_range is the difference between the upper and lower quartiles. (IQR = Q₃ - Q₁) Method used - Use the median to divide the ordered data set into two halves. - If there are an odd number of data points in the original ordered data set, include the median (the central value in the ordered list) in both halves. - If there are an even number of data points in the original ordered data set, split this data set exactly in half. - The lower quartile value is the median of the lower half of the data. The upper quartile value is the median of the upper half of the data. The values found by this method are also known as "Tukey's hinges". This is the method that the programming language R uses by default.

public static quartilesInclusive ( array $numbers ) : array
$numbers	array
Résultat	array	[ 0%, Q1, Q2, Q3, 100%, IQR ]

range() public static méthode

It provides an indication of statistical dispersion. (https://en.wikipedia.org/wiki/Range_(statistics)) R = max x - min x

public static range ( array $numbers ) : number
$numbers	array
Résultat	number

sampleVariance() public static méthode

∑⟮xᵢ - x̄⟯² S² = ---------- n - 1 x̄ is the sample mean n is the number of numbers in the sample set

public static sampleVariance ( array $numbers ) : numeric
$numbers	array
Résultat	numeric

sd() public static méthode

sd - Standard deviation - convenience method

public static sd ( array $numbers, boolean $SD＋ = false ) : numeric
$numbers	array
$SD＋	boolean
Résultat	numeric

standardDeviation() public static méthode

A low standard deviation indicates that the data points tend to be close to the mean (also called the expected value) of the set. A high standard deviation indicates that the data points are spread out over a wider range of values. (https://en.wikipedia.org/wiki/Standard_deviation) σ = √⟮σ²⟯ = √⟮variance⟯ SD+ = √⟮σ²⟯ = √⟮sample variance⟯

public static standardDeviation ( array $numbers, boolean $SD＋ = false ) : numeric
$numbers	array
$SD＋	boolean
Résultat	numeric

variance() public static méthode

Variance measures how far a set of numbers are spread out. A variance of zero indicates that all the values are identical. Variance is always non-negative: a small variance indicates that the data points tend to be very close to the mean (expected value) and hence to each other. A high variance indicates that the data points are very spread out around the mean and from each other. (https://en.wikipedia.org/wiki/Variance) ∑⟮xᵢ - μ⟯² σ² = ---------- ν Generalized method that allows setting the degrees of freedom. For population variance, set d.f. (ν) to n For sample variance, set d.f (ν) to n - 1 Or use popluationVariance or sampleVaraince covenience methods. μ is the population mean ν is the degrees of freedom, which usually is the number of numbers in the population set or n - 1 for sample set.

public static variance ( array $numbers, integer $ν ) : numeric
$numbers	array
$ν	integer	degrees of freedom
Résultat	numeric