Descriptive statistics
Contents
Descriptive statistics¶
A descriptive statistic is a summary statistic that describes and summarizes different features of a dataset. The Fortran Standard library stdlib provides several descriptive statistics that can be classified in two groups:
the measures of central tendency, such as the mean and the median;
the measures of variability, such as the moment and the variance.
In the context of Fortran, an array of any rank will be considered as a dataset. This guide shows examples for computing different descriptive statistics from an array.
All the functions described in this tutorial can be accessed via the stdlib_stats module.
First, we provide some examples for computing the mean of all elements of an array, or of the elements of an array along a dimension. These examples can be easily adapted to compute other descriptive statistics, such as the median, the moment, of the variance of a dataset.
Mean of an array¶
The Fortran Standard library stdlib provides the function mean
for computing the
mean of a whole array:
program mean_all
use stdlib_stats, only: mean
implicit none
real :: y(1:2, 1:3) = reshape([1., 2., 3., 4., 5., 6.], [2, 3])
print*, mean(y)
end program mean_all
The same function can be also used to compute the mean of the elements of an array along a dimension:
program mean_dim
use stdlib_stats, only: mean
implicit none
real :: y(1:2, 1:3) = reshape([1., 2., 3., 4., 5., 6.], [2, 3])
print*, mean(y, 1)
print*, mean(y, 2)
end program mean_dim
Finally, a mask can be also provided to ignore some elements of the array, as follows:
program mean_mask
use stdlib_stats, only: mean
implicit none
real :: y(1:2, 1:3) = reshape([1., 2., 3., 4., 5., 6.], [2, 3])
print*, mean(y, y > 2.)
print*, mean(y, 1, y > 2.)
print*, mean(y, 2, y > 2.)
end program mean_mask
Note
The module stdlib_stats provives procedures for arrays of any rank, as well as any numerical type.
Other descriptive statistics¶
Other descriptive statistics, such as the median or the variance of an array, can be computed using the stdlib_stats module.
Summary
Using the stdlib_stats module allows to compute:
the measures of central tendency, such as the mean and the median;
the measures of variability, such as the moment and the variance.