Descriptive statistics

A descriptive statistic is a summary statistic that describes and summarizes different features of a dataset. The Fortran Standard library stdlib provides several descriptive statistics that can be classified in two groups:

  • the measures of central tendency, such as the mean and the median;

  • the measures of variability, such as the moment and the variance.

In the context of Fortran, an array of any rank will be considered as a dataset. This guide shows examples for computing different descriptive statistics from an array.

All the functions described in this tutorial can be accessed via the stdlib_stats module.

First, we provide some examples for computing the mean of all elements of an array, or of the elements of an array along a dimension. These examples can be easily adapted to compute other descriptive statistics, such as the median, the moment, of the variance of a dataset.

Mean of an array

The Fortran Standard library stdlib provides the function mean for computing the mean of a whole array:

mean_all.f90
program mean_all
  use stdlib_stats, only: mean
  implicit none
  real :: y(1:2, 1:3) = reshape([1., 2., 3., 4., 5., 6.], [2, 3])

  print*, mean(y)
end program mean_all

The same function can be also used to compute the mean of the elements of an array along a dimension:

mean_dim.f90
program mean_dim
  use stdlib_stats, only: mean
  implicit none
  real :: y(1:2, 1:3) = reshape([1., 2., 3., 4., 5., 6.], [2, 3])

  print*, mean(y, 1)
  print*, mean(y, 2)
end program mean_dim

Finally, a mask can be also provided to ignore some elements of the array, as follows:

mean_mask.f90
program mean_mask
  use stdlib_stats, only: mean
  implicit none
  real :: y(1:2, 1:3) = reshape([1., 2., 3., 4., 5., 6.], [2, 3])

  print*, mean(y, y > 2.)
  print*, mean(y, 1, y > 2.)
  print*, mean(y, 2, y > 2.)
end program mean_mask

Note

The module stdlib_stats provives procedures for arrays of any rank, as well as any numerical type.

Other descriptive statistics

Other descriptive statistics, such as the median or the variance of an array, can be computed using the stdlib_stats module.

Summary

Using the stdlib_stats module allows to compute:

  • the measures of central tendency, such as the mean and the median;

  • the measures of variability, such as the moment and the variance.