Chi-SquaredFit v0.8.5 Chi2fit.Utilities View Source

Provides various utilities:

  • Bootstrapping
  • Derivatives
  • Creating Cumulative Distribution Functions / Histograms from sample data
  • Solving linear, quadratic, and cubic equations
  • Autocorrelation coefficients

Link to this section Summary

Types

Algorithm used to assign errors to frequencey data: Wald score and Wilson score

Cumulative Distribution Function

Supported numerical integration methods

Functions

Calculates the autocorrelation coefficient of a list of observations

Implements bootstrapping procedure as resampling with replacement

Converts a CDF function to a list of data points

Generates a Cullen & Frey plot for the sample data

Extracts data point with standard deviation from Cullen & Frey plot data

Calculates the partial derivative of a function and returns the value

Generates an empirical Cumulative Distribution Function from sample data

Calculates and returns the error associated with a list of observables

Numerical integration providing Gauss and Romberg types

Calculates the jacobian of the function at the point x

Converts a list of numbers to frequency data

Calculates the nth moment of the sample

Calculates the nth centralized moment of the sample

Calculates the nth centralized moment of the sample

Calculates the nth normalized moment of the sample

Calculates the nth normalized moment of the sample

Calculates the nth normalized moment of the sample

Newton-Fourier method for locating roots and returning the interval where the root is located

Converts the input so that the result is a Puiseaux diagram, that is a strict convex shape

Reads data from a file specified by filename and returns a stream with the data parsed as floats

Unzips lists of 1-, 2-, 3-, 4-, and 5-tuples

Link to this section Types

Link to this type algorithm() View Source
algorithm() :: :wilson | :wald

Algorithm used to assign errors to frequencey data: Wald score and Wilson score.

Cumulative Distribution Function

Link to this type cullenfrey() View Source
cullenfrey() :: [{squared_skewness :: float(), kurtosis :: float()} | nil]
Link to this type method() View Source
method() :: :gauss | :gauss2 | :gauss3 | :romberg | :romberg2 | :romberg3

Supported numerical integration methods

Link to this type range() View Source
range() :: {float(), float()} | [float(), ...]

Link to this section Functions

Link to this function auto(list, opts \\ [nproc: 1]) View Source
auto([number()], Keyword.t()) :: [number()]

Calculates the autocorrelation coefficient of a list of observations.

The implementation uses the discrete Fast Fourier Transform to calculate the autocorrelation. For available options see Chi2fit.FFT.fft/2. Returns a list of the autocorrelation coefficients.

Example

iex> auto [1,2,3]
[14.0, 7.999999999999999, 2.999999999999997]
Link to this function bootstrap(total, data, fun, options \\ []) View Source
bootstrap(
  total :: integer(),
  data :: [number()],
  fun :: ([number()], integer() -> number()),
  options :: Keyword.t()
) :: [any()]

Implements bootstrapping procedure as resampling with replacement.

It supports saving intermediate results to a file using :dets. Use the options :safe and :filename (see below)

Arguments:

`total` - Total number resmaplings to perform
`data` - The sample data
`fun` - The function to evaluate
`options` - A keyword list of options, see below.

Options

`:safe` - Whether to safe intermediate results to a file, so as to support continuation when it is interrupted.
      Valid values are `:safe` and `:cont`.
`:filename` - The filename to use for storing intermediate results
Link to this function convert_cdf(arg) View Source
convert_cdf({cdf(), range()}) :: [{float(), float(), float(), float()}]

Converts a CDF function to a list of data points.

Example

iex> convert_cdf {fn x->{:math.exp(-x),:math.exp(-x)/16,:math.exp(-x)/4} end, {1,4}}
[{1, 0.36787944117144233, 0.022992465073215146, 0.09196986029286058},
 {2, 0.1353352832366127, 0.008458455202288294, 0.033833820809153176},
 {3, 0.049787068367863944, 0.0031116917729914965, 0.012446767091965986},
 {4, 0.01831563888873418, 0.0011447274305458862, 0.004578909722183545}]
Link to this function cullen_frey(sample, n \\ 100) View Source
cullen_frey(sample :: [number()], n :: integer()) :: cullenfrey()

Generates a Cullen & Frey plot for the sample data.

Link to this function cullen_frey_point(data) View Source
cullen_frey_point(data :: cullenfrey()) ::
  {{x :: float(), dx :: float()}, {y :: float(), dy :: float()}}

Extracts data point with standard deviation from Cullen & Frey plot data.

Link to this function der(parameters, fun, options \\ []) View Source
der([float() | {float(), integer()}], ([float()] -> float()), Keyword.t()) ::
  float()

Calculates the partial derivative of a function and returns the value.

Examples

The function value at a point:
iex> der([3.0], fn [x]-> x*x end) |> Float.round(3)
9.0

The first derivative of a function at a point:
iex> der([{3.0,1}], fn [x]-> x*x end) |> Float.round(3)
6.0

The second derivative of a function at a point:
iex> der([{3.0,2}], fn [x]-> x*x end) |> Float.round(3)
2.0

Partial derivatives with respect to two variables:
iex> der([{2.0,1},{3.0,1}], fn [x,y] -> 3*x*x*y end) |> Float.round(3)
12.0
Link to this function empirical_cdf(data, bin \\ {1.0, 0.5}, algorithm \\ :wilson, correction \\ 0) View Source
empirical_cdf(
  [{float(), number()}],
  {number(), number()},
  algorithm(),
  integer()
) :: {cdf(), bins :: [float()], numbins :: pos_integer(), sum :: float()}

Generates an empirical Cumulative Distribution Function from sample data.

Three parameters determine the resulting empirical distribution:

1) algorithm for assigning errors,

2) the size of the bins,

3) a correction for limiting the bounds on the ‘y’ values

When e.g. task effort/duration is modeled, some tasks measured have 0 time. In practice what is actually is meant, is that the task effort is between 0 and 1 hour. This is where binning of the data happens. Specify a size of the bins to control how this is done. A bin size of 1 means that 0 effort will be mapped to 1/2 effort (at the middle of the bin). This also prevents problems when the fited distribution cannot cope with an effort os zero.

Supports two ways of assigning errors: Wald score or Wilson score. See [1]. Valie values for the algorithm argument are :wald or :wilson.

In the handbook of MCMC [1] a cumulative distribution is constructed. For the largest ‘x’ value in the sample, the ‘y’ value is exactly one (1). In combination with the Wald score this gives zero errors on the value ‘1’. If the resulting distribution is used to fit a curve this may give an infinite contribution to the maximum likelihood function. Use the correction number to have a ‘y’ value of slightly less than 1 to prevent this from happening. Especially the combination of 0 correction, algorithm :wald, and ‘linear’ model for handling asymmetric errors gives problems.

The algorithm parameter determines how the errors onthe ‘y’ value are determined. Currently supported values include :wald and :wilson.

References

[1] "Handbook of Monte Carlo Methods" by Kroese, Taimre, and Botev, section 8.4
[2] See https://en.wikipedia.org/wiki/Cumulative_frequency_analysis
[3] https://arxiv.org/pdf/1112.2593v3.pdf
[4] See https://en.wikipedia.org/wiki/Student%27s_t-distribution:
    90% confidence ==> t = 1.645 for many data points (> 120)
    70% confidence ==> t = 1.000
Link to this function error(nauto, atom) View Source
error([{gamma :: number(), k :: pos_integer()}], :initial_sequence_method) ::
  {var :: number(), lag :: number()}

Calculates and returns the error associated with a list of observables.

Usually these are the result of a Markov Chain Monte Carlo simulation run.

The only supported method is the so-called Initial Sequence Method. See section 1.10.2 (Initial sequence method) of [1].

Input is a list of autocorrelation coefficients. This may be the output of auto/2.

References

[1] ‘Handbook of Markov Chain Monte Carlo’

Link to this function get_cdf(data, binsize \\ {1.0, 0.5}, algorithm \\ :wilson, correction \\ 0) View Source
get_cdf([number()], number() | {number(), number()}, algorithm(), integer()) ::
  {cdf(), bins :: [float()], numbins :: pos_integer(), sum :: float()}

Calculates the empirical CDF from a sample.

Convenience function that chains make_histogram/2 and empirical_cdf/3.

Link to this function integrate(method, func, a, b, options \\ []) View Source
integrate(
  method(),
  (float() -> float()),
  a :: float(),
  b :: float(),
  options :: Keyword.t()
) :: float()

Numerical integration providing Gauss and Romberg types.

Link to this function jacobian(x, fun, options \\ []) View Source

Calculates the jacobian of the function at the point x.

Examples

iex> jacobian([2.0,3.0], fn [x,y] -> x*y end) |> Enum.map(&Float.round(&1))
[3.0, 2.0]
Link to this function make_histogram(list, binsize \\ 1.0, offset \\ 0.5) View Source
make_histogram([number()], number(), number()) :: [
  {non_neg_integer(), pos_integer()}
]

Converts a list of numbers to frequency data.

The data is divived into bins of size binsize and the number of data points inside a bin are counted. A map is returned with the bin’s index as a key and as value the number of data points in that bin.

Examples

iex> make_histogram [1,2,3]
[{1, 1}, {2, 1}, {3, 1}]

iex> make_histogram [1,2,3,4,5,6,5,4,3,4,5,6,7,8,9]
[{1, 1}, {2, 1}, {3, 2}, {4, 3}, {5, 3}, {6, 2}, {7, 1}, {8, 1}, {9, 1}]

iex> make_histogram [1,2,3,4,5,6,5,4,3,4,5,6,7,8,9], 3, 1.5
[{0, 1}, {1, 6}, {2, 6}, {3, 2}]
Link to this function moment(sample, n) View Source
moment(sample :: [number()], n :: pos_integer()) :: float()

Calculates the nth moment of the sample.

Example

iex> moment [1,2,3,4,5,6], 1
3.5
Link to this function momentc(sample, n) View Source
momentc(sample :: [number()], n :: pos_integer()) :: float()

Calculates the nth centralized moment of the sample.

Example

iex> momentc [1,2,3,4,5,6], 1
0.0

iex> momentc [1,2,3,4,5,6], 2
2.9166666666666665
Link to this function momentc(sample, n, mu) View Source
momentc(sample :: [number()], n :: pos_integer(), mu :: float()) :: float()

Calculates the nth centralized moment of the sample.

Example

iex> momentc [1,2,3,4,5,6], 2, 3.5
2.9166666666666665
Link to this function momentn(sample, n) View Source
momentn(sample :: [number()], n :: pos_integer()) :: float()

Calculates the nth normalized moment of the sample.

Example

iex> momentn [1,2,3,4,5,6], 1
0.0

iex> momentn [1,2,3,4,5,6], 2
1.0

iex> momentn [1,2,3,4,5,6], 4
1.7314285714285718
Link to this function momentn(sample, n, mu) View Source
momentn(sample :: [number()], n :: pos_integer(), mu :: float()) :: float()

Calculates the nth normalized moment of the sample.

Example

iex> momentn [1,2,3,4,5,6], 4, 3.5
1.7314285714285718
Link to this function momentn(sample, n, mu, sigma) View Source
momentn(
  sample :: [number()],
  n :: pos_integer(),
  mu :: float(),
  sigma :: float()
) :: float()

Calculates the nth normalized moment of the sample.

Link to this function newton(a, b, func, maxiter \\ 10, options) View Source
newton(
  a :: float(),
  b :: float(),
  func :: (x :: float() -> float()),
  maxiter :: non_neg_integer(),
  options :: Keyword.t()
) :: {float(), {float(), float()}, {float(), float()}}

Newton-Fourier method for locating roots and returning the interval where the root is located.

See [https://en.wikipedia.org/wiki/Newton%27s_method#Newton.E2.80.93Fourier_method]

Link to this function puiseaux(list, result \\ [], flag \\ false) View Source
puiseaux([number()], [number()], boolean()) :: [number()]

Converts the input so that the result is a Puiseaux diagram, that is a strict convex shape.

Examples

iex> puiseaux [1]
[1]

iex> puiseaux [5,3,3,2]
[5, 3, 2.5, 2]
Link to this function read_data(filename) View Source
read_data(filename :: String.t()) :: Stream.t()

Reads data from a file specified by filename and returns a stream with the data parsed as floats.

It expects a single data point on a separate line and removes entries that:

  • are not floats, and
  • smaller than zero (0)
Link to this function richardson(func, init, factor, results \\ [], options) View Source
richardson(
  func :: (term() -> {float(), term()}),
  init :: term(),
  factor :: float(),
  results :: [float()],
  options :: Keyword.t()
) :: float()

Richardson extrapolation.

Link to this function unzip(list) View Source
unzip(list :: [tuple()]) :: tuple()

Unzips lists of 1-, 2-, 3-, 4-, and 5-tuples.