@stdlib/stats-strided-dcovmatmtk

v0.1.1

Published

a month ago

Compute the covariance matrix for an `M` by `N` double-precision floating-point matrix `A` and assigns the results to a matrix `B` when provided known means and using a one-pass textbook algorithm.

dcovmatmtk

Compute the covariance matrix for an M by N double-precision floating-point matrix A and assign the results to a matrix B when provided known means and using a one-pass textbook algorithm.

The population covariance of two finite size populations of size N is given by

where the population means are given by

and

Often in the analysis of data, the true population covariance is not known a priori and must be estimated from samples drawn from population distributions. If one attempts to use the formula for the population covariance, the result is biased and yields a biased sample covariance. To compute an unbiased sample covariance for samples of size n,

where sample means are given by

and

The use of the term n-1 is commonly referred to as Bessel's correction. Depending on the characteristics of the population distributions, other correction factors (e.g., n-1.5, n+1, etc) can yield better estimators.

Given a matrix X containing K data vectors X_i, a covariance matrix (also known as the auto-covariance matrix, dispersion matrix, variance matrix, or variance-covariance matrix) is a square matrix containing the covariance cov(X_i,X_j) between each pair of data vectors for 0 <= i < K and 0 <= j < K.

Installation

npm install @stdlib/stats-strided-dcovmatmtk

Usage

var dcovmatmtk = require( '@stdlib/stats-strided-dcovmatmtk' );

dcovmatmtk( order, orient, uplo, M, N, correction, means, strideM, A, LDA, B, LDB )

Computes the covariance matrix for an M by N double-precision floating-point matrix A and assigns the results to a matrix B when provided known means and using a one-pass textbook algorithm.

var Float64Array = require( '@stdlib/array-float64' );

// Define a 2x3 matrix in which variables are stored along rows in row-major order:
var A = new Float64Array([
    1.0, -2.0, 2.0,
    2.0, -2.0, 1.0
]);

// Define a vector of known means:
var means = new Float64Array( [ 1.0/3.0, 1.0/3.0 ] );

// Allocate a 2x2 output matrix:
var B = new Float64Array( [ 0.0, 0.0, 0.0, 0.0 ] );

// Perform operation:
var out = dcovmatmtk( 'row-major', 'rows', 'full', 2, 3, 1, means, 1, A, 3, B, 2 );
// returns <Float64Array>[ ~4.3333, ~3.8333, ~3.8333, ~4.3333 ]

var bool = ( B === out );
// returns true

The function has the following parameters:

order: storage layout.
orient: specifies whether variables are stored along columns or along rows.
uplo: specifies whether to overwrite the upper or lower triangular part of B.
M: number of rows in A.
N: number of columns in A.
correction: degrees of freedom adjustment. Setting this parameter to a value other than 0 has the effect of adjusting the divisor during the calculation of the covariance according to N-c where c corresponds to the provided degrees of freedom adjustment. When computing the population covariance, setting this parameter to 0 is the standard choice (i.e., the provided arrays contain data constituting entire populations). When computing the unbiased sample covariance, setting this parameter to 1 is the standard choice (i.e., the provided arrays contain data sampled from larger populations; this is commonly referred to as Bessel's correction).
means: Float64Array containing known means.
strideM: stride length for means.
A: input matrix stored in linear memory as a Float64Array.
LDA: stride of the first dimension of A (a.k.a., leading dimension of the matrix A).
B: output matrix stored in linear memory as a Float64Array.
LDB: stride of the first dimension of B (a.k.a., leading dimension of the matrix B).

The stride parameters determine which elements in the arrays are accessed at runtime. For example, to iterate over the elements of means in reverse order,

var Float64Array = require( '@stdlib/array-float64' );

var A = new Float64Array([
    1.0, -2.0, 2.0,
    2.0, -2.0, 1.0
]);

var means = new Float64Array( [ 1.0/3.0, 1.0/3.0 ] );
var B = new Float64Array( [ 0.0, 0.0, 0.0, 0.0 ] );

var out = dcovmatmtk( 'row-major', 'rows', 'full', 2, 3, 1, means, -1, A, 3, B, 2 );
// returns <Float64Array>[ ~4.3333, ~3.8333, ~3.8333, ~4.3333 ]

Note that indexing is relative to the first index. To introduce an offset, use typed array views.

var Float64Array = require( '@stdlib/array-float64' );

var A0 = new Float64Array([
    0.0, 0.0, 0.0,
    1.0, -2.0, 2.0,
    2.0, -2.0, 1.0
]);

var means0 = new Float64Array( [ 0.0, 1.0/3.0, 1.0/3.0 ] );
var B0 = new Float64Array( [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ] );

var A1 = new Float64Array( A0.buffer, A0.BYTES_PER_ELEMENT*3 );
var means1 = new Float64Array( means0.buffer, means0.BYTES_PER_ELEMENT*1 );
var B1 = new Float64Array( B0.buffer, B0.BYTES_PER_ELEMENT*2 );

var out = dcovmatmtk( 'row-major', 'rows', 'full', 2, 3, 1, means1, 1, A1, 3, B1, 2 );
// B0 => <Float64Array>[ 0.0, 0.0, ~4.3333, ~3.8333, ~3.8333, ~4.3333 ]

dcovmatmtk.ndarray( orient, uplo, M, N, c, means, sm, om, A, sa1, sa2, oa, B, sb1, sb2, ob )

Computes the covariance matrix for an M by N double-precision floating-point matrix A and assigns the results to a matrix B when provided known means and using a one-pass textbook algorithm and alternative indexing semantics.

var Float64Array = require( '@stdlib/array-float64' );

// Define a 2x3 matrix in which variables are stored along rows in row-major order:
var A = new Float64Array([
    1.0, -2.0, 2.0,
    2.0, -2.0, 1.0
]);

// Define a vector of known means:
var means = new Float64Array( [ 1.0/3.0, 1.0/3.0 ] );

// Allocate a 2x2 output matrix:
var B = new Float64Array( [ 0.0, 0.0, 0.0, 0.0 ] );

// Perform operation:
var out = dcovmatmtk.ndarray( 'rows', 'full', 2, 3, 1, means, 1, 0, A, 3, 1, 0, B, 2, 1, 0 );
// returns <Float64Array>[ ~4.3333, ~3.8333, ~3.8333, ~4.3333 ]

var bool = ( B === out );
// returns true

The function has the following parameters:

orient: specifies whether variables are stored along columns or along rows.
uplo: specifies whether to overwrite the upper or lower triangular part of B.
M: number of rows in A.
N: number of columns in A.
c: degrees of freedom adjustment. Setting this parameter to a value other than 0 has the effect of adjusting the divisor during the calculation of the covariance according to N-c where c corresponds to the provided degrees of freedom adjustment. When computing the population covariance, setting this parameter to 0 is the standard choice (i.e., the provided arrays contain data constituting entire populations). When computing the unbiased sample covariance, setting this parameter to 1 is the standard choice (i.e., the provided arrays contain data sampled from larger populations; this is commonly referred to as Bessel's correction).
means: Float64Array containing known means.
sm: stride length for means.
om: starting index for means.
A: input matrix stored in linear memory as a Float64Array.
sa1: stride of the first dimension of A.
sa2: stride of the second dimension of A.
oa: starting index for A.
B: output matrix stored in linear memory as a Float64Array.
sb1: stride of the first dimension of B.
sb2: stride of the second dimension of B.
ob: starting index for B.

While typed array views mandate a view offset based on the underlying buffer, the offset parameters support indexing semantics based on starting indices. For example,

var Float64Array = require( '@stdlib/array-float64' );

var A = new Float64Array([
    0.0, 0.0, 0.0,
    1.0, -2.0, 2.0,
    2.0, -2.0, 1.0
]);

var means = new Float64Array( [ 1.0/3.0, 1.0/3.0 ] );
var B = new Float64Array( [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ] );

var out = dcovmatmtk.ndarray( 'rows', 'full', 2, 3, 1, means, -1, 1, A, 3, 1, 3, B, 2, 1, 2 );
// B => <Float64Array>[ 0.0, 0.0, ~4.3333, ~3.8333, ~3.8333, ~4.3333 ]

Notes

When orient = 'columns',
- each column in A represents a variable and each row in A represents an observation.
- B should be an N by N matrix.
- the list of known means should be an N element vector.
When orient = 'rows',
- each row in A represents a variable and each column in A represents an observation.
- B should be an M by M matrix.
- the list of known means should be an M element vector.
If M or N is equal to 0, both functions return B unchanged.
If N - c is less than or equal to 0 (where c corresponds to the provided degrees of freedom adjustment), both functions set the elements of B to NaN.

Examples

var Float64Array = require( '@stdlib/array-float64' );
var strided2array2d = require( '@stdlib/array-base-strided2array2d' );
var dcovmatmtk = require( '@stdlib/stats-strided-dcovmatmtk' );

// Define a 4x3 matrix in which variables are stored along rows in row-major order:
var A = new Float64Array([
    1.0, -2.0, 2.0,
    2.0, -2.0, 1.0,
    2.0, -2.0, 1.0,
    1.0, -2.0, 2.0
]);

// Define a vector of known means:
var means = new Float64Array( [ 1.0/3.0, 1.0/3.0, 1.0/3.0, 1.0/3.0 ] );

// Allocate a 4x4 output matrix:
var B = new Float64Array( 4*4 );

var out = dcovmatmtk.ndarray( 'rows', 'full', 4, 3, 1, means, 1, 0, A, 3, 1, 0, B, 4, 1, 0 );
// returns <Float64Array>[ ~4.33, ~3.83, ~3.83, ~4.33, ~3.83, ~4.33, ~4.33, ~3.83, ~3.83, ~4.33, ~4.33, ~3.83, ~4.33, ~3.83, ~3.83, ~4.33 ]

console.log( strided2array2d( out, [ 4, 4 ], [ 4, 1 ], 0 ) );

C APIs

Usage

#include "stdlib/stats/strided/dcovmatmtk.h"

stdlib_strided_dcovmatmtk( order, orient, uplo, M, N, correction, Means, strideM, A, LDA, *B, LDB )

Computes the covariance matrix for an M by N double-precision floating-point matrix A and assigns the results to a matrix B when provided known means and using a one-pass textbook algorithm.

#include "stdlib/blas/base/shared.h"

// Define a 2x3 matrix in which variables are stored along rows in row-major order:
const double A[] = { 1.0, -2.0, 2.0, 2.0, -2.0, 1.0 };

// Define a vector of known means:
const double means[] = { 1.0/3.0, 1.0/3.0 };

// Allocate a 2x2 output matrix:
double B[] = { 0.0, 0.0, 0.0, 0.0 };

stdlib_strided_dcovmatmtk( CblasRowMajor, CblasRows, -1, 2, 3, 1.0, means, 1, A, 3, B, 2 );

The function accepts the following arguments:

order: [in] CBLAS_LAYOUT storage layout.
orient: [in] CBLAS_ORIENT specifies whether variables are stored along columns or along rows.
uplo: [in] int specifies whether to overwrite the upper or lower triangular part of B.
M: [in] CBLAS_INT number of rows in A.
N: [in] CBLAS_INT number of columns in A.
correction: [in] double degrees of freedom adjustment. Setting this parameter to a value other than 0 has the effect of adjusting the divisor during the calculation of the covariance according to N-c where c corresponds to the provided degrees of freedom adjustment. When computing the population covariance, setting this parameter to 0 is the standard choice (i.e., the provided arrays contain data constituting entire populations). When computing the unbiased sample covariance, setting this parameter to 1 is the standard choice (i.e., the provided arrays contain data sampled from larger populations; this is commonly referred to as Bessel's correction).
Means: [in] double* vector containing known means.
strideM: [in] CBLAS_INT stride length for Means.
A: [in] double* input matrix.
LDA: [in] CBLAS_INT stride of the first dimension of A (a.k.a., leading dimension of the matrix A).
B: [out] double* output matrix.
LDB: [in] CBLAS_INT stride of the first dimension of B (a.k.a., leading dimension of the matrix B).

void stdlib_strided_dcovmatmtk( const CBLAS_LAYOUT layout, const CBLAS_ORIENT orient, const int uplo, const CBLAS_INT M, const CBLAS_INT N, const double correction, const double *Means, const CBLAS_INT strideM, const double *A, const CBLAS_INT LDA, double *B, const CBLAS_INT LDB );

stdlib_strided_dcovmatmtk_ndarray( orient, uplo, M, N, c, Means, sm, om, A, sa1, sa2, oa, *B, sb1, sb2, ob )

#include "stdlib/blas/base/shared.h"

// Define a 2x3 matrix in which variables are stored along rows in row-major order:
const double A[] = { 1.0, -2.0, 2.0, 2.0, -2.0, 1.0 };

// Define a vector of known means:
const double means[] = { 1.0/3.0, 1.0/3.0 };

// Allocate a 2x2 output matrix:
double B[] = { 0.0, 0.0, 0.0, 0.0 };

stdlib_strided_dcovmatmtk_ndarray( CblasRows, -1, 2, 3, 1.0, means, 1, 0, A, 3, 1, 0, B, 2, 1, 0 );

The function accepts the following arguments:

orient: [in] CBLAS_ORIENT specifies whether variables are stored along columns or along rows.
uplo: [in] int specifies whether to overwrite the upper or lower triangular part of B.
M: [in] CBLAS_INT number of rows in A.
N: [in] CBLAS_INT number of columns in A.
c: [in] double degrees of freedom adjustment. Setting this parameter to a value other than 0 has the effect of adjusting the divisor during the calculation of the covariance according to N-c where c corresponds to the provided degrees of freedom adjustment. When computing the population covariance, setting this parameter to 0 is the standard choice (i.e., the provided arrays contain data constituting entire populations). When computing the unbiased sample covariance, setting this parameter to 1 is the standard choice (i.e., the provided arrays contain data sampled from larger populations; this is commonly referred to as Bessel's correction).
Means: [in] double* vector containing known means.
sm: [in] CBLAS_INT stride length for Means.
om: [in] CBLAS_INT starting index for Means.
A: [in] double* input matrix.
sa1: [in] CBLAS_INT stride of the first dimension of A.
sa2: [in] CBLAS_INT stride of the second dimension of A.
oa: [in] CBLAS_INT starting index for A.
B: [out] double* output matrix.
sb1: [in] CBLAS_INT stride of the first dimension of B.
sb2: [in] CBLAS_INT stride of the second dimension of B.
ob: [in] CBLAS_INT starting index for B.

void stdlib_strided_dcovmatmtk_ndarray( const CBLAS_ORIENT orient, const int uplo, const CBLAS_INT M, const CBLAS_INT N, const double correction, const double *Means, const CBLAS_INT strideM, const CBLAS_INT offsetM, const double *A, const CBLAS_INT strideA1, const CBLAS_INT strideA2, const CBLAS_INT offsetA, double *B, const CBLAS_INT strideB1, const CBLAS_INT strideB2, const CBLAS_INT offsetB );

Examples

#include "stdlib/stats/strided/dcovmatmtk.h"
#include "stdlib/blas/base/shared.h"
#include <stdio.h>

int main( void ) {
   // Define a 4x3 matrix in which variables are stored along rows in row-major order:
    const double A[] = {
        1.0, -2.0, 2.0,
        2.0, -2.0, 1.0,
        2.0, -2.0, 1.0,
        1.0, -2.0, 2.0
    };

    // Define a vector of known means:
    const double means[] = { 1.0/3.0, 1.0/3.0, 1.0/3.0, 1.0/3.0 };

    // Allocate a 4x4 output matrix:
    double B[] = {
        0.0, 0.0, 0.0, 0.0,
        0.0, 0.0, 0.0, 0.0,
        0.0, 0.0, 0.0, 0.0,
        0.0, 0.0, 0.0, 0.0
    };

    stdlib_strided_dcovmatmtk( CblasRowMajor, CblasRows, -1, 4, 3, 1.0, means, 1, A, 3, B, 4 );

    // Print the result:
    for ( int i = 0; i < 4; i++ ) {
        for ( int j = 0; j < 4; j++ ) {
            printf( "B[ %i, %i ] = %lf\n", i, j, B[ (i*4)+j ] );
        }
    }
}

Notice

This package is part of stdlib, a standard library for JavaScript and Node.js, with an emphasis on numerical and scientific computing. The library provides a collection of robust, high performance libraries for mathematics, statistics, streams, utilities, and more.

For more information on the project, filing bug reports and feature requests, and guidance on how to develop stdlib, see the main project repository.

Community

License

See LICENSE.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

dcovmatmtk

Installation

Usage

dcovmatmtk( order, orient, uplo, M, N, correction, means, strideM, A, LDA, B, LDB )

dcovmatmtk.ndarray( orient, uplo, M, N, c, means, sm, om, A, sa1, sa2, oa, B, sb1, sb2, ob )

Notes

Examples

C APIs

Usage

stdlib_strided_dcovmatmtk( order, orient, uplo, M, N, correction, *Means, strideM, *A, LDA, *B, LDB )

stdlib_strided_dcovmatmtk_ndarray( orient, uplo, M, N, c, *Means, sm, om, *A, sa1, sa2, oa, *B, sb1, sb2, ob )

Examples

Notice

Community

License

Copyright

stdlib_strided_dcovmatmtk( order, orient, uplo, M, N, correction, Means, strideM, A, LDA, *B, LDB )

stdlib_strided_dcovmatmtk_ndarray( orient, uplo, M, N, c, Means, sm, om, A, sa1, sa2, oa, *B, sb1, sb2, ob )