Minimum variance analysis (MVA)

Notes:

The following method implicitly assumes the data is evenly sampled, otherwise, data resampling is needed.

/home/runner/work/discontinuitypy/discontinuitypy/.pixi/envs/default/lib/python3.12/site-packages/fastcore/docscrape.py:230: UserWarning: potentially wrong underline length... 
Parameters 
----------- in 
see `pyspedas.cotrans.minvar`
...
  else: warn(msg)

minvar

 minvar (data:numpy.ndarray)

*see pyspedas.cotrans.minvar

This program computes the principal variance directions and variances of a vector quantity as well as the associated eigenvalues.*

Type Details
data ndarray
Returns vrot: an array of (npoints, ndim) containing the rotated data in the new coordinate system, ijk.
Vi(maximum direction)=vrot[0,:]
Vj(intermediate direction)=vrot[1,:]
Vk(minimum variance direction)=Vrot[2,:]
Exported source
def minvar(data: np.ndarray):
    """
    see `pyspedas.cotrans.minvar`

    This program computes the principal variance directions and variances of a
    vector quantity as well as the associated eigenvalues.

    Parameters
    -----------
    data:
        Vxyz, an (npoints, ndim) array of data(ie Nx3)

    Returns
    -------
    vrot:
        an array of (npoints, ndim) containing the rotated data in the new coordinate system, ijk.
        Vi(maximum direction)=vrot[0,:]
        Vj(intermediate direction)=vrot[1,:]
        Vk(minimum variance direction)=Vrot[2,:]
    v:
        an (ndim,ndim) array containing the principal axes vectors
        Maximum variance direction eigenvector, Vi=v[*,0]
        Intermediate variance direction, Vj=v[*,1] (descending order)
    w:
        the eigenvalues of the computation
    """

    #  Min var starts here
    # data must be Nx3
    vecavg = np.nanmean(np.nan_to_num(data, nan=0.0), axis=0)

    mvamat = np.zeros((3, 3))
    for i in range(3):
        for j in range(3):
            mvamat[i, j] = (
                np.nanmean(np.nan_to_num(data[:, i] * data[:, j], nan=0.0))
                - vecavg[i] * vecavg[j]
            )

    # Calculate eigenvalues and eigenvectors
    w, v = np.linalg.eigh(mvamat, UPLO="U")

    # Sorting to ensure descending order
    w = np.abs(w)
    idx = np.flip(np.argsort(w))

    # IDL compatability
    if True:
        if np.sum(w) == 0.0:
            idx = [0, 2, 1]

    w = w[idx]
    v = v[:, idx]

    # Rotate intermediate var direction if system is not Right Handed
    YcrossZdotX = v[0, 0] * (v[1, 1] * v[2, 2] - v[2, 1] * v[1, 2])
    if YcrossZdotX < 0:
        v[:, 1] = -v[:, 1]
        # v[:, 2] = -v[:, 2] # Should not it is being flipped at Z-axis?

    # Ensure minvar direction is along +Z (for FAC system)
    if v[2, 2] < 0:
        v[:, 2] = -v[:, 2]
        v[:, 1] = -v[:, 1]

    vrot = np.array([np.dot(row, v) for row in data])

    return vrot, v, w

Fit maximum variance direction

\[ f(x; A, \mu, \sigma, {\mathrm{form={}'logistic{}'}}) = A \left[1 - \frac{1}{1 + e^{\alpha}} \right] \]

where \(\alpha = (x - \mu)/{\sigma}\). And the derivative is

\[ \frac{df}{dx} = \frac{A}{\sigma} \frac{e^{\alpha}}{(1 + e^{\alpha})^2} \]

at center \(x = \mu\), the derivative is

\[ \frac{df}{dx} = \frac{A}{4 \sigma} \]


fit_maxiumum_variance_direction

 fit_maxiumum_variance_direction (data:xarray.core.dataarray.DataArray,
                                  datetime_unit='s',
                                  return_best_fit:bool=False, **kwargs)

*Fit maximum variance direction data by model

Note: - see datetime_to_numeric in xarray.core.duck_array_ops for more details about converting datetime to numeric - Xarray uses the numpy dtypes datetime64[ns] and timedelta64[ns] to represent datetime data.*


calc_mva_features_all

 calc_mva_features_all (data:xarray.core.dataarray.DataArray,
                        method=typing.Literal['fit', 'derivative'],
                        **kwargs)

Fit Examples and Caveats

image.png

Test

test for ts_max_distance function
import pandas as pd

time = pd.date_range("2000-01-01", periods=10)
x = np.linspace(0, np.pi, 10)
# generate data circular in three dimensions, so the biggest distance is between the first and the last point
data = np.zeros((10, 3))
data[:, 0] = np.cos(x)
data[:, 1] = np.sin(x)
data = xr.DataArray(data, coords={"time": time}, dims=["time", "space"])
fit_maxiumum_variance_direction(data[:, 0], return_best_fit=True)["fit.best_fit"]
array([ 1.01473047,  0.91997617,  0.75951811,  0.51243958,  0.18263172,
       -0.18270987, -0.51251774, -0.75959627, -0.92005434, -1.01480865])
Code
mva_features, vrot = calc_mva_features(data)
vrot.shape
(10, 3)
Code
from fastcore.test import test_eq

# Generate synthetic data
np.random.seed(42)  # for reproducibility
data = np.random.rand(100, 3)  # 100 time points, 3-dimensional data
# Call the mva_features function
features, vrot = calc_mva_features(data)
_features = [
    0.3631060892452051,
    0.8978455426527485,
    -0.24905290500542857,
    0.09753158579102299,
    0.086943767300213,
    0.07393142040422575,
    1.1760056390752571,
    0.9609421690770317,
    0.6152039820297959,
    -0.5922397773398479,
    0.6402091632847049,
    0.61631157045453,
    1.2956351134759623,
    0.19091785005728523,
    0.5182488424049534,
    0.4957624347593598,
]
test_eq(features, _features)