Variance method

Large variance in the magnetic field compared with neighboring intervals

References: (Liu et al. 2022)

Introduction

For each sampling instant \(t\), we define three intervals: the pre-interval \([-1,-1/2]\cdot T+t\), the middle interval \([-1/,1/2]\cdot T+t\), and the post-interval \([1/2,1]\cdot T+t\), in which \(T\) are time lags. Let time series of the magnetic field data in these three intervals are labeled \({\mathbf B}_-\), \({\mathbf B}_0\), \({\mathbf B}_+\), respectively. Compute the following indices:

\[ I_1 = \frac{\sigma(B_0)}{Max(\sigma(B_-),\sigma(B_+))} \]

\[ I_2 = \frac{\sigma(B_- + B_+)} {\sigma(B_-) + \sigma(B_+)} \]

\[ I_3 = \frac{| \Delta \vec{B} |}{|B_{bg}|} \]

By selecting a large and reasonable threshold for the first two indices (\(I_1>2, I_2>1\)) , we could guarantee that the field changes of the IDs identified are large enough to be distinguished from the stochastic fluctuations on magnetic fields, while the third is a supplementary condition to reduce the uncertainty of recognition. While the third index (relative field jump) is a supplementary condition to reduce the uncertainty of recognition.

Index of the standard deviation

\[ I_1 = \frac{\sigma(B_0)}{Max(\sigma(B_-),\sigma(B_+))} \]


source

compute_std

 compute_std (df:polars.lazyframe.frame.LazyFrame,
              period:datetime.timedelta, index_column='time',
              cols:list[str]=['BX', 'BY', 'BZ'],
              every:datetime.timedelta=None, result_column='std')
Type Default Details
df LazyFrame
period timedelta period to group by
index_column str time
cols list [‘BX’, ‘BY’, ‘BZ’]
every timedelta None every to group by (default: period / 2)
result_column str std

source

add_neighbor_std

 add_neighbor_std (df:polars.lazyframe.frame.LazyFrame,
                   tau:datetime.timedelta, join_strategy='inner',
                   std_column='std', time_column='time')

Get the neighbor standard deviations


source

compute_index_std

 compute_index_std (df:polars.lazyframe.frame.LazyFrame, std_column='std')

Compute the standard deviation index based on the given DataFrame

Type Default Details
df LazyFrame
std_column str std
Returns - pl.LazyFrame: DataFrame with calculated ‘index_std’ column.

Index of fluctuation

\[ I_2 = \frac{\sigma(B_- + B_+)} {\sigma(B_-) + \sigma(B_+)} \]


source

compute_index_fluctuation

 compute_index_fluctuation (df:polars.lazyframe.frame.LazyFrame,
                            std_column='std', clean=True)

source

compute_combinded_std

 compute_combinded_std (df:polars.lazyframe.frame.LazyFrame,
                        cols:list[str], every:datetime.timedelta,
                        period:datetime.timedelta=None,
                        index_column='time', result_column='std_combined')
Type Default Details
df LazyFrame
cols list
every timedelta every to group by (default: period / 2)
period timedelta None period to group by
index_column str time
result_column str std_combined

Index of the relative field jump

\[ I_3 = \frac{| \Delta \vec{B} |}{|B_{bg}|} \]


source

pl_dvec

 pl_dvec (columns, *more_columns)

source

compute_index_diff

 compute_index_diff (df:polars.lazyframe.frame.LazyFrame,
                     every:datetime.timedelta, cols:list[str],
                     period:datetime.timedelta=None, clean=True)

source

compute_indices

 compute_indices (df:polars.lazyframe.frame.LazyFrame,
                  tau:datetime.timedelta, cols:list[str]=['BX', 'BY',
                  'BZ'], clean=True, join_strategy='inner')

Compute all index based on the given DataFrame and tau value.

Type Default Details
df LazyFrame Input DataFrame.
tau timedelta Time interval value.
cols list [‘BX’, ‘BY’, ‘BZ’]
clean bool True
join_strategy str inner
Returns LazyFrame Tuple containing DataFrame results for fluctuation index,
standard deviation index, and ‘index_num’.

Filtering


source

filter_indices

 filter_indices (df:polars.lazyframe.frame.LazyFrame,
                 index_std_threshold:float=2,
                 index_fluc_threshold:float=1,
                 index_diff_threshold:float=0.1, sparse_num:int=15)

Obsolete

Code
def _compute_combinded_std(df: pl.LazyFrame, tau, cols: list[str]):
    combined_std_cols = [col_name + "_combined_std" for col_name in cols]
    offsets = [0 * tau, tau / 2]
    combined_std_dfs = []

    for offset in offsets:
        truncated_df = df.select(
            (pl.col("time") - offset).dt.truncate(tau, offset=offset).alias("time"),
            pl.col(cols),
        )

        prev_df = truncated_df.select(
            (pl.col("time") + tau),
            pl.col(cols),
        )

        next_df = truncated_df.select(
            (pl.col("time") - tau),
            pl.col(cols),
        )

        temp_combined_std_df = (
            pl.concat([prev_df, next_df])
            .group_by("time")
            .agg(pl.col(cols).std(ddof=0).name.suffix("_combined_std"))
            .with_columns(B_std_combined=pl_norm(combined_std_cols))
            .drop(combined_std_cols)
            .sort("time")
        )

        combined_std_dfs.append(temp_combined_std_df)

    combined_std_df = pl.concat(combined_std_dfs)
    return combined_std_df

References

Liu, Y. Y., H. S. Fu, J. B. Cao, Z. Wang, R. J. He, Z. Z. Guo, Y. Xu, and Y. Yu. 2022. “Magnetic Discontinuities in the Solar Wind and Magnetosheath: Magnetospheric Multiscale Mission (MMS) Observations.” Astrophysical Journal 930 (1): 63. https://doi.org/10.3847/1538-4357/ac62d2.