References: (Liu et al. 2022 )
Introduction
For each sampling instant \(t\) , we define three intervals: the pre-interval \([-1,-1/2]\cdot T+t\) , the middle interval \([-1/,1/2]\cdot T+t\) , and the post-interval \([1/2,1]\cdot T+t\) , in which \(T\) are time lags. Let time series of the magnetic field data in these three intervals are labeled \({\mathbf B}_-\) , \({\mathbf B}_0\) , \({\mathbf B}_+\) , respectively. Compute the following indices:
\[
I_1 = \frac{\sigma(B_0)}{Max(\sigma(B_-),\sigma(B_+))}
\]
\[
I_2 = \frac{\sigma(B_- + B_+)} {\sigma(B_-) + \sigma(B_+)}
\]
\[
I_3 = \frac{| \Delta \vec{B} |}{|B_{bg}|}
\]
By selecting a large and reasonable threshold for the first two indices (\(I_1>2, I_2>1\) ) , we could guarantee that the field changes of the IDs identified are large enough to be distinguished from the stochastic fluctuations on magnetic fields, while the third is a supplementary condition to reduce the uncertainty of recognition. While the third index (relative field jump) is a supplementary condition to reduce the uncertainty of recognition.
Index of the standard deviation
\[
I_1 = \frac{\sigma(B_0)}{Max(\sigma(B_-),\sigma(B_+))}
\]
source
compute_std
compute_std (df:polars.lazyframe.frame.LazyFrame,
period:datetime.timedelta, index_column='time',
cols:list[str]=['BX', 'BY', 'BZ'],
every:datetime.timedelta=None, result_column='std')
df
LazyFrame
period
timedelta
period to group by
index_column
str
time
cols
list
[‘BX’, ‘BY’, ‘BZ’]
every
timedelta
None
every to group by (default: period / 2)
result_column
str
std
source
add_neighbor_std
add_neighbor_std (df:polars.lazyframe.frame.LazyFrame,
tau:datetime.timedelta, join_strategy='inner',
std_column='std', time_column='time')
Get the neighbor standard deviations
source
compute_index_std
compute_index_std (df:polars.lazyframe.frame.LazyFrame, std_column='std')
Compute the standard deviation index based on the given DataFrame
df
LazyFrame
std_column
str
std
Returns
- pl.LazyFrame: DataFrame with calculated ‘index_std’ column.
Index of fluctuation
\[
I_2 = \frac{\sigma(B_- + B_+)} {\sigma(B_-) + \sigma(B_+)}
\]
source
compute_index_fluctuation
compute_index_fluctuation (df:polars.lazyframe.frame.LazyFrame,
std_column='std', clean=True)
source
compute_combinded_std
compute_combinded_std (df:polars.lazyframe.frame.LazyFrame,
cols:list[str], every:datetime.timedelta,
period:datetime.timedelta=None,
index_column='time', result_column='std_combined')
df
LazyFrame
cols
list
every
timedelta
every to group by (default: period / 2)
period
timedelta
None
period to group by
index_column
str
time
result_column
str
std_combined
Index of the relative field jump
\[
I_3 = \frac{| \Delta \vec{B} |}{|B_{bg}|}
\]
source
pl_dvec
pl_dvec (columns, *more_columns)
source
compute_index_diff
compute_index_diff (df:polars.lazyframe.frame.LazyFrame,
every:datetime.timedelta, cols:list[str],
period:datetime.timedelta=None, clean=True)
source
compute_indices
compute_indices (df:polars.lazyframe.frame.LazyFrame,
tau:datetime.timedelta, cols:list[str]=['BX', 'BY',
'BZ'], clean=True, join_strategy='inner')
Compute all index based on the given DataFrame and tau value.
df
LazyFrame
Input DataFrame.
tau
timedelta
Time interval value.
cols
list
[‘BX’, ‘BY’, ‘BZ’]
clean
bool
True
join_strategy
str
inner
Returns
LazyFrame
Tuple containing DataFrame results for fluctuation index, standard deviation index, and ‘index_num’.
Filtering
source
filter_indices
filter_indices (df:polars.lazyframe.frame.LazyFrame,
index_std_threshold:float=2,
index_fluc_threshold:float=1,
index_diff_threshold:float=0.1, sparse_num:int=15)
Obsolete
Code
def _compute_combinded_std(df: pl.LazyFrame, tau, cols: list [str ]):
combined_std_cols = [col_name + "_combined_std" for col_name in cols]
offsets = [0 * tau, tau / 2 ]
combined_std_dfs = []
for offset in offsets:
truncated_df = df.select(
(pl.col("time" ) - offset).dt.truncate(tau, offset= offset).alias("time" ),
pl.col(cols),
)
prev_df = truncated_df.select(
(pl.col("time" ) + tau),
pl.col(cols),
)
next_df = truncated_df.select(
(pl.col("time" ) - tau),
pl.col(cols),
)
temp_combined_std_df = (
pl.concat([prev_df, next_df])
.group_by("time" )
.agg(pl.col(cols).std(ddof= 0 ).name.suffix("_combined_std" ))
.with_columns(B_std_combined= pl_norm(combined_std_cols))
.drop(combined_std_cols)
.sort("time" )
)
combined_std_dfs.append(temp_combined_std_df)
combined_std_df = pl.concat(combined_std_dfs)
return combined_std_df
References
Liu, Y. Y., H. S. Fu, J. B. Cao, Z. Wang, R. J. He, Z. Z. Guo, Y. Xu, and Y. Yu. 2022.
“Magnetic Discontinuities in the Solar Wind and Magnetosheath: Magnetospheric Multiscale Mission (MMS ) Observations.” Astrophysical Journal 930 (1): 63.
https://doi.org/10.3847/1538-4357/ac62d2 .