Time resolution effect Analysis

Abstract
To understand if the difference of Juno distribution at 1AU and beyond is due to time resolution, i.e. it’s possile that 1 s is sufficient to resolve distribution at r > 5AU, but insufficient at 1AU. To check this we may compare ARTEMIS data at 1AU for two resolutions - 1 s and 1/5 s
Code
import polars as pl
import polars.selectors as cs

::: {#cell-2 .cell 0=‘h’ 1=‘i’ 2=‘d’ 3=‘e’ execution_count=2}

Code
from beforerr.r import py2rpy_polars
import rpy2.robjects as robjects

%load_ext rpy2.ipython

r = robjects.r
r.source('utils.R')

conv_pl = py2rpy_polars()
R[write to console]: 
Attaching package: ‘dplyr’


R[write to console]: The following objects are masked from ‘package:stats’:

    filter, lag


R[write to console]: The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union


R[write to console]: 
Attaching package: ‘scales’


R[write to console]: The following object is masked from ‘package:purrr’:

    discard

:::

Code
ts = 1.00 # unit: seconds
tau = 60 # unit: seconds
data_dir = "../data"
format = "arrow"
Code
parameters = ['j0_k', 'j0_k_norm', 'L_k', 'L_k_norm']

juno_fit = pl.read_ipc(f"{data_dir}/05_reporting/events.JNO.fit.ts_{ts:.2f}s_tau_{tau}s.arrow").with_columns(
    pl.col(parameters).abs(),
).with_columns(cs.numeric().cast(pl.Float64)).drop_nulls().filter(
    pl.col("d_star") < 100,  # exclude extreme values
    pl.col('j0_k') < 100
).with_columns(pl.col('count').cast(pl.UInt32)).lazy()
---------------------------------------------------------------------------
ColumnNotFoundError                       Traceback (most recent call last)
Cell In[11], line 8
      1 parameters = ['j0_k', 'j0_k_norm', 'L_k', 'L_k_norm']
      3 juno_fit = pl.read_ipc(f"{data_dir}/05_reporting/events.JNO.fit.ts_{ts:.2f}s_tau_{tau}s.arrow").with_columns(
      4     pl.col(parameters).abs(),
      5 ).with_columns(cs.numeric().cast(pl.Float64)).drop_nulls().filter(
      6     pl.col("d_star") < 100,  # exclude extreme values
      7     pl.col('j0_k') < 100
----> 8 ).with_columns(pl.col('count').cast(pl.UInt32)).lazy()

File ~/micromamba/envs/juno/lib/python3.10/site-packages/polars/dataframe/frame.py:8315, in DataFrame.with_columns(self, *exprs, **named_exprs)
   8169 def with_columns(
   8170     self,
   8171     *exprs: IntoExpr | Iterable[IntoExpr],
   8172     **named_exprs: IntoExpr,
   8173 ) -> DataFrame:
   8174     """
   8175     Add columns to this DataFrame.
   8176 
   (...)
   8313     └─────┴──────┴─────────────┘
   8314     """
-> 8315     return self.lazy().with_columns(*exprs, **named_exprs).collect(_eager=True)

File ~/micromamba/envs/juno/lib/python3.10/site-packages/polars/lazyframe/frame.py:1940, in LazyFrame.collect(self, type_coercion, predicate_pushdown, projection_pushdown, simplify_expression, slice_pushdown, comm_subplan_elim, comm_subexpr_elim, no_optimization, streaming, background, _eager)
   1937 if background:
   1938     return InProcessQuery(ldf.collect_concurrently())
-> 1940 return wrap_df(ldf.collect())

ColumnNotFoundError: count

Error originated just after this operation:
DF ["time", "tstart", "tstop", "t.d_end"]; PROJECT */92 COLUMNS; SELECTION: "None"
Code
parameters = ['j0', 'j0_norm', 'j0_k', 'j0_k_norm', 'L_mn', 'L_k', 'L_mn_norm', 'L_k_norm']

wind_fit = pl.read_ipc(f"{data_dir}/05_reporting/events.Wind.ts_{ts}s_tau_{tau}s.arrow").with_columns(
    pl.col(parameters).abs(),
).with_columns(cs.numeric().cast(pl.Float64)).drop_nulls().filter(
    pl.col("d_star") < 100,  # exclude extreme values
    pl.col('j0') < 100
).with_columns(pl.col('count').cast(pl.UInt32)).lazy()
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/var/folders/tg/rfd0nr_970s3mv1fspgvkkxm0000gn/T/ipykernel_31903/3242119788.py in ?()
      1 parameters = ['j0', 'j0_norm', 'j0_k', 'j0_k_norm', 'L_mn', 'L_k', 'L_mn_norm', 'L_k_norm']
      2 
----> 3 wind_fit = pl.read_ipc(f"{data_dir}/05_reporting/events.Wind.ts_{ts}s_tau_{tau}s.arrow").with_columns(
      4     pl.col(parameters).abs(),
      5 ).with_columns(cs.numeric().cast(pl.Float64)).drop_nulls().filter(
      6     pl.col("d_star") < 100,  # exclude extreme values

~/micromamba/envs/juno/lib/python3.10/site-packages/polars/utils/deprecation.py in ?(*args, **kwargs)
    132         def wrapper(*args: P.args, **kwargs: P.kwargs) -> T:
    133             _rename_keyword_argument(
    134                 old_name, new_name, kwargs, function.__name__, version
    135             )
--> 136             return function(*args, **kwargs)

~/micromamba/envs/juno/lib/python3.10/site-packages/polars/utils/deprecation.py in ?(*args, **kwargs)
    132         def wrapper(*args: P.args, **kwargs: P.kwargs) -> T:
    133             _rename_keyword_argument(
    134                 old_name, new_name, kwargs, function.__name__, version
    135             )
--> 136             return function(*args, **kwargs)

~/micromamba/envs/juno/lib/python3.10/site-packages/polars/io/ipc/functions.py in ?(source, columns, n_rows, use_pyarrow, memory_map, storage_options, row_index_name, row_index_offset, rechunk)
     99             if n_rows is not None:
    100                 df = df.slice(0, n_rows)
    101             return df
    102 
--> 103         return pl.DataFrame._read_ipc(
    104             data,
    105             columns=columns,
    106             n_rows=n_rows,

~/micromamba/envs/juno/lib/python3.10/site-packages/polars/dataframe/frame.py in ?(cls, source, columns, n_rows, row_index_name, row_index_offset, rechunk, memory_map)
    964             return cls._from_pydf(df._df)
    965 
    966         projection, columns = handle_projection_columns(columns)
    967         self = cls.__new__(cls)
--> 968         self._df = PyDataFrame.read_ipc(
    969             source,
    970             columns,
    971             projection,

FileNotFoundError: No such file or directory (os error 2): ../data/05_reporting/events.Wind.ts_1.0s_tau_60s.arrow
Code
def load_events(name: str, ts: float, tau: float, method ='derivative') -> pl.DataFrame:
    if method == 'derivative':
        format = 'parquet'
        filepath = f"{data_dir}/08_reporting/events/l1/{name}_ts_{ts}s_tau_{tau}s.{format}"
        df = pl.scan_parquet(filepath)
    elif method == 'fit':
        format = 'arrow'
        filepath = f"{data_dir}/05_reporting/events.{name}.ts_{ts:.2f}s_tau_{tau}s.{format}"
        df = pl.read_ipc(filepath).lazy()
    return df.with_columns(
        pl.col(parameters).abs(),
        sat = pl.lit(name),
        ts = pl.lit(f'{ts}s'),
        method = pl.lit(method),
        ts_method = pl.lit(f'{ts}s {method}'),
        label = pl.lit(f'{name} {ts}s {method}')
    ).with_columns(cs.numeric().cast(pl.Float64))
Code
wind_ts_009_all = load_events('Wind', 0.09, 60)
wind_ts_01_all = load_events('Wind', 0.1, 60)
wind_ts_02_all = load_events('Wind', 0.2, 60)
wind_ts_05_all = load_events('Wind', 0.5, 60)
wind_ts_1_all = load_events('Wind', 1, 60)

juno_ts_1_all = load_events('JNO', 1, 60)
juno_ts_1_fit = load_events('JNO', 1, 60, method='fit')


wind_ts_01_fit = load_events('Wind', 0.09, 60, method='fit')
wind_ts_02_fit = load_events('Wind', 0.2, 60, method='fit')
wind_ts_05_fit = load_events('Wind', 0.5, 60, method='fit')
wind_ts_1_fit = load_events('Wind', 1, 60, method='fit')
wind_ts_2_fit = load_events('Wind', 2, 60, method='fit')

wind_fit_dfs = [wind_ts_01_fit, wind_ts_02_fit, wind_ts_05_fit, wind_ts_1_fit, wind_ts_2_fit]
Code
wind_ts_01_fit
naive plan: (run LazyFrame.explain(optimized=True) to see the optimized plan)

WITH_COLUMNS:

[col("count").strict_cast(Float64), col("B_std").strict_cast(Float64), col("B_mean").strict_cast(Float64), col("dB_vec").strict_cast(Float64), col("index_diff").strict_cast(Float64), col("index_std").strict_cast(Float64), col("index_fluctuation").strict_cast(Float64), col("B.after").strict_cast(Float64), col("B.before").strict_cast(Float64), col("b_mag").strict_cast(Float64), col("b_n").strict_cast(Float64), col("bn_over_b").strict_cast(Float64), col("d_star").strict_cast(Float64), col("db_mag").strict_cast(Float64), col("db_over_b").strict_cast(Float64), col("db_over_b_max").strict_cast(Float64), col("fit.stat.chisqr").strict_cast(Float64), col("fit.stat.rsquared").strict_cast(Float64), col("fit.vars.amplitude").strict_cast(Float64), col("fit.vars.c").strict_cast(Float64), col("fit.vars.sigma").strict_cast(Float64), col("rotation_angle").strict_cast(Float64), col("dB_x").strict_cast(Float64), col("dB_y").strict_cast(Float64), col("dB_z").strict_cast(Float64), col("dB_lmn_x").strict_cast(Float64), col("dB_lmn_y").strict_cast(Float64), col("dB_lmn_z").strict_cast(Float64), col("k_x").strict_cast(Float64), col("k_y").strict_cast(Float64), col("k_z").strict_cast(Float64), col("Vl_x").strict_cast(Float64), col("Vl_y").strict_cast(Float64), col("Vl_z").strict_cast(Float64), col("Vn_x").strict_cast(Float64), col("Vn_y").strict_cast(Float64), col("Vn_z").strict_cast(Float64), col("plasma_density").strict_cast(Float64), col("VX (GSM)").strict_cast(Float64), col("VY (GSM)").strict_cast(Float64), col("VZ (GSM)").strict_cast(Float64), col("SW Vth").strict_cast(Float64), col("plasma_speed").strict_cast(Float64), col("plasma_temperature").strict_cast(Float64), col("n.before").strict_cast(Float64), col("T.before").strict_cast(Float64), col("v.ion.before").strict_cast(Float64), col("n.after").strict_cast(Float64), col("T.after").strict_cast(Float64), col("v.ion.after").strict_cast(Float64), col("v_l").strict_cast(Float64), col("v_n").strict_cast(Float64), col("v_k").strict_cast(Float64), col("v_mn").strict_cast(Float64), col("L_n").strict_cast(Float64), col("L_mn").strict_cast(Float64), col("L_k").strict_cast(Float64), col("j0").strict_cast(Float64), col("j0_k").strict_cast(Float64), col("ion_inertial_length").strict_cast(Float64), col("Alfven_speed").strict_cast(Float64), col("j_Alfven").strict_cast(Float64), col("L_n_norm").strict_cast(Float64), col("L_mn_norm").strict_cast(Float64), col("L_k_norm").strict_cast(Float64), col("j0_norm").strict_cast(Float64), col("j0_k_norm").strict_cast(Float64), col("v.Alfven.before").strict_cast(Float64), col("v.Alfven.after").strict_cast(Float64), col("n.change").strict_cast(Float64), col("v.ion.change").strict_cast(Float64), col("T.change").strict_cast(Float64), col("B.change").strict_cast(Float64), col("v.Alfven.change").strict_cast(Float64)]

WITH_COLUMNS:

[col("j0").abs(), col("j0_norm").abs(), col("j0_k").abs(), col("j0_k_norm").abs(), col("L_mn").abs(), col("L_k").abs(), col("L_mn_norm").abs(), col("L_k_norm").abs(), String(Wind).alias("sat"), String(0.09s).alias("ts"), String(fit).alias("method"), String(0.09s fit).alias("ts_method"), String(Wind 0.09s fit).alias("label")]

DF ["time", "tstart", "tstop", "d_tstart"]; PROJECT */83 COLUMNS; SELECTION: "None"
Code
time_filter = pl.col('time').dt.year()==2016
dfs = [juno_ts_1_all, wind_ts_1_all, juno_ts_1_fit, wind_ts_009_all, wind_ts_01_all, wind_ts_02_all, wind_ts_05_all] + wind_fit_dfs

df = pl.concat([_.filter(time_filter) for _ in dfs], how='diagonal').collect()
%R -i df -c conv_pl
Code
%%R
filename <- "ts_effect"
filename <- "ts_effect_new"
Code
%%R
# sort color with 'JUNO 1s' first
plot_ts_effect <- function(df, labels=NULL, color ="label", L_mn_norm_lim = c(0,60), j_norm_lim = c(0,0.8), legend_title = NULL) {

  # if labels is not null, select df with labels and reorder labels
  if (!is.null(labels)){
    df <- df %>%
      filter(label %in% labels) %>%
      mutate(label = factor(label, levels = labels))
  }

  
  add <- "mean" # this seems to be computed before the limits are set
  add <- NULL
  
  common_custom <- scale_color_okabeito(palette = "black_first")

  x <- "L_mn"
  x_lim <- c(0,7500)
  p1 <- ggdensity(df, x = x, color = color, add = add, alpha = 0) + xlim(x_lim) + common_custom

  x <- "L_mn_norm"
  x_lab <- expression(paste("Normalized Thickness (", d[i], ")"))
  p2 <- ggdensity(df, x = x, color = color, add = add, alpha = 0) + xlim(L_mn_norm_lim) + common_custom + labs(x=x_lab)

  x <- "j0"
  x_lim <- c(0,20)
  p3 <- ggdensity(df, x = x, color = color, add = add, alpha = 0) + xlim(x_lim) + common_custom

  x <- "j0_norm"
  x_lab <- expression(paste("Normalized Current Intensity (", J[A], ")"))
  p4 <- ggdensity(df, x = x, color = color, add = add, alpha = 0) + xlim(j_norm_lim) + common_custom + labs(x=x_lab)


  output <- list(p2, p4)
  # change the legend title for each plot
  if (!is.null(legend_title)){
    for (i in 1:length(output)){
      output[[i]] <- ggpar(output[[i]], legend.title = legend_title)
    }
  }

  purrr::reduce(output, `+`) + plot_layout(guides = 'collect', nrow=2) &
    theme(legend.position='top')
}
Code
%%R
filename <- "ts_method/all"
p <- plot_ts_effect(df)
save_plot(filename)
p
R[write to console]: In addition: 
R[write to console]: Warning messages:

R[write to console]: 1: Removed 12428 rows containing non-finite values (`stat_density()`). 

R[write to console]: 2: Removed 1 rows containing missing values (`geom_vline()`). 

R[write to console]: 3: Removed 719 rows containing non-finite values (`stat_density()`). 

Code
%%R
# filter df with sat = 'Wind'
p1 <- df %>%
  filter(sat == 'Wind') %>%
    plot_ts_effect(color = "ts_method")

filename <- "ts_method/wind"
save_plot(filename)

p2 <- df %>%
  filter(sat == 'JNO') %>%
    plot_ts_effect(color = "ts_method")
    
filename <- "ts_method/juno"
save_plot(filename)

p1 | p2
Saving 6.67 x 6.67 in image
Saving 6.67 x 6.67 in image
Saving 6.67 x 6.67 in image
Saving 6.67 x 6.67 in image
R[write to console]: In addition: 
R[write to console]: Warning messages:

R[write to console]: 1: Removed 12233 rows containing non-finite values (`stat_density()`). 

R[write to console]: 2: Removed 658 rows containing non-finite values (`stat_density()`). 

R[write to console]: 3: Removed 195 rows containing non-finite values (`stat_density()`). 

R[write to console]: 4: Removed 61 rows containing non-finite values (`stat_density()`). 
In addition: Warning messages:
1: Removed 12233 rows containing non-finite values (`stat_density()`). 
2: Removed 658 rows containing non-finite values (`stat_density()`). 
3: Removed 12233 rows containing non-finite values (`stat_density()`). 
4: Removed 658 rows containing non-finite values (`stat_density()`). 
5: Removed 195 rows containing non-finite values (`stat_density()`). 
6: Removed 61 rows containing non-finite values (`stat_density()`). 
7: Removed 195 rows containing non-finite values (`stat_density()`). 
8: Removed 61 rows containing non-finite values (`stat_density()`). 

Code
%%R
p <- df %>%
  filter(method == 'fit', sat == 'Wind') %>%
    plot_ts_effect(color = 'ts', L_mn_norm_lim = c(0,300), j_norm_lim = c(0,0.6), legend_title = "time resolution")

p <- p + plot_annotation(title = "Wind - Fit")

filename <- "method/time_res_effect"
save_plot(filename)
p
Saving 6.67 x 6.67 in image
Saving 6.67 x 6.67 in image
R[write to console]: In addition: 
R[write to console]: Warning messages:

R[write to console]: 1: Removed 4927 rows containing non-finite values (`stat_density()`). 

R[write to console]: 2: Removed 3013 rows containing non-finite values (`stat_density()`). 
In addition: Warning messages:
1: Removed 4927 rows containing non-finite values (`stat_density()`). 
2: Removed 3013 rows containing non-finite values (`stat_density()`). 
3: Removed 4927 rows containing non-finite values (`stat_density()`). 
4: Removed 3013 rows containing non-finite values (`stat_density()`). 

Code
%%R
p1 <- df %>%
  filter(method == 'derivative') %>%
    plot_ts_effect
 
filename <- "ts_method/derivative"
save_plot(filename)
p1

p2 <- df %>%
  filter(method == 'fit') %>%
    plot_ts_effect(L_mn_norm_lim = c(0,300), j_norm_lim = c(0,0.6))

filename <- "ts_method/fit"
save_plot(filename)
p2

p1 | p2
Saving 6.67 x 6.67 in image
Saving 6.67 x 6.67 in image
Saving 6.67 x 6.67 in image
Saving 6.67 x 6.67 in image
R[write to console]: In addition: 
R[write to console]: Warning messages:

R[write to console]: 1: Removed 334 rows containing non-finite values (`stat_density()`). 

R[write to console]: 2: Removed 502 rows containing non-finite values (`stat_density()`). 

R[write to console]: 3: Removed 335 rows containing non-finite values (`stat_density()`). 

R[write to console]: 4: Removed 313 rows containing non-finite values (`stat_density()`). 
In addition: Warning messages:
1: Removed 334 rows containing non-finite values (`stat_density()`). 
2: Removed 502 rows containing non-finite values (`stat_density()`). 
3: Removed 334 rows containing non-finite values (`stat_density()`). 
4: Removed 502 rows containing non-finite values (`stat_density()`). 
5: Removed 335 rows containing non-finite values (`stat_density()`). 
6: Removed 313 rows containing non-finite values (`stat_density()`). 
7: Removed 335 rows containing non-finite values (`stat_density()`). 
8: Removed 313 rows containing non-finite values (`stat_density()`).