SpaceTools.jl

Space data analysis toolbox in Julia

package
Julia
Published

March 7, 2025

Think of SPEDAS in Julia.

Installation

using Pkg
Pkg.add("https://github.com/SciQLop/Speasy.jl")
Pkg.add("https://github.com/Beforerr/SpaceTools.jl")
Pkg.add("https://github.com/Beforerr/Beforerr.jl")

Getting data is easy

Code
using Speasy
using Dates
using DimensionalData
using CairoMakie
using SpaceTools
using Unitful
using LinearAlgebra

const spz = speasy
    CondaPkg Found dependencies: /Users/zijin/.julia/packages/DimensionalData/M9vEC/CondaPkg.toml
    CondaPkg Found dependencies: /Users/zijin/.julia/dev/Speasy/CondaPkg.toml
    CondaPkg Found dependencies: /Users/zijin/.julia/packages/PythonCall/WMWY0/CondaPkg.toml
    CondaPkg Found dependencies: /Users/zijin/.julia/dev/PySPEDAS/CondaPkg.toml
    CondaPkg Initialising pixi
             /Users/zijin/.julia/artifacts/d2fecc2a9fa3eac2108d3e4d9d155e6ff5dfd0b2/bin/pixi
             init
             --format pixi
             /Users/zijin/projects/beforerr/.CondaPkg
✔ Created /Users/zijin/projects/beforerr/.CondaPkg/pixi.toml
    CondaPkg Wrote /Users/zijin/projects/beforerr/.CondaPkg/pixi.toml
             [dependencies]
             netcdf4 = "*"
             openssl = ">=3, <3.1"
             uv = ">=0.4"
             xarray = "*"
             sqlite = "!=3.49.1"
             numpy = "*"
                              [dependencies.python]
                 channel = "conda-forge"
                 build = "*cpython*"
                 version = ">=3.8,<4"
                          [project]
             name = ".CondaPkg"
             platforms = ["osx-arm64"]
             channels = ["conda-forge"]
             channel-priority = "strict"
             description = "automatically generated by CondaPkg.jl"
                          [pypi-dependencies.speasy]
             git = "https://github.com/SciQLop/speasy"
                          [pypi-dependencies.pyspedas]
             git = "https://github.com/spedas/pyspedas"
    CondaPkg Installing packages
             /Users/zijin/.julia/artifacts/d2fecc2a9fa3eac2108d3e4d9d155e6ff5dfd0b2/bin/pixi
             install
             --manifest-path /Users/zijin/projects/beforerr/.CondaPkg/pixi.toml
 WARN Using local manifest /Users/zijin/projects/beforerr/.CondaPkg/pixi.toml rather than /Users/zijin/projects/beforerr/pyproject.toml from environment variable `PIXI_PROJECT_MANIFEST`
✔ The default environment has been installed.
Precompiling SpaceTools...
   7303.1 msSpaceTools
  1 dependency successfully precompiled in 9 seconds. 368 already precompiled.
Precompiling SpaceToolsSpeasyExt...
   7963.9 msSpaceTools → SpaceToolsSpeasyExt
  1 dependency successfully precompiled in 9 seconds. 383 already precompiled.
Python: <module 'speasy' from '/Users/zijin/projects/beforerr/.CondaPkg/.pixi/envs/default/lib/python3.11/site-packages/speasy/__init__.py'>
Code
t0 = DateTime("2008-09-05T10:00:00")
t1 = DateTime("2008-09-05T22:00:00")
da = get_data("cda/THA_L2_FGM/tha_fgs_gse", t0, t1) |> DimArray
[ Info: Cannot parse unit nT*GSE*(All*Qs)
14386×3 DimArray{Float32, 2} tha_fgs_gse
├──────────────────────────────────────────┴───────────────────────────── dims ┐
  Ti          Sampled{NanoDates.NanoDate} [2008-09-05T10:00:02.232367872, …, 2008-09-05T21:59:59.539785472] ForwardOrdered Irregular Points,
  tha_fgs_gse Categorical{Symbol} [Symbol("Bx FGS-D"), Symbol("By FGS-D"), Symbol("Bz FGS-D")] ForwardOrdered
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{Any, Any} with 27 entries:
  "FILLVAL"          => Any[-1.0e30]
  "DEPEND_EPOCH0"    => "tha_fgs_epoch0"
  "FIELDNAM"         => "BXYZ GSE Coordinate, nT units"
  "VALIDMAX"         => Any[25000.0, 25000.0, 25000.0]
  "SI_CONVERSION"    => "1e-9>T"
  "axes"             => VariableAxis[Speasy.VariableAxis(…
  "TENSOR_ORDER"     => "1"
  "SC_ID"            => "d"
  "DEPEND_1"         => "tha_fgs_compno"
  "SCALE_TYP"        => "linear"
  "CALIB_SOFTWARE"   => "fgm_calibrate, thm_cal_fgm, thm_cal_fit"
  "CATDESC"          => "---- FGS (spin-resolution/~3 sec) magnetic  field B in…
  "DEPEND_TIME"      => "tha_fgs_time"
  "AVG_TYPE"         => "standard"
  "DISPLAY_TYPE"     => "time_series"
  "VALIDMIN"         => Any[-25000.0, -25000.0, -25000.0]
  "UNITS"            => "nT GSE (All Qs)"
  "VAR_NOTES"        => "Units are in nanotesla"
  "REPRESENTATION_1" => "Rep_xyz"
  ⋮                  => ⋮
└──────────────────────────────────────────────────────────────────────────────┘
 ⋮      ⋱  

Array-like Time Series

No more need to get_data and store_data. All the functions and operations that work on abstract array would also work on the time series.

Code
@info da_max = maximum(da)
@info norm.(eachrow(da))[1:10]
[ Info: 65.44341
[ Info: Float32[75.64841, 75.72411, 75.5134, 75.73276, 75.492195, 75.506645, 75.522064, 75.78088, 75.69144, 75.647]
Code
using Test
da1 = get_data("cda/OMNI_HRO_1MIN/flow_speed", t0, t1) |> DimArray
da2 = get_data("cda/OMNI_HRO_1MIN/BX_GSE", t0, t1) |> DimArray

try
    da1 + da2
catch e
    println(e)
    @test e isa Unitful.DimensionError
end

# in different times so add would not work

tspan3 = DateTime("2008-09-05T10:00:00"), DateTime("2008-09-05T11:00:00")
tspan4 = DateTime("2008-09-05T11:00:00"), DateTime("2008-09-05T12:00:00")
da3 = get_data("cda/OMNI_HRO_1MIN/flow_speed", tspan3...) |> DimArray
da4 = get_data("cda/OMNI_HRO_1MIN/flow_speed", tspan4...) |> DimArray
try
    da3 + da4
catch e
    println(e)
end
Unitful.DimensionError(522.0f0 km s⁻¹, 4.01f0 nT)
DimensionMismatch("Lookup values for DimensionalData.Dimensions.Ti of NanoDates.NanoDate[2008-09-05T10:00:00, 2008-09-05T10:01:00, 2008-09-05T10:02:00, 2008-09-05T10:03:00, 2008-09-05T10:04:00, 2008-09-05T10:05:00, 2008-09-05T10:06:00, 2008-09-05T10:07:00, 2008-09-05T10:08:00, 2008-09-05T10:09:00, 2008-09-05T10:10:00, 2008-09-05T10:11:00, 2008-09-05T10:12:00, 2008-09-05T10:13:00, 2008-09-05T10:14:00, 2008-09-05T10:15:00, 2008-09-05T10:16:00, 2008-09-05T10:17:00, 2008-09-05T10:18:00, 2008-09-05T10:19:00, 2008-09-05T10:20:00, 2008-09-05T10:21:00, 2008-09-05T10:22:00, 2008-09-05T10:23:00, 2008-09-05T10:24:00, 2008-09-05T10:25:00, 2008-09-05T10:26:00, 2008-09-05T10:27:00, 2008-09-05T10:28:00, 2008-09-05T10:29:00, 2008-09-05T10:30:00, 2008-09-05T10:31:00, 2008-09-05T10:32:00, 2008-09-05T10:33:00, 2008-09-05T10:34:00, 2008-09-05T10:35:00, 2008-09-05T10:36:00, 2008-09-05T10:37:00, 2008-09-05T10:38:00, 2008-09-05T10:39:00, 2008-09-05T10:40:00, 2008-09-05T10:41:00, 2008-09-05T10:42:00, 2008-09-05T10:43:00, 2008-09-05T10:44:00, 2008-09-05T10:45:00, 2008-09-05T10:46:00, 2008-09-05T10:47:00, 2008-09-05T10:48:00, 2008-09-05T10:49:00, 2008-09-05T10:50:00, 2008-09-05T10:51:00, 2008-09-05T10:52:00, 2008-09-05T10:53:00, 2008-09-05T10:54:00, 2008-09-05T10:55:00, 2008-09-05T10:56:00, 2008-09-05T10:57:00, 2008-09-05T10:58:00, 2008-09-05T10:59:00] and NanoDates.NanoDate[2008-09-05T11:00:00, 2008-09-05T11:01:00, 2008-09-05T11:02:00, 2008-09-05T11:03:00, 2008-09-05T11:04:00, 2008-09-05T11:05:00, 2008-09-05T11:06:00, 2008-09-05T11:07:00, 2008-09-05T11:08:00, 2008-09-05T11:09:00, 2008-09-05T11:10:00, 2008-09-05T11:11:00, 2008-09-05T11:12:00, 2008-09-05T11:13:00, 2008-09-05T11:14:00, 2008-09-05T11:15:00, 2008-09-05T11:16:00, 2008-09-05T11:17:00, 2008-09-05T11:18:00, 2008-09-05T11:19:00, 2008-09-05T11:20:00, 2008-09-05T11:21:00, 2008-09-05T11:22:00, 2008-09-05T11:23:00, 2008-09-05T11:24:00, 2008-09-05T11:25:00, 2008-09-05T11:26:00, 2008-09-05T11:27:00, 2008-09-05T11:28:00, 2008-09-05T11:29:00, 2008-09-05T11:30:00, 2008-09-05T11:31:00, 2008-09-05T11:32:00, 2008-09-05T11:33:00, 2008-09-05T11:34:00, 2008-09-05T11:35:00, 2008-09-05T11:36:00, 2008-09-05T11:37:00, 2008-09-05T11:38:00, 2008-09-05T11:39:00, 2008-09-05T11:40:00, 2008-09-05T11:41:00, 2008-09-05T11:42:00, 2008-09-05T11:43:00, 2008-09-05T11:44:00, 2008-09-05T11:45:00, 2008-09-05T11:46:00, 2008-09-05T11:47:00, 2008-09-05T11:48:00, 2008-09-05T11:49:00, 2008-09-05T11:50:00, 2008-09-05T11:51:00, 2008-09-05T11:52:00, 2008-09-05T11:53:00, 2008-09-05T11:54:00, 2008-09-05T11:55:00, 2008-09-05T11:56:00, 2008-09-05T11:57:00, 2008-09-05T11:58:00, 2008-09-05T11:59:00] do not match.")

Unit-aware Time Series

Unit is not just a plot label, but also has dimensions meaning. (Unitful.jl integration)

Code
@info da_dim = dimension(da.data[1])
[ Info: NoDims

Tplot

Code
f = tplot(da)
[ Info: Resampling array of size (14386, 3) along dimension 1 from 14386 to 10000 points

Code
figure = (; size=(1200, 800))

tvars = [
    "cda/THA_L2_FGM/tha_fgs_gsmQ",
    "cda/OMNI_HRO_1MIN/flow_speed",
    "cda/OMNI_HRO_1MIN/E",
    "cda/OMNI_HRO_1MIN/Pressure",
]

tplot(tvars, t0, t1; figure)
Can't get THA_L2_FGM/tha_fgs_gsmQ without web service, switching to web service
[ Info: Cannot parse unit nT*GSM
Can't get THA_L2_FGM/tha_fgs_gsmQ without web service, switching to web service
[ Info: Cannot parse unit nT*GSM
[ Info: Resampling array of size (14386, 3) along dimension 1 from 14386 to 10000 points

Spectrogram

https://cdaweb.gsfc.nasa.gov/misc/NotesM.html#MMS1_FPI_FAST_L2_DES-DIST

Code
ts0 = "2019-01-02T15" |> DateTime
ts1 = "2019-01-02T22" |> DateTime

tvars = [
    "cda/MMS1_FGM_SRVY_L2/mms1_fgm_b_gse_srvy_l2_clean",
    "cda/MMS1_FPI_FAST_L2_DES-MOMS/mms1_des_energyspectr_omni_fast",
]
tplot(tvars, ts0, ts1; add_title=true)
Can't get MMS1_FGM_SRVY_L2/mms1_fgm_b_gse_srvy_l2_clean without web service, switching to web service
Can't get MMS1_FGM_SRVY_L2/mms1_fgm_b_gse_srvy_l2_clean without web service, switching to web service
[ Info: Resampling array of size (403194, 4) along dimension 1 from 403194 to 10000 points

Overlay plots

Code
SDS = SpaceTools.DataSet
ts0 = DateTime("2021-01-17")
ts1 = DateTime("2021-01-18")

density_tvar = SDS(
    "Density",
    [
        "cda/PSP_SWP_SPI_SF00_L3_MOM/DENS",
        "cda/PSP_FLD_L3_RFS_LFR_QTN/N_elec",
        "cda/PSP_FLD_L3_SQTN_RFS_V1V2/electron_density"
    ]
)

tvars = [density_tvar]

tplot(tvars, ts0, ts1; add_title=true)
[ Info: Resampling array of size (24704, 1) along dimension 1 from 24704 to 10000 points
[ Info: Resampling array of size (24272, 1) along dimension 1 from 24272 to 10000 points
[ Info: Resampling array of size (24282, 1) along dimension 1 from 24282 to 10000 points

Complex requests and flexible layout

Code
data = let intervals = ["2019-01-02T15", "2019-01-02T16"]
    products = [
        spz.inventories.tree.cda.MMS.MMS1.FGM.MMS1_FGM_SRVY_L2.mms1_fgm_b_gse_srvy_l2_clean,
        spz.inventories.tree.cda.MMS.MMS1.SCM.MMS1_SCM_SRVY_L2_SCSRVY.mms1_scm_acb_gse_scsrvy_srvy_l2,
        spz.inventories.tree.cda.MMS.MMS1.DES.MMS1_FPI_FAST_L2_DES_MOMS.mms1_des_bulkv_gse_fast,
        spz.inventories.tree.cda.MMS.MMS1.DES.MMS1_FPI_FAST_L2_DES_MOMS.mms1_des_temppara_fast,
        spz.inventories.tree.cda.MMS.MMS1.DES.MMS1_FPI_FAST_L2_DES_MOMS.mms1_des_tempperp_fast,
        spz.inventories.tree.cda.MMS.MMS1.DES.MMS1_FPI_FAST_L2_DES_MOMS.mms1_des_energyspectr_omni_fast,
        spz.inventories.tree.cda.MMS.MMS1.DIS.MMS1_FPI_FAST_L2_DIS_MOMS.mms1_dis_energyspectr_omni_fast
    ]
    get_data(products, intervals)
end

let figure = (; size=(1200, 1200)), add_title = true
    f = Figure(; figure...)
    tplot(f[1, 1], data[1:3]; add_title)
    tplot(f[1, 2], [data[4:5], data[6:7]...]; add_title)
    f
end
Can't get MMS1_FGM_SRVY_L2/mms1_fgm_b_gse_srvy_l2_clean without web service, switching to web service
[ Info: Resampling array of size (57599, 4) along dimension 1 from 57599 to 10000 points
[ Info: Resampling array of size (115198, 3) along dimension 1 from 115198 to 10000 points

Disk-cached Time Series

WebServer (CDAWeb) -> Files -> Data structure (tplot variable / numpy array)

However, loading cdf files is moderately slow, especially for large data. We can improve the performance by disk-caching the data structure (numpy array).

WebServer (CDAWeb) -> Files -> Disk-cached data structure -> Data structure

Interactive Time Series

Use GLMakie to zoom in and Zoom out to explore data.

Code
tplot(tvars, t0, t1; figure)

Multiple dispatch tplot

We are power Matlab users. Some of us are Lisp hackers. Some are Pythonistas, others Rubyists, still others Perl hackers. There are those of us who used Mathematica before we could grow facial hair. There are those who still can’t grow facial hair. We’ve generated more R plots than any sane person should. C is our desert island programming language. We love all of these languages; they are wonderful and powerful. For the work we do — scientific computing, machine learning, data mining, large-scale linear algebra, distributed and parallel computing — each one is perfect for some aspects of the work and terrible for others. Each one is a trade-off. We are greedy: we want more. We want a language that’s open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled.

We want tplot to understand a time series represented as a vector-like / matrix-like (series/heatmap/spectrogram) / Function / String; We want tplot to plot multiple time series on the same figure; We want tplot to be interactive as well as publication-ready; and we want tplot to be composable.

Code
figure = (; size=(1600, 1200))
f = Figure(; figure...)

tvars1 = ["cda/OMNI_HRO_1MIN/flow_speed", "cda/OMNI_HRO_1MIN/E", "cda/OMNI_HRO_1MIN/Pressure"]
tvars2 = ["cda/THA_L2_FGM/tha_fgs_gse"]
tvars3 = ["cda/OMNI_HRO_1MIN/BX_GSE", "cda/OMNI_HRO_1MIN/BY_GSE"]

t0 = DateTime("2008-09-05T10:00:00")
t1 = DateTime("2008-09-05T22:00:00")

tplot(f[1, 1], tvars1, t0, t1)
tplot(f[1, 2], tvars2, t0, t1)
tplot(f[2, 1:2], tvars3, t0, t1)
f
[ Info: Cannot parse unit nT*GSE*(All*Qs)
[ Info: Cannot parse unit nT*GSE*(All*Qs)
[ Info: Resampling array of size (14386, 3) along dimension 1 from 14386 to 10000 points