Remove spikes from signal
The example data is a sine wave with random spikes.
using Random
"""
create_sample_data()
Create a sine wave and add random positive and negative spikes.
Returns a DataFrame with columns: `x`, `y`, `rand`, `spike_high`, `spike_low`, and `y_spikey`.
"""
function create_sample_data(; length=1000)
# Create x values and compute sine wave y values
x = range(0, stop=2π, length=length)
y = 2 .* sin.(x)
rands = rand(Xoshiro(1), length)
# random values above this trigger a spike:
RAND_HIGH = 0.98
# random values below this trigger a negative spike:
RAND_LOW = 0.02
# amplitude of the spikes:
spike_amplitudes = 0.1 .+ 10rand(Xoshiro(2), length)
# Create random spikes based on threshold conditions
spike_high = ifelse.(rands .> RAND_HIGH, 1, 0) .* spike_amplitudes
spike_low = ifelse.(rands .< RAND_LOW, -1, 0) .* spike_amplitudes
n_spikes = sum(spike_high .!= 0) + sum(spike_low .!= 0)
y .+ spike_high .+ spike_low, n_spikes
end
y_spikey, n_spikes = create_sample_data()
([0.0, 0.012578866632135501, 0.025157235677482116, 0.03773460956893418, 0.050310490778751694, 0.0628843818382412, 0.07545578535743436, 0.08802420404476333, 0.10058914072673236, 0.1131500983675847 … -0.11315009836758545, -0.1005891407267323, -0.08802420404476421, -0.07545578535743443, -0.06288438183824223, -0.0503104907787519, -0.03773460956893535, -0.025157235677482466, -0.01257886663213681, -4.898587196589413e-16], 36)
By default, replace_outliers
uses a threshold-detection approach based on the median absolute deviation (MAD) to detect spikes. It is also possible to use a filter-based approach (i.e. low-pass filtering).
using SPEDAS
using CairoMakie
using Test
y_remove_outliers = replace_outliers(y_spikey, detector=find_spikes)
n_removed = sum(isnan.(y_remove_outliers))
@test n_removed == n_spikes
begin
f = Figure()
lines(f[1,1],y_spikey)
lines(f[2,1],y_remove_outliers)
f
end

SPEDAS.find_spikes
— Functionfind_spikes(data; threshold=3.0, window=0)
Identifies indices in data
that are considered spikes
For multidimensional arrays, the function can be applied along a specific dimension using the dims
parameter.
Arguments
threshold
: Threshold multiplier for MAD to identify spikes (default: 3.0)window
: Size of the moving window for local statistics (default: 16)dims
: Dimension along which to find spikes (for multidimensional arrays)
Returns
- For 1D arrays: Vector of indices where spikes were detected
- For multidimensional arrays: Dictionary mapping dimension indices to spike indices
See also: find_spikes_1d_mad
SPEDAS.replace_outliers
— Functionreplace_outliers(data; detector=find_spikes, replacement_fn=nothing, kwargs...)
Replaces outliers in data
using replacement_fn
.
A detector
function (by default, find_spikes
) is used to identify outlier indices.
A replacement_fn
function can be supplied to define how to correct each spike:
- It should takes
(data, index)
and returns a replacement value; - If not provided, the default is to replace with NaN.
For multidimensional arrays, the dims
parameter specifies the dimension along which to detect and replace outliers.
See also: find_spikes