Remove spikes from signal
The example data is a sine wave with random spikes.
using Random
"""
create_sample_data()
Create a sine wave and add random positive and negative spikes.
Returns a DataFrame with columns: `x`, `y`, `rand`, `spike_high`, `spike_low`, and `y_spikey`.
"""
function create_sample_data(; length=1000)
# Create x values and compute sine wave y values
x = range(0, stop=2π, length=length)
y = 2 .* sin.(x)
rands = rand(Xoshiro(1), length)
# random values above this trigger a spike:
RAND_HIGH = 0.98
# random values below this trigger a negative spike:
RAND_LOW = 0.02
# amplitude of the spikes:
spike_amplitudes = 0.1 .+ 10rand(Xoshiro(2), length)
# Create random spikes based on threshold conditions
spike_high = ifelse.(rands .> RAND_HIGH, 1, 0) .* spike_amplitudes
spike_low = ifelse.(rands .< RAND_LOW, -1, 0) .* spike_amplitudes
n_spikes = sum(spike_high .!= 0) + sum(spike_low .!= 0)
y .+ spike_high .+ spike_low, n_spikes
end
y_spikey, n_spikes = create_sample_data()([0.0, 0.012578866632135501, 0.025157235677482116, 0.03773460956893418, 0.050310490778751694, 0.0628843818382412, 0.07545578535743436, 0.08802420404476333, 0.10058914072673236, 0.1131500983675847 … -0.11315009836758545, -0.1005891407267323, -0.08802420404476421, -0.07545578535743443, -0.06288438183824223, -0.0503104907787519, -0.03773460956893535, -0.025157235677482466, -0.01257886663213681, -4.898587196589413e-16], 36)By default, replace_outliers uses a threshold-detection approach based on the median absolute deviation (MAD) to detect spikes. It is also possible to use a filter-based approach (i.e. low-pass filtering).
using TimeseriesUtilities
using CairoMakie
using Test
y_remove_outliers = replace_outliers(y_spikey)
n_removed = sum(isnan.(y_remove_outliers))
@test n_removed == n_spikes
begin
f = Figure()
lines(f[1,1],y_spikey)
lines(f[2,1],y_remove_outliers)
f
end
TimeseriesUtilities.find_outliers — Functionfind_outliers(A, [method, window]; dim = 1, kw...)Find outliers in data A along the specified dim dimension.
Returns a Boolean array whose elements are true when an outlier is detected in the corresponding element of A.
The default method is :median (other option is :mean), which uses the median absolute deviation (MAD) to detect outliers. When the length of A is greater than 256, it uses a moving window of size 16.
See also: find_outliers_median, find_outliers_mean, isoutlier - MATLAB
TimeseriesUtilities.replace_outliers! — Functionreplace_outliers!(A, s, [find_method, window]; kwargs...)Finds outliers in A and replaces them with s (by default: NaN).
See also: find_outliers, filloutliers - MATLAB
replace_outliers!(A, method, [find_method, window]; kwargs...)
replace_outliers!(A, method, outliers; kwargs...)Replaces outliers in A with values determined by the specified method.
Outliers can be detected using find_outliers with optional find_method and window parameters or specified directly as a Boolean array outliers.
method can be one of the following:
:linear: Linear interpolation of neighboring, nonoutlier values:previous: Previous nonoutlier value:next: Next nonoutlier value:nearest: Nearest nonoutlier value
See also: filloutliers - MATLAB
TimeseriesUtilities.replace_outliers — Functionreplace_outliers(A; args...; kw...)Non-mutable version of replace_outliers!.