finds.recipes.filters

Filtering functions

MIT License

finds.recipes.filters.fft_align(X: ndarray) → Tuple[source]

Find best alignment and cross-correlation of all pairs of columns

Parameters:: X – array with time series in columns
Returns:: Max cross-correlations, best lags, indices of all pairs of columns

Notes:

Apply convolution theorem to compute cross-correlations at all lags
For each pair of series, the lag with largest correlation is assumed to be the displacement which aligns the presentation of the two series

Examples:

>>> fft_align(np.hstack((X[:-1], X[1:])))

finds.recipes.filters.fft_correlation(X: ndarray, Y: ndarray | None) → Series[source]

Compute cross-correlations of two series, using Convolution Theorem

Parameters:

X – series of observations
Y – series of observations

Returns:

Series of cross-correlation values at every displacement lag

Notes: - Cross-correlation[n] = sum_m^N f[m] g[n + m] - equals Convolution (f * g)[n] = sum_m^N f[m] g[n - m]

Examples

>>> statsmodels.tsa.stattools.acf(X)
>>> fft_correlate(X, X)

finds.recipes.filters.fft_neweywest(X: ndarray) → List[source]

Compute Newey-West weighted cross-correlation of all pairs of columns

Parameters:: X – array with series in columns
Returns:: List of Newey-west weighted cross-correlations

Notes:

First apply convolution theorem to compute all cross-autocorrelations,
Then for each pair of series, compute Newey-west weighted correlation

finds.recipes.filters.fractile_split(values: Iterable, pct: Iterable, keys: Iterable | None = None, ascending: bool = False) → List[int][source]

Sort and assign values into fractiles

Parameters:

values – input array to assign to fractiles
pct – list of percentiles between 0 and 100
keys – key values to determine breakpoints, use values if None
ascending – if True, assign to fractiles in ascending order

Returns:

list of fractile assignments (starting at 1 with smallest values)

finds.recipes.filters.is_outlier(x: Any, method: str = 'iq10', fences: bool = False) → array[source]

Test if elements of x are column-wise outliers

Parameters:

x – Input array to test element-wise
method – method to filter, in {‘iq{D}’, ‘tukey’, ‘farout’}
fences – If True, return (low, high) values of fence

Returns:

boolean indicator array if element is column-wise outlier or compute fences

Notes: - ‘iq{D}’ - screen by iq range median +/- [D times (Q3-Q1)] - ‘tukey’ - [Q1 - 1.5(Q3-Q1), Q3 + 1.5(Q3-Q1)] - ‘farout’ - tukey with 3IQ

finds.recipes.filters.remove_outliers(X: DataFrame, method: str = 'iq10', verbose: bool = False) → DataFrame[source]

Set column-wise outliers to np.nan

Parameters:

X – Input array to test element-wise
method – method to filter outliers, in {‘iq{D}’, ‘tukey’, ‘farout’}

Returns:

DataFrame with outliers set to NaN

Notes: - ‘iq{D}’ - within [median +/- D times (Q3-Q1)] - ‘tukey’ - within [Q1 - 1.5(Q3-Q1), Q3 + 1.5(Q3-Q1)] - ‘farout’ - within [Q1 - 3(Q3-Q1), Q3 + 3(Q3-Q1)]

finds.recipes.filters.weighted_average(df: DataFrame, weights: str = '') → Series[source]

Weighted means of data frame

Parameters:

df – DataFrame containing values, and optional weights, in columns
weights – Column name to use as weights

Returns:

Series of weighted means

Notes: - ignores NaN’s using numpy.ma

finds.recipes.filters.winsorize(df, quantiles=[0.025, 0.975])[source]

Winsorise dataframe by column quantiles (default=[0.025, 0.975])

Parameters:

df – Input DataFrame
quantiles – high and low fractions of distribution to truncate