finds.recipes.filters

Filtering functions

Copyright 2022, Terence Lim

MIT License

finds.recipes.filters.fft_align(X: ndarray) Tuple[source]

Find best alignment and cross-correlation of all pairs of columns

Parameters:

X – array with time series in columns

Returns:

Max cross-correlations, best lags, indices of all pairs of columns

Notes:

  • Apply convolution theorem to compute cross-correlations at all lags

  • For each pair of series, the lag with largest correlation is assumed to be the displacement which aligns the presentation of the two series

Examples:

>>> fft_align(np.hstack((X[:-1], X[1:])))
finds.recipes.filters.fft_correlation(X: ndarray, Y: ndarray | None) Series[source]

Compute cross-correlations of two series, using Convolution Theorem

Parameters:
  • X – series of observations

  • Y – series of observations

Returns:

Series of cross-correlation values at every displacement lag

Notes: - Cross-correlation[n] = sum_m^N f[m] g[n + m] - equals Convolution (f * g)[n] = sum_m^N f[m] g[n - m]

Examples

>>> statsmodels.tsa.stattools.acf(X)
>>> fft_correlate(X, X)
finds.recipes.filters.fft_neweywest(X: ndarray) List[source]

Compute Newey-West weighted cross-correlation of all pairs of columns

Parameters:

X – array with series in columns

Returns:

List of Newey-west weighted cross-correlations

Notes:

  • First apply convolution theorem to compute all cross-autocorrelations,

  • Then for each pair of series, compute Newey-west weighted correlation

finds.recipes.filters.fractile_split(values: Iterable, pct: Iterable, keys: Iterable | None = None, ascending: bool = False) List[int][source]

Sort and assign values into fractiles

Parameters:
  • values – input array to assign to fractiles

  • pct – list of percentiles between 0 and 100

  • keys – key values to determine breakpoints, use values if None

  • ascending – if True, assign to fractiles in ascending order

Returns:

list of fractile assignments (starting at 1 with smallest values)

finds.recipes.filters.is_outlier(x: Any, method: str = 'iq10', fences: bool = False) array[source]

Test if elements of x are column-wise outliers

Parameters:
  • x – Input array to test element-wise

  • method – method to filter, in {‘iq{D}’, ‘tukey’, ‘farout’}

  • fences – If True, return (low, high) values of fence

Returns:

boolean indicator array if element is column-wise outlier or compute fences

Notes: - ‘iq{D}’ - screen by iq range median +/- [D times (Q3-Q1)] - ‘tukey’ - [Q1 - 1.5(Q3-Q1), Q3 + 1.5(Q3-Q1)] - ‘farout’ - tukey with 3IQ

finds.recipes.filters.remove_outliers(X: DataFrame, method: str = 'iq10', verbose: bool = False) DataFrame[source]

Set column-wise outliers to np.nan

Parameters:
  • X – Input array to test element-wise

  • method – method to filter outliers, in {‘iq{D}’, ‘tukey’, ‘farout’}

Returns:

DataFrame with outliers set to NaN

Notes: - ‘iq{D}’ - within [median +/- D times (Q3-Q1)] - ‘tukey’ - within [Q1 - 1.5(Q3-Q1), Q3 + 1.5(Q3-Q1)] - ‘farout’ - within [Q1 - 3(Q3-Q1), Q3 + 3(Q3-Q1)]

finds.recipes.filters.weighted_average(df: DataFrame, weights: str = '') Series[source]

Weighted means of data frame

Parameters:
  • df – DataFrame containing values, and optional weights, in columns

  • weights – Column name to use as weights

Returns:

Series of weighted means

Notes: - ignores NaN’s using numpy.ma

finds.recipes.filters.winsorize(df, quantiles=[0.025, 0.975])[source]

Winsorise dataframe by column quantiles (default=[0.025, 0.975])

Parameters:
  • df – Input DataFrame

  • quantiles – high and low fractions of distribution to truncate