DensityOutliersTransform¶
- class DensityOutliersTransform(in_column: str, window_size: int = 15, distance_coef: float = 3, n_neighbors: int = 3, distance_func: typing.Callable[[float, float], float] = <function absolute_difference_distance>)[source]¶
Bases:
etna.transforms.outliers.base.OutliersTransform
Transform that uses
get_anomalies_density()
to find anomalies in data.Warning
This transform can suffer from look-ahead bias. For transforming data at some timestamp it uses information from the whole train part.
Create instance of DensityOutliersTransform.
- Parameters
in_column (str) – name of processed column
window_size (int) – size of windows to build
distance_coef (float) – factor for standard deviation that forms distance threshold to determine points are close to each other
n_neighbors (int) – min number of close neighbors of point not to be outlier
distance_func (Callable[[float, float], float]) – distance function
- Inherited-members
Methods
detect_outliers
(ts)Call
get_anomalies_density()
function with self parameters.fit
(df)Find outliers using detection method.
fit_transform
(df)May be reimplemented.
inverse_transform
(df)Inverse transformation.
transform
(df)Replace found outliers with NaNs.
- detect_outliers(ts: etna.datasets.tsdataset.TSDataset) Dict[str, List[pandas._libs.tslibs.timestamps.Timestamp]] [source]¶
Call
get_anomalies_density()
function with self parameters.- Parameters
ts (etna.datasets.tsdataset.TSDataset) – dataset to process
- Returns
dict of outliers in format {segment: [outliers_timestamps]}
- Return type
Dict[str, List[pandas._libs.tslibs.timestamps.Timestamp]]