Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does Darts provide methods for unsupervised anomaly detection models? #2355

Closed
ETTAN93 opened this issue Apr 26, 2024 · 5 comments
Closed

Does Darts provide methods for unsupervised anomaly detection models? #2355

ETTAN93 opened this issue Apr 26, 2024 · 5 comments
Labels
question Further information is requested

Comments

@ETTAN93
Copy link

ETTAN93 commented Apr 26, 2024

Based on the darts documentation on anomaly models, it seems like the 2 available ones - filtering anomaly model and forecasting anomaly model both require the model to be initially fitted to a series without anomalies, i.e. a supervised anomaly detection model.

Is my understanding correct? Does Darts offer any unsupervised models for anomaly detection?

@madtoinou madtoinou added the question Further information is requested label Apr 29, 2024
@madtoinou
Copy link
Collaborator

Hi @ETTAN93,

At the moment, Darts does not offer any unsupervised models for anomaly detection but it can be added to the roadmap, especially if contributors propose architectures and open PRs.

@ETTAN93
Copy link
Author

ETTAN93 commented May 10, 2024

@madtoinou thanks for that.

Another thing to clarify, I used eval_accuracy in darts as part of the quantile detector class and compared it to the results I got from sklearn's recall_score.

I passed the same y_test and y_pred series to both:

from sklearn.metrics import recall_score
qd_recall = qd.eval_accuracy(y_test_series, qd_y_pred_series, metric='recall')
sklearn_recall = recall_score(y_test_series.pd_series(), qd_y_pred_series.pd_series())

For some reason, I am getting the inverse of values from both, i.e. when I sum the two recall scores, I end up with 1.0. In this particular case, qd_recall from darts returns me 0.9946808510638298 whereas the recall_score from sklearn returns me 0.005319148936170213.

Am I passing in the wrong parameters to the darts function? As far as I understand, the anomaly_score parameter should be the y_pred_series from the model? what does the window parameter do?

image

The same also happens when I evaluate the accuracy score. The two scores that are returned sums up to 1.

from sklearn.metrics import accuracy_score
qd_accuracy= qd.eval_accuracy(y_test_series, qd_y_pred_series, metric='accuracy')
sklearn_accuracy = accuracy_score(y_test_series.pd_series(), qd_y_pred_series.pd_series())

@dennisbader
Copy link
Collaborator

dennisbader commented May 13, 2024

Hi @ETTAN93, QuantileDetector.eval_accuracy() expects the predicted scores from the Scorer and not the output of QuantileDetector.detect().

The following should work:

# darts
qd = QuantileDetector(high_quantile=0.5)
anom_pred = qd.fit_detect(scores_pred)
qd_recall = qd.eval_accuracy(anom_true, scores_pred, metric="recall")

# sklearn
sl_recall = recall_score(
    anom_true.slice_intersect(anom_pred).pd_series(), 
    anom_pred.slice_intersect(anom_true).pd_series()
)

print(qd_recall, sl_recall)

outputs: (0.6923, 0.6923)

You could also use eval_accuracy_from_binary_prediction() fromdarts.ad.utilsto compute the recall on the output of theQuantileDetector`.

Note also that in 1-2 weeks we'll release the new Darts version with the refactored anomaly detection module (including an example notebook). So the API will change slightly (see the changes and PR here).

@ETTAN93
Copy link
Author

ETTAN93 commented May 13, 2024

@dennisbader how is the scores_pred defined?

@dennisbader
Copy link
Collaborator

It can be any numeric non-binary input series. The detector converts non-binary to binary.
In the example above it was the output from KMeansScorer.score(). But you can also use it on other series as shown below:

from sklearn.metrics import recall_score

from darts import TimeSeries
from darts.ad import QuantileDetector
from darts.datasets import AirPassengersDataset

series = AirPassengersDataset().load()

# flag values above 400 as anomalies
anom_true = TimeSeries.from_dataframe(
    series.pd_dataframe() > 400
)

# darts
qd = QuantileDetector(high_quantile=0.95)
anom_pred = qd.fit_detect(series)
qd_recall = qd.eval_accuracy(anom_true, series, metric="recall")

# sklearn
sl_recall = recall_score(
    anom_true.slice_intersect(anom_pred).pd_series(),
    anom_pred.slice_intersect(anom_true).pd_series()
)

print(qd_recall, sl_recall)

gives (0.2857, 0.2857)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants