Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index of changes in PAA and SAX #441

Open
yasirroni opened this issue Mar 2, 2023 · 1 comment
Open

Index of changes in PAA and SAX #441

yasirroni opened this issue Mar 2, 2023 · 1 comment

Comments

@yasirroni
Copy link

Is your feature request related to a problem? Please describe.
I'm running the tutorial of PAA and SAX in the doc

import numpy
import matplotlib.pyplot as plt

from tslearn.generators import random_walks
from tslearn.preprocessing import TimeSeriesScalerMeanVariance
from tslearn.piecewise import PiecewiseAggregateApproximation
from tslearn.piecewise import SymbolicAggregateApproximation, \
    OneD_SymbolicAggregateApproximation

numpy.random.seed(0)
# Generate a random walk time series
n_ts, sz, d = 1, 100, 1
dataset = random_walks(n_ts=n_ts, sz=sz, d=d)
scaler = TimeSeriesScalerMeanVariance(mu=0., std=1.)  # Rescale time series
dataset = scaler.fit_transform(dataset)

n_paa_segments = 10
paa = PiecewiseAggregateApproximation(n_segments=n_paa_segments)
paa_data = paa.fit_transform(dataset)
paa_dataset_inv = paa.inverse_transform(paa_data)

After reading the docs, it seems that there is no function to find the index when the paa_data changes occured.

Describe the solution you'd like
paa.get_index(paa_data) should return the index where the paa_data changes occured. That will be nice.

@yasirroni
Copy link
Author

My implementation:

from numba import njit

@njit
def find_first(array, item):
    for idx, val in enumerate(array):
        if val == item:
            return idx
    return None

def get_paa_index(paa_dataset_inv, paa_data):
    paa_dataset_inv = paa_dataset_inv.ravel()
    paa_data = paa_data.ravel()

    idxs = []
    idx = 0
    for val in paa_data:
        idx_ = find_first(paa_dataset_inv[idx:], val)
        idx += idx_
        idxs.append(idx)

    return idxs

get_paa_index(paa_dataset_inv, paa_data)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant