Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

N-dimensional features issue in the method #496

Open
zandarina1 opened this issue Nov 20, 2023 · 2 comments
Open

N-dimensional features issue in the method #496

zandarina1 opened this issue Nov 20, 2023 · 2 comments

Comments

@zandarina1
Copy link

zandarina1 commented Nov 20, 2023

Hello all,

I want to use two dimensions, two time series for each participant. I transform the data as expected by the library

(6431, 5, 2)

However if I plot it, it puts together both signals in one single plot, I am not sure if they are considering the features separatelly, that this is what i want for example participant 1 with series A increasing and series B dicreaseing is cluster 1. But what I get, it does not make sense, it makes the same as if it was in one dimension and if I plot it, it does not make sense by separating X_train[y_pred == yi,:,1] or X_train[y_pred == yi,:,0], and the cluster centers are the same for both series /dims. How can I plot when I have two dimensions and make the clusters differentiate by dimensions?. It would be great to show an example with multiple dimensions apart from the nice examples of the tutorial, Thanks

for yi in range(N_CLUSTERS):
    plt.subplot(2, 3, 1 + yi)
    for xx in X_train[y_pred == yi,:,1]:
        plt.plot(xx.ravel(), "k-", alpha=.2)
    plt.plot(km.cluster_centers_[yi,:,1].ravel(), "r-")
    plt.xlim(0, sz)
    plt.ylim(-4, 4)
    plt.text(0.55, 0.85,'Cluster %d' % (yi + 1),
             transform=plt.gca().transAxes)
    if yi == 1:
        plt.title("DTW $k$-means")
@zandarina1 zandarina1 added the bug label Nov 20, 2023
@YannCabanes
Copy link
Contributor

Hello @zandarina1,
I think that your problem comes from the misuse of numpy.ravel which flattens the NumPy arrays:
https://numpy.org/doc/stable/reference/generated/numpy.ravel.html#numpy.ravel

@YannCabanes
Copy link
Contributor

YannCabanes commented Dec 5, 2023

Taking inspiration from:
https://tslearn.readthedocs.io/en/stable/auto_examples/clustering/plot_kmeans.html#sphx-glr-auto-examples-clustering-plot-kmeans-py
I have written the following code:

import numpy
import matplotlib.pyplot as plt
import numpy as np

from tslearn.clustering import TimeSeriesKMeans
from tslearn.datasets import CachedDatasets
from tslearn.preprocessing import TimeSeriesScalerMeanVariance, \
    TimeSeriesResampler

seed = 0
numpy.random.seed(seed)
X_train, y_train, X_test, y_test = CachedDatasets().load_dataset("Trace")
print(X_train.shape)  # (100, 275, 1)
X_train = np.concatenate([X_train, - X_train], axis=2)
print(X_train.shape)  # (100, 275, 2)
X_train = X_train[y_train < 4]  # Keep first 3 classes
numpy.random.shuffle(X_train)
# Keep only 50 time series
X_train = TimeSeriesScalerMeanVariance().fit_transform(X_train[:50])
# Make time series shorter
X_train = TimeSeriesResampler(sz=40).fit_transform(X_train)
sz = X_train.shape[1]
print(sz)

# Soft-DTW-k-means
print("Soft-DTW k-means")
sdtw_km = TimeSeriesKMeans(n_clusters=3,
                           metric="softdtw",
                           metric_params={"gamma": .01},
                           verbose=True,
                           random_state=seed)
y_pred = sdtw_km.fit_predict(X_train)

for yi in range(3):
    for di in range(2):
        plt.subplot(2, 3, 1 + yi + 3 * di)
        for xx in X_train[y_pred == yi]:
            plt.plot(xx[:, di], "k-", alpha=.2)
        plt.plot(sdtw_km.cluster_centers_[yi, :, di], "r-")
        plt.xlim(0, sz)
        plt.ylim(-4, 4)
        plt.text(0.05, 0.85, f"Cluster {yi + 1}, dim {di + 1}",
                 transform=plt.gca().transAxes)
        if yi == 1 and di == 0:
            plt.title("Soft-DTW $k$-means")

plt.tight_layout()
plt.show()

Does it correspond to what you would like to do?

@YannCabanes YannCabanes removed the bug label Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants