Feature/mda #206

james-oldfield · 2020-12-14T16:30:06Z

Also opening a PR for Li's constrained multilinear discriminant analysis [1]. This could also perhaps suit tensorly/decomposition best, given the current structure? Along with MPCA, there's also a notebook walkthrough that could potentially fit into the examples!

[1] Q. Li and D. Schonfeld, "Multilinear Discriminant Analysis for Higher-Order Tensor Data Classification," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 12, pp. 2524-2537, 1 Dec. 2014, doi: 10.1109/TPAMI.2014.2342214.

coveralls · 2020-12-14T16:58:33Z

Coverage increased (+0.07%) to 87.341% when pulling f00b1b8 on james-oldfield:feature/mda into cadf017 on tensorly:master.

JeanKossaifi · 2020-12-22T21:37:52Z

Thanks @james-oldfield!
This is a great new feature!

On a high-level, would it make sense to also have an additional solver based on SVD (e.g. like for the linear case in scikit-learn)? This would avoid building the covariance matrix (it seems we make the assumption that all classes have the same covariance anyway here?). However, it's not clear whether it would be computationally advantageous in the end. Looking at the update rule, an online version might be nice to add down the line.

As a side note, it's good to make the docstring as easily readible as possible (e.g. not too much latex) -- we can have a short explanation in the docstring itself and have a longer, more math heavy version in the user guide.

james-oldfield · 2020-12-23T16:51:16Z

Definitely! Although off the top of my head I'm not quite sure how one would extend the SVD solver for the multilinear case (i.e. how we'd solve for the factor matrices with the SVD directly on the batch of mode-n unfolding (a 3rd-order tensor), without computing the scatter matrices explicitly?) It certainly would make a nice optional feature though if it were possible.

One thing that would be nice in the future could be to provide a flag to instead take the solution via the eigendecomposition of W^{-1}B (for the DATER solution--rather than the CMDA solution via the SVD).

Good idea with the docstring, I've removed a lot of the bulk :)

JeanKossaifi · 2020-12-27T11:36:21Z

tensorly/decomposition/mda.py

+    ###############
+    # check correct # of ranks have been supplied
+    ###############
+    assert len(ranks) == len(T.shape(X)[1:]), 'Expected number of ranks: {}. \


It's good for end users to make the error messages as informative as possible. Something like:

Suggested change

assert len(ranks) == len(T.shape(X)[1:]), 'Expected number of ranks: {}. \

if len(ranks) != (T.ndim(X) - 1):

msg = f'Got as input a tensor of shape {T.shape(X)} corresponding to {T.shape(X)[0]} samples of shape {T.shape(X)[1:]} of order {T.ndim(X) -1}.'

msg += f'But got {len(ranks)} ranks != {T.ndim(X) -1}.'

raise ValueError(msg)

JeanKossaifi · 2020-12-27T14:12:43Z

tensorly/decomposition/mda.py

+    # store the mean of all tensor samples with label i
+    for i in range(len(set(y))):
+        # tensorflow is only backend to not support indexing into tensor with a list
+        if backend == 'tensorflow':


I really don't like/want to have these in the code (it should be backend agnostic). However tensorflow really does not make it easy.. Would be ideal to find another way if it doesn't slow down the other backends.

JeanKossaifi · 2020-12-27T14:15:58Z

tensorly/decomposition/mda.py

+        if backend == 'tensorflow':
+            class_means += [T.mean(T.tensor([X[j] for j in class_idx[i]]), axis=0)]
+        else:
+            class_means += [T.mean(X[class_idx[i], ...], axis=0)]


We can directly get the means without creating class_idx:

Suggested change

class_means += [T.mean(X[class_idx[i], ...], axis=0)]

class_means += [T.mean(X[y == i, ...], axis=0)]

Of course tensorflow doesn't support this directly but through a function:

tf.boolean_mask(X, y==i)

equivalently, for TensorFlow, we can do:

tl.stack([X[j, ...] for j, e in enumerate(tl.to_numpy(y)) if e == i], 0)

JeanKossaifi · 2020-12-27T14:19:15Z

tensorly/decomposition/mda.py

+            # --
+            # [1] Q. Li et al. "Multilinear Discriminant Analysis for Higher-Order Tensor Data Classification"
+            ###################################################
+            U, _, _ = T.partial_svd(T.dot(W_scat_inv,  B_scat))


We should probably be using SVD_FUNS but this is being refactored in #217 and should be easier to use when that gets merged.

JeanKossaifi · 2020-12-27T16:07:03Z

tensorly/decomposition/mda.py

+    return factors
+
+
+def compute_modek_wb_scatters(X, mode, factors, global_mean, class_means, class_idx):


What about naming it mode_scatter_matrices? That would be consistent with mode_dot.
Or even just scatter_matrices and we document well that this corresponds to mode-n between and within class scatter matrices?

JeanKossaifi · 2020-12-28T20:24:57Z

tensorly/decomposition/tests/test_mda.py

+    # recover X, using the inverse projection matrices
+    X_hat = multi_mode_dot(Z, [tl.tensor(inv(tl.to_numpy(f))) for f in factors], modes=[1, 2, 3], transpose=True)
+
+    assert_array_almost_equal(X, X_hat, decimal=tol)


Isn't this always true as long as the factors are not singular? Also the factors are obtain through SVD so orthogonal, we can just take their transpose instead of converting to NumPy and using inv. It would be good to have some specific tests.

james-oldfield added 2 commits December 14, 2020 16:18

implement mda (CMDA variety w/ svd)

5cb3f53

add unit test for reconstruction

f00b1b8

tidy up latex

09c3f22

JeanKossaifi reviewed Dec 27, 2020

View reviewed changes

JeanKossaifi reviewed Dec 28, 2020

View reviewed changes

JeanKossaifi force-pushed the master branch 2 times, most recently from d45beab to 2aa9834 Compare January 3, 2021 14:51

Base automatically changed from master to main January 22, 2021 13:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/mda #206

Feature/mda #206

james-oldfield commented Dec 14, 2020

coveralls commented Dec 14, 2020 •

edited

JeanKossaifi commented Dec 22, 2020 •

edited

james-oldfield commented Dec 23, 2020 •

edited

JeanKossaifi Dec 27, 2020 •

edited

JeanKossaifi Dec 27, 2020

JeanKossaifi Dec 27, 2020 •

edited

JeanKossaifi Dec 27, 2020

JeanKossaifi Dec 27, 2020

JeanKossaifi Dec 28, 2020

-    assert len(ranks) == len(T.shape(X)[1:]), 'Expected number of ranks: {}. \
+    if len(ranks) != (T.ndim(X) - 1):
+        msg = f'Got as input a tensor of shape {T.shape(X)} corresponding to {T.shape(X)[0]} samples of shape {T.shape(X)[1:]} of order {T.ndim(X) -1}.'
+        msg += f'But got {len(ranks)} ranks != {T.ndim(X) -1}.'
+        raise ValueError(msg)

	class_means += [T.mean(X[class_idx[i], ...], axis=0)]
	class_means += [T.mean(X[y == i, ...], axis=0)]

		return factors


		def compute_modek_wb_scatters(X, mode, factors, global_mean, class_means, class_idx):

Feature/mda #206

Are you sure you want to change the base?

Feature/mda #206

Conversation

james-oldfield commented Dec 14, 2020

coveralls commented Dec 14, 2020 • edited

JeanKossaifi commented Dec 22, 2020 • edited

james-oldfield commented Dec 23, 2020 • edited

JeanKossaifi Dec 27, 2020 • edited

Choose a reason for hiding this comment

JeanKossaifi Dec 27, 2020

Choose a reason for hiding this comment

JeanKossaifi Dec 27, 2020 • edited

Choose a reason for hiding this comment

JeanKossaifi Dec 27, 2020

Choose a reason for hiding this comment

JeanKossaifi Dec 27, 2020

Choose a reason for hiding this comment

JeanKossaifi Dec 28, 2020

Choose a reason for hiding this comment

coveralls commented Dec 14, 2020 •

edited

JeanKossaifi commented Dec 22, 2020 •

edited

james-oldfield commented Dec 23, 2020 •

edited

JeanKossaifi Dec 27, 2020 •

edited

JeanKossaifi Dec 27, 2020 •

edited