Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easy Sample Weights #1175

Open
Beerstabr opened this issue Aug 30, 2022 · 18 comments · May be fixed by #2362
Open

Easy Sample Weights #1175

Beerstabr opened this issue Aug 30, 2022 · 18 comments · May be fixed by #2362
Assignees
Labels
feature request Use this label to request a new feature
Projects

Comments

@Beerstabr
Copy link
Contributor

Often in forecasting it makes sense to use sample weights that make your model focus more on the recent history. And with most Sklearn models you can introduce this through the fit method. It would be great if Darts could make it easy to implement sensible weighting schemes for forecasting such as an exponentially decaying weighting function.

Many thanks for the library!

@Beerstabr Beerstabr added the triage Issue waiting for triaging label Aug 30, 2022
@dennisbader dennisbader added feature request Use this label to request a new feature and removed triage Issue waiting for triaging labels Aug 30, 2022
@hrzn
Copy link
Contributor

hrzn commented Aug 31, 2022

good idea, adding to the backlog :) (and contributions are welcome!)

@hrzn hrzn added this to To do in darts via automation Aug 31, 2022
@Beerstabr
Copy link
Contributor Author

I would definitely like to contribute!

@Beerstabr
Copy link
Contributor Author

I was thinking of solving it much like the _create_lagged_data function from RegressionModel class.

Starting out with three options:

  • equal weights
  • linearly decaying weights
  • exponentially decaying weights like
    image

@hrzn
Copy link
Contributor

hrzn commented Sep 7, 2022

Hi @Beerstabr, after checking, I think it should already be possible to do something like this:

my_model = RegressionModel(..., lags=n_lags)
my_model.fit(..., sample_weight=[1. / (in_len - i) for i in range(n_lags)])

because all the kwargs received by fit() are passed to the underlying estimator's fit() method.

@Beerstabr
Copy link
Contributor Author

Beerstabr commented Sep 8, 2022

Hi @hrzn, yes that's true. That's how I am currently doing it.

However, it gets slightly more complicated when you start using lags, an output_chunk_length >1 or start training on multiple series (and there's probably other things to consider as well).

For example, when you use 8 lags your series gets cut short by 8 data points. In that case I think it should be:

my_model = RegressionModel(..., lags=n_lags, input_chunk_length=in_len)
my_model.fit(..., sample_weight=[1. / (in_len - i) for i in range(in_len - n_lags)])

And, if you want both lags and output_chunk_length > 1, then I believe it should be:

my_model = RegressionModel(..., lags=n_lags, output_chunk_length=out_len, input_chunk_length=in_len)
my_model.fit(..., sample_weight=[1. / (in_len - i) for i in range(in_len - np.max([n_lags, out_len])])

And finally, when you're training on multiple series and these series differ in length, it gets a bit more complicated. It that case you'll need to take into account the order and the difference in length of the series. For example, in the case of exponentially decaying weights it could be like this:

# function for calculating exponential weights
def exponential_sample_weights(ts, n_lags=8, multiple_series=False, max_series_length=np.nan):
  
    if not multiple_series:
        T = len(ts) - n_lags
        sample_weights = [-np.log(1-t/T)/(T-1) for t in range(1,T+1) if t<T] + [np.log(T)/(T-1)]
    else:
        T = max_series_length - n_lags
        T_self = len(ts) - n_lags
        sample_weights = [-np.log(1-t/T)/(T-1) for t in range(1 +(T-T_self),T+1) if t<T] + [np.log(T)/(T-1)]
    
    return sample_weights

# create a list with the weights in the same order as the series to which they belong
seq_sample_weights = []
max_len = np.max([len(series) for series in seq_series])
for series in seq_series:
        seq_sample_weights_sample_weights += exponential_sample_weights(ts=series, 
                                                                        multiple_series=True, 
                                                                        max_series_length=max_len)

# fit the model (without a specific input_chunk_length)
my_model = RegressionModel(..., lags=n_lags)
my_model.fit(..., sample_weight=seq_sample_weights)

In the latter case you have to be very mindful of the fact that if series differ in length and you train on multiple series, then when you calculate exponentially decaying weights T should be the same for all series if you want to put equal weight on the series.

So, if you want to apply sample weights and use Darts, currently it requires of you that you know what happens behind the scenes. Otherwise it's hard to get it going in the non-trivial cases and it's easy to make mistakes. Therefore I think it would be nice to have an easier way of doing it like:

my_model = RegressionModel(..., lags=n_lags)
my_model.fit(..., sample_weight_type='exponential')

Later on you could also add functionality to let the model focus more on specific series. But I would say that's of lesser importance.

@hrzn
Copy link
Contributor

hrzn commented Sep 12, 2022

Hi @Beerstabr , first off, I'm sorry because I realised I made a mistake in my previous message - the sample_weight are (obviously) per-sample weights and not per-dimension weights, as I was too quick to assume. Indeed the actual number of samples is a relatively non-trivial function of the input chunk length (or nr. of lags used on the target), the number of targets, and potentially the parameter max_samples_per_ts. Then once all samples are built (this is done in the function RegressionModel._create_lagged_data(), the weights should be assigned to them as a function of how far in the past the lag of the y column corresponds to.

I think it can be done and it could be a pretty nice feature indeed. However it would also add a bit of complexity, because it would be strongly coupled to the tabularization logic. Nevertheless, if you feel like tackling it, we would be very happy to receive a PR in this direction. However, I would recommend that you wait a little before you start, as we have another couple of initiatives ongoing that are touching the tabularization itself, so it'd be better to do it afterwards to avoid conflicts.

@Beerstabr
Copy link
Contributor Author

Hi @hrzn,

Seems to me like a fun challenge to tackle. I’ll wait for the right moment though. How will I know the ongoing initiatives will be done? Are their specific backlog items I can follow?

@madtoinou
Copy link
Collaborator

Hi @Beerstabr,

The PR refactoring the taburalization has been merged. If you're still interested in implementing this feature, it's more than welcome!

@Beerstabr
Copy link
Contributor Author

Beerstabr commented Mar 23, 2023 via email

@madtoinou madtoinou assigned madtoinou and Beerstabr and unassigned madtoinou Mar 23, 2023
@madtoinou madtoinou moved this from To do to In progress in darts Mar 23, 2023
@daniel-ressi
Copy link

daniel-ressi commented Oct 3, 2023

Hi! I would highly appreciate this feature as well. I currently pass sample weights to the fit method the following way:

  1. create darts timeseries with sample weights (in my case a list of timeseries)
  2. recomputing the _get_feature_times and get_shared_times from the tabularization module (very redundant)
  3. slicing the sample weights (darts timeseries) based on the shared times
  4. converting it into a numpy array
  5. passing it it to sample_weight as additional keyword arguments passed to the fit method of the underlying model

@gofford
Copy link

gofford commented Oct 5, 2023

This would be a big addition. Weights would also allow alternative was to handle missing values; e.g., https://cienciadedatos.net/documentos/py46-forecasting-time-series-missing-values.html

@BohdanBilonoh
Copy link
Contributor

BohdanBilonoh commented Apr 8, 2024

Hi! There is an idea to make the weights part of the TimeSeries class as an attribute for xarray (like a static covs or a hierarchy). I could contribute if the idea is valid

@madtoinou
Copy link
Collaborator

There is an upcoming PR that will offer the possibility to either generate weights during tabularization or provide them as a TimeSeries when training the model. The logic is implemented, the contributor is now working on the tests.

I am not sure that adding it as an attribute of TimeSeries is the approach we want to take as they are immutable and one might be interested in testing several weighting approaches.

@BohdanBilonoh
Copy link
Contributor

BohdanBilonoh commented Apr 10, 2024

Sounds interesting. My motivation was to make the sample weights part of the input and use them as weight_cols for TimeSeries.from_dataframe. This could allow all slicing logic to be hidden behind the TimeSeries class and allow weight values ​​not only per sample, but per timestamp and/or per component. Does this new logic you mentioned cover such abilities?

@madtoinou
Copy link
Collaborator

The slicing logic will be hidden, but in the tabularization.

The upcoming implementation allows to associate a weight with each timestamp, which is then converted to samples weights. I don't see how weighting could be performed on the component dimensions, would you mind describing how this can be leveraged?

@BohdanBilonoh
Copy link
Contributor

BohdanBilonoh commented Apr 10, 2024

It will be interesting to see the code of the new logic.

Very simple example:
E-commerce time series that contain revenue and margin as targets and have to be predicted simultaneously (using TiDE model) but revenue is more important that margin

@madtoinou
Copy link
Collaborator

I will make sure that the PR implementing this new feature will be linked to this PR.

I think that this kind of "bias" should come from the loss/objective function, it's not really possible to influence a model to favor the optimization of one target component over another using another mechanism (at least to my knowledge). The model is usually responsible for identifying the most informative features (lags/components).

@BohdanBilonoh
Copy link
Contributor

BohdanBilonoh commented Apr 10, 2024

My vision of the sample weights was similar to the weights passed to Likelihood.compute_loss and in this scenario sample and/or timestamp and/or component could be weighted

@AntonRagot AntonRagot linked a pull request Apr 30, 2024 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Use this label to request a new feature
Projects
darts
In progress
Development

Successfully merging a pull request may close this issue.

7 participants