Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] Observe, intercept and re-build models based on caches #189

Open
jcozar87 opened this issue Sep 3, 2019 · 0 comments
Open

[FEAT] Observe, intercept and re-build models based on caches #189

jcozar87 opened this issue Sep 3, 2019 · 0 comments

Comments

@jcozar87
Copy link
Contributor

jcozar87 commented Sep 3, 2019

Right now we observe and intercept variables by using a boolean tf.variable and a tf.condition. The advantages of this is that we only need to build the sequence of Random Variables and tensors (the model) once, and the previous mechanism can be used to observe and intercept, using the same Random Variables and tensors. This is useful specially in the case of DNN, where the fit method learns the parameters of it, and the same tensors will be used afterwards for posterior and posterior_predictive queries.

However, some inference methods, like Hamiltonian (MCMC), cannot be used in combination with the previous mechanism (tf.conditions). This makes impossible to make queries posterior or posterior_predictive if the model includes a DNN whose weights have been learnt.

A different approach is to use a kind of cache memory for inputs and outputs of Random Variables. The cache contains information, for a specific model codified by its parameters, at least the plate size, for the parameters of the Random Variables, their inputs and their outputs.
The input and output is wrapped with a special object for caches.
In the case of inputs:

  • if inputs are tensors, the first time the tensor is used. The following times, the cache tensor is used instead.
    In the case of outputs:
  • if combined with tensors, the first time the tensor is used. The following times, the cache tensor is used instead.
    In the case of parameters for Random Variables:
  • The first time, the parameters are used. The following times, the cache is used.

This approach allows to re-create the model when required, but composing the pieces of it as you need:

  • intercept variables using directly ed.intercept (this should be careful designed, thinking about fit method, posterior and posterior_predictive queries)
  • observe variables by using ed.intercept
  • The MCMC can be used combined with DNN, because the trained DNN is used by the cache.

In order to implement this feature, the actual functionality should be listed and described how should be done with this new approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant