Token Generation using ConditionalAutoregressive2D #267

aszala · 2023-01-13T20:24:01Z

Hi, I am trying to use Jukebox to generate my own tokens.
The paper mentions that you pass all previously generated tokens to the model as input for generating the next token.

However, I was reading through the code to understand how it works, and I am confused about how it passes the previously generated tokens to the model.

Here is the code snippet from the ConditionalAutoregressive2D class in the sample method I am referring to:

jukebox/jukebox/prior/autoregressive.py

Lines 222 to 237 in 08efbbc

    
           for sample_t in get_range(range(0, sample_tokens)): 
        
               x, cond = self.get_emb(sample_t, n_samples, x, x_cond, y_cond) 
        
               self.transformer.check_cache(n_samples, sample_t, fp16) 
        
               x = self.transformer(x, encoder_kv=encoder_kv, sample=True, fp16=fp16) # Transformer 
        
               if self.add_cond_after_transformer: 
        
                   x = x + cond 
        
               assert x.shape == (n_samples, 1, self.width) 
        
               x = self.x_out(x) # Predictions 
        
               if get_preds: 
        
                   preds.append(x.clone()) 
        
               # Adjust logits 
        
               x = x / temp 
        
               x = filter_logits(x, top_k=top_k, top_p=top_p) 
        
               x = t.distributions.Categorical(logits=x).sample() # Sample and replace x 
        
               assert x.shape == (n_samples, 1) 
        
               xs.append(x.clone())

Even the self.get_emb(sample_t, n_samples, x, x_cond, y_cond) line, doesn't retain the previous tokens, it just adds the updated positional embedding.

jukebox/jukebox/prior/autoregressive.py

Lines 177 to 197 in 08efbbc

    
           def get_emb(self, sample_t, n_samples, x, x_cond, y_cond): 
        
               N, D = n_samples, self.input_dims 
        
               if sample_t == 0: 
        
                   # Fill in start token 
        
                   x = t.empty(n_samples, 1, self.width).cuda() 
        
                   if self.y_cond: 
        
                       x[:, 0] = y_cond.view(N, self.width) 
        
                   else: 
        
                       x[:, 0] = self.start_token 
        
               else: 
        
                   assert isinstance(x, t.cuda.LongTensor) 
        
                   assert (0 <= x).all() and (x < self.bins).all() 
        
                   x = self.x_emb(x) 
        
               assert x.shape == (n_samples, 1, self.width) 
        
               if x_cond.shape == (N, D, self.width): 
        
                   cond = x_cond[:, sample_t:sample_t + 1, :] 
        
               else: 
        
                   cond = x_cond 
        
               x = x + self.pos_emb()[sample_t:sample_t + 1] + cond  # Pos emb, dropout is identity at eval time 
        
               assert x.shape == (n_samples, 1, self.width) 
        
               return x, cond

Could someone explain how Autoregressively generates and where the previous tokens are used as input?

Thanks!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token Generation using ConditionalAutoregressive2D #267

Token Generation using ConditionalAutoregressive2D #267

aszala commented Jan 13, 2023

Token Generation using ConditionalAutoregressive2D #267

Token Generation using ConditionalAutoregressive2D #267

Comments

aszala commented Jan 13, 2023