Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

differences in operation during training and inference #20

Open
KonstantinLihota opened this issue May 8, 2024 · 3 comments
Open

differences in operation during training and inference #20

KonstantinLihota opened this issue May 8, 2024 · 3 comments

Comments

@KonstantinLihota
Copy link

KonstantinLihota commented May 8, 2024

Hello!

I'm studying your code on GitHub and came across a question that I can't fully grasp. I'm curious why predictions on the same image during training and inference might differ. It would be great if you could provide more details about the differences in operation during training and inference, especially in the context of the functions decoder_forward_dynamic and decoder_forward in the decoder, as well as points_queris_embed and points_queris_embed in BasePETCount.

I would appreciate your clarification!

Best regards,
Konstantin

@cxliu0
Copy link
Owner

cxliu0 commented May 9, 2024

During training, we generate the whole point-query quadtree, because we need to compute loss to supervise it. During testing, we dynamically construct the point-query quadtree, i.e., using sparse/dense point queries in sparse/dense regions. This operation aims to accelerate inference speed. Technically, one can use the same function in training to do inference.

@KonstantinLihota
Copy link
Author

In this case, why can the results of working on the same patch with inferno and training be different?

@cxliu0
Copy link
Owner

cxliu0 commented May 16, 2024

To be more specific, we use the split map (Figure 4 in the paper) to categorize sparse and dense regions, where sparse/dense point queries are responsible for object prediction in sparse/dense regions.

Regarding one can use the same function in training to do inference, I mean one can use sparse/dense point queries to do inference in the whole image, and use the split map to select the corresponding predictions in sparse and dense regions. This operation is relatively computationally expensive.

A more convenient way, which is presented in this repo, is to dynamically construct the point-query quadtree to do inference. This ensures that sparse/dense point queries only do inference in sparse/dense regions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants