You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I notice that both in the article for EQA implement (Embodied Question Answering, (Das et al. CVPR 2018)) and in the dataset used for this task (Embodied Question Answering in Photorealistic Environments with Point Cloud Perception, (Wijmans et al. CVPR 2018)). The test for VQA test seems based on the observation from navigation module, so there are results related how many actions away from the question target.
However, in the provided implementation it seems like the VQA part is independent from the navigation part. My question is: Where is the "last 5 frames" in VQA part come from? Is it the shortest path or the HumanNav? And how can I use the observation from navigation as the input for the VQA part.
Thank you for the help.
The text was updated successfully, but these errors were encountered:
Habitat-Lab and Habitat-Sim versions
Habitat-Lab: master
Habitat-Sim: master
❓ Questions and Help
I notice that both in the article for EQA implement (Embodied Question Answering, (Das et al. CVPR 2018)) and in the dataset used for this task (Embodied Question Answering in Photorealistic Environments with Point Cloud Perception, (Wijmans et al. CVPR 2018)). The test for VQA test seems based on the observation from navigation module, so there are results related how many actions away from the question target.
However, in the provided implementation it seems like the VQA part is independent from the navigation part. My question is: Where is the "last 5 frames" in VQA part come from? Is it the shortest path or the HumanNav? And how can I use the observation from navigation as the input for the VQA part.
Thank you for the help.
The text was updated successfully, but these errors were encountered: