We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
非常感谢您的分享。 想问您一个我一直没搞清楚的问题: 对于使用mel的模型,我清楚mel是怎么和每个视频帧对齐的。 但是使用wav2vec2的模型,音频特征是怎么和每个视频帧对齐的呢?
The text was updated successfully, but these errors were encountered:
#131 感谢您的关注。可以参考我们在这个issue中公开的代码,采用audio_encoder将音频特征转换为与视频帧对应的分片。
Sorry, something went wrong.
感谢!
在看了您的131问题中提供的代码片段之后,有一个问题: 我有观察到似乎所有的数据都是共用了同一个neutral_face, 所以想问一下您这个neutral_face是怎么得到的呢?
No branches or pull requests
非常感谢您的分享。
想问您一个我一直没搞清楚的问题:
对于使用mel的模型,我清楚mel是怎么和每个视频帧对齐的。
但是使用wav2vec2的模型,音频特征是怎么和每个视频帧对齐的呢?
The text was updated successfully, but these errors were encountered: