You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
im training VITS model (Persian and English language) my dataset is consists of audio clips from 1 to 25s.Im training it on a A100 GPU but most of the time gpu memory is not even half and its utilization is not as i expect.
To Reproduce
i modified my code based on this script in coqui library:
and these are the parameters that i set:
audio_config = VitsAudioConfig(
sample_rate=16000,
win_length=1024,
hop_length=256,
num_mels=80,
mel_fmin=0,
mel_fmax=None,
)
Describe the bug
im training VITS model (Persian and English language) my dataset is consists of audio clips from 1 to 25s.Im training it on a A100 GPU but most of the time gpu memory is not even half and its utilization is not as i expect.
To Reproduce
i modified my code based on this script in coqui library:
https://github.com/coqui-ai/TTS/blob/dev/recipes/multilingual/vits_tts/train_vits_tts_phonemes.py
and these are the parameters that i set:
audio_config = VitsAudioConfig(
sample_rate=16000,
win_length=1024,
hop_length=256,
num_mels=80,
mel_fmin=0,
mel_fmax=None,
)
vitsArgs = VitsArgs(
use_language_embedding=True,
embedded_language_dim=2,
use_speaker_embedding=True,
use_sdp=False,
)
config = VitsConfig(
model_args=vitsArgs,
audio=audio_config,
run_name="A6_vits_multi_language_10_spk_5_ordibehesht",
use_speaker_embedding=True,
batch_size=48,
eval_batch_size=32,
batch_group_size=128,
num_loader_workers=12,
num_eval_loader_workers=8,
precompute_num_workers=12,
run_eval=True,
test_delay_epochs=-1,
epochs=1000,
text_cleaner="multilingual_cleaners",
use_phonemes=True,
phoneme_language=None,
phonemizer="multi_phonemizer",
phoneme_cache_path=os.path.join(output_path, "phoneme_cache"),
compute_input_seq_cache=True,
print_step=25,
use_language_weighted_sampler=True,
print_eval=False,
mixed_precision=True,
output_path=output_path,
datasets=dataset_config,
cudnn_enable=True,
cudnn_benchmark=True,
cudnn_deterministic=True
Expected behavior
higher gpu utilization and faster training time
Logs
Environment
Additional context
No response
The text was updated successfully, but these errors were encountered: