Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] VITS gpu utilization #3710

Open
maryawwm opened this issue Apr 28, 2024 · 0 comments
Open

[Bug] VITS gpu utilization #3710

maryawwm opened this issue Apr 28, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@maryawwm
Copy link

Describe the bug

im training VITS model (Persian and English language) my dataset is consists of audio clips from 1 to 25s.Im training it on a A100 GPU but most of the time gpu memory is not even half and its utilization is not as i expect.

Screenshot 2024-04-28 091235

To Reproduce

i modified my code based on this script in coqui library:

https://github.com/coqui-ai/TTS/blob/dev/recipes/multilingual/vits_tts/train_vits_tts_phonemes.py

and these are the parameters that i set:
audio_config = VitsAudioConfig(
sample_rate=16000,
win_length=1024,
hop_length=256,
num_mels=80,
mel_fmin=0,
mel_fmax=None,
)

vitsArgs = VitsArgs(
use_language_embedding=True,
embedded_language_dim=2,
use_speaker_embedding=True,
use_sdp=False,
)

config = VitsConfig(
model_args=vitsArgs,
audio=audio_config,
run_name="A6_vits_multi_language_10_spk_5_ordibehesht",
use_speaker_embedding=True,
batch_size=48,
eval_batch_size=32,
batch_group_size=128,
num_loader_workers=12,
num_eval_loader_workers=8,
precompute_num_workers=12,
run_eval=True,
test_delay_epochs=-1,
epochs=1000,
text_cleaner="multilingual_cleaners",
use_phonemes=True,
phoneme_language=None,
phonemizer="multi_phonemizer",
phoneme_cache_path=os.path.join(output_path, "phoneme_cache"),
compute_input_seq_cache=True,
print_step=25,
use_language_weighted_sampler=True,
print_eval=False,
mixed_precision=True,
output_path=output_path,
datasets=dataset_config,
cudnn_enable=True,
cudnn_benchmark=True,
cudnn_deterministic=True

Expected behavior

higher gpu utilization and faster training time

Logs

one of my steps log:

[1m   --> TIME: 2024-04-27 09:15:52 -- STEP: 124/3006 -- GLOBAL_STEP: 1750125�[0m
     | > loss_disc: 2.7141058444976807  (2.7415779617524914)
     | > loss_disc_real_0: 0.2915174067020416  (0.22191733380238854)
     | > loss_disc_real_1: 0.2596714198589325  (0.2545961029827594)
     | > loss_disc_real_2: 0.25090914964675903  (0.2519173812601836)
     | > loss_disc_real_3: 0.2509034276008606  (0.2488831561659612)
     | > loss_disc_real_4: 0.2618330121040344  (0.24871416005396074)
     | > loss_disc_real_5: 0.23049794137477875  (0.2413994044726414)
     | > loss_0: 2.7141058444976807  (2.7415779617524914)
     | > grad_norm_0: tensor(2.3359, device='cuda:0')  (tensor(4.0910, device='cuda:0'))
     | > loss_gen: 1.8159717321395874  (1.9762149626208896)
     | > loss_kl: 5.008370399475098  (42.11719334894611)
     | > loss_feat: 1.7703579664230347  (2.0269679972721693)
     | > loss_mel: 30.50223731994629  (41.7430907526324)
     | > loss_duration: 9.647953033447266  (2.5745641668477357)
     | > amp_scaler: 256.0  (509.9354838709682)
     | > loss_1: 48.74489212036133  (90.438032304087)
     | > grad_norm_1: tensor(73.7072, device='cuda:0')  (tensor(215.5241, device='cuda:0'))
     | > current_lr_0: 0.0002 
     | > current_lr_1: 0.0002 
     | > step_time: 5.8922  (3.467874986510123)
     | > loader_time: 0.006  (0.005929248948251048)

Environment

- TTS version : 0.17.8
- python : 3.9.18
- pytorch : 2.1.1
- os : Linux
- gpu : A100

Additional context

No response

@maryawwm maryawwm added the bug Something isn't working label Apr 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant