Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does "use_shuffle" actualy works? [spoiler] NOPE! [/spoiler] #799

Open
SA-j00u opened this issue May 10, 2024 · 5 comments
Open

Does "use_shuffle" actualy works? [spoiler] NOPE! [/spoiler] #799

SA-j00u opened this issue May 10, 2024 · 5 comments

Comments

@SA-j00u
Copy link

SA-j00u commented May 10, 2024

i can't perform 1 epoch in 1 pass without --auto_resume
so for me "use_shuffle" is critical
and i need make sure that it works
and not process only same first files every time

but i can't find "use_shuffle" argument parser even (in ALL python libs)
and can't check does it initialize rnd with random seed at start
and there is no input files printing

@SA-j00u SA-j00u changed the title does "use_shuffle" actualy works Does "use_shuffle" actualy works? May 10, 2024
@SA-j00u
Copy link
Author

SA-j00u commented May 10, 2024

i checked it on dataset with missed files
and looks like it doesn't works
rnd is works
but start rnd seed is same (usually it initialized with current time)
and every --auto_resume it processing same files!

i make 1st file missed
and several runs it "crashed" on 6rd iteration
after i saved on 2 iteration
and on several resumes it "crashed" on 8rd iteration

@SA-j00u
Copy link
Author

SA-j00u commented May 10, 2024

i put print(filepath) to def get(self, filepath): and def get_text(self, filepath):
in basicsr\utils\file_client.py

3 runs with 1 resume

HQ\0004.png	HQ\0004.png	HQ\0004.png
LQ\0004.png     LQ\0004.png     LQ\0004.png
HQ\0001.png     HQ\0001.png     HQ\0001.png
LQ\0001.png     LQ\0001.png     LQ\0001.png
HQ\0007.png     HQ\0007.png     HQ\0007.png
LQ\0007.png     LQ\0007.png     LQ\0007.png
iter:       3   iter:       3   iter:      11
HQ\0005.png     HQ\0005.png     HQ\0005.png
LQ\0005.png     LQ\0005.png     LQ\0005.png
iter:       4   iter:       4   iter:      12
HQ\0003.png     HQ\0003.png     HQ\0003.png
LQ\0003.png     LQ\0003.png     LQ\0003.png
iter:       5   iter:       5   iter:      13
                INFO: Saving models and training states.
HQ\0009.png     HQ\0009.png     HQ\0009.png
LQ\0009.png     LQ\0009.png     LQ\0009.png
iter:       6   iter:       6   iter:      14
HQ\0000.png     HQ\0000.png     HQ\0000.png
LQ\0000.png     LQ\0000.png     LQ\0000.png
iter:       7   iter:       7   iter:      15
                                INFO: Saving models and training states.
HQ\0008.png     HQ\0008.png     HQ\0008.png
LQ\0008.png     LQ\0008.png     LQ\0008.png
iter:       8   iter:       8   iter:      16
HQ\0006.png     HQ\0006.png     HQ\0006.png
LQ\0006.png     LQ\0006.png     LQ\0006.png
iter:       9   iter:       9   iter:      17
HQ\0002.png     HQ\0002.png     HQ\0002.png
LQ\0002.png     LQ\0002.png     LQ\0002.png
iter:      10   iter:      10
                INFO: Saving models and training states.
                iter:      11
                iter:      12

@SA-j00u
Copy link
Author

SA-j00u commented May 10, 2024

there is only 2 places in basicsr that run random.seed(seed)
and for seed used random 🤦‍♂️

    # random seed
    seed = opt.get('manual_seed')
    if seed is None:
        seed = random.randint(1, 10000)
        opt['manual_seed'] = seed
    set_random_seed(seed + opt['rank'])
def set_random_seed(seed):
    """Set random seeds."""
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
def worker_init_fn(worker_id, num_workers, rank, seed):
    # Set the worker seed to num_workers * rank + worker_id + seed
    worker_seed = num_workers * rank + worker_id + seed
    np.random.seed(worker_seed)
    random.seed(worker_seed)

@SA-j00u
Copy link
Author

SA-j00u commented May 11, 2024

i tried to put random.seed init in different places
but i can't change files order at all...

@SA-j00u
Copy link
Author

SA-j00u commented May 11, 2024

fix?
basicsr\data\data_sampler.py

    def __iter__(self):
        # deterministically shuffle based on epoch
        g = torch.Generator()
        # EPIC FAIL
        # g.manual_seed(self.epoch)
        import random
        random.seed(a=None, version=2)
        g.manual_seed(random.randint(1, 2147483647))
        indices = torch.randperm(self.total_size, generator=g).tolist()

so random clip may not works correct too

so i spend 10 days
training same first files...
(and get strange results)

@SA-j00u SA-j00u changed the title Does "use_shuffle" actualy works? Does "use_shuffle" actualy works? [spoiler] NOPE! [/spoiler] May 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant