Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phash dont work #214

Open
RMobile17 opened this issue Mar 13, 2024 · 4 comments
Open

Phash dont work #214

RMobile17 opened this issue Mar 13, 2024 · 4 comments

Comments

@RMobile17
Copy link

my code:
for image_dir in image_dir_list:
# Überprüfe, ob das image_dir existiert
if not image_dir.exists() or not image_dir.is_dir():
print(f"Das Verzeichnis {image_dir} existiert nicht oder ist kein Verzeichnis.")
continue

    # Erstelle den "remove"-Unterordner
    remove_dir = image_dir / "remove"
    # Überprüfe, ob der Unterordner "remove" existiert, sonst erstelle ihn
    if not remove_dir.exists():
        remove_dir.mkdir()

    phasher = PHash()

    # Find duplicates using the generated encodings
    duplicates = phasher.find_duplicates(image_dir=image_dir)

error:
2024-03-13 11:04:10,376: INFO Start: Calculating hashes...

0%| | 0/2 [00:00<?, ?it/s]2024-03-13 11:04:17,839: INFO Start: Calculating hashes...
2024-03-13 11:04:17,844: INFO Start: Calculating hashes...
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 59, in
move_duplicates_to_remove(image_dir_list)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 22, in move_duplicates_to_remove
duplicates = phasher.find_duplicates(image_dir=image_dir)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 303, in find_duplicates
result = self._find_duplicates_dir(
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 363, in _find_duplicates_dir
encoding_map = self.encode_images(image_dir, recursive=recursive, num_enc_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 161, in encode_images
hashes = parallelise(function=self.encode_image, data=files, verbose=self.verbose, num_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\utils\general_utils.py", line 65, in parallelise
pool = Pool(processes=num_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 215, in init
self._repopulate_pool()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 306, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 329, in _repopulate_pool_static
2024-03-13 11:04:17,852: INFO Start: Calculating hashes...
w.start()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\popen_spawn_win32.py", line 45, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
Traceback (most recent call last):
File "", line 1, in
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.  File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 116, in spawn_main

exitcode = _main(fd, parent_sentinel)

File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 59, in
move_duplicates_to_remove(image_dir_list)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 22, in move_duplicates_to_remove
duplicates = phasher.find_duplicates(image_dir=image_dir)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 303, in find_duplicates
result = self._find_duplicates_dir(
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 363, in _find_duplicates_dir
encoding_map = self.encode_images(image_dir, recursive=recursive, num_enc_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 161, in encode_images
hashes = parallelise(function=self.encode_image, data=files, verbose=self.verbose, num_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\utils\general_utils.py", line 65, in parallelise
pool = Pool(processes=num_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 215, in init
self._repopulate_pool()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 306, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 329, in _repopulate_pool_static
w.start()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\popen_spawn_win32.py", line 45, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

Traceback (most recent call last):
File "", line 1, in
2024-03-13 11:04:17,864: INFO Start: Calculating hashes...
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 59, in
move_duplicates_to_remove(image_dir_list)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 22, in move_duplicates_to_remove
duplicates = phasher.find_duplicates(image_dir=image_dir)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 303, in find_duplicates
Traceback (most recent call last):
File "", line 1, in
result = self._find_duplicates_dir(
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 363, in _find_duplicates_dir
2024-03-13 11:04:17,914: INFO Start: Calculating hashes...
encoding_map = self.encode_images(image_dir, recursive=recursive, num_enc_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 161, in encode_images
hashes = parallelise(function=self.encode_image, data=files, verbose=self.verbose, num_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\utils\general_utils.py", line 65, in parallelise
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
pool = Pool(processes=num_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 119, in Pool
exitcode = _main(fd, parent_sentinel)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 125, in _main
return Pool(processes, initializer, initargs, maxtasksperchild,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 215, in init
prepare(preparation_data)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 236, in prepare
self._repopulate_pool()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 306, in _repopulate_pool
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
return self._repopulate_pool_static(self._ctx, self.Process,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 329, in _repopulate_pool_static
main_content = runpy.run_path(main_path,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path
w.start()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 121, in start
return _run_module_code(code, init_globals, run_name,
self._popen = self._Popen(self)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\popen_spawn_win32.py", line 45, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 59, in
move_duplicates_to_remove(image_dir_list)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 22, in move_duplicates_to_remove
duplicates = phasher.find_duplicates(image_dir=image_dir)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 303, in find_duplicates
result = self._find_duplicates_dir(
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 363, in _find_duplicates_dir
encoding_map = self.encode_images(image_dir, recursive=recursive, num_enc_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 161, in encode_images
hashes = parallelise(function=self.encode_image, data=files, verbose=self.verbose, num_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\utils\general_utils.py", line 65, in parallelise
pool = Pool(processes=num_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 215, in init
self._repopulate_pool()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 306, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 329, in _repopulate_pool_static
w.start()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\popen_spawn_win32.py", line 45, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

Traceback (most recent call last):
File "", line 1, in
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 59, in
move_duplicates_to_remove(image_dir_list)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 22, in move_duplicates_to_remove
duplicates = phasher.find_duplicates(image_dir=image_dir)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 303, in find_duplicates
result = self._find_duplicates_dir(
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 363, in _find_duplicates_dir
encoding_map = self.encode_images(image_dir, recursive=recursive, num_enc_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 161, in encode_images
hashes = parallelise(function=self.encode_image, data=files, verbose=self.verbose, num_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\utils\general_utils.py", line 65, in parallelise
pool = Pool(processes=num_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 215, in init
self._repopulate_pool()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 306, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 329, in _repopulate_pool_static
w.start()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\popen_spawn_win32.py", line 45, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

2024-03-13 11:04:18,412: INFO Start: Calculating hashes...
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 59, in
move_duplicates_to_remove(image_dir_list)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 22, in move_duplicates_to_remove
duplicates = phasher.find_duplicates(image_dir=image_dir)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 303, in find_duplicates
result = self._find_duplicates_dir(
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 363, in _find_duplicates_dir
encoding_map = self.encode_images(image_dir, recursive=recursive, num_enc_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 161, in encode_images
hashes = parallelise(function=self.encode_image, data=files, verbose=self.verbose, num_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\utils\general_utils.py", line 65, in parallelise
pool = Pool(processes=num_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 215, in init
self._repopulate_pool()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 306, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 329, in _repopulate_pool_static
w.start()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\popen_spawn_win32.py", line 45, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

2024-03-13 11:04:18,520: INFO Start: Calculating hashes...
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 59, in
move_duplicates_to_remove(image_dir_list)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 22, in move_duplicates_to_remove
duplicates = phasher.find_duplicates(image_dir=image_dir)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 303, in find_duplicates
result = self._find_duplicates_dir(
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 363, in _find_duplicates_dir
encoding_map = self.encode_images(image_dir, recursive=recursive, num_enc_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 161, in encode_images
hashes = parallelise(function=self.encode_image, data=files, verbose=self.verbose, num_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\utils\general_utils.py", line 65, in parallelise
pool = Pool(processes=num_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 215, in init
self._repopulate_pool()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 306, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 329, in _repopulate_pool_static
w.start()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\popen_spawn_win32.py", line 45, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

2024-03-13 11:04:18,589: INFO Start: Calculating hashes...
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 59, in
move_duplicates_to_remove(image_dir_list)
File "C:\Users\rr004\eclipse-workspace-2023\ParseHTML\duplicate_phash.py", line 22, in move_duplicates_to_remove
duplicates = phasher.find_duplicates(image_dir=image_dir)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 303, in find_duplicates
result = self._find_duplicates_dir(
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 363, in _find_duplicates_dir
encoding_map = self.encode_images(image_dir, recursive=recursive, num_enc_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\methods\hashing.py", line 161, in encode_images
hashes = parallelise(function=self.encode_image, data=files, verbose=self.verbose, num_workers=num_enc_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\site-packages\imagededup\utils\general_utils.py", line 65, in parallelise
pool = Pool(processes=num_workers)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 215, in init
self._repopulate_pool()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 306, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 329, in _repopulate_pool_static
w.start()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\popen_spawn_win32.py", line 45, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Users\rr004\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.
@SWHL
Copy link

SWHL commented Mar 14, 2024

I met the same problem.
Python: 3.10.13
imagededup: 0.3.1
OS: macOS

@RMobile17
Copy link
Author

I have windows 10

@oohtmeel1
Copy link

oohtmeel1 commented Mar 29, 2024

Not sure what happened but these are the image matches and then what came up in the directory.
'cat (10023).jpg': ['cat (413).jpg'],

cat (10023)

cat (413)

@ltskinner
Copy link

ltskinner commented May 23, 2024

The solution for me:

from imagededup.methods import PHash


#     vv this is the solution vv
if __name__ == '__main__':
    # ^^ this is the solution ^^

    phasher = PHash()
    encodings = phasher.encode_images(
        image_dir='path/to/image/directory',
        num_enc_workers=0  # https://github.com/idealo/imagededup/blob/master/imagededup/methods/hashing.py#L141C171-L142C1
    )
    ...

It has to do with the multiprocessing happening under the hood

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants