Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When I modify the training classes, there are some errors while training (KeyError: 'cat' if self.cache and self.cache_type == "ram":) #1756

Open
STRIVESS opened this issue Feb 23, 2024 · 0 comments

Comments

@STRIVESS
Copy link

A few months ago, I successfully trained my custom data with 6 classes and wrote a blog documenting the training process. Recently, I used the same code to modify the training classes and update some training photos. Unfortunately, I encountered the following errors.
(train1_errors.log)

image

image

image

Therefore, I use ChatGPT to search for answers and attempted to resolve the issue; however, I encountered the same errors.
image

I add below codes

 #Reinitialize class_to_ind attribute with updated classes
  self.target_transform.class_to_ind = {
      cls: idx for idx, cls in enumerate(VOC_CLASSES)
  }

after the code

self.num_imgs = len(self.ids)

The completed codes are below:

class VOCDetection(CacheDataset):  
    def __init__(
        self,
        data_dir,
        image_sets=[("2007", "trainval")],
        img_size=(416, 416),
        preproc=None,
        target_transform=AnnotationTransform(),
        dataset_name="VOC0712",
        cache=False,
        cache_type="ram",
    ):
        self.root = data_dir
        self.image_set = image_sets
        self.img_size = img_size
        self.preproc = preproc
        self.target_transform = target_transform
        self.name = dataset_name
        self._annopath = os.path.join("%s", "Annotations", "%s.xml")
        self._imgpath = os.path.join("%s", "JPEGImages", "%s.jpg")
        self._classes = VOC_CLASSES
        self.cats = [
            {"id": idx, "name": val} for idx, val in enumerate(VOC_CLASSES)
        ]
        self.class_ids = list(range(len(VOC_CLASSES)))
        self.ids = list()
        for (year, name) in image_sets:
            self._year = year
            rootpath = os.path.join(self.root, "VOC" + year)
            for line in open(
                os.path.join(rootpath, "ImageSets", "Main", name + ".txt")
            ):
                self.ids.append((rootpath, line.strip()))
        self.num_imgs = len(self.ids)

        # Reinitialize class_to_ind attribute with updated classes
        self.target_transform.class_to_ind = {
            cls: idx for idx, cls in enumerate(VOC_CLASSES)
        }

        self.annotations = self._load_coco_annotations()

        path_filename = [
            (self._imgpath % self.ids[i]).split(self.root + "/")[1]
            for i in range(self.num_imgs)
        ]
        super().__init__(
            input_dimension=img_size,
            num_imgs=self.num_imgs,
            data_dir=self.root,
            cache_dir_name=f"cache_{self.name}",
            path_filename=path_filename,
            cache=cache,
            cache_type=cache_type
        )

After analyzing the problem, I have come up with the following guess below:

  1. Because I previously trained using a conda-created PyTorch-GPU virtual environment and installed the training environment for YOLOX, I suspect that during training, it loads the cached content from the previous Anaconda installation.
  2. If I reconfigure the training classes in the next iteration, the new configuration may have a different number of classes than the cached content, resulting in an inability to train properly.
  3. Previously, I trained with six classes: {'ball': 0, 'person': 1, 'dog': 2, 'animal faeces': 3, 'chair': 4, 'cat': 5}. If I change it to only four classes: {'ball': 0, 'person': 1, 'dog': 2, 'animal faeces': 3}, it can train normally. However, when it reaches the 10th epoch, there are
    evaluation-related errors.(train2_errors.log)"

image

I would like to request assistance to determine the cause of the problem and find a solution. Thank you for your help!

Currently, my plan is to create a new PyTorch-GPU virtual environment using conda and configure all the necessary dependencies. Then, I will proceed to train the new set of photos.

My virtual environment is below:
image
image

Package                       Version              Editable project location
----------------------------- -------------------- -------------------------------------------------------------------------------------------------
absl-py                       1.4.0
actionlib                     1.14.0
aiofiles                      23.2.1
altair                        5.2.0
angles                        1.9.13
annotated-types               0.6.0
antlr4-python3-runtime        4.9.3
anyio                         4.2.0
apex                          0.1
astunparse                    1.6.3
attrs                         23.1.0
autopep8                      2.0.2
beautifulsoup4                4.12.2
bidict                        0.22.1
bondpy                        1.8.6
cachetools                    5.3.1
camera-calibration            1.17.0
camera-calibration-parsers    1.12.0
catkin                        0.8.10
certifi                       2023.7.22
charset-normalizer            3.2.0
click                         8.1.7
cmake                         3.27.0
colorama                      0.4.6
coloredlogs                   15.0.1
contourpy                     1.1.0
controller-manager            0.19.6
controller-manager-msgs       0.19.6
controlnet-aux                0.0.3
cv-bridge                     1.16.2
cycler                        0.11.0
Cython                        3.0.2
diagnostic-analysis           1.11.0
diagnostic-common-diagnostics 1.11.0
diagnostic-updater            1.11.0
diffusers                     0.16.1
dynamic-reconfigure           1.7.3
einops                        0.7.0
exceptiongroup                1.1.2
expecttest                    0.1.4
fastapi                       0.109.0
ffmpy                         0.3.1
filelock                      3.12.2
Flask                         2.2.3
Flask-Cors                    4.0.0
Flask-SocketIO                5.3.6
flaskwebgui                   0.3.5
flatbuffers                   23.5.26
fonttools                     4.41.1
fsspec                        2023.6.0
gazebo_plugins                2.9.2
gazebo_ros                    2.9.2
gencpp                        0.7.0
geneus                        3.0.0
genlisp                       0.4.18
genmsg                        0.6.0
gennodejs                     2.0.2
genpy                         0.6.15
gitdb                         4.0.10
GitPython                     3.1.32
google-auth                   2.22.0
google-auth-oauthlib          1.0.0
gradio                        4.16.0
gradio_client                 0.8.1
grpcio                        1.56.2
h11                           0.14.0
httpcore                      1.0.2
httpx                         0.26.0
huggingface-hub               0.20.3
humanfriendly                 10.0
hypothesis                    6.82.0
idna                          3.4
image-geometry                1.16.2
imageio                       2.33.1
importlib-metadata            6.8.0
importlib-resources           6.0.0
interactive-markers           1.12.0
itsdangerous                  2.1.2
Jinja2                        3.1.2
joblib                        1.3.1
joint-state-publisher         1.15.1
joint-state-publisher-gui     1.15.1
jsonschema                    4.21.1
jsonschema-specifications     2023.12.1
kiwisolver                    1.4.4
lama-cleaner                  1.2.5
laser_geometry                1.6.7
lazy_loader                   0.3
lit                           16.0.6
loguru                        0.7.1
lxml                          4.9.3
Markdown                      3.4.4
markdown-it-py                3.0.0
MarkupSafe                    2.1.3
matplotlib                    3.7.2
mdurl                         0.1.2
message-filters               1.16.0
mpmath                        1.3.0
networkx                      3.1
ninja                         1.11.1
numpy                         1.23.1
nvidia-cublas-cu11            11.10.3.66
nvidia-cuda-cupti-cu11        11.7.101
nvidia-cuda-nvrtc-cu11        11.7.99
nvidia-cuda-runtime-cu11      11.7.99
nvidia-cudnn-cu11             8.5.0.96
nvidia-cufft-cu11             10.9.0.58
nvidia-curand-cu11            10.2.10.91
nvidia-cusolver-cu11          11.4.0.1
nvidia-cusparse-cu11          11.7.4.91
nvidia-nccl-cu11              2.14.3
nvidia-nvtx-cu11              11.7.91
oauthlib                      3.2.2
omegaconf                     2.3.0
onnx                          1.14.0
onnx-simplifier               0.4.10
onnxruntime                   1.15.1
opencv-contrib-python         4.8.0.74
opencv-python                 4.8.0.74
orjson                        3.9.12
packaging                     23.1
pandas                        2.0.3
piexif                        1.1.3
Pillow                        10.0.0
pip                           23.3.2
pkgutil_resolve_name          1.3.10
protobuf                      4.24.0
psutil                        5.9.5
py-cpuinfo                    9.0.0
pyasn1                        0.5.0
pyasn1-modules                0.3.0
pycocotools                   2.0
pycodestyle                   2.10.0
pydantic                      2.5.3
pydantic_core                 2.14.6
pydub                         0.25.1
Pygments                      2.16.1
pyparsing                     3.0.9
python-dateutil               2.8.2
python-engineio               4.8.2
python-multipart              0.0.6
python-qt-binding             0.4.4
python-resize-image           1.1.20
python-socketio               5.11.0
pytils                        0.4.1
pytz                          2023.3
PyWavelets                    1.4.1
PyYAML                        6.0.1
qt-dotgraph                   0.4.2
qt-gui                        0.4.2
qt-gui-cpp                    0.4.2
qt-gui-py-common              0.4.2
referencing                   0.33.0
regex                         2023.12.25
requests                      2.31.0
requests-oauthlib             1.3.1
resource_retriever            1.12.7
rich                          13.5.2
rosbag                        1.16.0
rosboost-cfg                  1.15.8
rosclean                      1.15.8
roscreate                     1.15.8
rosgraph                      1.16.0
roslaunch                     1.16.0
roslib                        1.15.8
roslint                       0.12.0
roslz4                        1.16.0
rosmake                       1.15.8
rosmaster                     1.16.0
rosmsg                        1.16.0
rosnode                       1.16.0
rosparam                      1.16.0
rospy                         1.16.0
rosservice                    1.16.0
rostest                       1.16.0
rostopic                      1.16.0
rosunit                       1.15.8
roswtf                        1.16.0
rpds-py                       0.17.1
rqt_action                    0.4.9
rqt_bag                       0.5.1
rqt_bag_plugins               0.5.1
rqt_console                   0.4.11
rqt_dep                       0.4.12
rqt_graph                     0.4.14
rqt_gui                       0.5.3
rqt_gui_py                    0.5.3
rqt-image-view                0.4.17
rqt_launch                    0.4.9
rqt_logger_level              0.4.11
rqt-moveit                    0.5.10
rqt_msg                       0.4.10
rqt_nav_view                  0.5.7
rqt_plot                      0.4.13
rqt_pose_view                 0.5.11
rqt_publisher                 0.4.10
rqt_py_common                 0.5.3
rqt_py_console                0.4.10
rqt-reconfigure               0.5.5
rqt-robot-dashboard           0.5.8
rqt-robot-monitor             0.5.14
rqt_robot_steering            0.5.12
rqt_runtime_monitor           0.5.9
rqt-rviz                      0.7.0
rqt_service_caller            0.4.10
rqt_shell                     0.4.11
rqt_srv                       0.4.9
rqt_tf_tree                   0.6.3
rqt_top                       0.4.10
rqt_topic                     0.4.13
rqt_web                       0.4.10
rsa                           4.9
ruff                          0.1.14
rviz                          1.14.20
safetensors                   0.4.2
scikit-image                  0.21.0
scikit-learn                  1.3.0
scipy                         1.10.1
seaborn                       0.12.2
semantic-version              2.10.0
sensor-msgs                   1.13.1
setuptools                    67.8.0
shellingham                   1.5.4
simple-websocket              1.0.0
six                           1.16.0
smach                         2.5.1
smach-ros                     2.5.1
smclib                        1.8.6
smmap                         5.0.0
sniffio                       1.3.0
sortedcontainers              2.4.0
soupsieve                     2.5
starlette                     0.35.1
sympy                         1.12
tabulate                      0.9.0
tensorboard                   2.14.0
tensorboard-data-server       0.7.1
tf                            1.13.2
tf-conversions                1.13.2
tf2-geometry-msgs             0.7.6
tf2-kdl                       0.7.6
tf2-py                        0.7.6
tf2-ros                       0.7.6
thop                          0.1.1.post2209072238
threadpoolctl                 3.2.0
tifffile                      2023.7.10
timm                          0.9.12
tokenizers                    0.13.3
tomli                         2.0.1
tomlkit                       0.12.0
tools                         0.1.9
toolz                         0.12.1
topic-tools                   1.16.0
torch                         2.0.1
torchaudio                    2.0.2
torchsummary                  1.5.1
torchvision                   0.15.2
tqdm                          4.65.0
transformers                  4.27.4
triton                        2.0.0
typer                         0.9.0
types-dataclasses             0.6.6
typing_extensions             4.9.0
tzdata                        2023.3
ultralytics                   8.0.143
urllib3                       1.26.16
utils                         1.0.1
uvicorn                       0.27.0.post1
websockets                    11.0.3
Werkzeug                      2.2.2
wheel                         0.38.4
whichcraft                    0.6.1
wsproto                       1.2.0
xacro                         1.14.15
xmltodict                     0.13.0
yacs                          0.1.8
yolox                         0.3.0                /home/kevin/anaconda3/envs/pytorch/lib/python3.8/site-packages/yolox-0.3.0-py3.8-linux-x86_64.egg
zipp                          3.16.2
@STRIVESS STRIVESS changed the title When I modify the training classes, there are some errors while training (KeyError: 'cat') When I modify the training classes, there are some errors while training (KeyError: 'cat' if self.cache and self.cache_type == "ram":) Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant