Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New: Ultralytics YOLO-Human #12702

Open
wants to merge 93 commits into
base: main
Choose a base branch
from
Open

New: Ultralytics YOLO-Human #12702

wants to merge 93 commits into from

Conversation

Laughing-q
Copy link
Member

@Laughing-q Laughing-q commented May 15, 2024

πŸ› οΈ PR Summary

Made with ❀️ by Ultralytics Actions

🌟 Summary

Introducing new YOLOHuman model for human attribute detection! πŸš€

πŸ“Š Key Changes

  • Added YOLOHuman class as part of the model imports.
  • Introduced a new YAML configuration for the YOLOv8 human detection model.
  • Implemented additional augmentations to include human attribute data handling.
  • Established a new dataset class, HumanDataset, for loading and processing human-related datasets.
  • Included Human object in results to encapsulate detected human attributes.
  • Enriched model __init__.py to include YOLOHuman.
  • Formulated HumanPredictor, HumanTrainer, and HumanValidator under the new YOLO human module for prediction, training, and validation.

🎯 Purpose & Impact

  • Enhances Model Catalog: Expands Ultralytics' model offerings to include human-specific attribute detection.
  • Improves Dataset Handling: Offers streamlined process for datasets involving human features.
  • Facilitates Human-centric Applications: Paves the way for more sophisticated applications such as demographic analysis, security enhancements, and personalized customer experiences.

@glenn-jocher
Copy link
Member

@Laughing-q @ambitious-octopus val test is also failing because our save_txt functionality is not adapted correctly (should output txt files in the same format as the labels).

yolo val human model=weights/yolov8n-human.pt data=human8.yaml imgsz=32 save_txt

Save JSON is working correct:

yolo val human model=weights/yolov8n-human.pt data=human8.yaml imgsz=32 save_json

@ambitious-octopus
Copy link
Member

@Laughing-q
Copy link
Member Author

@glenn-jocher @ambitious-octopus Guys I removed YOLOHuman class since it's not needed while we treat human as a new task of YOLO.
Also I've fixed the save_one_txt issue for human task. There's actually another update made by me, is that I figured we can directly use the save_txt method in Results instead of recreating something similar/redundant for val mode of each task.

def save_txt(self, txt_file, save_conf=False):
"""
Save predictions into txt file.
Args:
txt_file (str): txt file path.
save_conf (bool): save confidence score or not.
"""
is_obb = self.obb is not None
boxes = self.obb if is_obb else self.boxes
masks = self.masks
probs = self.probs
kpts = self.keypoints
texts = []
if probs is not None:
# Classify
[texts.append(f"{probs.data[j]:.2f} {self.names[j]}") for j in probs.top5]
elif boxes:
# Detect/segment/pose
for j, d in enumerate(boxes):
c, conf, id = int(d.cls), float(d.conf), None if d.id is None else int(d.id.item())
line = (c, *(d.xyxyxyxyn.view(-1) if is_obb else d.xywhn.view(-1)))
if masks:
seg = masks[j].xyn[0].copy().reshape(-1) # reversed mask.xyn, (n,2) to (n*2)
line = (c, *seg)
if kpts is not None:
kpt = torch.cat((kpts[j].xyn, kpts[j].conf[..., None]), 2) if kpts[j].has_visible else kpts[j].xyn
line += (*kpt.reshape(-1).tolist(),)
line += (conf,) * save_conf + (() if id is None else (id,))
texts.append(("%g " * len(line)).rstrip() % line)
if texts:
Path(txt_file).parent.mkdir(parents=True, exist_ok=True) # make directory
with open(txt_file, "a") as f:
f.writelines(text + "\n" for text in texts)

def save_one_txt(self, predn, save_conf, shape, file):
"""Save YOLO detections to a txt file in normalized coordinates in a specific format."""
gn = torch.tensor(shape)[[1, 0, 1, 0]] # normalization gain whwh
for *xyxy, conf, cls in predn.tolist():
xywh = (ops.xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
line = (cls, *xywh, conf) if save_conf else (cls, *xywh) # label format
with open(file, "a") as f:
f.write(("%g " * len(line)).rstrip() % line + "\n")

So I updated our detect/obb/human tasks with Results.save_txt. Other tasks i.e segment/pose actually have the save_one_txt part commented so I left it there for now and let's develop these two in another PR later.
# if self.args.save_txt:
# save_one_txt(predn, save_conf, shape, file=save_dir / 'labels' / f'{path.stem}.txt')

@ambitious-octopus
Copy link
Member

ambitious-octopus commented May 31, 2024

@ambitious-octopus test_data_utils() is failing because we need to upload a HUB-format dataset to https://github.com/ultralytics/hub/tree/main/example_datasets to join the datasets for other tasks.

Screenshot 2024-05-30 at 22 46 33

@glenn-jocher Uploaded HUB-format datataset PR.

@Laughing-q
Copy link
Member Author

@glenn-jocher Meanwhile I noticed that we have a lot duplicated code in Validator.update_metrics across different tasks.

def update_metrics(self, preds, batch):

Looking into this PR: #12645, the author had to update each val.py to add the feature because we have multiple modified versions of this method. And that's why @ambitious-octopus encountered the target_img key missing issue after the PR merged(because I had to create another modified version for human task).
I actually tried to refactor this part a bit in the OBB PR so we don't have a modified version of update_metrics lying in obb/val.py. But there's still a lot duplication with other tasks.
I think I'll revisit this part of code and try to eliminate the duplicated part as much as possible later some day in another PR.

@Laughing-q
Copy link
Member Author

Laughing-q commented May 31, 2024

@ambitious-octopus the CI tests are failing, seems to related to new updated yolov8n-human.pt

And that's because the model was trained withtask=detect but now since we're updating the logic here to treat human as a new task so the tests are failing with training from the new yolov8n-human.pt

@Laughing-q
Copy link
Member Author

@ambitious-octopus let's reset the weight back, and later today I'll launch new training on our server with task=human to get all sizes of model. :)

@Laughing-q
Copy link
Member Author

@glenn-jocher @ambitious-octopus ok I've re-uploaded the weight and now everything works properly in tests except the hub dataset, which I guess it'll be good when the PR that @ambitious-octopus opened merged. :)
pic-240531-1705-48

And now there's several gpus freed on our server, I'll launch several training right now.

@ambitious-octopus
Copy link
Member

docs image
val_batch0_labels

Copy link
Member

@Burhan-Q Burhan-Q left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not serve the docs locally, just quickly reviewed the raw markdown on GitHub. A few notes and suggestions, but overall looks excellent!

@@ -0,0 +1,16 @@
---
description: TODO ADD DESCRIPTION
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fill in TODO sections for predict.md and train.md

Human detection and attributes estimation is a task that involves identifying humans in an image or video stream and estimating their attributes, such as age, gender, weight, height, and ethnicity.
The output of the detector is a set of bounding boxes that enclose the humans in the image, along with class labels, confidence scores, and estimated attributes for each person. This task is useful for applications in surveillance, retail analytics, and human-computer interaction.

## [Models](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/models/v8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want the header to be a link like this? I haven't looked at how it renders, but I don't think this is normally something we do.


## [Models](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/models/v8)

YOLOv8 pretrained Human models are shown here. Detect, Segment and Pose models are pretrained on the [COCO](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml) dataset, while Classify models are pretrained on the [ImageNet](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/ImageNet.yaml) dataset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of referencing the YAML files, probably better to reference the docs pages for the relevant datasets.


YOLOv8 pretrained Human models are shown here. Detect, Segment and Pose models are pretrained on the [COCO](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml) dataset, while Classify models are pretrained on the [ImageNet](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/ImageNet.yaml) dataset.

[Models](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/models) download automatically from the latest Ultralytics [release](https://github.com/ultralytics/assets/releases) on first use.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if Models here needs to be a link.


!!! note

It is important to note that these models have been trained on a specially curated, artificially annotated version of the COCO dataset. This custom dataset was meticulously crafted to enhance the models' performance on specific tasks by incorporating additional annotations and adjustments beyond those available in the public COCO dataset. Due to proprietary reasons, this enhanced version of the dataset is not publicly available. The artificial annotations were designed to provide more comprehensive and nuanced data, enabling the models to achieve higher accuracy and robustness in their predictions. The proprietary nature of this dataset ensures that the models possess a competitive edge, offering advanced capabilities and superior performance in their respective applications.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few notes:

  • I think that "artificially annotated" should be bolded and underlined artificially annotated for emphasis
  • I don't think a reason ("Due to proprietary reasons") for why the dataset isn't available should be given, just that we've decided to not make it publicly available.
  • I think this section should be removed "ensures that the models possess a competitive edge, offering advanced capabilities and superior performance" as it's fairly baseless and seems (to me) like it's asking for trouble.


- Weight (Kg): The weight of the person is annotated in kilograms. This numeric value is essential for applications requiring precise biometric data.

- Height (Cm): The height of the person is annotated in centimeters. Accurate height measurements are crucial for many analytical and identification purposes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Height (Cm)
+ Height (cm)

- **Real-time Human Attribute Estimation**: Leverages the computational speed of CNNs to provide fast and accurate human attribute estimation in real-time.
- **Efficiency and Performance**: Optimized for reduced computational and resource requirements without sacrificing performance, enabling deployment in real-time applications.
- **Comprehensive Attribute Estimation**: Capable of estimating multiple human attributes such as age, gender, ethnicity, weight, and height, providing detailed demographic analysis.
- **Detection of Flocks of Humans**: Enhanced to detect both individual humans and groups of humans, expanding its applicability in various scenarios.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flocks of Humans


!!! note

It is important to note that these models have been trained on a specially curated, artificially annotated version of the COCO dataset. This custom dataset was meticulously crafted to enhance the models' performance on specific tasks by incorporating additional annotations and adjustments beyond those available in the public COCO dataset. Due to proprietary reasons, this enhanced version of the dataset is not publicly available. The artificial annotations were designed to provide more comprehensive and nuanced data, enabling the models to achieve higher accuracy and robustness in their predictions. The proprietary nature of this dataset ensures that the models possess a competitive edge, offering advanced capabilities and superior performance in their respective applications.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments in docs/en/tasks/human.md on this text.


### Val Usage

Validate trained YOLOv8n-human model accuracy on the COCO8-human dataset. No argument need to passed as the `model` retains it's training `data` and arguments as model attributes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would include data="coco8-human" even tho it will work without. Trust me, it will avoid confusion later.


## Val

Validate trained YOLOv8n-human model accuracy on the COCO8-human dataset. No argument need to passed as the `model` retains it's training `data` and arguments as model attributes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment from earlier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request TODO Items that needs completing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants