Copy-Paste augmentation #12599

Arno1235 · 2024-01-08T12:12:07Z

Currently the Copy-Paste augmentation only flips the copied object and places it if it doesn't overlap too much.
This code places the copied object randomly on the image and places it if it doesn't overlap too much (like the cited paper explains https://arxiv.org/abs/2012.07177).

Possible improvements:

The copied object could also be augmented (flip, scale, ...) before placing it on the image.

🛠️ PR Summary

_{Made with ❤️ by Ultralytics Actions}

📊 Key Changes

Added shift_array function to handle image translation.
Improved copy_paste augmentation method to include random translation with boundary checks and segment translation.

🎯 Purpose & Impact

The changes introduce a more diverse Copy-Paste augmentation which can enhance model robustness by training it on images with objects pasted in variable positions. It makes the training process closer to real-world scenarios where objects can appear anywhere in the frame, thus helping the model generalize better. This could potentially improve object detection accuracy in unseen data.

🌟 Summary

Implemented enhanced Copy-Paste augmentation for better object detection model training. 🎨✂️📌

github-actions

👋 Hello @Arno1235, thank you for submitting a YOLOv5 🚀 PR! To allow your work to be integrated as seamlessly as possible, we advise you to:

✅ Verify your PR is up-to-date with ultralytics/yolov5 master branch. If your PR is behind you can update your code by clicking the 'Update branch' button or by running git pull and git merge master locally.
✅ Verify all YOLOv5 Continuous Integration (CI) checks are passing.
✅ Reduce changes to the absolute minimum required for your bug fix or feature addition. "It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is." — Bruce Lee

glenn-jocher · 2024-01-08T17:24:19Z

@Arno1235 hello!

Thank you for your interest in YOLOv5 and for bringing up the Copy-Paste augmentation. Your suggestion to enhance the augmentation by including additional transformations like flipping and scaling is indeed in line with the cited paper and could potentially improve the robustness of the model.

We always welcome contributions from the community. If you're interested in implementing these improvements, feel free to fork the repo, make your changes, and submit a pull request. We'll be happy to review it. For guidelines on contributing, you can refer to our documentation.

Keep in mind that any changes should be thoroughly tested to ensure they benefit the model's performance without introducing unexpected behavior.

Thanks again for your input, and we look forward to any contributions you might make! 😊🚀

glenn-jocher · 2024-01-08T20:09:43Z

@Arno1235 this looks good, but one of the main issues may be speed. It looks like you have 2 cv2.warpaffine() calls in the innermost part of the for loops, which means that these will run very many times, and likely lead to very significant augmentation compute burden.

Arno1235 · 2024-01-09T14:08:51Z

Hi @glenn-jocher, thanks for the quick response!

You're right.
After some testing I found that instead of using cv2.warpaffine() you can just shift the arrays, and this is as fast (even a little faster than the original cv2.flip() function).

The code for shifting the array looks like this:

def shift_array(im, move_x, move_y, fill_value=0):
    result = np.empty_like(im)

    if move_y > 0:
        result[:move_y, :] = fill_value
        if move_x > 0:
            result[:, :move_x] = fill_value
            result[move_y:, move_x:] = im[:-move_y, :-move_x]
        elif move_x < 0:
            result[:, move_x:] = fill_value
            result[move_y:, :move_x] = im[:-move_y, -move_x:]
        else:
            result[move_y:, :] = im[:-move_y, :]
    elif move_y < 0:
        result[move_y:, :] = fill_value
        if move_x > 0:
            result[:, :move_x] = fill_value
            result[:move_y, move_x:] = im[-move_y:, :-move_x]
        elif move_x < 0:
            result[:, move_x:] = fill_value
            result[:move_y, :move_x] = im[-move_y:, -move_x:]
        else:
            result[:move_y, :] = im[-move_y:, :]
    else:
        if move_x > 0:
            result[:, :move_x] = fill_value
            result[:, move_x:] = im[:, :-move_x]
        elif move_x < 0:
            result[:, move_x:] = fill_value
            result[:, :move_x] = im[:, -move_x:]
        else:
            result[:, :] = im[:, :]
    
    return result

I tested the functionality and speed with the following program:

import numpy as np
import cv2
import time
import random


def warp_affine(im, move_x, move_y, w, h):
    result = cv2.warpAffine(im, np.float32([[1, 0, move_x], [0, 1, move_y]]), (w, h))
    return result


def flip(im):
    result = cv2.flip(im, 1)
    return result


def shift_array(im, move_x, move_y, w, h, fill_value=0):
    result = np.empty_like(im)

    if move_y > 0:
        result[:move_y, :] = fill_value
        if move_x > 0:
            result[:, :move_x] = fill_value
            result[move_y:, move_x:] = im[:-move_y, :-move_x]
        elif move_x < 0:
            result[:, move_x:] = fill_value
            result[move_y:, :move_x] = im[:-move_y, -move_x:]
        else:
            result[move_y:, :] = im[:-move_y, :]
    elif move_y < 0:
        result[move_y:, :] = fill_value
        if move_x > 0:
            result[:, :move_x] = fill_value
            result[:move_y, move_x:] = im[-move_y:, :-move_x]
        elif move_x < 0:
            result[:, move_x:] = fill_value
            result[:move_y, :move_x] = im[-move_y:, -move_x:]
        else:
            result[:move_y, :] = im[-move_y:, :]
    else:
        if move_x > 0:
            result[:, :move_x] = fill_value
            result[:, move_x:] = im[:, :-move_x]
        elif move_x < 0:
            result[:, move_x:] = fill_value
            result[:, :move_x] = im[:, -move_x:]
        else:
            result[:, :] = im[:, :]
    
    return result


if __name__ == "__main__":

    iterations = 100_000

    im = cv2.imread("input.png")
    print(f"Image shape: {im.shape}")

    h, w, c = im.shape
    

    # Compare functionality

    moves_to_test = [
        (0, 0),
        (0, 10),
        (0, -10),

        (10, 0),
        (10, 10),
        (10, -10),

        (-10, 0),
        (-10, 10),
        (-10, -10),
    ]

    for move_x, move_y in moves_to_test:
        np.testing.assert_array_equal(warp_affine(im, move_x, move_y, w, h), shift_array(im, move_x, move_y, w, h))


    # Compare timings

    t1 = time.time_ns()

    for _ in range(iterations):

        flip(im)

    print(f"flip: {(time.time_ns() - t1)/1e9} s")


    t1 = time.time_ns()

    for _ in range(iterations):
        move_x = random.randint(-w, w)
        move_y = random.randint(-h, h)

        warp_affine(im, move_x, move_y, w, h)

    print(f"warp: {(time.time_ns() - t1)/1e9} s")


    t1 = time.time_ns()

    for _ in range(iterations):
        move_x = random.randint(-w, w)
        move_y = random.randint(-h, h)

        shift_array(im, move_x, move_y, w, h)

    print(f"shift: {(time.time_ns() - t1)/1e9} s")

This gives output:

Image shape: (640, 640, 3)
flip: 4.852631847 s
warp: 47.472902211 s
shift: 3.944873514 s

Do you think this is good enough?

If the for loop concerns you, I could also do one random translation and check what translated objects are in the image and don't overlap with other objects and copy those (keeping the chance value p in mind).

glenn-jocher · 2024-01-09T20:54:04Z

Hi @Arno1235,

Great work on optimizing the augmentation process! It's impressive to see that your shift_array function is not only functionally equivalent to warp_affine but also faster. This is a valuable improvement, as efficiency is key when training models.

Your benchmarking results are promising, and it seems like your approach could be a good fit for the YOLOv5 project. If you've ensured that the functionality is consistent and that there are no edge cases or bugs, this could indeed be good enough to consider integrating.

Regarding the for loop, your idea to perform a single random translation and then check for overlaps is a good one. It could further optimize the process by reducing the number of operations needed.

If you're ready, you might want to proceed by submitting a pull request with your changes. Make sure to include your test cases and performance benchmarks so that we can review the full impact of your contribution.

Thanks for your dedication to improving YOLOv5! 😊👍

Arno1235 · 2024-01-12T10:31:27Z

Hi @glenn-jocher,

I implemented the array shifting and made it only do a single translation in the code.
How can I include my test cases and performance benchmarks in the code?

glenn-jocher · 2024-01-12T15:49:25Z

Hi @Arno1235,

Fantastic to hear that you've implemented the array shifting with a single translation! To include your test cases and performance benchmarks, you can follow these steps:

Documenting in Code Comments: Include inline comments in your code explaining the purpose of each test case and the expected outcomes. For performance benchmarks, you can add comments on top of the functions or in a separate block to explain the performance gains observed.
Unit Tests: If you've written unit tests, you can include them in the tests directory of the YOLOv5 repository. Make sure they follow the structure and style of existing tests.
Performance Benchmarks: For performance benchmarks, you can create a markdown file or a section in the existing documentation that details your benchmarking methodology, the environment in which the tests were run (hardware, software versions, etc.), and the results you obtained.
Pull Request Description: When you submit your pull request, use the description to provide a summary of the changes, the rationale behind them, and the impact on performance. You can include snippets of your benchmark results here as well.
Commit Messages: Write clear and descriptive commit messages for each of your changes. This helps reviewers understand the context of each change and makes the revision history more informative.

Remember to ensure that your tests are reproducible and that your benchmarks accurately reflect the performance improvements. This will help the reviewers during the pull request process.

Looking forward to seeing your contribution! 😊🚀

Arno1235 · 2024-01-25T14:59:12Z

Hi @glenn-jocher

I added comments to my code.
I did not write any unit tests and don't see a tests directory in the repository.
I don't see any performance benchmarks for other augmentations.

I think this pull request is ready to be reviewed and merged if it is approved.
Is there anything else you need from me?

Thanks

glenn-jocher · 2024-01-25T18:16:44Z

Hi @Arno1235,

Thank you for adding comments to your code and for preparing your pull request. Here's what you can do next:

Pull Request (PR): Go ahead and submit your PR if you haven't already. Make sure to provide a clear and detailed description of your changes, the reasoning behind them, and any performance improvements you've observed.
Unit Tests: While there may not be a dedicated tests directory, it's good practice to include tests for new functionality. You can create a new test file that follows the naming convention of existing files and includes tests for your new augmentation method.
Performance Benchmarks: If there are no existing performance benchmarks for augmentations, you can still include your benchmark results in the PR description. This will provide evidence of the efficiency gains from your changes.
Documentation: If your changes are significant, consider updating the relevant documentation to reflect the new augmentation behavior. This helps users understand and utilize the new feature correctly.

Once you've submitted your PR, the maintainers will review your changes. They may request additional changes or clarifications, so be prepared to engage in the review process.

It sounds like you've done a thorough job, and if everything is in order, there shouldn't be anything else you need to do for now. Just be responsive to any feedback you might receive during the review process.

Thanks for your contribution, and we're looking forward to reviewing your work! 😊👍

github-actions · 2024-04-09T12:53:53Z

Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.

I have read the CLA Document and I sign the CLA

2 out of 3 committers have signed the CLA.
✅ (UltralyticsAssistant)[https://github.com/UltralyticsAssistant]
✅ (glenn-jocher)[https://github.com/glenn-jocher]
❌ @Arno1235
_{You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.}

Arno1235 and others added 3 commits January 8, 2024 12:08

copy-paste augmentation fix

74406d2

Added comments

03b47e1

Auto-format by Ultralytics actions

de06dd6

github-actions bot reviewed Jan 8, 2024

View reviewed changes

Merge branch 'master' into master

5df11fc

Arno1235 and others added 3 commits January 12, 2024 09:42

Merge branch 'master' into master

b5a0c67

change translation augmentation to array shift

e93e1ab

Auto-format by Ultralytics actions

3f9350e

glenn-jocher and others added 4 commits January 17, 2024 11:01

Merge branch 'master' into master

43f41c9

Merge branch 'master' into master

7f6070a

Merge branch 'master' into master

eb693c1

Comment shift_array function

ddb8322

Arno1235 and others added 10 commits January 29, 2024 09:18

Merge branch 'master' into master

a553b48

Merge branch 'master' into master

8a7ece9

Merge branch 'master' into master

42eefa4

Merge branch 'master' into master

ec9be94

Merge branch 'master' into master

316ef9e

Merge branch 'master' into master

f8d48ea

Merge branch 'master' into master

e282b6b

Auto-format by https://ultralytics.com/actions

a01dc0e

Merge branch 'master' into master

3be76eb

Merge branch 'master' into master

720400d

Arno1235 and others added 3 commits April 2, 2024 09:10

Merge branch 'master' into master

258091e

Merge branch 'master' into master

4ea9e4c

Merge branch 'master' into master

71d283e

UltralyticsAssistant added 12 commits April 14, 2024 07:37

Merge branch 'master' into master

468d116

Merge branch 'master' into master

df39897

Merge branch 'master' into master

49dd077

Merge branch 'master' into master

57207c7

Merge branch 'master' into master

7fe1546

Merge branch 'master' into master

6557fe3

Merge branch 'master' into master

04c9afb

Merge branch 'master' into master

a83e308

Merge branch 'master' into master

4ef3c24

Merge branch 'master' into master

edc52ed

Merge branch 'master' into master

b40d778

Merge branch 'master' into master

7cd17c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Copy-Paste augmentation #12599

Copy-Paste augmentation #12599

Arno1235 commented Jan 8, 2024 •

edited by UltralyticsAssistant

github-actions bot left a comment

glenn-jocher commented Jan 8, 2024

glenn-jocher commented Jan 8, 2024

Arno1235 commented Jan 9, 2024

glenn-jocher commented Jan 9, 2024

Arno1235 commented Jan 12, 2024

glenn-jocher commented Jan 12, 2024

Arno1235 commented Jan 25, 2024

glenn-jocher commented Jan 25, 2024

github-actions bot commented Apr 9, 2024 •

edited

Copy-Paste augmentation #12599

Are you sure you want to change the base?

Copy-Paste augmentation #12599

Conversation

Arno1235 commented Jan 8, 2024 • edited by UltralyticsAssistant

🛠️ PR Summary

📊 Key Changes

🎯 Purpose & Impact

🌟 Summary

github-actions bot left a comment

Choose a reason for hiding this comment

glenn-jocher commented Jan 8, 2024

glenn-jocher commented Jan 8, 2024

Arno1235 commented Jan 9, 2024

glenn-jocher commented Jan 9, 2024

Arno1235 commented Jan 12, 2024

glenn-jocher commented Jan 12, 2024

Arno1235 commented Jan 25, 2024

glenn-jocher commented Jan 25, 2024

github-actions bot commented Apr 9, 2024 • edited

Arno1235 commented Jan 8, 2024 •

edited by UltralyticsAssistant

github-actions bot commented Apr 9, 2024 •

edited