Issues with stride multiple of 32 #12708

tjasmin111 · 2024-05-15T15:20:45Z

Search before asking

I have searched the YOLOv8 issues and discussions and found no similar questions.

Question

I trained a yolov8 detection model based on 1080x1920 image size and I need to adhere to this size. But when converting to ONNX, YOLOv8 converts it to 1088x1920 size saying stride should be a multiple of 32. My device only takes 1080x1920, not 1088.

What should we do? Any way to force it and keep it as 1080?

Additional

No response

glenn-jocher · 2024-05-16T00:42:43Z

@tjasmin111 hello! When working with YOLOv8, it's vital that the input size be a multiple of the stride, which is typically 32 for most models. This requirement ensures proper alignment of layers within the network during the computations.

Unfortunately, forcing the network to accept an input size like 1080x1920, which isn't a multiple of 32, might not be straightforward and could lead to errors or suboptimal results. Here's a possible solution:

Resize your input images to 1088x1920 before passing them to the model.
Post-process the model outputs, if necessary, to map back to the original dimensions.

Here's a simple example of how you could resize an image before inference if using OpenCV:

import cv2

image = cv2.imread("path_to_1080x1920_image.jpg")
resized_image = cv2.resize(image, (1920, 1088))  # Width x Height

This approach should help you keep compatibility with the network requirements without significant performance loss. Let me know if this helps or if you have any more questions!

tjasmin111 · 2024-05-16T00:49:54Z

Was this also a requirement in Yolov7? Or only in Yolov8?
HD or Full-HD size are very important resolutions. It's difficult to make 1088 to 1080 due to computation requirements.

glenn-jocher · 2024-05-16T03:45:53Z

Hello @tjasmin111! 👋 YOLOv8 and some other deep learning models require the input resolution to be divisible by the stride (32 by default) due to architectural considerations in the neural network design. This requirement was not specific to YOLOv7 and varies depending on the underlying model architecture and implementation details.

For handling resolutions like Full-HD (1920x1080), a practical approach is to pad the image slightly to meet the stride requirement rather than resizing, preserving the original aspect ratio and detail. Here's a simple way to pad an image using NumPy:

import cv2
import numpy as np

image = cv2.imread("path_to_image.jpg")
h, w = image.shape[:2]
padded_image = np.zeros((1088, 1920, 3), dtype=np.uint8)
padded_image[:h, :w] = image

This keeps your image detail intact and allows the model to process it in its required input format. Let me know if this workaround fits your needs!

tjasmin111 added the question Further information is requested label May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with stride multiple of 32 #12708

Issues with stride multiple of 32 #12708

tjasmin111 commented May 15, 2024

glenn-jocher commented May 16, 2024

tjasmin111 commented May 16, 2024

glenn-jocher commented May 16, 2024

Issues with stride multiple of 32 #12708

Issues with stride multiple of 32 #12708

Comments

tjasmin111 commented May 15, 2024

Search before asking

Question

Additional

glenn-jocher commented May 16, 2024

tjasmin111 commented May 16, 2024

glenn-jocher commented May 16, 2024