You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Have you reproduced the bug with TensorFlow Nightly?
Yes
Source
source
TensorFlow version
2.4
Custom code
Yes
OS platform and distribution
No response
Mobile device
No response
Python version
No response
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
No response
GPU model and memory
No response
Current behavior?
Similar to issue #53169, I have observed that the "filter before batch" approach is significantly slow. Filtering the dataset alone takes 430ms, whereas the "batch+map" method only requires 20ms.
In theory, the computation of filter and map should be similar, but "filter before batch" consumes excessive time.
I attempted to filter after batching, but encountered a limitation where the filter predicate must return a scalar boolean value. Unfortunately, it does not support filtering batched elements.
My question is:
Is there a potential optimization for this performance issue? I aim to develop a customized operation that can filter batched elements (accepting [M,] shaped tensors as input and producing [N,] tensors as output). Is there a more efficient approach available?
Test map+batch Execution time(ms): 585.0009880959988
Test batch+map Execution time(ms): 23.16068299114704
Test map+batch+prefetch Execution time(ms): 503.9997957646847
Test batch+map+prefetch Execution time(ms): 19.63987946510315
Test prefetch+batch+map Execution time(ms): 54.23441715538502
Test batch+prefetch+map Execution time(ms): 16.469698399305344
Test filter+batch Execution time(ms): 282.77427703142166
The text was updated successfully, but these errors were encountered:
Issue type
Performance
Have you reproduced the bug with TensorFlow Nightly?
Yes
Source
source
TensorFlow version
2.4
Custom code
Yes
OS platform and distribution
No response
Mobile device
No response
Python version
No response
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
No response
GPU model and memory
No response
Current behavior?
Similar to issue #53169, I have observed that the "filter before batch" approach is significantly slow. Filtering the dataset alone takes 430ms, whereas the "batch+map" method only requires 20ms.
In theory, the computation of filter and map should be similar, but "filter before batch" consumes excessive time.
I attempted to filter after batching, but encountered a limitation where the filter predicate must return a scalar boolean value. Unfortunately, it does not support filtering batched elements.
My question is:
Is there a potential optimization for this performance issue? I aim to develop a customized operation that can filter batched elements (accepting [M,] shaped tensors as input and producing [N,] tensors as output). Is there a more efficient approach available?
result:
Standalone code to reproduce the issue
Relevant log output
The text was updated successfully, but these errors were encountered: