Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why tf.data.Dataset.choose_from_datasets() chooses only one element from dataset of size-element 5, I want to unite with other dataset of size-element 5 the same. If I want to merge dataset with all their elements and get <ChooseDataset ...> with 10 elements inside #67327

Open
Hell576 opened this issue May 10, 2024 · 2 comments
Assignees
Labels
comp:data tf.data related issues stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.16 type:bug Bug

Comments

@Hell576
Copy link

Hell576 commented May 10, 2024

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

No

Source

source

TensorFlow version

tf v2.16.0-rc0-18-g5bc9d26649c 2.16.1

Custom code

Yes

OS platform and distribution

Windows 10 Home

Mobile device

No response

Python version

3.11.8

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

Ryzen 5 5600U 8 gb RAM

Current behavior?

I expect to have one dataset with 21 elements from SUB_DATASETS trainSubDs***(unpack it):
Uploading SUB_DATASETS.7z… link if archive didn't upload: https://drive.google.com/drive/folders/1yg2QL6uXUwNikSVNduBl0tMvOSqTa050?usp=sharing
It looks like(console could show only last elements to output, it could copy only last snippet):
[[0.5359075],
[0.2795821],
[0.0720736],
...,
[0.0077176],
[0.1932825],
[0.0856282]],

   [[0.5283639],
    [0.2746144],
    [0.0710207],
    ...,
    [0.0052901],
    [0.1952176],
    [0.0862649]],

   ...,

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Mental_grip'>)

train_ds el 12 : (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.5237354],
[0.2701872],
[0.0703329],
...,
[0.008211 ],
[0.196203 ],
[0.088349 ]],

   [[0.5247858],
    [0.2711178],
    [0.070595 ],
    ...,
    [0.0099486],
    [0.1979399],
    [0.0891815]],

   [[0.5274265],
    [0.2729441],
    [0.0710507],
    ...,
    [0.0072448],
    [0.199901 ],
    [0.0902823]],

   ...,

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Mental_grip'>)

train_ds el 13 : (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.5219159],
[0.2691515],
[0.0700537],
...,
[0.0110053],
[0.2001242],
[0.0888058]],

   [[0.5231395],
    [0.2692853],
    [0.0701746],
    ...,
    [0.0099637],
    [0.2019817],
    [0.0895737]],

   [[0.5253162],
    [0.2704399],
    [0.0704052],
    ...,
    [0.0077836],
    [0.2037999],
    [0.0902737]],

   ...,

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Mental_grip'>)

train_ds el 14 : (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.5171451],
[0.2742897],
[0.0721269],
...,
[0.0055069],
[0.2053241],
[0.0913286]],

   [[0.5189649],
    [0.2752684],
    [0.0723212],
    ...,
    [0.011555 ],
    [0.2068635],
    [0.0918813]],

   [[0.5214246],
    [0.2764484],
    [0.0727693],
    ...,
    [0.0114146],
    [0.2091298],
    [0.0929936]],

   ...,

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Mental_grip'>)

train_ds el 15 : (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.5192521],
[0.277901 ],
[0.0729991],
...,
[0.0039281],
[0.2143918],
[0.0939896]],

   [[0.5229708],
    [0.2794364],
    [0.073528 ],
    ...,
    [0.0112221],
    [0.2173722],
    [0.0953516]],

   [[0.5269855],
    [0.2820122],
    [0.0741047],
    ...,
    [0.0105395],
    [0.2104741],
    [0.0923529]],

   ...,

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Mental_grip'>)

train_ds el 16 : (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.5331477],
[0.2739582],
[0.0717712],
...,
[0.0109573],
[0.1940901],
[0.0880007]],

   [[0.5335427],
    [0.2737107],
    [0.071813 ],
    ...,
    [0.0037366],
    [0.1964362],
    [0.0886652]],

   [[0.5356644],
    [0.2747716],
    [0.0720867],
    ...,
    [0.0090865],
    [0.1971512],
    [0.0893587]],

   ...,

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Mental_ungrip'>)

train_ds el 17 : (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.526029 ],
[0.2695351],
[0.0711689],
...,
[0.0050149],
[0.2037496],
[0.0924048]],

   [[0.5300163],
    [0.2719229],
    [0.0718631],
    ...,
    [0.0069598],
    [0.1986737],
    [0.0902188]],

   [[0.5202652],
    [0.2676857],
    [0.0705827],
    ...,
    [0.0083441],
    [0.1983998],
    [0.0903398]],

   ...,

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Mental_ungrip'>)

train_ds el 18 : (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.5228637],
[0.2701937],
[0.0715879],
...,
[0.0051579],
[0.2034421],
[0.0902776]],

   [[0.5247263],
    [0.2706789],
    [0.0718333],
    ...,
    [0.0046164],
    [0.204667 ],
    [0.0909407]],

   [[0.5265769],
    [0.2717923],
    [0.072096 ],
    ...,
    [0.0095011],
    [0.2066145],
    [0.0916767]],

   ...,

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Mental_ungrip'>)

train_ds el 19 : (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.5154157],
[0.2765533],
[0.0723962],
...,
[0.0059454],
[0.2126359],
[0.0933117]],

   [[0.5172763],
    [0.277748 ],
    [0.0726707],
    ...,
    [0.0079134],
    [0.2137299],
    [0.0938761]],

   [[0.5202885],
    [0.2787489],
    [0.0730177],
    ...,
    [0.0077494],
    [0.2166882],
    [0.0949059]],

   ...,

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]],

   [[0.       ],
    [0.       ],
    [0.       ],
    ...,
    [0.       ],
    [0.       ],
    [0.       ]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Mental_ungrip'>)

train_ds el 20 : (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.5253529],
[0.2798946],
[0.0724061],
...,
[0.0083182],
[0.2105255],
[0.0941397]],

   [[0.5157818],
    [0.2750256],
    [0.0710735],
    ...,
    [0.0070147],
    [0.2102364],
    [0.0941667]],

   [[0.5171091],
    [0.2751543],
    [0.0711846],
    ...,
    [0.0105602],
    [0.2124142],
    [0.094877 ]],

   ...,

   [[0.53277  ],
    [0.2829515],
    [0.0742321],
    ...,
    [0.0052444],
    [0.2241798],
    [0.0944005]],

   [[0.5211917],
    [0.2779166],
    [0.0727785],
    ...,
    [0.0074783],
    [0.2260852],
    [0.0950583]],

   [[0.5240757],
    [0.2794829],
    [0.0732059],
    ...,
    [0.0126765],
    [0.2276907],
    [0.0955679]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Mental_ungrip'>)

Standalone code to reproduce the issue

import tensorflow as tf

train_subdses = []
for i in range(5):
    train_subdses.append(tf.data.Dataset.load('SUB_DATASETS/trainSubDSpt' + str(i)))
    train_ds = tf.data.Dataset.choose_from_datasets(sub_dses, tf.data.Dataset.range(len(sub_dses)))


i = 0
for elem in train_ds:#.as_numpy_iterator():
    print('train_ds el',i,': ', elem)
    i = i + 1

Relevant log output

output: 5 elements instead of 21:
train_ds el 0 :  (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.5369535],
        [0.2724565],
        [0.073154 ],
        ...,
        [0.0074817],
        [0.2035824],
        [0.0882927]],
       [[0.5376732],
        [0.2733304],
        [0.0730333],
        ...,
        [0.0017834],
        [0.1970369],
        [0.0859187]],
       [[0.5307747],
        [0.2692053],
        [0.0720603],
        ...,
        [0.0029357],
        [0.1989727],
        [0.0866213]],
       ...,
       [[0.       ],
        [0.       ],
        [0.       ],
        ...,
        [0.       ],
        [0.       ],
        [0.       ]],
       [[0.       ],
        [0.       ],
        [0.       ],
        ...,
        [0.       ],
        [0.       ],
        [0.       ]],
       [[0.       ],
        [0.       ],
        [0.       ],
        ...,
        [0.       ],
        [0.       ],
        [0.       ]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Relaxation'>)
train_ds el 1 :  (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.5342219],
        [0.2789463],
        [0.0737332],
        ...,
        [0.0081697],
        [0.2026219],
        [0.0868143]],
       [[0.5382575],
        [0.2805069],
        [0.0741783],
        ...,
        [0.0140429],
        [0.2039362],
        [0.0874103]],
       [[0.5390069],
        [0.2813562],
        [0.0740098],
        ...,
        [0.0100257],
        [0.1966492],
        [0.0847842]],
       ...,
       [[0.       ],
        [0.       ],
        [0.       ],
        ...,
        [0.       ],
        [0.       ],
        [0.       ]],
       [[0.       ],
        [0.       ],
        [0.       ],
        ...,
        [0.       ],
        [0.       ],
        [0.       ]],
       [[0.       ],
        [0.       ],
        [0.       ],
        ...,
        [0.       ],
        [0.       ],
        [0.       ]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Physical_grip'>)
train_ds el 2 :  (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.5101197],
        [0.2706629],
        [0.0697101],
        ...,
        [0.0135378],
        [0.1926464],
        [0.0866967]],
       [[0.514319 ],
        [0.2727665],
        [0.0703219],
        ...,
        [0.0082093],
        [0.1880849],
        [0.0848876]],
       [[0.5069909],
        [0.2686928],
        [0.0690993],
        ...,
        [0.0075324],
        [0.1876027],
        [0.0847249]],
       ...,
       [[0.5385068],
        [0.2723664],
        [0.0731111],
        ...,
        [0.0123314],
        [0.2012639],
        [0.0878491]],
       [[0.5350855],
        [0.2717791],
        [0.0727867],
        ...,
        [0.0062128],
        [0.197012 ],
        [0.0863911]],
       [[0.5311691],
        [0.2695098],
        [0.0722636],
        ...,
        [0.0034987],
        [0.1995552],
        [0.0873127]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Physical_ungrip'>)
train_ds el 3 :  (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.5344185],
        [0.278568 ],
        [0.0721303],
        ...,
        [0.0091263],
        [0.2004447],
        [0.0882447]],
       [[0.5359075],
        [0.2795821],
        [0.0720736],
        ...,
        [0.0077176],
        [0.1932825],
        [0.0856282]],
       [[0.5283639],
        [0.2746144],
        [0.0710207],
        ...,
        [0.0052901],
        [0.1952176],
        [0.0862649]],
       ...,
       [[0.       ],
        [0.       ],
        [0.       ],
        ...,
        [0.       ],
        [0.       ],
        [0.       ]],
       [[0.       ],
        [0.       ],
        [0.       ],
        ...,
        [0.       ],
        [0.       ],
        [0.       ]],
       [[0.       ],
        [0.       ],
        [0.       ],
        ...,
        [0.       ],
        [0.       ],
        [0.       ]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Mental_grip'>)
train_ds el 4 :  (<tf.Tensor: shape=(150, 128, 1), dtype=float64, numpy=
array([[[0.5331477],
        [0.2739582],
        [0.0717712],
        ...,
        [0.0109573],
        [0.1940901],
        [0.0880007]],
       [[0.5335427],
        [0.2737107],
        [0.071813 ],
        ...,
        [0.0037366],
        [0.1964362],
        [0.0886652]],
       [[0.5356644],
        [0.2747716],
        [0.0720867],
        ...,
        [0.0090865],
        [0.1971512],
        [0.0893587]],
       ...,
       [[0.       ],
        [0.       ],
        [0.       ],
        ...,
        [0.       ],
        [0.       ],
        [0.       ]],
       [[0.       ],
        [0.       ],
        [0.       ],
        ...,
        [0.       ],
        [0.       ],
        [0.       ]],
       [[0.       ],
        [0.       ],
        [0.       ],
        ...,
        [0.       ],
        [0.       ],
        [0.       ]]])>, <tf.Tensor: shape=(), dtype=string, numpy=b'Mental_ungrip'>)
@sushreebarsa
Copy link
Contributor

@Hell576 tf.data.Dataset.choose_from_datasets() isn't designed to merge datasets element-wise. It actually picks elements one at a time, deterministically choosing from the provided datasets based on a separate "choice" dataset.
Thank you!

@sushreebarsa sushreebarsa added the stat:awaiting response Status - Awaiting response from author label May 14, 2024
Copy link

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:data tf.data related issues stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.16 type:bug Bug
Projects
None yet
Development

No branches or pull requests

2 participants