Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Shared, heterogeneous storage #2029

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

[Feature] Shared, heterogeneous storage #2029

wants to merge 14 commits into from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Mar 20, 2024

No description provided.

Copy link

pytorch-bot bot commented Mar 20, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2029

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 2 Unrelated Failures

As of commit b4db037 with merge base 6f1c387 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 20, 2024
Copy link

github-actions bot commented Mar 20, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1037s 0.1013s 9.8694 Ops/s 9.3659 Ops/s $\textbf{\color{#35bf28}+5.38\%}$
test_sync 89.2275ms 87.4175ms 11.4394 Ops/s 11.3890 Ops/s $\color{#35bf28}+0.44\%$
test_async 0.1637s 71.3038ms 14.0245 Ops/s 14.1359 Ops/s $\color{#d91a1a}-0.79\%$
test_single_pixels 0.1121s 0.1119s 8.9355 Ops/s 8.9767 Ops/s $\color{#d91a1a}-0.46\%$
test_sync_pixels 69.0746ms 67.0573ms 14.9126 Ops/s 14.7381 Ops/s $\color{#35bf28}+1.18\%$
test_async_pixels 66.5060ms 62.1283ms 16.0957 Ops/s 16.0968 Ops/s $-0.01\%$
test_simple 0.6740s 0.6723s 1.4874 Ops/s 1.4316 Ops/s $\color{#35bf28}+3.90\%$
test_transformed 0.9038s 0.9020s 1.1086 Ops/s 1.0969 Ops/s $\color{#35bf28}+1.06\%$
test_serial 2.1761s 2.1165s 0.4725 Ops/s 0.4748 Ops/s $\color{#d91a1a}-0.49\%$
test_parallel 1.8683s 1.7959s 0.5568 Ops/s 0.5492 Ops/s $\color{#35bf28}+1.38\%$
test_step_mdp_speed[True-True-True-True-True] 92.2720μs 32.7875μs 30.4994 KOps/s 29.6877 KOps/s $\color{#35bf28}+2.73\%$
test_step_mdp_speed[True-True-True-True-False] 42.7710μs 19.9289μs 50.1785 KOps/s 49.9228 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[True-True-True-False-True] 45.2500μs 18.5217μs 53.9906 KOps/s 53.7021 KOps/s $\color{#35bf28}+0.54\%$
test_step_mdp_speed[True-True-True-False-False] 36.0800μs 11.3654μs 87.9865 KOps/s 89.0468 KOps/s $\color{#d91a1a}-1.19\%$
test_step_mdp_speed[True-True-False-True-True] 60.8510μs 34.7164μs 28.8048 KOps/s 28.3065 KOps/s $\color{#35bf28}+1.76\%$
test_step_mdp_speed[True-True-False-True-False] 43.8800μs 21.7371μs 46.0044 KOps/s 46.2733 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[True-True-False-False-True] 41.5100μs 20.5409μs 48.6833 KOps/s 48.1778 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[True-True-False-False-False] 73.3510μs 13.2801μs 75.3004 KOps/s 76.5265 KOps/s $\color{#d91a1a}-1.60\%$
test_step_mdp_speed[True-False-True-True-True] 63.9310μs 37.0252μs 27.0086 KOps/s 26.7914 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[True-False-True-True-False] 44.4510μs 23.7888μs 42.0366 KOps/s 42.2620 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[True-False-True-False-True] 42.6710μs 20.4425μs 48.9177 KOps/s 47.9799 KOps/s $\color{#35bf28}+1.95\%$
test_step_mdp_speed[True-False-True-False-False] 33.7900μs 13.2360μs 75.5514 KOps/s 75.8368 KOps/s $\color{#d91a1a}-0.38\%$
test_step_mdp_speed[True-False-False-True-True] 62.8110μs 38.6890μs 25.8471 KOps/s 26.0349 KOps/s $\color{#d91a1a}-0.72\%$
test_step_mdp_speed[True-False-False-True-False] 95.8610μs 25.3227μs 39.4902 KOps/s 39.8033 KOps/s $\color{#d91a1a}-0.79\%$
test_step_mdp_speed[True-False-False-False-True] 42.1010μs 22.0966μs 45.2558 KOps/s 44.5334 KOps/s $\color{#35bf28}+1.62\%$
test_step_mdp_speed[True-False-False-False-False] 35.1710μs 14.9251μs 67.0012 KOps/s 66.1642 KOps/s $\color{#35bf28}+1.26\%$
test_step_mdp_speed[False-True-True-True-True] 56.5210μs 36.8269μs 27.1541 KOps/s 26.8741 KOps/s $\color{#35bf28}+1.04\%$
test_step_mdp_speed[False-True-True-True-False] 48.2910μs 23.5258μs 42.5065 KOps/s 42.9552 KOps/s $\color{#d91a1a}-1.04\%$
test_step_mdp_speed[False-True-True-False-True] 52.3410μs 24.9159μs 40.1351 KOps/s 40.4468 KOps/s $\color{#d91a1a}-0.77\%$
test_step_mdp_speed[False-True-True-False-False] 45.3910μs 15.0070μs 66.6357 KOps/s 67.3447 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[False-True-False-True-True] 53.9910μs 38.8990μs 25.7076 KOps/s 25.0250 KOps/s $\color{#35bf28}+2.73\%$
test_step_mdp_speed[False-True-False-True-False] 48.0610μs 25.6555μs 38.9780 KOps/s 39.4579 KOps/s $\color{#d91a1a}-1.22\%$
test_step_mdp_speed[False-True-False-False-True] 65.0810μs 26.6442μs 37.5316 KOps/s 37.5732 KOps/s $\color{#d91a1a}-0.11\%$
test_step_mdp_speed[False-True-False-False-False] 37.9110μs 16.8369μs 59.3934 KOps/s 59.6749 KOps/s $\color{#d91a1a}-0.47\%$
test_step_mdp_speed[False-False-True-True-True] 60.3910μs 40.3717μs 24.7698 KOps/s 25.0035 KOps/s $\color{#d91a1a}-0.93\%$
test_step_mdp_speed[False-False-True-True-False] 58.1410μs 27.3786μs 36.5249 KOps/s 36.7616 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[False-False-True-False-True] 47.6810μs 26.6637μs 37.5041 KOps/s 36.4325 KOps/s $\color{#35bf28}+2.94\%$
test_step_mdp_speed[False-False-True-False-False] 37.9410μs 16.8903μs 59.2057 KOps/s 57.6610 KOps/s $\color{#35bf28}+2.68\%$
test_step_mdp_speed[False-False-False-True-True] 67.1110μs 41.7991μs 23.9239 KOps/s 23.4382 KOps/s $\color{#35bf28}+2.07\%$
test_step_mdp_speed[False-False-False-True-False] 57.7110μs 29.2795μs 34.1536 KOps/s 34.7789 KOps/s $\color{#d91a1a}-1.80\%$
test_step_mdp_speed[False-False-False-False-True] 58.4010μs 28.6093μs 34.9537 KOps/s 35.9516 KOps/s $\color{#d91a1a}-2.78\%$
test_step_mdp_speed[False-False-False-False-False] 49.4010μs 18.5979μs 53.7695 KOps/s 52.9005 KOps/s $\color{#35bf28}+1.64\%$
test_values[generalized_advantage_estimate-True-True] 25.4071ms 24.5764ms 40.6894 Ops/s 40.0579 Ops/s $\color{#35bf28}+1.58\%$
test_values[vec_generalized_advantage_estimate-True-True] 80.9476ms 3.1873ms 313.7489 Ops/s 299.6711 Ops/s $\color{#35bf28}+4.70\%$
test_values[td0_return_estimate-False-False] 92.1720μs 64.0748μs 15.6068 KOps/s 15.4781 KOps/s $\color{#35bf28}+0.83\%$
test_values[td1_return_estimate-False-False] 54.1886ms 52.8376ms 18.9259 Ops/s 18.8346 Ops/s $\color{#35bf28}+0.48\%$
test_values[vec_td1_return_estimate-False-False] 2.1778ms 1.7739ms 563.7188 Ops/s 565.2704 Ops/s $\color{#d91a1a}-0.27\%$
test_values[td_lambda_return_estimate-True-False] 86.0154ms 84.9501ms 11.7716 Ops/s 11.8488 Ops/s $\color{#d91a1a}-0.65\%$
test_values[vec_td_lambda_return_estimate-True-False] 2.1429ms 1.7634ms 567.0848 Ops/s 564.8341 Ops/s $\color{#35bf28}+0.40\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 23.9723ms 23.5150ms 42.5260 Ops/s 40.1216 Ops/s $\textbf{\color{#35bf28}+5.99\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9379ms 0.7078ms 1.4129 KOps/s 1.4236 KOps/s $\color{#d91a1a}-0.75\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7362ms 0.6459ms 1.5481 KOps/s 1.5382 KOps/s $\color{#35bf28}+0.65\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.4842ms 1.4502ms 689.5660 Ops/s 687.2436 Ops/s $\color{#35bf28}+0.34\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.9346ms 0.6723ms 1.4874 KOps/s 1.4775 KOps/s $\color{#35bf28}+0.67\%$
test_dqn_speed 8.9746ms 1.4627ms 683.6794 Ops/s 687.5711 Ops/s $\color{#d91a1a}-0.57\%$
test_ddpg_speed 3.1108ms 2.7449ms 364.3088 Ops/s 361.0030 Ops/s $\color{#35bf28}+0.92\%$
test_sac_speed 8.9076ms 8.1993ms 121.9612 Ops/s 117.6148 Ops/s $\color{#35bf28}+3.70\%$
test_redq_speed 10.7896ms 10.3983ms 96.1692 Ops/s 97.0595 Ops/s $\color{#d91a1a}-0.92\%$
test_redq_deprec_speed 11.9598ms 11.4610ms 87.2522 Ops/s 89.8334 Ops/s $\color{#d91a1a}-2.87\%$
test_td3_speed 8.3005ms 8.1508ms 122.6870 Ops/s 121.6096 Ops/s $\color{#35bf28}+0.89\%$
test_cql_speed 26.0368ms 25.0094ms 39.9849 Ops/s 39.3481 Ops/s $\color{#35bf28}+1.62\%$
test_a2c_speed 5.6106ms 5.3356ms 187.4194 Ops/s 180.8116 Ops/s $\color{#35bf28}+3.65\%$
test_ppo_speed 5.7386ms 5.5792ms 179.2363 Ops/s 171.1705 Ops/s $\color{#35bf28}+4.71\%$
test_reinforce_speed 4.3713ms 4.1973ms 238.2484 Ops/s 223.0196 Ops/s $\textbf{\color{#35bf28}+6.83\%}$
test_iql_speed 19.3713ms 18.8834ms 52.9566 Ops/s 51.8714 Ops/s $\color{#35bf28}+2.09\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.0380ms 2.8026ms 356.8145 Ops/s 354.0274 Ops/s $\color{#35bf28}+0.79\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.1041s 0.6215ms 1.6090 KOps/s 1.8230 KOps/s $\textbf{\color{#d91a1a}-11.74\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7438ms 0.5202ms 1.9222 KOps/s 1.9122 KOps/s $\color{#35bf28}+0.53\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.9538ms 2.8292ms 353.4536 Ops/s 351.1783 Ops/s $\color{#35bf28}+0.65\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6547ms 0.5361ms 1.8654 KOps/s 1.8550 KOps/s $\color{#35bf28}+0.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.2876ms 0.5167ms 1.9354 KOps/s 1.9379 KOps/s $\color{#d91a1a}-0.13\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6058ms 1.4721ms 679.2946 Ops/s 686.5745 Ops/s $\color{#d91a1a}-1.06\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5370ms 1.4051ms 711.6885 Ops/s 723.5270 Ops/s $\color{#d91a1a}-1.64\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.9888ms 2.9325ms 341.0024 Ops/s 341.0989 Ops/s $\color{#d91a1a}-0.03\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2950ms 0.6719ms 1.4883 KOps/s 1.4850 KOps/s $\color{#35bf28}+0.23\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7796ms 0.6450ms 1.5504 KOps/s 1.5333 KOps/s $\color{#35bf28}+1.12\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.9176ms 2.8240ms 354.1105 Ops/s 353.6650 Ops/s $\color{#35bf28}+0.13\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6958ms 0.5479ms 1.8252 KOps/s 1.8261 KOps/s $\color{#d91a1a}-0.05\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6529ms 0.5244ms 1.9070 KOps/s 1.8872 KOps/s $\color{#35bf28}+1.05\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1222ms 2.8223ms 354.3237 Ops/s 352.8701 Ops/s $\color{#35bf28}+0.41\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6700ms 0.5383ms 1.8577 KOps/s 1.8531 KOps/s $\color{#35bf28}+0.25\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6644ms 0.5151ms 1.9413 KOps/s 1.9271 KOps/s $\color{#35bf28}+0.73\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.0709ms 2.9619ms 337.6192 Ops/s 340.2557 Ops/s $\color{#d91a1a}-0.77\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3819ms 0.6751ms 1.4813 KOps/s 1.4830 KOps/s $\color{#d91a1a}-0.11\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8362ms 0.6502ms 1.5381 KOps/s 1.5284 KOps/s $\color{#35bf28}+0.63\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1293s 9.6550ms 103.5732 Ops/s 101.8842 Ops/s $\color{#35bf28}+1.66\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 17.2065ms 14.8334ms 67.4155 Ops/s 66.3553 Ops/s $\color{#35bf28}+1.60\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.1937ms 1.3348ms 749.1882 Ops/s 880.7758 Ops/s $\textbf{\color{#d91a1a}-14.94\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1210s 7.1742ms 139.3885 Ops/s 140.5239 Ops/s $\color{#d91a1a}-0.81\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 17.0265ms 14.8333ms 67.4160 Ops/s 66.5216 Ops/s $\color{#35bf28}+1.34\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.0539ms 1.3295ms 752.1817 Ops/s 887.8321 Ops/s $\textbf{\color{#d91a1a}-15.28\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1188s 7.4934ms 133.4510 Ops/s 132.1334 Ops/s $\color{#35bf28}+1.00\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 17.5787ms 15.2026ms 65.7784 Ops/s 65.4118 Ops/s $\color{#35bf28}+0.56\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.6625ms 1.5437ms 647.7874 Ops/s 658.4145 Ops/s $\color{#d91a1a}-1.61\%$

Copy link

github-actions bot commented Mar 21, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}1$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 53.4331ms 52.8233ms 18.9310 Ops/s 18.9091 Ops/s $\color{#35bf28}+0.12\%$
test_sync 40.6860ms 35.1795ms 28.4257 Ops/s 33.7421 Ops/s $\textbf{\color{#d91a1a}-15.76\%}$
test_async 50.3959ms 27.5782ms 36.2606 Ops/s 36.3189 Ops/s $\color{#d91a1a}-0.16\%$
test_simple 0.3310s 0.3293s 3.0363 Ops/s 3.0061 Ops/s $\color{#35bf28}+1.01\%$
test_transformed 0.4761s 0.4720s 2.1184 Ops/s 2.0464 Ops/s $\color{#35bf28}+3.52\%$
test_serial 1.2370s 1.1845s 0.8442 Ops/s 0.8346 Ops/s $\color{#35bf28}+1.15\%$
test_parallel 1.0425s 0.9890s 1.0112 Ops/s 1.0004 Ops/s $\color{#35bf28}+1.07\%$
test_step_mdp_speed[True-True-True-True-True] 0.1528ms 21.3164μs 46.9122 KOps/s 46.3774 KOps/s $\color{#35bf28}+1.15\%$
test_step_mdp_speed[True-True-True-True-False] 52.0070μs 13.0801μs 76.4520 KOps/s 77.3649 KOps/s $\color{#d91a1a}-1.18\%$
test_step_mdp_speed[True-True-True-False-True] 42.3400μs 12.5219μs 79.8603 KOps/s 80.7872 KOps/s $\color{#d91a1a}-1.15\%$
test_step_mdp_speed[True-True-True-False-False] 31.4600μs 7.6257μs 131.1349 KOps/s 134.4069 KOps/s $\color{#d91a1a}-2.43\%$
test_step_mdp_speed[True-True-False-True-True] 52.5180μs 22.5463μs 44.3532 KOps/s 44.1497 KOps/s $\color{#35bf28}+0.46\%$
test_step_mdp_speed[True-True-False-True-False] 41.2370μs 14.3007μs 69.9266 KOps/s 70.5445 KOps/s $\color{#d91a1a}-0.88\%$
test_step_mdp_speed[True-True-False-False-True] 42.0590μs 14.3028μs 69.9162 KOps/s 71.8354 KOps/s $\color{#d91a1a}-2.67\%$
test_step_mdp_speed[True-True-False-False-False] 36.2780μs 9.2562μs 108.0357 KOps/s 113.7591 KOps/s $\textbf{\color{#d91a1a}-5.03\%}$
test_step_mdp_speed[True-False-True-True-True] 50.1330μs 24.0937μs 41.5047 KOps/s 41.3437 KOps/s $\color{#35bf28}+0.39\%$
test_step_mdp_speed[True-False-True-True-False] 66.6440μs 15.4682μs 64.6487 KOps/s 65.0209 KOps/s $\color{#d91a1a}-0.57\%$
test_step_mdp_speed[True-False-True-False-True] 39.9650μs 13.6564μs 73.2258 KOps/s 71.8106 KOps/s $\color{#35bf28}+1.97\%$
test_step_mdp_speed[True-False-True-False-False] 27.7720μs 8.7548μs 114.2234 KOps/s 114.0217 KOps/s $\color{#35bf28}+0.18\%$
test_step_mdp_speed[True-False-False-True-True] 61.7450μs 25.0931μs 39.8516 KOps/s 40.0965 KOps/s $\color{#d91a1a}-0.61\%$
test_step_mdp_speed[True-False-False-True-False] 42.8500μs 16.8470μs 59.3577 KOps/s 60.8956 KOps/s $\color{#d91a1a}-2.53\%$
test_step_mdp_speed[True-False-False-False-True] 33.2120μs 14.8179μs 67.4861 KOps/s 67.7065 KOps/s $\color{#d91a1a}-0.33\%$
test_step_mdp_speed[True-False-False-False-False] 33.8830μs 9.9544μs 100.4582 KOps/s 101.0090 KOps/s $\color{#d91a1a}-0.55\%$
test_step_mdp_speed[False-True-True-True-True] 55.1830μs 24.0032μs 41.6611 KOps/s 41.7760 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[False-True-True-True-False] 57.5780μs 15.5787μs 64.1903 KOps/s 64.2246 KOps/s $\color{#d91a1a}-0.05\%$
test_step_mdp_speed[False-True-True-False-True] 38.9020μs 15.9104μs 62.8518 KOps/s 62.6072 KOps/s $\color{#35bf28}+0.39\%$
test_step_mdp_speed[False-True-True-False-False] 36.0380μs 9.9883μs 100.1170 KOps/s 100.3995 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[False-True-False-True-True] 45.5350μs 25.4626μs 39.2733 KOps/s 39.5346 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[False-True-False-True-False] 77.5350μs 16.5150μs 60.5509 KOps/s 60.1252 KOps/s $\color{#35bf28}+0.71\%$
test_step_mdp_speed[False-True-False-False-True] 45.2240μs 16.9552μs 58.9791 KOps/s 58.8980 KOps/s $\color{#35bf28}+0.14\%$
test_step_mdp_speed[False-True-False-False-False] 33.6830μs 11.1535μs 89.6578 KOps/s 89.8749 KOps/s $\color{#d91a1a}-0.24\%$
test_step_mdp_speed[False-False-True-True-True] 58.4990μs 26.2052μs 38.1603 KOps/s 38.4083 KOps/s $\color{#d91a1a}-0.65\%$
test_step_mdp_speed[False-False-True-True-False] 45.2840μs 17.9272μs 55.7812 KOps/s 56.1082 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[False-False-True-False-True] 47.1680μs 17.0992μs 58.4824 KOps/s 59.1040 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[False-False-True-False-False] 30.4570μs 11.1597μs 89.6080 KOps/s 89.2200 KOps/s $\color{#35bf28}+0.43\%$
test_step_mdp_speed[False-False-False-True-True] 53.3290μs 27.3245μs 36.5972 KOps/s 36.3758 KOps/s $\color{#35bf28}+0.61\%$
test_step_mdp_speed[False-False-False-True-False] 43.3510μs 18.8065μs 53.1731 KOps/s 52.6470 KOps/s $\color{#35bf28}+1.00\%$
test_step_mdp_speed[False-False-False-False-True] 42.1480μs 18.0747μs 55.3260 KOps/s 55.7720 KOps/s $\color{#d91a1a}-0.80\%$
test_step_mdp_speed[False-False-False-False-False] 50.9650μs 12.1608μs 82.2313 KOps/s 81.9653 KOps/s $\color{#35bf28}+0.32\%$
test_values[generalized_advantage_estimate-True-True] 9.3924ms 9.1709ms 109.0401 Ops/s 105.7779 Ops/s $\color{#35bf28}+3.08\%$
test_values[vec_generalized_advantage_estimate-True-True] 38.3944ms 35.4526ms 28.2067 Ops/s 28.0925 Ops/s $\color{#35bf28}+0.41\%$
test_values[td0_return_estimate-False-False] 0.2305ms 0.1617ms 6.1850 KOps/s 6.0383 KOps/s $\color{#35bf28}+2.43\%$
test_values[td1_return_estimate-False-False] 26.1974ms 22.9155ms 43.6385 Ops/s 42.6103 Ops/s $\color{#35bf28}+2.41\%$
test_values[vec_td1_return_estimate-False-False] 37.7370ms 35.6014ms 28.0888 Ops/s 28.0979 Ops/s $\color{#d91a1a}-0.03\%$
test_values[td_lambda_return_estimate-True-False] 36.3051ms 33.3115ms 30.0197 Ops/s 29.5964 Ops/s $\color{#35bf28}+1.43\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.7894ms 35.5505ms 28.1290 Ops/s 28.1206 Ops/s $\color{#35bf28}+0.03\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.3660ms 8.1556ms 122.6155 Ops/s 120.4353 Ops/s $\color{#35bf28}+1.81\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.6728ms 2.0883ms 478.8657 Ops/s 479.1225 Ops/s $\color{#d91a1a}-0.05\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5366ms 0.3574ms 2.7980 KOps/s 2.7227 KOps/s $\color{#35bf28}+2.77\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 46.1287ms 43.7914ms 22.8356 Ops/s 21.5703 Ops/s $\textbf{\color{#35bf28}+5.87\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.1138ms 3.0218ms 330.9258 Ops/s 325.8672 Ops/s $\color{#35bf28}+1.55\%$
test_dqn_speed 1.7887ms 1.3802ms 724.5551 Ops/s 731.3419 Ops/s $\color{#d91a1a}-0.93\%$
test_ddpg_speed 3.4407ms 2.7140ms 368.4567 Ops/s 368.5023 Ops/s $\color{#d91a1a}-0.01\%$
test_sac_speed 9.4773ms 8.3463ms 119.8141 Ops/s 117.6313 Ops/s $\color{#35bf28}+1.86\%$
test_redq_speed 14.2711ms 13.1859ms 75.8384 Ops/s 74.0721 Ops/s $\color{#35bf28}+2.38\%$
test_redq_deprec_speed 14.6938ms 13.1309ms 76.1562 Ops/s 74.9567 Ops/s $\color{#35bf28}+1.60\%$
test_td3_speed 8.4193ms 8.2128ms 121.7610 Ops/s 121.3060 Ops/s $\color{#35bf28}+0.38\%$
test_cql_speed 39.2140ms 36.7604ms 27.2032 Ops/s 27.5722 Ops/s $\color{#d91a1a}-1.34\%$
test_a2c_speed 8.3637ms 7.3550ms 135.9610 Ops/s 136.5446 Ops/s $\color{#d91a1a}-0.43\%$
test_ppo_speed 8.3383ms 7.5937ms 131.6877 Ops/s 132.1106 Ops/s $\color{#d91a1a}-0.32\%$
test_reinforce_speed 7.7009ms 6.5510ms 152.6482 Ops/s 153.6087 Ops/s $\color{#d91a1a}-0.63\%$
test_iql_speed 33.4182ms 32.5668ms 30.7061 Ops/s 30.7310 Ops/s $\color{#d91a1a}-0.08\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.2495ms 2.1010ms 475.9607 Ops/s 472.2477 Ops/s $\color{#35bf28}+0.79\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.6936ms 0.5161ms 1.9377 KOps/s 1.9943 KOps/s $\color{#d91a1a}-2.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3.5343ms 0.4760ms 2.1010 KOps/s 2.1106 KOps/s $\color{#d91a1a}-0.46\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.2874ms 2.1314ms 469.1658 Ops/s 470.7657 Ops/s $\color{#d91a1a}-0.34\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.6331ms 0.4934ms 2.0267 KOps/s 1.9401 KOps/s $\color{#35bf28}+4.46\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6227ms 0.4663ms 2.1445 KOps/s 2.1408 KOps/s $\color{#35bf28}+0.17\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4744ms 1.2226ms 817.8990 Ops/s 810.3485 Ops/s $\color{#35bf28}+0.93\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.3426ms 1.1567ms 864.5003 Ops/s 861.6696 Ops/s $\color{#35bf28}+0.33\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.2377ms 2.2235ms 449.7368 Ops/s 446.2842 Ops/s $\color{#35bf28}+0.77\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9581ms 0.6139ms 1.6291 KOps/s 1.6379 KOps/s $\color{#d91a1a}-0.54\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7384ms 0.5869ms 1.7040 KOps/s 1.6875 KOps/s $\color{#35bf28}+0.97\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.8371ms 2.0851ms 479.5901 Ops/s 461.1476 Ops/s $\color{#35bf28}+4.00\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6056ms 0.4965ms 2.0140 KOps/s 2.0152 KOps/s $\color{#d91a1a}-0.06\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3.5762ms 0.4761ms 2.1002 KOps/s 2.0887 KOps/s $\color{#35bf28}+0.55\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.2179ms 2.1115ms 473.6075 Ops/s 467.5047 Ops/s $\color{#35bf28}+1.31\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6037ms 0.4904ms 2.0390 KOps/s 2.0428 KOps/s $\color{#d91a1a}-0.18\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6986ms 0.4710ms 2.1233 KOps/s 2.1400 KOps/s $\color{#d91a1a}-0.78\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.3478ms 2.1771ms 459.3199 Ops/s 452.0548 Ops/s $\color{#35bf28}+1.61\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0991ms 0.6134ms 1.6303 KOps/s 1.6259 KOps/s $\color{#35bf28}+0.27\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8618ms 0.6293ms 1.5890 KOps/s 1.7063 KOps/s $\textbf{\color{#d91a1a}-6.87\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1117s 7.7893ms 128.3807 Ops/s 125.3758 Ops/s $\color{#35bf28}+2.40\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 14.7619ms 11.9589ms 83.6197 Ops/s 82.7746 Ops/s $\color{#35bf28}+1.02\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.8567ms 1.0811ms 924.9578 Ops/s 924.4717 Ops/s $\color{#35bf28}+0.05\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1058s 5.6914ms 175.7023 Ops/s 181.6854 Ops/s $\color{#d91a1a}-3.29\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 14.0197ms 12.0142ms 83.2349 Ops/s 82.8973 Ops/s $\color{#35bf28}+0.41\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.8554ms 1.1062ms 903.9933 Ops/s 925.6748 Ops/s $\color{#d91a1a}-2.34\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1034s 7.9101ms 126.4212 Ops/s 123.5982 Ops/s $\color{#35bf28}+2.28\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 14.6219ms 12.3801ms 80.7747 Ops/s 80.2555 Ops/s $\color{#35bf28}+0.65\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.8922ms 1.3850ms 721.9964 Ops/s 729.8941 Ops/s $\color{#d91a1a}-1.08\%$

@vmoens vmoens added the enhancement New feature or request label Mar 21, 2024
# Conflicts:
#	torchrl/data/replay_buffers/samplers.py
#	torchrl/data/replay_buffers/storages.py
@dtsaras
Copy link

dtsaras commented Apr 29, 2024

I think this feature works pretty well, it would be nice to merge it to the main branch.

@vmoens
Copy link
Contributor Author

vmoens commented Apr 29, 2024

Yep I can give it a fresh look!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants