You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
When debugging an issue with the logs produced during the various system tests that involve the scaling of deployments, it can be confusing trying to keep track of the particular set of pods that are active during any given scaling operation. Having this information would make it easier to map what the relevant set of logs are during any given period in a test.
Furthermore, the scaling feature request by things such as the PravegaSegmentStoreK8sService depends on a waitUntilPodIsRunning call provided by K8sClient. One downside to the current implementation is that it only waits for a particular number of running pods to become active, but does not wait for the terminated pods to be removed. This can lead us to be unaware of times where resources are failing to be cleaned up and can have downstream effects on later system tests.
Another issue is that there is no bound on the number of tries the system test will take on waiting for the resource, but instead relies on timing out on the testing framework end. This can greatly increase the total amount of time it takes for a single system test deployment to take.
Describe the solution you'd like
Add some logs to list the active set of pods for the particular resource after a scale event.
Make the waitUntilPodIsRunning call wait for both the expected number of running pods to be active, as well as ensuring all non-running pods have been removed.
Implement some bound on the number of retries to take.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
When debugging an issue with the logs produced during the various system tests that involve the scaling of deployments, it can be confusing trying to keep track of the particular set of pods that are active during any given scaling operation. Having this information would make it easier to map what the relevant set of logs are during any given period in a test.
Furthermore, the scaling feature request by things such as the
PravegaSegmentStoreK8sService
depends on awaitUntilPodIsRunning
call provided byK8sClient
. One downside to the current implementation is that it only waits for a particular number of running pods to become active, but does not wait for the terminated pods to be removed. This can lead us to be unaware of times where resources are failing to be cleaned up and can have downstream effects on later system tests.Another issue is that there is no bound on the number of tries the system test will take on waiting for the resource, but instead relies on timing out on the testing framework end. This can greatly increase the total amount of time it takes for a single system test deployment to take.
Describe the solution you'd like
waitUntilPodIsRunning
call wait for both the expected number of running pods to be active, as well as ensuring all non-running pods have been removed.The text was updated successfully, but these errors were encountered: