Scaling System Test Logging Improvements. #6435

co-jo · 2021-11-19T21:18:06Z

Is your feature request related to a problem? Please describe.

When debugging an issue with the logs produced during the various system tests that involve the scaling of deployments, it can be confusing trying to keep track of the particular set of pods that are active during any given scaling operation. Having this information would make it easier to map what the relevant set of logs are during any given period in a test.

Furthermore, the scaling feature request by things such as the PravegaSegmentStoreK8sService depends on a waitUntilPodIsRunning call provided by K8sClient. One downside to the current implementation is that it only waits for a particular number of running pods to become active, but does not wait for the terminated pods to be removed. This can lead us to be unaware of times where resources are failing to be cleaned up and can have downstream effects on later system tests.

Another issue is that there is no bound on the number of tries the system test will take on waiting for the resource, but instead relies on timing out on the testing framework end. This can greatly increase the total amount of time it takes for a single system test deployment to take.

Describe the solution you'd like

Add some logs to list the active set of pods for the particular resource after a scale event.
Make the waitUntilPodIsRunning call wait for both the expected number of running pods to be active, as well as ensuring all non-running pods have been removed.
Implement some bound on the number of retries to take.

The text was updated successfully, but these errors were encountered:

co-jo linked a pull request Nov 23, 2021 that will close this issue

Issue 6435: System Test Logging #6445

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scaling System Test Logging Improvements. #6435

Scaling System Test Logging Improvements. #6435

co-jo commented Nov 19, 2021

Scaling System Test Logging Improvements. #6435

Scaling System Test Logging Improvements. #6435

Comments

co-jo commented Nov 19, 2021