[CI] Add more unit test to ensure the the outputs are reasonable #704

AsakusaRinne · 2024-04-28T11:14:03Z

Description

Our unit test ensures that loading model and running inference successfully, but cannot indicate the output is reasonable instead of kinda garbage. Currently we need to run the examples when making some major features & fix, which is a bit annoying.

To address this issue, I think we could send the output to OpenAI chatgpt to check if it's reasonable. I will afford the payment of the tokens but will use github.triggering_actor to allow only developers who have write access to trigger the corresponding workflow.

The text was updated successfully, but these errors were encountered:

martindevans · 2024-04-28T15:06:01Z

We could also hardcode the expected responses in the unit tests. For example in this test it generates two completions of "Question. what is a cat?\nAnswer:" and assets that they are the same. We could assert the exact response too.

Of course this would only work with temp=0 and a specific model (even a specific quantisation), but it might save a few OpenAI calls!

SignalRT · 2024-04-28T18:16:55Z

In my opinion it would be easier the alternative that Martin proposes. We can not run all the test in CI y we should verify all the test locally.

AsakusaRinne · 2024-04-28T20:39:59Z

We could also hardcode the expected responses in the unit tests.

Yes, I also want to save the tokens where this approach works! I'll only consider using OpenAI API when necessary.

We can not run all the test in CI. we should verify all the test locally.

I tend to view things a bit differently. The workflows and unit test are responsible for reducing the risk when we merge the PRs. As long as the workflows pass, it should be equal to saying that terrible behaviors won't appear if we merge the PR.

However, due to the GPU backends, it's indeed hard for us to cover all the cases in the workflows. I can provide a machine with Nvidia GPU and Linux OS to run the workflows, but no idea for Windows yet. :)

AsakusaRinne added the ci/cd label Apr 28, 2024

AsakusaRinne self-assigned this Apr 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] Add more unit test to ensure the the outputs are reasonable #704

[CI] Add more unit test to ensure the the outputs are reasonable #704

AsakusaRinne commented Apr 28, 2024

martindevans commented Apr 28, 2024

SignalRT commented Apr 28, 2024

AsakusaRinne commented Apr 28, 2024

[CI] Add more unit test to ensure the the outputs are reasonable #704

[CI] Add more unit test to ensure the the outputs are reasonable #704

Comments

AsakusaRinne commented Apr 28, 2024

Description

martindevans commented Apr 28, 2024

SignalRT commented Apr 28, 2024

AsakusaRinne commented Apr 28, 2024