Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to support pass@k evaluation on the HumanEval dataset #1180

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

shubhra
Copy link

@shubhra shubhra commented Aug 11, 2023

Example:

numactl -C0-15 python deepsparse/src/deepsparse/transformers/eval_downstream.py \
        <model_path>\
        --num-cores 16 \
        --dataset openai_humaneval \
        --humaneval-method pass_at_k \
        --engine deepsparse \
        --start 0 \
        --max-samples 2 
  • This will create a subset of the HumanEval dataset starting at index 0 (start) and pick 2 samples (max-samples) to run the evaluation on.
  • If benchmark-humaneval argument is supplied, the evaluation will run on a pre-selected smaller subset of the dataset that contains 11 samples and will ignore start and max-samples.
  • Set humaneval-method to perplexity to evaluate perplexity instead of pass@k.
  • Add --n-solutions <n> to specify the number of solutions required per task . Default is 1.

Note: Remove numactl -C0-15 if you don't need to specify which cores to run on.

@shubhra shubhra marked this pull request as draft August 11, 2023 14:56
@shubhra shubhra changed the title Changes to support pass at k evaluation on the HumanEval dataset Changes to support pass@k evaluation on the HumanEval dataset Aug 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant