Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extractive QA with multiple answers #406

Open
SamPse opened this issue Jan 12, 2023 · 7 comments
Open

Extractive QA with multiple answers #406

SamPse opened this issue Jan 12, 2023 · 7 comments

Comments

@SamPse
Copy link

SamPse commented Jan 12, 2023

Hello, I am new to txtai and I want to know if I can have a list of results with the QA approach.

My code is:

context = ["The doctor administered a 7g dose of Acetarsol and a 16mg dose of Ibuprofen",
           "The patient took one Alosetron tablet"]


queries = ["What is the drug taken?", "What is the dose?"]

questions = Questions(path=("sultan/BioM-ELECTRA-Large-SQuAD2-BioASQ8B", tokenizer), gpu=True)
results = [questions([question] * len(context), context) for question in queries]
results.append(context)

pd.DataFrame(list(zip(*results)), columns=["Drug", "Dose", "Text"])

I would like to have this result.

Drug Dose Text
Acetarsol 7g The doctor administered a 7g dose of Acetarsol and a 16mg dose of Ibuprofen
Ibuprofen 16mg The doctor administered a 7g dose of Acetarsol and a 16mg dose of Ibuprofen
Alosetron None The patient took one Alosetron tablet

thank you for your help

@davidmezzetti
Copy link
Member

Thank you for taking the time to submit an issue.

Currently, the Questions pipeline only supports returning the top answer. But the underlying Hugging Face pipeline supports multiple answers using the topk argument.

See this link for more: huggingface/transformers#3207

It would be a fairly straightforward change to add this to txtai

@SamPse
Copy link
Author

SamPse commented Jan 12, 2023

Thank you for your answer.
I don't know how exactly do that. If you have an example with code I'm interested. Especially for the Extractive task.
https://github.com/neuml/txtai/blob/master/examples/20_Extractive_QA_to_build_structured_data.ipynb

Once again a very good work and the documentation is very rich :)

@davidmezzetti
Copy link
Member

Thank you.

The comment here: huggingface/transformers#3207 (comment)

Has an example on how to apply the topk parameter and return multiple results.

@casafurix
Copy link

Can this work for sagemaker? Actually I am trying to deploy this pipeline in sagemaker, how can we customize the topk parameter in that? (As Sagemaker works a bit differently), Sagemaker currently returns only 1 answer, and am unable to modify the parameters to increase the number of returned items. Thank you.

This is my model and pipeline setup in Sagemaker:
hub = {
'HF_MODEL_ID':'valhalla/t5-base-qa-qg-hl',
'HF_TASK':'text2text-generation'
}

@davidmezzetti
Copy link
Member

davidmezzetti commented Jan 24, 2023

Unfortunately, I'm not too familiar with the Hugging Face SageMaker interface. The Hugging Face team may be able to help on that one. I'd take a look at asking here - https://discuss.huggingface.co/

@rcali21
Copy link

rcali21 commented May 25, 2023

@SamPse This is a bit dated, but I'd like to implement this in a future PR. Do you mind providing your full code? Thanks!

@davidmezzetti
Copy link
Member

Keeping this issue open, still a good issue to consider.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants