Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taskflow默认的最大序列长度怎么看?FastDeploy UIE中最长序列长度怎么设置? #8408

Open
goldwater668 opened this issue May 9, 2024 · 12 comments
Assignees
Labels
question Further information is requested

Comments

@goldwater668
Copy link

请提出你的问题

Taskflow("information_extraction", schema=schema_uie,task_path=model_path,use_fast=True,batch_size=32,precision='fp16')
我训练的最大序列长度设置为512,上述的默认序列长度是不是也是512,耗时67ms
但是我用FastDeploy UIE部署的时候设置最长序列长度为512,耗时却要112ms,
这是怎么回事呢

@goldwater668 goldwater668 added the question Further information is requested label May 9, 2024
@w5688414
Copy link
Contributor

可以参考源代码:

self._max_seq_len = kwargs.get("max_seq_len", 512)

fastdeploy的问题请参考repo:

https://github.com/PaddlePaddle/FastDeploy/tree/develop/examples/text/uie

@goldwater668
Copy link
Author

打扰了,PaddleNLP/paddlenlp/taskflow/information_extraction.py这里面也没有看到pred_threshold这个设置
https://github.com/PaddlePaddle/FastDeploy/tree/develop/examples/text/uie也没有看到阈值概率的设置,只看到这个position_prob
但是设置的position_prob为0.5,还是会出现'probability': 0.16951142251491547,
@w5688414 这是怎么回事呢

@w5688414
Copy link
Contributor

请问怎么复现呢?

@goldwater668
Copy link
Author

使用my_ie = Taskflow("information_extraction",
schema=args.schema,
task_path=args.model_dir,
max_seq_len=512,
use_fast=True,
batch_size=32,
pred_threshold=0.9,
precision='fp16')
这个函数也会出现'probability': 0.6520814895629883

@goldwater668
Copy link
Author

或者使用position_prob=0.9,是不是可以认为probability的值无法通过position_prob和pred_threshold进行过滤

@w5688414
Copy link
Contributor

我测了一下 position_prob没什么问题:

from paddlenlp import Taskflow
schema = ['Destination','Price','Time']


message = "Peace be upon you, dear. I want to book a ticket today to cairo in the evening with a price of 1000 to 2000 SAR"
my_ie = Taskflow("information_extraction",
        schema=schema,
        model='uie-base-en',
        max_seq_len=512,
        use_fast=True,
        batch_size=32,
        position_prob=0.9,
        precision='fp16'
        )
info = my_ie(message)
print(info)

self._position_prob = kwargs.get("position_prob", 0.5)

@goldwater668
Copy link
Author

position_prob=0.9是不是'probability '都是大于或者等于0.9的呢

@w5688414
Copy link
Contributor

可以参考源代码,是大于。

if p > limit:

@goldwater668
Copy link
Author

我设置为position_prob=0.9,确实有出现'probability': 0.8505142460404365,不知道是怎么回事,

@w5688414
Copy link
Contributor

请问这种情况怎么复现呢?

@goldwater668
Copy link
Author

我用的就是my_ie = Taskflow("information_extraction",
schema=args.schema,
task_path=args.model_dir,
max_seq_len=512,
use_fast=True,
batch_size=32,
pred_threshold=0.9,
precision='fp16')
模型使用uie训练自己的数据集得到的

@goldwater668
Copy link
Author

@w5688414 fastdeploy和taskflow不能共存吗?
PaddlePaddle/FastDeploy#2257
#8495

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants