Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Named Entity Recognition Skill #30

Open
tianchiguaixia opened this issue Nov 7, 2023 · 19 comments
Open

Create Named Entity Recognition Skill #30

tianchiguaixia opened this issue Nov 7, 2023 · 19 comments

Comments

@tianchiguaixia
Copy link

This is very important. Can we do this first?

@niklub
Copy link
Contributor

niklub commented Nov 7, 2023

Hi, @tianchiguaixia ! Absolutely, do you have any use case in mind to test it out?

@tianchiguaixia
Copy link
Author

I have many projects that structure the results of text and image OCR.

If using BERT, it is necessary to continuously annotate data for training, which incurs a significant time cost.

  1. The fields extracted between different projects are also different, making it difficult to achieve universality.

  2. If LLM is used, according to the prompts, partial structuring can be achieved, but the accuracy is not as high as after Bert fine-tuning.

The ideal method is to use LLM+system knowledge (existing knowledge) for information extraction and NER

As knowledge (already filled into the system) increases, the model's effectiveness improves

Uploading WeChat_20231108103427.mp4…

@tianchiguaixia
Copy link
Author

Hello, can you do this NER task first? Approximately when can I try it out

@tianchiguaixia
Copy link
Author

Supporting NER tasks is very important, and there are many application scenarios. Can we support this first?

@niklub
Copy link
Contributor

niklub commented Nov 29, 2023

Hey, @tianchiguaixia , happy to implement that as soon as possible. Any ideas on how to most efficiently match entities from text using LLM?

@tianchiguaixia
Copy link
Author

The entities in the text cannot be matched all at once, and the evaluation criteria are also different. We need to rely on each incoming sample and continuously let the model learn. Make the model smarter and smarter. How to evaluate the quality of learning is to evaluate the data that has been artificially corrected and the data that has been passed in.

@tianchiguaixia
Copy link
Author

This is when I use space_llm and ChatGPT API to Implementation cases

Rec.0004.mp4

@tianchiguaixia
Copy link
Author

Rec.0002.mp4

@tianchiguaixia
Copy link
Author

Rec.0006.mp4

@tianbuwei
Copy link

This is when I use space_llm and ChatGPT API to Implementation cases

Rec.0004.mp4

Hello, may I ask how to input the picture information into ChatGPT? Are you using a multi-modal model to do this?

@tianchiguaixia
Copy link
Author

没有啊。我直接使用的OCR+LLM

@tianbuwei
Copy link

没有啊。我直接使用的OCR+LLM

您好,方便我加您个微信吗,想要具体了解下您这边是怎么实现的,感觉您在页面上展示的功能太炫酷了

@tianchiguaixia
Copy link
Author

tianchiguaixia2023

@tianbuwei
Copy link

tianchiguaixia2023

您好,您提供的是微信号啊,我这边找不到您,要不麻烦您加我一下
tianhesuo

@tianbuwei
Copy link

tianchiguaixia2023

您是不是设置了陌生人不允许添加您的设置

@Sean-Koval
Copy link

I am interested in tackling the NER skill implementation if the task is still open. I have multiple use cases at work where something like this would be valuable.

@niklub
Copy link
Contributor

niklub commented Feb 1, 2024

Hello guys @Sean-Koval @tianchiguaixia @tianbuwei we are going to implement a simple version of NER skill in #57 Happy to get your feedback / suggestions on it!

@amenhere
Copy link

amenhere commented May 8, 2024

您好,我也在做基于llm的ner任务,我的数据是非结构化的合同文本。llm提取后一些甲乙方标签集效果不是很好,如:甲乙混淆,甲乙方提取到同一个实体等等。方便的话想和您详细讨论一下。

@tianchiguaixia
Copy link
Author

提示词

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants