Issues: modelscope/data-juicer
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
stopwords_filter 为什么是过滤掉小于某个阈值的样本
question
Further information is requested
#307
opened Apr 25, 2024 by
noforit
3 tasks done
hash calculate in ray deduplicator
question
Further information is requested
#286
opened Mar 29, 2024 by
simplew2011
3 tasks done
filter是否支持batch处理,以及怎么设置batch_size?
enhancement
New feature or request
question
Further information is requested
#285
opened Mar 29, 2024 by
Yang-QW
3 tasks done
[Feature Request] Implement more streamlined interfaces for users seeking minimal functionality (data_juicer.op.functional)
dj:op
issues/PRs about some specific OPs
enhancement
New feature or request
#261
opened Mar 15, 2024 by
yxdyc
2 tasks done
support panda's student captioner model in our captioning mapper
dj:multimodal
issues/PRs about multimodal data processing
dj:op
issues/PRs about some specific OPs
enhancement
New feature or request
#251
opened Mar 14, 2024 by
yxdyc
Potential performance Issue: Slow read_csv() Function with pandas 2.0.0
#224
opened Mar 2, 2024 by
TendouArisu
ProTip!
Updated in the last three days: updated:>2024-05-09.