Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easier way to get a dictionary #80

Open
omae-muds opened this issue Aug 20, 2021 · 0 comments
Open

Easier way to get a dictionary #80

omae-muds opened this issue Aug 20, 2021 · 0 comments

Comments

@omae-muds
Copy link

I'm not familiar with this project, so there may be solutions I don't know about.

Motivation

Recently, one of the ways to use MeCab in Python is to just pip install MeCab-Python3 and a dictionary. I want to install both MeCab-Python3 and mecab-ipadic-neologd from PyPI. However, so far this dictionary is only available via the system package manager or installer script.

Goal

I have two suggestions.

1. Make it possible to pip install mecab-ipadic-neologd

An easy way to use MeCab and good dictionary in Python. The most ideal, but harder than the other one.

In Python, we often build virtual environments based on lists of packages described in files like requirements.txt, Pipfile or pyproject.toml. Currently, this dictionary cannot be written in the list, and can only be installed in a way that affects outside the virtual environment (i.e. the system). This problem will be solved.

2. Releasing the latest dictionary zip via GitHub Actions

This is a simple way to satisfy people who want the dictionary data.

In Github Actions, run the equivalent of the commands described in the README, and release the generated dictionary as a zip file.
After downloading and extracting the zip, it can be used like tagger = MeCab.Tagger("-r /dev/null -d ./dic/mecab-ipadic-neologd").

However, this repository is already producing releases every few years. If there were (for example) two zip releases every week, it would cause confusion with existing releases. Also, the automatic release may contain issues that were overlooked because they were not done manually. (e.g. Corrupted data, compression failure, etc. )


At first, I tried to implement 2 in my repository. (Personally, I wanted to learn GitHub Actions).
But I am not experienced in this kind of thing, and I am confused about how to handle the license. Do I just put a copy of COPYING in the repository and zip, and mention it in the README?

If this issue is not in the scope of this project, I want to implement 2 personally. In that case, won't there be license problems with the above approach?

Rather, if you choose method 2 or something similar, I could help you, since I already have yaml created in a private repository.

@omae-muds omae-muds changed the title より簡単な辞書の入手方法 Easier way to get a dictionary Aug 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant