Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container ? #33

Open
Jean-Baptiste-Lasselle opened this issue Feb 28, 2022 · 2 comments
Open

Container ? #33

Jean-Baptiste-Lasselle opened this issue Feb 28, 2022 · 2 comments

Comments

@Jean-Baptiste-Lasselle
Copy link

Description

I would like to run mugen in a container.

Rationale

Because it would be so much faster, and simpler t run. Because it could make think of how to scale up that service (scale up a Kubernetes deployment to 20 pods, and each of the 20 pods processes 2 seconds of the videos, in, the end all is put back together and returned to request issuer)

Alternatives

running conda in a Virtual Machine

Additional context

@Jean-Baptiste-Lasselle
Copy link
Author

I will try as soon as possible to propose a Dockerfile in a PR. Do you @scherroman have any plan to bring up an OCI Container image definition ?

@scherroman
Copy link
Owner

scherroman commented Mar 5, 2022

Hey @Jean-Baptiste-Lasselle I love the idea of a standard Dockerfile for the project if it makes running mugen simpler and more flexible, hadn't thought of this! No stranger to dockerfiles so I'll play around with this next week. Would be happy to review one if you propose it as well.

I've been thinking a lot about how to speed up the creation process, right now it's a kind of run it and forget it for an hour or two type thing, but doesn't necessarily have to be that way. In particular the analysis of each randomly selected clip for scene cuts and text is quite time consuming and inefficient, and a fair number of perfectly good clips are thrown out as false positives. Switching over to pytesseract from tesserocr likely slowed down mugen slightly, as there's now disk writes/reads involved in the text detection, but whereas tesserocr was not working properly cross-platform, pytesseract was and has better maintenance.

This past week I spent some time writing tests for the detection functions, tweaking them to get the best results and researching/testing alternatives. For scene cut detection, TransNetV2 was quite impressive but slower, had some weaknesses with cuts at the very beginning or end of short clips, and not the smoothest installation. Though a Dockerfile could potentially help with more complex setup steps like that, so something to think about (they actually have a method to run the program from a dockerfile). PySceneDetect was not great. I also found that a larger number of false positives can be reduced by combing moviepy's scene detection (what we currently use) with ffprobe's libav scene detection at a low threshold, which is very fast but inaccurate on it's own. Of course that means running a second scene detection function when a clip with a cut is detected, which again slows down the process slightly.

So there's a balance here, I want to ensure mugen is fast, but first and foremost I want to ensure that that it's easy to install and use, maintainable, working as expected, and not throwing out good scenes. In the immediate term to speed things up i'll be looking to:

  1. Enable performing any necessary analysis up front once per video, which will take some time up front but make creating music videos from the same sources blazingly fast afterwards. There has also been the suggestion in How to speedup? #27 of allowing using a gpu or multiple cores for the selection and analysis, something i'll be keeping in mind in relation to this.

  2. Provide an easy way for users to manually specify exclusion zones for their videos and groups of videos to exclude opening/ending sequences and credits, which would take some extra manual effort up front but improve results and speed up creation overall by allowing us to remove the need for a text detection filter by default. Tesseract's text detection works decently, but there are too many examples of distorted or hand drawn credit sequences that aren't detected, and perfectly good scenes where it falsely detects text causing us to throw them out. Neither does it help in excluding credit-less opening and ending sequences in series and movies. I've thought about training my own credits detection model to help with this, but would be a little too far down the rabbit hole on that one for my liking at this point in time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants