Skip to content
This repository has been archived by the owner on Apr 9, 2024. It is now read-only.

Latest commit

 

History

History
60 lines (29 loc) · 1.91 KB

README.md

File metadata and controls

60 lines (29 loc) · 1.91 KB

See lazysys for the revamped version.


LazyShorts

A command-line tool to convert long-form videos into multiple short-form videos, with burned-in text and subtitles. It also cuts out unwanted silence.

Preview

Original video

original.mp4

Result (with manual subtitle correction)

corrected.mp4

Result (without manual subtitle correction)

As you can see, in Hungarian the medium model works quite well: considering the bad quality of my input. The large model could be even better: if you have the hardware. :)

whisper.mp4

Notes

Arguments

See lazyshorts -h

Subtitles

I use Whisper to transcribe audible voices to text.

Obviously with non-english languages the accuracy can be lower: you can help that by...

  • ...using a different (for now, only Whisper) model (be wary, the medium model is hard to run even with 8GBs of RAM.)
  • ...editing subtitles manually from segment to segment. ({lazyshorts-py} e1 2 45 78...)

Not tested

  • I don't know if running Whisper on GPU works, you could try CUDA. See --whisper_device and PyTorch/Whisper documentation. Also, get the CUDA enabled PyTorch as I define the CPU one in the requirements.txt.

Known issues

  • 'subprocess.run' somehow blocks UI process.
  • We could use rich to have nice progress bars, as currently you have to manually poll the status of the renders.
  • Cropping is just arbitrary: I wanted to use MediaPipe. It's not easy to even get it to run, but my resources were not enough. Maybe a less demanding model or cloud is needed?
  • Don't combine segments that are less than end_time, you'll get an exception.