Replies: 1 comment
-
It's a very good question and honestly, I don't know. When training, I only used augmentation techniques like resizing and flipping. So there might be some gains when adding some additional augmentation techniques powered by Open-CV or similar... And because I did not train the vision models with these techniques I did not add any pre-processing step either. So, if you want to try some pre-processing you could do something like this (I haven't tried though) import deepdoctection as dd
def my_pre_proc_func(np_image):
# your implementation that pre-processes and returns the transformed image
class PreProcessing(dd.ImageTransformer):
# a predictor that runs in a SimpleTransformService
def __init__(self):
self.name = "preproc"
self.model_id = self.get_model_id()
def transform(self, np_img, specification):
return my_pre_proc_func(np_img)
def predict(self, np_img):
return dd.DetectionResult(document_type="my_pre_proc_func")
def clone(self):
return self.__class__()
@staticmethod
def possible_category():
return dd.PageType.document_type
preproc_component = dd.SimpleTransformService(PreProcessing())
analyzer = dd.get_dd_analyzer()
# inject the pre-processing step at the beginning of the pipeline
analyzer.pipe_component_list[0] = preproc_component
df = analyzer.analyze(path=...) # the usual stuff Would be interesting, if you observe improvements! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I wonder is PDF/Images Pre-processing necessary to improve Layout segmentation and OCR accuracy? I have tried Table Recognition + DocTR on two different PDF. One work perfectly, the other one fail to perform segmentation.
Beta Was this translation helpful? Give feedback.
All reactions