Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Coordinate 'right' is less than 'left' #11

Open
atgreen opened this issue Apr 8, 2024 · 1 comment
Open

ValueError: Coordinate 'right' is less than 'left' #11

atgreen opened this issue Apr 8, 2024 · 1 comment
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@atgreen
Copy link

atgreen commented Apr 8, 2024

Given this code:

import openparse

basic_doc_path = "mydoc.pdf"
parser = openparse.DocumentParser(
    table_args={
        "parsing_algorithm": "unitable",
        "min_table_confidence": 0.8,
    }
)

parsed_basic_doc = parser.parse(basic_doc_path)

for node in parsed_basic_doc.nodes:
    print(node.json())

I'm getting the following error:

  File "/home/green/git/cl-langtools/test.py", line 11, in <module>
    parsed_basic_doc = parser.parse(basic_doc_path)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/green/git/cl-langtools/tools/open-parse/lib64/python3.12/site-packages/openparse/doc_parser.py", line 106, in parse
    table_elems = tables.ingest(doc, table_args_obj, verbose=self._verbose)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/green/git/cl-langtools/tools/open-parse/lib64/python3.12/site-packages/openparse/tables/parse.py", line 223, in ingest
    return _ingest_with_unitable(doc, parsing_args, verbose)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/green/git/cl-langtools/tools/open-parse/lib64/python3.12/site-packages/openparse/tables/parse.py", line 189, in _ingest_with_unitable
    table_str = table_img_to_html(table_img)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/green/git/cl-langtools/tools/open-parse/lib64/python3.12/site-packages/openparse/tables/unitable/core.py", line 192, in table_img_to_html
    pred_cell_lst = predict_cells(image_tensor, pred_bbox, table_image)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/green/git/cl-langtools/tools/open-parse/lib64/python3.12/site-packages/openparse/tables/unitable/core.py", line 160, in predict_cells
    _image_to_tensor(image.crop(bbox), size=(112, 448)) for bbox in pred_bboxes
                     ^^^^^^^^^^^^^^^^
  File "/home/green/git/cl-langtools/tools/open-parse/lib64/python3.12/site-packages/PIL/Image.py", line 1237, in crop
    raise ValueError(msg)
ValueError: Coordinate 'right' is less than 'left'

If it helps, my input document is this one: https://www.rbc.com/investor-relations/_assets-custom/pdf/ar_2023_e.pdf

@giovannibonetti
Copy link

Thanks for this great library.

I'm also getting ValueError: Coordinate 'right' is less than 'left' with this PDF
and almost the same code:

import openparse

basic_doc_path = "sample.pdf"
parser = openparse.DocumentParser(
    table_args={
        "parsing_algorithm": "unitable",
        "min_table_confidence": 0.8
    },
)

parsed_doc = parser.parse(basic_doc_path)

@Filimoa Filimoa added the bug Something isn't working label Apr 8, 2024
@Filimoa Filimoa changed the title Error parsing with unitable ValueError: Coordinate 'right' is less than 'left' Apr 11, 2024
@Filimoa Filimoa added the good first issue Good for newcomers label May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants