Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG TextField objects are not editable when added to an existing PDF #138

Open
stimon opened this issue Oct 30, 2022 · 5 comments
Open

BUG TextField objects are not editable when added to an existing PDF #138

stimon opened this issue Oct 30, 2022 · 5 comments

Comments

@stimon
Copy link

stimon commented Oct 30, 2022

Describe the bug
TextField objects that are added to existing PDF pages with absolute positioning are displayed, but not editable.

To Reproduce

  1. Open an existing PDF.
  2. Iterate the document pages and add TextField objects to a specific position with paint method.
  3. Save document as a new PDF
from borb.pdf import Document
from borb.pdf import PDF
from borb.pdf.canvas.geometry.rectangle import Rectangle
from decimal import Decimal
from borb.pdf import Page
from borb.pdf import Paragraph
from borb.pdf import TextField

with open("questionnaire.pdf", "rb") as in_file_handle:
    doc = PDF.loads(in_file_handle)

info = doc.get_document_info()
N = int(info.get_number_of_pages())
r: Rectangle = Rectangle(
        Decimal(50),
        Decimal(5),
        Decimal(100),
        Decimal(5),
    )
subfield = TextField(field_name="subject_id")
for i in range(0, N):
    page = doc.get_page(i)
    subfield.paint(page, r)
    

with open("borbtest.pdf", "wb") as pdf_out_handle:
    PDF.dumps(pdf_out_handle, doc)

Expected behaviour
The TextField object should be editable in the output PDF.
The aim is that entering text in any of the pages updates all text fields. That is why I'm using one TextField object for all pages. I'm unsure if this is the right approach or possible at all.

Screenshots
screenshot

Desktop (please complete the following information):

  • OS: Mac OSX 12.6
  • borb version: 2.1.5.2
  • The input PDF is a Clinical Research Form and should not be shared because it includes licensed clinical tests.

Additional context
The aim is to add an input text field to enter the subject ID and carry it over all pages before printing.

@jorisschellekens
Copy link
Owner

I think I can spot the problem (although I may be wrong).

You are adding the same TextField to every Page. For other LayoutElement objects this wouldn't matter. But FormField objects are special.

If you want 5 TextFields, you have to create 5 separate TextFields.

Can you change your code to try out this idea? Just move the line where you construct the TextField inside the loop.

Kind regards,
Joris Schellekens

@stimon
Copy link
Author

stimon commented Oct 31, 2022

Hi Joris,

It doesn't do the trick :(
I also gave a unique name just in case, but same result:

with open("crf_textfield.pdf", "rb") as in_file_handle:
    doc = PDF.loads(in_file_handle)

info = doc.get_document_info()
N = int(info.get_number_of_pages())
r: Rectangle = Rectangle(
        Decimal(50),
        Decimal(5),
        Decimal(100),
        Decimal(5),
    )

for i in range(0, N):
    page = doc.get_page(i)
    TextField(field_name=f"subject_id_{i}").paint(page, r)
    
with open("borbtest.pdf", "wb") as pdf_out_handle:
    PDF.dumps(pdf_out_handle, doc)

I extracted 5 pages for these tests so I can share the input file: crf_textfield.pdf

The output file I got is this one: borbtest.pdf

@jorisschellekens
Copy link
Owner

jorisschellekens commented Nov 3, 2022

I made the test even simpler by taking just the first Page.
The problem also manifests there.

It seems like your PDF is applying some kind of strange coordinate-transform.
You can see the result here borbtest_1_page.pdf

The rectangle for the TextField is added at the bottom of the Page. But the actual text of the TextField (Lorem Ipsum) is added at the top of the Page.

I'm guessing the same problem manifests itself in the larger PDF. And perhaps because you are drawing so close to the boundary of the Page, the transformed coordinates end up being off the Page entirely (at which point the PDF reader software might decide not to render that anymore)

@stimon
Copy link
Author

stimon commented Nov 5, 2022

Hi Joris,

Thank you for investigating this. I'm not sure if I have many options to address the issue then, because I don't have much control in how the PDF is built.
The source PDF is generated with SDAPS, a toolkit to create and process OMR questionnaires. Then, I make some modifications using Coherent PDF Command Line Tools. Basically, just merging pages from the tests PDFs, and writing it to the final one I uploaded.
I know SDAPS does quite some things under the hood, during the setup, but I don't have enough knowledge about PDF to dive that deep.

The thing is that a colleague did this "manually" using Acrobat pro, so I thought of doing it programatically.
Maybe I can reach out to the SDAPS people and ask around.

Thanks again

@githobbes
Copy link

githobbes commented Nov 16, 2022

Simon,

I was experiencing a similar issue: for some reason the PDF that I was developing is placing the Text_Field object in the correct place, but performing a linear shift to the coordinates when printing the border for the Text_Field object.

A particularly weird aspect of this is that this problem only appeared after I updated to the current version.

I'm not sure if this is useful information to you, but since the borders for the boxes are not necessary for my document, I just set the border_**** parameters to False for the Text_Field objects.

-- Michael

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants