Questions & Requests? #20

akshowhini · 2020-07-05T19:46:22Z

eval_relations(gt=[ground_truth_relations], res=[your_relations], cmp_blank=True)

As the objective is to find the merged relations of the neighboring cells, isn't the blank relation comparison a wrong evaluation metric?

rel_gen.py this

That assumes every cell has unique text and case sensitive. Considering the fact this is only to generate relations for comparisons against the prediction. This may result in false negatives?

Evaluation stats & trained models

Can we expect to get the evaluation stats & trained models getting published?

The text was updated successfully, but these errors were encountered:

abhyantrika · 2020-07-06T22:42:31Z

Hi,
Do you know how to recover the graph/table structure from the model predictions ?

akshowhini · 2020-07-07T00:52:08Z

Unfortunately not, I'm still trying to understand that piece of it. Expecting help from @CZWin32768, the author of thi repo

rmporsch · 2020-07-07T01:37:12Z

@akshowhini I am not one of the authors. I just fixed an issue when I was looking over this repo. For table re-construction I believe this is not included and in my experience this is a topic by its own. Table reconstruction, including accounting for potential errors, isn't that trivial.

abhyantrika · 2020-07-08T00:11:45Z

@akshowhini @rmporsch
Had a doubt regarding the annotations.
x1,x2,y1,y2 with respect to what ?
Original PDF image ?
or the cropped image ?
Both do not seem to match. I am assuming x1,x2,y1,y2 as xmin,xmax,ymin and ymax.

rmporsch · 2020-07-08T03:12:39Z

@abhyantrika I will say that this also confused me a bit. Its also not mentioned in the paper I believe:
You can see in

SciTSR/scitsr/data/utils.py

Lines 66 to 92 in b62789a

    
           def transform_coord(chunks): 
        
               # Get table width and height 
        
               coords_x, coords_y = [], [] 
        
               for chunk in chunks: 
        
                   coords_x.append(chunk.x1) 
        
                   coords_x.append(chunk.x2) 
        
                   coords_y.append(chunk.y1) 
        
                   coords_y.append(chunk.y2) 
        
               # table_width = max(coords_x) - min(coords_x) 
        
               # table_height = max(coords_y) - min(coords_y) 
        
               # Coordinate transformation for chunks 
        
               table_min_x, table_max_y = min(coords_x), max(coords_y) 
        
               chunks_new = [] 
        
               for chunk in chunks: 
        
                   x1 = chunk.x1 - table_min_x 
        
                   x2 = chunk.x2 - table_min_x 
        
                   y1 = table_max_y - chunk.y2 
        
                   y2 = table_max_y - chunk.y1 
        
                   chunk_new = Chunk( 
        
                       text=chunk.text, 
        
                       pos=(x1, x2, y1, y2), 
        
                   ) 
        
                   chunks_new.append(chunk_new) 
        
               # return table_width, table_height 
        
               return chunks_new

the chunks are actually transformed. You will need to re-convert them to visualize the results.

Please note that the evaluation stats are published in the paper @akshowhini . I guess this also should answer a number of your other questions.

abhyantrika · 2020-07-08T03:24:24Z

@rmporsch
I saw that.
So basically there are chunks in the chunk folder and once loaded they are transformed.
But I visualized the both the table and pdf (as image) and no co ordinates (original and transformed) were matching the boxes of any image. I am seriously confused.
Do you have any intuition on that ?

rmporsch · 2020-07-08T09:26:26Z

@abhyantrika
The following works fine for me:

        for chunk in chunks:
            copy_chunk = copy.deepcopy(chunk)
            chunk.x1 = copy_chunk.x1 + self.table_min_x
            chunk.x2 = copy_chunk.x2 + self.table_min_x
            chunk.y1 = table_max_y - copy_chunk.y2
            chunk.y2 = table_max_y - copy_chunk.y1
            chunk.pos = [chunk.x1, chunk.x2, chunk.y1, chunk.y2]

abhyantrika · 2020-07-08T15:37:11Z

@rmporsch
Is this a preprocessing function you apply in utils.py ?
And I assume they refer to xmin,xmax,ymin,ymax of the table image (Not the pdf image).

rmporsch · 2020-07-08T15:42:17Z

This is a post processing I did to visualize the prediction results when I took a look at the paper. table_mix_x etc refer to the same one as indata/utils.py

abhyantrika · 2020-07-08T15:46:44Z

@rmporsch
Okay.
Her is my code for relation predictions.
for d in dataset: outputs = model(d.nodes, d.edges, d.adj, d.incidence); output_rel = outputs.max(dim=1)[1];
Now, your post processing needs to be applied for d.chunks ?

abhyantrika · 2020-07-08T16:27:19Z

@rmporsch @akshowhini
Sorry for the silly questions.
Also what exactly do the coordinates in .chunk files represent ?
I tried visualizing them for both pdf and table image. The coordinates do not seem to match any boxes.

rmporsch · 2020-07-11T08:48:20Z

@abhyantrika yes, since the chunks are changed during preprocssing (during the data loader process) you will need to change them back.

kbrajwani · 2020-10-30T05:14:56Z

@rmporsch hey can you provide steps or example notebook for training and inference the model .

rmporsch · 2020-10-30T05:39:12Z

sorry I don't think I have any bandwidth for that at the moment. But again...I am not the maintainer of this repo, nor was I involved in the development.

…

On Fri, Oct 30, 2020 at 1:15 PM Kumar Rajwani ***@***.***> wrote: @rmporsch <https://github.com/rmporsch> hey can you provide steps or example notebook for training and inference the model . — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#20 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZZ4RQCOI5MP3M4YQC3PKLSNJDVZANCNFSM4ORAPL3Q> .

kbrajwani · 2020-11-23T11:38:52Z

@rmporsch @akshowhini @abhyantrika can you help me to inference the trained model. I have only image from which contains the table. so how can i create the other required files and get the structure of table from image.

rmporsch · 2020-11-23T11:51:53Z

For that you would need an text segmentation model and probably an OCR model first. This model here requires at least the coordinates of the text.
Have a look at other projects if you need something more end to end. If you really want to use this model please aim to understand the paper first. This is not a production ready repo but a research project.

kbrajwani · 2020-11-23T12:23:13Z

Thanks for reply i am using the tesseract model which is able to give me text coordinate with transcription.
Yes i have experience with cascadetabnet which is giving me table coordinate then i am doing line detection to get structure of table.

kbrajwani · 2020-11-26T07:40:00Z

@rmporsch As per your guidance i was reading paper and i have found following things.
model takes chunk which is transcription and coordinate boxes that we can generate from ocr model another is structure which is cells position in that i need help.
then model will find the structure of table.

kbrajwani · 2020-11-26T13:55:26Z

@CZWin32768 @rmporsch hey i have doubt structure is actually a table right? so if we giving complete table then what model is doing.
or model doesn't take structure into account while training and predict from chunk and rel file.

rel file is just relation horizontal and vertical also the cell between them right?

kbrajwani · 2020-12-03T11:10:07Z

@rmporsch hey can you give me few hints to go ahead i am stuck.

rmporsch · 2020-12-03T15:07:43Z

@kbrajwani The model predicts the edges (a column, a row, or no relationship) between each node (the text boxes). Hence, during inference the model predicts from features present in the chunk file the relationship between each text chunk.

kbrajwani · 2020-12-08T13:12:04Z

@rmporsch Thanks for the support. I am studying this type of work so if you know another project similar to this can you please share it here. it would be appreciated.

levanpon98 · 2021-01-15T07:05:09Z

Hi,
How do I recover the relations predicted from model to csv?
Ex: If node_1 have 2 relations by column, when export to csv, node_1 is a merge cell

Darenar · 2021-02-04T11:23:34Z

Hi, everyone.
For those who also struggles with the identification of boundary boxes in the pdf - I think I've got a solution.
First of all, to read PDF correctly through python, you have to be sure that the size of loaded PDF is exactly the same as the original one.
For example, I've been loading pages from pdf as images using pdf2image convert_from_path function. As the default, it uses parameter dpi as 200, which is appeared to be not correct. The thing you need to do is to identify the correct shape of the PDF, using for example PyPDF2 from https://stackoverflow.com/questions/6230752/extracting-page-sizes-from-pdf-in-python , and then you could have used convert_from_path with parameter ```size`` specifying the correct shape of the pdf.

Another vital point here is that coordinates are different from those you wish to apply to your PIL.Image object, for example. The thing you have to do is too adjust your y1 and y2 by y1 = PAGE_HEIGHT-y2 and y2 = PAGE_HEIGHT-y1. It comes from the different measures of Y coordinates in python and in PDF itself.

I hope it will be handy to someone, cause I've struggled with it for almost the whole day!

brabbit61 · 2021-03-01T17:40:18Z

@Darenar Hey, that tip was useful thanks for that! Could you also please shed some light on the inference script. I can't seem to find that.

JBBalling · 2022-07-14T14:28:32Z

Has anyone done inference with the provided model? I am stuck loading the TableInferDataset. There the script wants to load the relation files from relation_path. But on inference there shouldn't be a relation file, because this is what we want to predict isn't it?
https://github.com/Academic-Hammer/SciTSR/blob/master/scitsr/data/loader.py#L121

I have a structure.json and chunks.json file according to the input in the training routine. What am I missing?
Any help is appreciated, thank you!

@abhyantrika @rmporsch @kbrajwani

JBBalling · 2022-08-04T12:36:22Z

I have managed to get this running and added the inference script at my fork. Also I had to do minor changes in the source code in the TableInferDataset class. Here is the Link to the repo. Have fun and I hope it helps.

Where:
[0, 0, 1] -> vertical relation
[0, 1, 0] -> horizontal relation
[1, 0, 0] -> no relation

https://github.com/JBBalling/SciTSR

MathamPollard · 2023-10-22T05:16:45Z

Hi, everyone. For those who also struggles with the identification of boundary boxes in the pdf - I think I've got a solution. First of all, to read PDF correctly through python, you have to be sure that the size of loaded PDF is exactly the same as the original one. For example, I've been loading pages from pdf as images using pdf2image convert_from_path function. As the default, it uses parameter dpi as 200, which is appeared to be not correct. The thing you need to do is to identify the correct shape of the PDF, using for example PyPDF2 from https://stackoverflow.com/questions/6230752/extracting-page-sizes-from-pdf-in-python , and then you could have used convert_from_path with parameter ```size`` specifying the correct shape of the pdf.

Another vital point here is that coordinates are different from those you wish to apply to your PIL.Image object, for example. The thing you have to do is too adjust your y1 and y2 by y1 = PAGE_HEIGHT-y2 and y2 = PAGE_HEIGHT-y1. It comes from the different measures of Y coordinates in python and in PDF itself.

I hope it will be handy to someone, cause I've struggled with it for almost the whole day!