Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docxtemplater 4 roadmap #340

Open
edi9999 opened this issue Aug 31, 2017 · 21 comments
Open

docxtemplater 4 roadmap #340

edi9999 opened this issue Aug 31, 2017 · 21 comments

Comments

@edi9999
Copy link
Member

edi9999 commented Aug 31, 2017

1. remove setData(data) and resolveData(),

It is now possible to do render(data) and renderAsync(data)

2. Multiple render calls #

Make it possible to call render multiple times, each returning a different JSZip instance :

const zip1 = doc.render({first_name: 1});
const zip2 = doc.render({first_name: 2});

Currently, calling render multiple times is not allowed, and will result in an error since version 3.30.2

Ideally, it would be possible to call render several times with different data.

To do this, we need to cache all compiled parts (this should be done already).

We would also need to cache all xmlDocuments parts before the rendering.

We would also need to be able to revert all zip operations (for example the image module will do this.zip.file(newImagePath, imageContent)

As this is quite complex, to do, I'm really not sure that this will be included in docxtemplater 4.

3. Reorder zip files when creating it via render

const zip1 = new JSZip();
const files = doc.render().file(/./);
files.sort((function (a1, a2) {
    return a1.name > a2.name ? 1 : -1;
}))

files.forEach(function (file) {
    zip1.file(file.name, file._data, {createFolder: true})
})

const buffer = zip1.generate({type: "nodebuffer", compression: "DEFLATE"});

4. Replace render by renderAsync

That returns a promise, that allows data to have promises too. , This has been done in 3.5.0 with resolveData

5. Use another test runner

Jest / ava ? Finding a way to have tests run faster would be cool. First we would need to know for sure what takes most time, is it IO for reading the expected/actual docx, is it CPU for zipping/unzipping the docx ?

6. Make all modules optional

To make it possible to disable loops, rawxml. This makes it possible to have even smaller builds (for the browser). This is probably not very high priority, these modules are not that big and it would make it more complex for users.

7. Add an official inspect module

that allows to debug the docx, and provides some utility function like getTags()

8. Make option : {linebreaks: true} the default

9. Make option {paragraphLoops: true} the default

10. Remove {tag:p} in following call in postparse, or pass this same value in the scopeMananger call.

```
try {
    this.parser(tag, { tag: p });
} catch (rootError) {
    errors.push(getScopeCompilationError({ tag, rootError }));
}
```

11. Remove .compile method

(since v4 constructor automatically compiles the doc).

12. Remove .attachModule method

and put it in the constructor of Docxtemplater (modules key). A question that needs to be solved with this approach is how to handle conditional modules depending of filetype, which are currently handled like this :

	if (doc.fileType === "pptx") {
		doc.attachModule(new TableModule.GridPptx());
		doc.attachModule(new SlidesModule());
	}

=> This has been implemented in #501

13. Require the use of the pizzip module

(jszip fork intended to be sync-only)

14. Remove outdated methods

attachModule, loadZip, setOptions, compile methods since they are now all done within the v4 constructor.

15. Add proofstate module by default

https://docxtemplater.readthedocs.io/en/latest/faq.html#remove-proofstate-tag ? To think about.

16. Remove unused events for modules

For example, module.set({compiled: compiled}) is currently called before the compilation, thus it always equals to {} which makes no sense.

17. Use <a:p> for rawTag instead of <p:sp>

see #622

18. Remove the internal property "resolveOffset"

of scope manager which is no more used.

19. Remove the getTraits API

which is probably overkill because it seems to be used only for the "expandPair" feature.

20. Remove the getFullText

method which was just used as an internal utility function

21. Use the fixDocPrCorruption module by default :

Currently, one has to do :

const fixDocPrCorruption = require("docxtemplater/js/modules/fix-doc-pr-corruption.js");
const doc = new Docxtemplater(zip, { modules: [fixDocPrCorruption] });
@frederikbosch
Copy link

I would skip renderAsync. Let render always return a promise. With the upcoming await syntax people can make it behave synchronous themselves.

const zip1 = await doc.render({first_name: 1});

@edi9999
Copy link
Member Author

edi9999 commented Sep 8, 2017

Yes, the idea was to have two methods, render for synchronous render and renderAsync.

I'm not 100% convinced that it is good to have only async methods, because it hurts performance, especially on CPU intensive tasks (and docxtemplater is only CPU bound), because the javascript VM has to switch tasks very often and loses some optimizations.

See Stuk/jszip#281 for a big discussion about the advantages of keeping a sync function.

@bunnyvishal6
Copy link

bunnyvishal6 commented Nov 4, 2017

Please consider getTags method in docxtemplater class.

@edi9999
Copy link
Member Author

edi9999 commented Nov 4, 2017

I don't think I will be adding a method getTags to docxtemplater itself.

I would like to keep the core of docxtemplater as light as possible.

I think I could create a inspector / debugger module that would contain the logic to do inspectModule.getTags()

Same could be for all modules that are included in the core, like the loopmodule and rawxmlmodule

@bunnyvishal6
Copy link

@edi9999 oh I got it.

@dashcraft
Copy link

Interestingly enough, i was able to make a little plugin/service with angular 4 (updating to 5) that allowed me to generate multiple documents on the fly. I may create a ng-docxtemplater, if i have the time and it's alright with you.

@edi9999
Copy link
Member Author

edi9999 commented Feb 10, 2018

It is now possible to get the tags with the builtin inspectModule :

http://docxtemplater.readthedocs.io/en/latest/faq.html#get-list-of-placeholders

@edi9999
Copy link
Member Author

edi9999 commented Feb 10, 2018

cc @bunnyvishal6

@edi9999
Copy link
Member Author

edi9999 commented Mar 11, 2018

It is now possible to resolve tags asynchronously : http://docxtemplater.readthedocs.io/en/latest/async.html

@edi9999 edi9999 mentioned this issue Jun 7, 2018
@alonrbar
Copy link

alonrbar commented Jun 8, 2018

Hi,

First of all thanks for a very useful library!

I'm really expecting for "8. Auto insert newlines when using \n in the input" is there any chance it can happen sooner, in v3.* instead of v4 ?

I don't mind adding it myself if you can instruct me for the general direction, I have tried to add it my self but wasn't very successful in understanding where it should be done.

@edi9999
Copy link
Member Author

edi9999 commented Jun 8, 2018

It is possible with the v3, but it is dirty :

See this comment :

#144 (comment)

**Edit : **

You now can do this :

const doc = new Docxtemplater(zip, {linebreaks: true});
doc.render({text: "My text,\nmultiline"});

https://docxtemplater.readthedocs.io/en/latest/configuration.html#linebreaks

@alonrbar
Copy link

alonrbar commented Jun 8, 2018

Thanks.
I'll have to consider the pros and cons.
Any estimation on v4 release?

@edi9999
Copy link
Member Author

edi9999 commented Jun 11, 2018

I would say probably during 2019, but it is not decided yet.

@edi9999 edi9999 mentioned this issue Oct 4, 2018
@manere
Copy link

manere commented Nov 21, 2019

Please consider getTags method in docxtemplater class.

Just use something like var tags = String(docxInstance.getFullText()).match(/{[\w,.]{1,100}}/g)

Works like a charm

@edi9999
Copy link
Member Author

edi9999 commented Nov 21, 2019

@manere , to get the list of tags it is recommended to use the following : https://docxtemplater.readthedocs.io/en/latest/faq.html#get-list-of-placeholders

@henrihietala
Copy link

Is it possible to remove complete slides from pptx using conditions? For example if I want to include certain slides for only specific group of people.

@edi9999
Copy link
Member Author

edi9999 commented Nov 30, 2020

Yes, it is possible with the slides module, see https://docxtemplater.com/modules/slides/

The syntax {:users} means to duplicate a given slide for each element in an iterable.

It can also be used with boolean values to simply keep the slide or remove it.

@wcordelo
Copy link

Are there updates on the docxtemplater 4 roadmap ?

@edi9999
Copy link
Member Author

edi9999 commented Jan 23, 2023

Hello @wcordelo , there is no currently set date for this, are you awaiting for anything special in the next feature ?

The major version is mostly hit to allow to simplify the API and thus to have some breaking changes.

@wcordelo
Copy link

wcordelo commented Feb 3, 2023

@edi9999 I'm wondering if there are limitations with using docxtemplater with cloud functions (AWS Lambda, GCP Cloud Functions, Azure Functions, etc.) regarding memory/CPU usage. The memory/CPU storage can be increased for cloud functions, so I'd like to know if there are limitations we should be aware of (e.g. memory should be at least 256 MB). In addition, cloud functions usually run asynchronously, so I'd like to know if there are limitations that require adopting synchronous processes.

@edi9999
Copy link
Member Author

edi9999 commented Feb 6, 2023

Hello @wcordelo , please create an other issue next time, I forgot to respond here.

The memory usage depends on the size of the documents.

The rule of thumb would be : use twice the RAM of the size of the documents you proceed, plus some little extra.
So I would probably use a factor of 2.3.
So if your document is 40MB big, use 2.3 * 40 = 92MB of RAM at least, so 256MB should be plenty enough.

As for CPU usage, docxtemplater is mostly CPU bound so it should work well with a slow CPU but it will of course be slow, and the faster the CPU the faster the generation will be.

For asynchonous, docxtemplater is mostly CPU bound (first the unzipping process is mostly decoding, then strings are splitted, parsed, replaced, then concatenated), so it actually runs almost entirely synchronously.
The only part that can be made async is the resolving of the data : see here : https://docxtemplater.com/docs/async/

However, users of docxtemplater and of the paid versions are using AWS Lambda or Azure functions in production without any issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants