New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug/Execution speed is very slow in AWS LAMBDA environment #2916
Comments
@cds-code can you describe how you are running |
Im running a docker image in AWS Lambda
|
Have you accounted for spin-up (cold-start) time of the Lambda instance? Like only start timing after receiving the first response? Also, can you provide some specific timings? |
And how much memory is allocated to the Lambda instance? |
I have the same problem in AWS Batch running on fargate. I allocated 2 vCPUs and 4 GB of memory |
Does not contain Lambda instance cold-start time. just partition(filename="XXXXX.pdf") . When the program is executed three times in a loop. Only the first time was very long. ```
|
|
Also, just out of curiosity, can you give me a sense of the cold-start times you've seen? |
We are able to run Unstructured in a container on AWS Lambda without issue (or, well, there are issues, but we can work around them.) Things to consider (sorry that these points are a bit............unstructured):
If you're already accounting for the ECR download/caching time, one other thing you can try is to run a "fake" partition script during the build of your container image. This will help "warm up" any libraries/dependencies which may want to run some initial first-time setup tasks (like building/caching fonts, or downloading models). For example, in the same way you "warm up" the NLTK libraries, you could add a RUN step:
But, this will potentially exacerbate the first point about the container image size. |
This works for me thanks.
Does unstructured itself have an initial load method to Implement the above function? |
Describe the bug
Execution speed is very slow in AWS Lambda environment with extract text from txt,pdf,docx etc, but very fast in local windows environment.
The text was updated successfully, but these errors were encountered: