Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File import is too slow! #418

Open
gister9000 opened this issue Oct 20, 2020 · 24 comments
Open

File import is too slow! #418

gister9000 opened this issue Oct 20, 2020 · 24 comments

Comments

@gister9000
Copy link

gister9000 commented Oct 20, 2020

What's the problem this feature will solve?
When I import 140MB nessus scan into faraday, import process lasts more than 5 days (I canceled it after 5 days - time is probably much longer). Test was done on 12GB RAM machine, AMD FX(tm)-8350 Eight-Core Processor (eight 4.0 GHz cores!) etc. Not a slow machine so to speak. Test was performed using CLI.

Describe the solution you'd like
I'd like the import process to be optimized - up to 5 hours would be usable I guess. This is not usable!

Additional context
Check python3 profilers to discover what the bottleneck is - https://docs.python.org/3/library/profile.html

P.S. test file is not attached because it is highly confidential. If you have troubles with reproducing this issue, I'm willing to create unconfidential nessus scan report for you guys - just ask.

GL HF

@aenima-x
Copy link
Contributor

aenima-x commented Oct 20, 2020

With CLI what do you mean?
We have a cli (https://github.com/infobyte/faraday-cli) and a Client (https://github.com/infobyte/faraday-client)

Here you have a test with a 120MB Nessus report.

With the cli, it took 10.83 seconds

(faraday-cli) ➜  Nessus (master) ✗ faraday-cli workspace
name    active    public    readonly      hosts    services    vulns
------  --------  --------  ----------  -------  ----------  -------
nessus  True      False     False             0           0        0
(faraday-cli) ➜  Nessus (master) ✗ time faraday-cli report ./sample_nessus.nessus
Sending data from [./sample_nessus.nessus] to workspace: nessus
faraday-cli report ./sample_nessus.nessus  10.83s user 1.03s system 33% cpu 35.848 total
(faraday-cli) ➜  Nessus (master) ✗ faraday-cli workspace
name    active    public    readonly      hosts    services    vulns
------  --------  --------  ----------  -------  ----------  -------
nessus  True      False     False             7          86      295
(faraday-cli) ➜  Nessus (master) ✗ ls -lh ./sample_nessus.nessus
-rw-r--r--  1 aenima  admin   120M Jun 16 12:41 ./sample_nessus.nessus
(faraday-cli) ➜  Nessus (master) ✗ faraday-cli workspace
name    active    public    readonly      hosts    services    vulns
------  --------  --------  ----------  -------  ----------  -------
nessus  True      False     False             7          86      295

With the web it took 55 seconds (including the upload of a 120MB file)

2020-10-20T14:00:40-0300 - faraday.server.api.modules.upload_reports - INFO {PoolThread-twisted.internet.reactor-6} [pid:34282] [upload_reports.py:44 - file_upload()]  Importing new plugin report in server...
2020-10-20T14:00:49-0300 - faraday.server.api.modules.upload_reports - INFO {PoolThread-twisted.internet.reactor-6} [pid:34282] [upload_reports.py:80 - file_upload()]  Get plugin for file: /Volumes/HDD/aenima/.faraday/uploaded_reports/L2C9D7V5OH48_sample_nessus.nessus
2020-10-20T14:00:49-0300 - faraday.server.api.modules.upload_reports - INFO {PoolThread-twisted.internet.reactor-6} [pid:34282] [upload_reports.py:86 - file_upload()]  Plugin for file: /Volumes/HDD/aenima/.faraday/uploaded_reports/L2C9D7V5OH48_sample_nessus.nessus Plugin: Nessus
2020-10-20T14:00:49-0300 - faraday.server.threads.reports_processor - INFO {ReportsManager-Thread} [pid:34282] [reports_processor.py:87 - run()]  Processing raw report /Volumes/HDD/aenima/.faraday/uploaded_reports/L2C9D7V5OH48_sample_nessus.nessus
2020-10-20T14:00:49-0300 - faraday.server.threads.reports_processor - INFO {ReportsManager-Thread} [pid:34282] [reports_processor.py:58 - process_report()]  Processing report [/Volumes/HDD/aenima/.faraday/uploaded_reports/L2C9D7V5OH48_sample_nessus.nessus] with plugin [Nessus
2020-10-20T14:01:14-0300 - faraday.server.threads.reports_processor - INFO {ReportsManager-Thread} [pid:34282] [reports_processor.py:38 - send_report_request()]  Send Report data to workspace [web]
2020-10-20T14:01:35-0300 - faraday.server.threads.reports_processor - INFO {ReportsManager-Thread} [pid:34282] [reports_processor.py:71 - process_report()]  Report processing finished

With the Client it took 32 seconds.

2020-10-20T14:02:47-0300 - faraday_client.managers.reports_managers - INFO {MainThread} [reports_managers.py:99 - sendReport()]  The file is /Volumes/HDD/aenima/Documents/Faraday/report-collection/faraday_plugins_tests/Nessus/sample_nessus.nessus, nessus
2020-10-20T14:02:47-0300 - faraday_client.plugins.controller - INFO {MainThread} [controller.py:256 - processReport()]  Processing report with plugin nessus
2020-10-20T14:03:19-0300 - faraday_client.plugins.controller - INFO {MainThread} [controller.py:139 - processOutput()]  Sent command duration 200

I don't know how many vulns, hosts and services are in your 140MB report file.

Maybe you are using an old version of the cliente witch uses an old API that is slower.

Or and old version of faraday witch had a problem like this.

Can you tell us witch version are you using?

@gister9000
Copy link
Author

gister9000 commented Oct 21, 2020

Faraday version 3.12.

How do you have 120MB and 300 vulnerabilities (about 400KB/vuln)??

I had over 100 000 vulnerabilities in 140MB scan and about 750 hosts. My colleague did the test - he will probably continue this discussion and tell you exactly how many hosts, vulns and services there are.

@gister9000
Copy link
Author

gister9000 commented Oct 21, 2020

When the import was canceled, there were 77000 vulnerabilities loaded into faraday.
5 days = 7200 minutes which approximates to little above 10 vulnerabilities per minute speed.

We are using a workaround now, maybe someone will find it useful - https://github.com/patriknordlen/nessusedit
Nessusedit repo allows you to easily strip the scan from all informational vulnerabilities (more than 99% of our vulnerabilities were informational). Our large scan was boiled down to about 20k vulnerabilities or 15MB.

@aenima-x
Copy link
Contributor

We never had a client with 100K vulns in one scan.
For example in our commercial version, the average corp license is for 10K vulns total.

I suggest you keep with that workaround.

@aenima-x
Copy link
Contributor

aenima-x commented Oct 22, 2020

And if its posible (changing sensitive data) could you share with us the report?

@gister9000
Copy link
Author

This scenario is very rare indeed, however it happens that a client wants to scan up to 10 /24 subnets. We will keep the workaround then - nessus info stuff is mainly to aid the attacking team, not so useful in the report.

Unfortunately the report is very sensitive - we won't risk leaving a single detail about our client which might be a disaster. I'll do my best to send you some smaller unsensitive nessus scans (but still large enough to take some time).

In the end, I believe this issue is not significant enough to deter us away from faraday, but it is a large minus. Maybe multi-threading the import process would make this 1000 times faster? Why don't you check what the bottleneck is at the very least?

@gister9000
Copy link
Author

I guess nothing will be done here. Workaround is fine and problem is extremely rare - closed

@gister9000
Copy link
Author

gister9000 commented Mar 2, 2021

Fact: any alternative solution is about 200 times faster when importing files. I won't name any other product since I don't want to market anyone, but the same file that took 2 days for faraday to import only took 3 minutes on other solutions. Tested with two different big files on three computers and two different operating systems.

Maybe you want to revisit this issue?

@gister9000 gister9000 reopened this Mar 2, 2021
@aenima-x
Copy link
Contributor

aenima-x commented Mar 2, 2021

@gister9000 I will try to generate a file similar to yours to see where bottle neck is.
If you can send us the report you mention in october ( I'll do my best to send you some smaller unsensitive nessus scans (but still large enough to take some time)) wolud be great

@gister9000
Copy link
Author

gister9000 commented Mar 3, 2021

I am scanning some bug bounty targets and then I'll send you the report. It will take time because I've throttled scan speed in order to not cause problems on their side.

@gister9000
Copy link
Author

Here's a 37mb nessus file which imports for a while. I see that you did a test as well with 120mb nessus file but low number of vulns which took 10 seconds to import. We can conclude that number of vulns creates a problem rather than file size. I suspect that you calculate various statistics after each 10 vulns imported (and pie charts are generated etc) - maybe you should do this after importing 10% of the vulns or even after everything is imported.

Good luck and please inform me if you make an improvement :)

Here's a scan with thousands of vulns
nessus_example.nessus.tar.gz

@aenima-x
Copy link
Contributor

aenima-x commented Mar 4, 2021

@gister9000 thanks

@gister9000
Copy link
Author

gister9000 commented Mar 30, 2021

@aenima-x When you restart faraday service, upload is much faster.

I had faraday service running for 5 days and tried to upload 10k vulns nessus report. It took 20hours. Then, I restarted faraday service with systemctl and repeated the process - it lasted less than 1 hour!

@aenima-x
Copy link
Contributor

aenima-x commented Mar 30, 2021

@gister9000 That's very strange, you imported the same file to a different workspace?
Could you send me your log file?

@gister9000
Copy link
Author

gister9000 commented Mar 30, 2021

@aenima-x
Yes, same file to a different workspace (both were empty initially). same file failed to import into one workspace few times on my colleague laptop (gateway timeout after 10+hours). Then I created a new workspace and it succeded in about 3 hours (18533 vulns total, 851 hosts).

faraday-server.log file contains workspace names which reveal client names. Cant share

I went through it and all I saw is "EOFError: Ran out of input" in finalize_request function which happens so often I believe it's not a bug at all. Did you want a different log file?

@aenima-x
Copy link
Contributor

aenima-x commented Mar 30, 2021

@gister9000 That "EOFError: Ran out of input" is a known issue of a library we use.
Its not related to this.

Yes a was talking about the faraday-server.log.
Can you run a sed to eliminate the clients names or something like that?

@aenima-x
Copy link
Contributor

@gister9000 are you using the web ui to upload it?
because it cant generate a timeout becase that api only uploads the file and the processing is done in background.
It returns a 200 after the upload

@gister9000
Copy link
Author

gister9000 commented Mar 30, 2021 via email

@gister9000
Copy link
Author

gister9000 commented Mar 30, 2021 via email

@aenima-x
Copy link
Contributor

aenima-x commented Mar 30, 2021

Failed attempts were done via cli. Successful attempt was done via GUI. There are no other errors besides eof.

On Tue, Mar 30, 2021, 4:37 PM Nicolas Rebagliati @.***> wrote: @gister9000 https://github.com/gister9000 are you using the web ui to upload it? because it cant generate a timeout becase that api only uploads the file and the processing is done in background. It returns a 200 after the upload — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#418 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANITQTOEN2KQNOL2G76I2T3TGHOZHANCNFSM4SYJHRNA .

Ok that makes sense.
The cli process the file and send the results to the faraday api, that can generate a timeout.
Upload to the web UI wont.

We will release a new version of cli with a feature that maybe resolve this as a workaorund but its not a final solution.
With this feature you can disable the vulns with info severity, nessus generate lots and lots of info vulns that are not very useful.
Without them the payloads are a lot smaller.

@aenima-x
Copy link
Contributor

@gister9000 forget the log, if it was generated with the cli we wont see anything in them

@gister9000
Copy link
Author

@aenima-x We removed informational vulns from the scan before importing it.

I find it weird that no computer resource is being used 100% during the import process. Maybe turning off live result rendering would help since you are using flask and template engines are anything but fast - if you are rendering a template after each vuln imported that seems to be the issue.

@gister9000
Copy link
Author

@aenima-x What I said about no resource being used is wrong - cpu is the bottleneck during the import process which goes along with my theory that you need to stop live rendering when importing files (or render less often).
postgres_cpu

@aenima-x
Copy link
Contributor

The live rendering thing I think well not be on the new frontend.
But that resource usage is the insertion in the database not the live rendering

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants