Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronous flush on crash #289

Open
CloudBob opened this issue May 22, 2014 · 10 comments
Open

Synchronous flush on crash #289

CloudBob opened this issue May 22, 2014 · 10 comments

Comments

@CloudBob
Copy link

When remotely debugging a crash (especially on iOS), it is a requirement to trace right up to the point of the crash. The user sends the trace files and the crash report.

Performance is also important, so asynchronous mode must be selected.

The problem is that trace data is left in the queue and not flushed when a crash occurs.

I am currently using the PlCrashReporter framework. That framework reliably drives a callback on every crash after the crash report has been written. I need to be able to make a CocoaLumberjack call in that crash callback to synchronously flush all of the trace.

The callback is driven on one thread, and all of the other threads are suspended, so the normal GCD background thread processing no longer occurs on the trace queue. The crash callback thread should synchronously flush the queue to the loggers.

Thanks, Bob

@CloudBob
Copy link
Author

CloudBob commented Jun 4, 2014

We are currently using CocoaLumberjack, and our next release really needs to implement this capability to synchronously flush all traces to disk while handling a crash. Here are the requirements:

  1. Performance is important, so CocoaLumberjack should be normally running in asynchronous mode using background threads to write traces to the targets.
  2. Some crash reporter (we use PLCrashReporter) will call an event method when a crash has occurred. It should be assumed that all of the app's threads (including CocoaLumberjack's background threads) are suspended when that method is called.
  3. The crash event callback method will call CocoaLumberjack's flushLog method.
  4. CocoaLumberjack.flushLog should flush all traces in memory to the targets using a best-effort approach.
  5. The flushLog method should be protected by try/catch.

Since we need this capability soon, we will start an implementation of our own. Here are two additional requirements for performance of this modification:

  1. Traces should be written to memory using the shortest CPU path length possible.
  2. At least in the case of a disk target, all of the trace records remaining in memory should be written to the disk using a single stream write if possible.

I'll let you know when our modification is complete. Perhaps you might be interested in integrating our changes.

Please feel free to add comments on the requirements above, or to add comments with your design suggestions.

Thanks, Bob

@rivera-ernesto
Copy link
Member

I think that from CocoaLumberjack side it should ensure that flushLog works properly.

As for comments you could open a pull request marking it as "in progress" so we don't merge it just yet but can discuss about the changes and code.


We must:

  • Ensure that flushLog properly flushes all queue messages so it can be invoked on crash by NSUncaughtExceptionHandler's.

It would be nice to:

  • Implement our own NSUncaughtExceptionHandler. It is however a delicate task and as there can only be one handler in most projects another one will be used anyway (i.e. Crashlytics).

@bpoplauschi
Copy link
Member

@CloudBob what do you think?

@dalemyers
Copy link

Did this ever go anywhere?

@bpoplauschi
Copy link
Member

@dalemyers unfortunately, not. But I do want to pick up and clean the issues, as the project is been silent for the past period.
Are you interested in contributing?

@dalemyers
Copy link

It's definitely not my area of expertise, so I wouldn't have much ability to do so. What I was curious about though was the custom uncaught exception handler. It's pretty well established at this point that crash handling libraries don't play nicely with multiple handlers. Those based on PLCrashReporter though (most of them I think?) have the ability to set callbacks. It seems to me that the best option for everyone is to not have a custom handler, and instead just have a force flush method. Something like this maybe: https://stackoverflow.com/a/35075346 (untested).

@dalemyers
Copy link

dalemyers commented Aug 7, 2018

As an update, we tried that fix with the callbacks and it's working perfectly for us. It probably won't work perfectly 100% of the time due to the nature of it, but it's good enough for our needs.

It would still be good to have a method to force a flush of what's there rather than needing to wait on the queue. That way we'd avoid some later messages too, and the whole process would be faster.

@bpoplauschi
Copy link
Member

@dalemyers thanks for getting back with this update. I think it will help other people interested in this problem. I will leave the ticket open till we find a fix

@stale
Copy link

stale bot commented Oct 10, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is still an issue, please make sure it is up to date and if so, add a comment that this is still an issue to keep it open. Thank you for your contributions.

@stale stale bot added the Stale label Oct 10, 2018
@stale stale bot removed the Stale label Oct 16, 2018
@ffried ffried removed the Important label Jan 17, 2020
@dalemyers
Copy link

Over 5 years later, I've worked at multiple other companies and jobs. I'm once again seeing this issue unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants