Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow passing next(error, job) to both cancel and update #67

Open
viridia opened this issue May 8, 2017 · 8 comments
Open

Allow passing next(error, job) to both cancel and update #67

viridia opened this issue May 8, 2017 · 8 comments

Comments

@viridia
Copy link
Contributor

viridia commented May 8, 2017

When processing a job, I sometimes want to both update the job and cancel it - such as setting a code indicating the reason for cancellation. The code for this tends to look like this:

job.cancelReson = Cancellation.USER;
job.update();
const error = new JobError('canceled');
error.cancelJob = true;
next(error);

However, presumably the 'next' method is going to mutate the job record. It would be convenient to be able to do both in a single database update.

My proposal would be that passing both parameters (error and job) to next() would accomplish this.

@grantcarthew
Copy link
Owner

That's a simple one. I don't see why we couldn't do that. It would be a point release also.

@viridia
Copy link
Contributor Author

viridia commented May 13, 2017

So it turns out this is getting more complex than I realized. I've got the following code pattern about a dozen places in my code:

job.someProperty = foo;
job.update();
job.addLog({}, 'some log message');
next(Error('error'));

So this is hitting the database three times. And what I haven't shown is that we actually need to wait on the first two promises - both the property update and the log addition.

(An example of where this would happen is if I get a job failed event from the cluster, I want to record what actually happened.)

A similar pattern also appears about a dozen times, which is that we're doing the exact same thing except outside of the context of a Queue.process call. In this case, the code looks like this:

job.someProperty = foo;
job.update();
job.addLog({}, 'some log message');
Queue.cancelJob(job);

Again, hitting the database three times. (In fact, I'm hitting it four times, because if it's a sub-task change I also wake up the parent task).

I wonder if there's a way to make a bunch of changes to a job as a single transaction?

@viridia
Copy link
Contributor Author

viridia commented May 13, 2017

Here's an alternative idea, which is a bit more radical:

Queue.process((job, control) => {
  control
    .update({ /* properties */ }) // set some properties
    .log.info('info message') // Log an info message
    .reschedule(1000); // Accepts either a date or a numeric delta
});

Queue.process((job, control) => {
  control
    .setStatus('failed')
    .log.error('error message')
    .cancel(); // You have to call reschedule, cancel, or fail as the last method.
});

Queue.getJobControl(id).then(control => {
  // Same interface but outside of process loop.
});

@grantcarthew
Copy link
Owner

I'm not keen. This this would be mostly to support your rather complex requirements.
I follow the KISS principle wherever possible.

You can drop one extra db connection though @viridia.

Rather than using the Job.addLog call which does connect and update the database, just push a new log onto the Job.log array.

If you look at this code you will see the log objects are just simple objects. Make one in your code and then do a Job.log.push(newLogObject) before you call Job.update().

I'm only keen to make the titled suggestion so far.

@viridia
Copy link
Contributor Author

viridia commented May 15, 2017

I've decided to go in a different direction, so I'm happy to table this discussion.

@grantcarthew
Copy link
Owner

Still with this queue? I wouldn't be surprised if you came up with a better solution.

@viridia
Copy link
Contributor Author

viridia commented May 15, 2017

Well, as you pointed out, my needs are complex. And I didn't even mention some of the other issues.

For example: users are going to want to monitor the status of their rendering jobs in real time, which means publish/subscribe to the browser. I'm using deepstream.io for this, but one side effect is that an event published to a topic is visible by all subscribers, which means that you don't want two worker processes to publish the same event (because everyone would then get double updates).

That in turn means that worker processes have to be careful about what events they listen to and what messages they broadcast. So listening to Node events on the queue is out, since every worker is going to get a copy of the same event from rethinkdb. Instead, I have the individual workers push updates to the appropriate subscription topic at the point of mutation.

Also, for my case it makes sense for logs to be in a separate table, since I want to have a separate pub/sub stream for the logs so that users can watch them in real time.\

Plus I'm having fun playing around with async / await :)

@grantcarthew
Copy link
Owner

So are you going to roll your own queue?
It took me six months to write rethinkdb-job-queue (not full time). Trying to duplicate that will be a big job however in your case it could be better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants