Merge lp://staging/~caio1982/capomastro/celery-fixes into lp://staging/capomastro

Proposed by Caio Begotti
Status: Merged
Approved by: Daniel Manrique
Approved revision: 190
Merged at revision: 188
Proposed branch: lp://staging/~caio1982/capomastro/celery-fixes
Merge into: lp://staging/capomastro
Diff against target: 30 lines (+8/-1)
2 files modified
capomastro/settings.py (+7/-0)
debian/capomastro.init (+1/-1)
To merge this branch: bzr merge lp://staging/~caio1982/capomastro/celery-fixes
Reviewer Review Type Date Requested Status
Daniel Manrique (community) Approve
Review via email: mp+257597@code.staging.launchpad.net

Commit message

Update configuration to suppress spurious/annoying messages in celery log.

The serializer one is a deprecation warning for the next version of Celery.

The other one was actually a bug. Reading the documentation of the version we use it seems the beat process cannot be started multiple times. Setting concurrency to 1 works around that problem.

Description of the change

These fix some annoying messages I was seeing in the Celery log of Capomastro.

The serializer one is a deprecation warning for the next version of Celery.

The other one was actually a bug. Reading the documentation of the version we use it seems the beat process cannot be started multiple times. Since by default Celery spawns as many process as the number of CPUs you have, we were having like four beats around. Some of them suddenly started to die and left Python defuncts all over.

You could see this nonsense in the log, multiple times:
[2015-04-27 18:03:40,669: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@127.0.0.1:5672//: [Errno 111] Connection refused.

Though it didn't bother Capomastro, eventually those defuncts would be still there and Celery would never shut down cleanly.

It's possible to have both beat and the worker together, as long we use a max concurrency value, which this MR changes to 1 just like Hexr does (and if a beast like Hexr is happy with that I suppose we will be too!).

I have been running multiple immediate and scheduled builds on staging for the whole day with that change and it seems all beautiful and good smelling; totally the opposite of the previous command line, which would leave defuncts behind after minutes.

To post a comment you must log in.
Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Hey. Three observations:

- Use a more readable commit message pattern with the summary separate from the description. This is standard practice and it helps for readability and changelog maintenance. Use proper capitalization for sentences. Use sentences instead of a long chain of commas. This might seem picky but looking at the commit messages just makes them look sloppy to me.
- Using database as as scheduler is a hack IMHO. It's a busy loop (with a sleep inside) that just keeps poking the database. I suspect using rabbit is a standard practice in Canonical and using it would definitely cut the load on the database as it's purely event based.
- Using --concurrency=1 to fix issues also feels very much wrong. Defunct processes are just zombies. I suspect that using concurrency=1 just runs everything in the master process so you don't get zombies but this just hides the real problem.

Thanks
ZK

Revision history for this message
Caio Begotti (caio1982) wrote :

Awesome, Zygmunt! About the scheduler, we already use Rabbit anyway but it seemed the simplest way to go for such simple scenario, but I'll take a look at that today. As for the concurrency change, the Celery documentation says you can't spawn more than one beat otherwise you'll have unexpected runs of it (like I was seeing and Hexr also avoids with the same "hack" apparently). The only possible alternative I see is really having two separate initscripts, one for beat one for the worker, but I kind of feel it's not worth the trouble here. Comments?

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Hey.

I agree on celerybeat (it makes no sense to have more than one) I
wasn't sure if that's what the --concurrency option is doing though. I
assumed that it is for the number of tasks celery gets to run as it's
smart enough not to run multiple beats by itself. One case where that
can happen (correct me if I'm wrong, I've been using celery over a
year ago) is where you run the beat with the scheduler and task
processor in one go. I don't know if this is the case here.

As for the init scripts you should be happy to know they are already
written and are working exactly like that. Look at the packaging for
'celeryd'. I think it makes sense to have them as separate services
(better monitoring and logging).

On Tue, Apr 28, 2015 at 12:22 PM, Caio Begotti
<email address hidden> wrote:
> Awesome, Zygmunt! About the scheduler, we already use Rabbit anyway but it seemed the simplest way to go for such simple scenario, but I'll take a look at that today. As for the concurrency change, the Celery documentation says you can't spawn more than one beat otherwise you'll have unexpected runs of it (like I was seeing and Hexr also avoids with the same "hack" apparently). The only possible alternative I see is really having two separate initscripts, one for beat one for the worker, but I kind of feel it's not worth the trouble here. Comments?
> --
> https://code.launchpad.net/~caio1982/capomastro/celery-fixes/+merge/257597
> You are subscribed to branch lp:capomastro.

Revision history for this message
Caio Begotti (caio1982) wrote :

Yeah, it's exactly the case here. Without that parameter Celery spawns multiple beats and goes crazy.

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Crazy :)

Can you rely on the packaged version of the startup scripts to avoid
this? It can be configured with the /etc/default/celeryd file.

On Tue, Apr 28, 2015 at 12:31 PM, Caio Begotti
<email address hidden> wrote:
> Yeah, it's exactly the case here. Without that parameter Celery spawns multiple beats and goes crazy.
> --
> https://code.launchpad.net/~caio1982/capomastro/celery-fixes/+merge/257597
> You are subscribed to branch lp:capomastro.

Revision history for this message
Caio Begotti (caio1982) wrote :

I could, though when I tried using it -- and I believe me, I really didn't want to write our own initscript -- it had problems stopping all the Celery processes cleanly :-/ I have filed a bug for that at https://bugs.launchpad.net/capomastro/+bug/1449493 though, in case someone wanna work on this bit since I can't myself right now.

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I would encourage you to file a bug on the celeryd package in *debian*
so that the issue with stopping them can be tracked and fixed.

On Tue, Apr 28, 2015 at 12:46 PM, Caio Begotti
<email address hidden> wrote:
> I could, though when I tried using it -- and I believe me, I really didn't want to write our own initscript -- it had problems stopping all the Celery processes cleanly :-/ I have filed a bug for that at https://bugs.launchpad.net/capomastro/+bug/1449493 though, in case someone wanna work on this bit since I can't myself right now.
> --
> https://code.launchpad.net/~caio1982/capomastro/celery-fixes/+merge/257597
> You are subscribed to branch lp:capomastro.

Revision history for this message
Daniel Manrique (roadmr) wrote :

Just a comment. the DatabaseScheduler simply implements celery beat's concept of a scheduler, so I don't think it's really a hack, the only thing that changes is how it stores persistent data. For comparison, the default scheduler is called PersistentScheduler and it uses local files for storage, I can't imagine DatabaseScheduler would be inefficient over this.

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Oh, sorry. I must have confused the database-polling scheduler
implementation. If that's the case then no complaint here.

On Tue, Apr 28, 2015 at 7:02 PM, Daniel Manrique
<email address hidden> wrote:
> Just a comment. the DatabaseScheduler simply implements celery beat's concept of a scheduler, so I don't think it's really a hack, the only thing that changes is how it stores persistent data. For comparison, the default scheduler is called PersistentScheduler and it uses local files for storage, I can't imagine DatabaseScheduler would be inefficient over this.
> --
> https://code.launchpad.net/~caio1982/capomastro/celery-fixes/+merge/257597
> You are subscribed to branch lp:capomastro.

Revision history for this message
Daniel Manrique (roadmr) wrote :

I agree that we should move toward using the packages when possible. Thanks for the idea, we'll look into what's keeping them from being usable and try to work to get that fixed.

For now, I think these changes are OK to update our working config to de-spam the logs.

I added a commit message per Zygmunt's specs (I agree completely with that), but since this merge was sent using bzr it's not too easy to fix the commit messages. We'll definitely try to be more dilligent with that, I can attest that it has made the checkbox changelog much more readable and organized.

Revision history for this message
Daniel Manrique (roadmr) wrote :

Er, +1 I mean.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches