Merge lp://staging/~xfactor973/charms/trusty/ceph-mon/coordinated-upgrade into lp://staging/~openstack-charmers-archive/charms/trusty/ceph-mon/next

Proposed by Chris Holcombe
Status: Rejected
Rejected by: James Page
Proposed branch: lp://staging/~xfactor973/charms/trusty/ceph-mon/coordinated-upgrade
Merge into: lp://staging/~openstack-charmers-archive/charms/trusty/ceph-mon/next
Diff against target: 747 lines (+407/-43)
6 files modified
.bzrignore (+1/-0)
actions/__init__.py (+3/-0)
hooks/ceph.py (+150/-19)
hooks/ceph_hooks.py (+200/-12)
hooks/charmhelpers/contrib/network/ip.py (+15/-0)
hooks/charmhelpers/contrib/storage/linux/ceph.py (+38/-12)
To merge this branch: bzr merge lp://staging/~xfactor973/charms/trusty/ceph-mon/coordinated-upgrade
Reviewer Review Type Date Requested Status
James Page Needs Fixing
Chris MacNaughton Pending
Review via email: mp+287375@code.staging.launchpad.net

Description of the change

This patch allows the ceph monitor cluster to upgrade themselves one by one. It does this by using the ceph monitor cluster as a locking mechanism. There are most likely edge cases with this method that I haven't thought of. Consider this code to be lightly tested. It worked fine on ec2.

To post a comment you must log in.
Revision history for this message
Chris Holcombe (xfactor973) :
Revision history for this message
Chris Holcombe (xfactor973) wrote :

Note: I put up the helper functions for review in charmhelpers: https://code.launchpad.net/~xfactor973/charm-helpers/ceph-keystore/+merge/287205

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #1558 ceph-mon-next for xfactor973 mp287375
    LINT FAIL: lint-test failed

LINT Results (max last 2 lines):
make: *** [lint] Error 1
ERROR:root:Make target returned non-zero.

Full lint test output: http://paste.ubuntu.com/15210769/
Build: http://10.245.162.36:8080/job/charm_lint_check/1558/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #1307 ceph-mon-next for xfactor973 mp287375
    UNIT OK: passed

Build: http://10.245.162.36:8080/job/charm_unit_test/1307/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #553 ceph-mon-next for xfactor973 mp287375
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 1
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/15210927/
Build: http://10.245.162.36:8080/job/charm_amulet_test/553/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #1561 ceph-mon-next for xfactor973 mp287375
    LINT FAIL: lint-test failed

LINT Results (max last 2 lines):
make: *** [lint] Error 1
ERROR:root:Make target returned non-zero.

Full lint test output: http://paste.ubuntu.com/15210963/
Build: http://10.245.162.36:8080/job/charm_lint_check/1561/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #1310 ceph-mon-next for xfactor973 mp287375
    UNIT FAIL: unit-test failed

UNIT Results (max last 2 lines):
make: *** [test] Error 1
ERROR:root:Make target returned non-zero.

Full unit test output: http://paste.ubuntu.com/15210971/
Build: http://10.245.162.36:8080/job/charm_unit_test/1310/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #557 ceph-mon-next for xfactor973 mp287375
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 1
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/15211079/
Build: http://10.245.162.36:8080/job/charm_amulet_test/557/

Revision history for this message
James Page (james-page) wrote :

Ignoring the numerous failures for now - they need resolving but wanted to have a first pass on principles for upgrading.

Revision history for this message
James Page (james-page) wrote :

This proposal introduces the concept of using packages from ceph.com to drive the upgrade process; in principle I'm happy that we add support for use of packages from ceph.com, but lets do that under separate cover.

I think the initial target is to roll between Ceph versions in Ubuntu Cloud Archive pockets, in much the same way that we do for OpenStack.

This will mean initial support is required for trusty (as that has visibility of firefly, hammer, and infernalis/jewel) with support being required for xenial at some point (when we get a new version of ceph in a future cloud archive pocker); here are the supported combinations:

trusty: firefly
trusty-juno: (firefly from 14.04)
trusty-kilo: (hammer)
trusty-liberty: (hammer)
trusty-mitaka: (jewel)

I think the trigger for an upgrade should be a change in the source which requires that ceph be upgraded.

review: Needs Fixing
Revision history for this message
James Page (james-page) wrote :

Re you use of status_set to feedback messages to the end user - this is super-useful, but please rework to actually set the status in assess_status - right now the status_set's you call in the upgrade functions will be overridden by the status_sets from assess_status.

I'd suggest that use use the unitdata charm-helper to manage states for the upgrade, and teach assess_status how to interpret these.

Revision history for this message
James Page (james-page) wrote :

Some inline comments; I'd also say that I'd prefer that general tidying and removal of unused functions as part of the move to ceph-mon be done in a different MP please.

Makes this one harder to review due to other general change noise from this work.

Revision history for this message
Chris Holcombe (xfactor973) wrote :

@jamespage - regarding the upstart specific code I did some digging and it looks like charmhelpers takes care of this for me: http://bazaar.launchpad.net/~charm-helpers/charm-helpers/devel/view/head:/charmhelpers/core/host.py#L120

Revision history for this message
Chris Holcombe (xfactor973) wrote :

For the status set bits my upgrade function actually blocks forever until the upgrade is complete. That's why I didn't add the messages into assess_status.

Revision history for this message
James Page (james-page) wrote :

@xfactor973 - re the upstart specific code - I was referring to the fact that ceph-mon-all is a upstart specific job - i.e. no systemd equiv exists.

Revision history for this message
James Page (james-page) wrote :

Moved to gerrit review.

Unmerged revisions

153. By Chris Holcombe

Hash the hostname instead of the ip address. That is more portable. Works now on lxc and also on ec2

152. By Chris Holcombe

Only deploy from cloud archives. Remove now unneeded key downloading function

151. By Chris Holcombe

Remove the unneeded charmhelpers bits

150. By Chris Holcombe

Add the deleted ceph.py functions back in. Will break them out into a separate MP

149. By Chris Holcombe

Merge upstream

148. By Chris Holcombe

Remove the coordinator.py. It is not needed

147. By Chris Holcombe

It rolls!. This now upgrades and rolls the ceph monitor cluster one by one.

146. By Chris Holcombe

It upgrades and rolls the cluster now. Next I need to work out the edge cases

145. By Chris Holcombe

Adding charmhelpers coordinator and start work on upgrading the monitor cluster

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches