Merge lp://staging/~barry/ubuntu-system-image/lp1383539 into lp://staging/~registry/ubuntu-system-image/client

Proposed by Barry Warsaw
Status: Merged
Merge reported by: Barry Warsaw
Merged at revision: not available
Proposed branch: lp://staging/~barry/ubuntu-system-image/lp1383539
Merge into: lp://staging/~registry/ubuntu-system-image/client
Diff against target: 2144 lines (+861/-400)
24 files modified
.bzrignore (+1/-1)
NEWS.rst (+14/-0)
cli-manpage.rst (+7/-2)
systemimage/config.py (+15/-0)
systemimage/helpers.py (+11/-18)
systemimage/index.py (+1/-4)
systemimage/main.py (+12/-1)
systemimage/scores.py (+29/-13)
systemimage/state.py (+4/-1)
systemimage/testing/controller.py (+7/-5)
systemimage/testing/helpers.py (+10/-0)
systemimage/testing/service.py (+8/-0)
systemimage/tests/data/index_22.json (+2/-3)
systemimage/tests/data/index_26.json (+245/-0)
systemimage/tests/test_candidates.py (+16/-24)
systemimage/tests/test_config.py (+23/-0)
systemimage/tests/test_helpers.py (+51/-43)
systemimage/tests/test_index.py (+0/-42)
systemimage/tests/test_main.py (+180/-195)
systemimage/tests/test_scores.py (+83/-10)
systemimage/tests/test_state.py (+130/-36)
systemimage/tests/test_winner.py (+1/-1)
systemimage/version.txt (+1/-1)
tools/runme.sh (+10/-0)
To merge this branch: bzr merge lp://staging/~barry/ubuntu-system-image/lp1383539
Reviewer Review Type Date Requested Status
Steve Langasek Pending
Michael Vogt Pending
Registry Administrators Pending
Review via email: mp+239135@code.staging.launchpad.net
To post a comment you must log in.
291. By Barry Warsaw

100% coverage and remove an unused import.

292. By Barry Warsaw

Add -m/--machine-id to s-i-cli to allow for overriding the machine id.

Print the machine id in the --info output.

Print the phased update percentage in the --dry-run output.

293. By Barry Warsaw

Inconsequential tweaks.

294. By Barry Warsaw

NEWS.

295. By Barry Warsaw

Update the man page.

296. By Barry Warsaw

Update NEWS.

297. By Barry Warsaw

Clean up.

Revision history for this message
Loïc Minier (lool) wrote :

Code-wise, +1 on these changes.

First, 2 usability concerns, variations of the same issue:
system-image will compute which build should be the target of the update, and then fail to update if the phased updates don't allow that; this prevents:
1/ switching channels -- this might fail if the phased updates score doesn't permit upgrading

2/ forcing a specific revision in the current channel -- again, might fail if score doesn't allow

Three ways to cope with 1/ and 2/:
a) change the whole upgrade selection algorithm
b) compute two upgrade pathes, one with just the 100% builds and one with all builds; prefer the latter one if possible
c) add a --ignore-phased-updates flag to force processing if phased updates check fails

Second, 2 operational concerns:
3/ I'm getting a phone coming with a factory version 10 from a store; stable channel's latest build has version 20 with a phased update score of 100%, and candidate build with version 21 with a phased update scores of 1%, I am very unlikely to get the latest and greatest stable update... until version 21 reaches 100% -- bad IMO

4/ Using build id in phased update computations seems bad.
The build id is used to compute device's phased score, which means that if we publish multiple images in quick succession (typical if we are trying to address issues) with 10% phased score, a *different set of 10% of devices* is going to get each image, resulting in a larger part of our user base getting on board for the latest images; that is, the number of publications impacts the number of phased testers. Also, having multiple builds with different phased updates scores would have no meaning.
Instead, consider we dont use the same build id. Then the same users get new images first (function of machine id and channel only) every release, but we can publish as many of them as we want in quick successions or even have multiple images with different percentages in the pipe, and that still makes sense.

Revision history for this message
Loïc Minier (lool) wrote :
Download full text (3.7 KiB)

Sorry, forgot to run the testsuite; I installed python-tox, python3-nose and python3-psutil and get this when running tox:
py34 develop-inst-nodeps: /home/lool/bzr/launchpad/~barry/ubuntu-system-image/lp1383539
py34 runtests: PYTHONHASHSEED='612702447'
py34 runtests: commands[0] | python -m nose2 -v
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/dbus/bus.py", line 175, in activate_name_owner
    return self.get_name_owner(bus_name)
  File "/usr/lib/python3/dist-packages/dbus/bus.py", line 361, in get_name_owner
    's', (bus_name,), **keywords)
  File "/usr/lib/python3/dist-packages/dbus/connection.py", line 651, in call_blocking
    message, timeout)
dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NameHasNoOwner: Could not get owner of name 'com.canonical.applications.Downloader': no such name

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.4/runpy.py", line 170, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.4/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3/dist-packages/nose2/__main__.py", line 12, in <module>
    discover()
  File "/usr/lib/python3/dist-packages/nose2/main.py", line 284, in discover
    return main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/nose2/main.py", line 98, in __init__
    super(PluggableTestProgram, self).__init__(**kw)
  File "/usr/lib/python3.4/unittest/main.py", line 93, in __init__
    self.runTests()
  File "/usr/lib/python3/dist-packages/nose2/main.py", line 260, in runTests
    self.result = runner.run(self.test)
  File "/usr/lib/python3/dist-packages/nose2/runner.py", line 45, in run
    self.session.hooks.startTestRun(event)
  File "/usr/lib/python3/dist-packages/nose2/events.py", line 224, in __call__
    result = getattr(plugin, self.method)(event)
  File "/home/lool/bzr/launchpad/~barry/ubuntu-system-image/lp1383539/systemimage/testing/helpers.py", line 240, in wrapper
    return function(*args, **kws)
  File "/home/lool/bzr/launchpad/~barry/ubuntu-system-image/lp1383539/systemimage/testing/nose.py", line 112, in startTestRun
    SystemImagePlugin.controller.start()
  File "/home/lool/bzr/launchpad/~barry/ubuntu-system-image/lp1383539/systemimage/testing/controller.py", line 242, in start
    self._start()
  File "/home/lool/bzr/launchpad/~barry/ubuntu-system-image/lp1383539/systemimage/testing/controller.py", line 234, in _start
    starter(self)
  File "/home/lool/bzr/launchpad/~barry/ubuntu-system-image/lp1383539/systemimage/testing/controller.py", line 82, in start_downloader
    service = bus.get_object('com.canonical.applications.Downloader', '/')
  File "/usr/lib/python3/dist-packages/dbus/bus.py", line 241, in get_object
    follow_name_owner_changes=follow_name_owner_changes)
  File "/usr/lib/python3/dist-packages/dbus/proxies.py", line 248, in __init__
    self._named_service = conn.activate_name_owner(bus_name)
  File "/usr/lib/python3/dist-packages/dbus/bus.py", line 180, in activate_name_owner
    self.start_service_by_name(bus_name)
  File "/usr/lib/python3/dist-packages/dbus/bus.py", line 278,...

Read more...

Revision history for this message
Barry Warsaw (barry) wrote :

On Oct 23, 2014, at 05:17 PM, Loïc Minier wrote:

>Sorry, forgot to run the testsuite; I installed python-tox, python3-nose and
>python3-psutil and get this when running tox:
[...]
> File "/home/lool/bzr/launchpad/~barry/ubuntu-system-image/lp1383539/systemimage/testing/controller.py", line 82, in start_downloader
> service = bus.get_object('com.canonical.applications.Downloader', '/')
> File "/usr/lib/python3/dist-packages/dbus/bus.py", line 241, in get_object
> follow_name_owner_changes=follow_name_owner_changes)
> File "/usr/lib/python3/dist-packages/dbus/proxies.py", line 248, in __init__
> self._named_service = conn.activate_name_owner(bus_name)
> File "/usr/lib/python3/dist-packages/dbus/bus.py", line 180, in activate_name_owner
> self.start_service_by_name(bus_name)
> File "/usr/lib/python3/dist-packages/dbus/bus.py", line 278, in start_service_by_name
> 'su', (bus_name, flags)))
> File "/usr/lib/python3/dist-packages/dbus/connection.py", line 651, in call_blocking
> message, timeout)
>dbus.exceptions.DBusException: org.freedesktop.DBus.Error.Spawn.ExecFailed: Failed to execute program com.canonical.applications.Downloader: Success
>ERROR: InvocationError: '/home/lool/bzr/launchpad/~barry/ubuntu-system-image/lp1383539/.tox/py34/bin/python -m nose2 -v'

Take a wild guess. ;)

Revision history for this message
Barry Warsaw (barry) wrote :
Download full text (6.2 KiB)

Thanks for the review!

On Oct 23, 2014, at 04:45 PM, Loïc Minier wrote:

>1/ switching channels -- this might fail if the phased updates score doesn't
>permit upgrading
>
>2/ forcing a specific revision in the current channel -- again, might fail if
>score doesn't allow
>
>Three ways to cope with 1/ and 2/:
>a) change the whole upgrade selection algorithm
>b) compute two upgrade pathes, one with just the 100% builds and one with all
>builds; prefer the latter one if possible
>c) add a --ignore-phased-updates flag to force processing if phased updates
>check fails

I thought about something like c) originally, but then opted to add
-m/--machine-id to set a specific machine-id override. -m is problematic
though because it will require trial-and-error to find a machine-id that would
cause your device to choose a specific phased percentage.

Adding a `force` flag would eliminate the need to override the machine-id.
The question is whether `force` should also allow you to install a "pulled"
image, i.e. one that has a 0% phased-percentage. That could be a problem if
an image is known to be bad.

A thought: if the `force` flag (however its spelled) is a counter rather than
a boolean, a single --force would ignore all but a 0% image. Give two
--forces (e.g. -ff) would then force the install of even a 0% image.

Definitely a) and b) are too invasive of a change for rtm. c) could be done
here, but I'd probably also then remove -m/--machine-id as it would really not
be necessary.

>3/ I'm getting a phone coming with a factory version 10 from a store; stable
>channel's latest build has version 20 with a phased update score of 100%, and
>candidate build with version 21 with a phased update scores of 1%, I am very
>unlikely to get the latest and greatest stable update... until version 21
>reaches 100% -- bad IMO

Well, presumably you'd get the update somewhere long before it reaches 100%,
since it's just as unlikely your device is 99% as it is 1%. :) Of course,
that might or might not be a good thing for you, if version 21 is bad, but
maybe marginally better if 21 is less bad than 20.

We did talk about this, although not specifically in the context of a
fresh-from-factory update. The question then is whether we should fall back
to installing a less-than-highest update that is at 100% if there is a highest
update that you wouldn't normally get because of your device's percentage. We
thought that no, you shouldn't. I think it wouldn't be long before, say 21's
percentage was raised high enough for you to get it.

The solution would be to include the phase percentage in the "score" that each
candidate path gets when calculating a winning upgrade path. That's not
something I want to change this close to rtm, but it's something we can look
at after rtm.

Tracked in LP: #1384854.

>4/ Using build id in phased update computations seems bad.
>The build id is used to compute device's phased score, which means that if we
>publish multiple images in quick succession (typical if we are trying to
>address issues) with 10% phased score, a *different set of 10% of devices* is
>going to get each image, resulting in a larger part of our user base getting
>on boar...

Read more...

298. By Barry Warsaw

Typo -found by lool during review

Revision history for this message
Barry Warsaw (barry) wrote :

So I think I'm going to add the -f/-ff (force option, spelling TBD) *and* an --override-phase-percentage (spelling TBD) to hard code the device's %. The latter will mostly be useful for testing, and will replace -m/--machine-id which really isn't very useful.

Revision history for this message
Steve Langasek (vorlon) wrote :

On Thu, Oct 23, 2014 at 04:45:53PM -0000, Loïc Minier wrote:

> 4/ Using build id in phased update computations seems bad.

> The build id is used to compute device's phased score, which means that if
> we publish multiple images in quick succession (typical if we are trying
> to address issues) with 10% phased score, a *different set of 10% of
> devices* is going to get each image, resulting in a larger part of our
> user base getting on board for the latest images; that is, the number of
> publications impacts the number of phased testers. Also, having multiple
> builds with different phased updates scores would have no meaning.
> Instead, consider we dont use the same build id. Then the same users get
> new images first (function of machine id and channel only) every release,
> but we can publish as many of them as we want in quick successions or even
> have multiple images with different percentages in the pipe, and that
> still makes sense.

The purpose of having phased updates is to have a failsafe mechanism for
catching issues that were not identified in the pre-release QA process, so
that we can roll back. The expected model is:

 - image has been QAed
 - it's released to the stable channel, with a low phased percentage (say,
   10%)
 - if there are no new problems being reported, we ratchet up the phased
   percentage
 - if there *are* new problems being reported, we *immediately* set its
   phased percentage to zero, then as soon as possible afterwards push an
   update that rolls back to the previous version
 - we iterate on the QA process, improving the test plan to catch the
   regressions that were seen and re-releasing an image to the stable
   channel once the regressions are resolved.

So a user is expected to always either update to the image currently being
phased, or, in the case of a regression, to update to the rolled-back image
and from there to the next phased image. The time between the publication
of the rolled-back image and the next phased image should be great enough to
ensure that users always update to the rolled-back image first, rather than
being stranded on the earlier, withdrawn phased image.

Revision history for this message
Barry Warsaw (barry) wrote :

On Oct 23, 2014, at 07:21 PM, Steve Langasek wrote:

>So a user is expected to always either update to the image currently being
>phased, or, in the case of a regression, to update to the rolled-back image
>and from there to the next phased image. The time between the publication
>of the rolled-back image and the next phased image should be great enough to
>ensure that users always update to the rolled-back image first, rather than
>being stranded on the earlier, withdrawn phased image.

Thanks Steve. Then with that model in mind, we will stick with the current
use of target build-number in the phase calculation, at least for now.

Revision history for this message
Barry Warsaw (barry) wrote :

On further thought, we don't need both -p/--percentage and -f/--force since "-p 1" would be effectively equivalent to -f and "-p 0" would effectively be -ff.

Revision history for this message
Barry Warsaw (barry) wrote :

One other decision based on discussions: we'll implement the fallback plan for phased updates. This means that if you have image 11 at 100% and a new image 12 is published at 5%, and your device phases at 15% you will ignore image 12 but update to image 11.

299. By Barry Warsaw

Respond to comments:

* -m/--machine-id is removed
* -p/--percentage is added
* Check candidate paths until we find one that fits percentage.
* Add support for $SYSTEMIMAGE_DLSERVICE envar for testing convenience.
* Lots of test improvements.

300. By Barry Warsaw

Update NEWS.

301. By Barry Warsaw

Helpful tool.

302. By Barry Warsaw

$HOME.

303. By Barry Warsaw

A little bit of additional testing help.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches