Ubuntu system image

Code review comment for lp://staging/~barry/ubuntu-system-image/lp1383539

lp1383539
Merge into client

Revision history for this message

Barry Warsaw (barry) wrote on 2014-10-23:

Thanks for the review!

On Oct 23, 2014, at 04:45 PM, Loïc Minier wrote:

>1/ switching channels -- this might fail if the phased updates score doesn't
>permit upgrading
>
>2/ forcing a specific revision in the current channel -- again, might fail if
>score doesn't allow
>
>Three ways to cope with 1/ and 2/:
>a) change the whole upgrade selection algorithm
>b) compute two upgrade pathes, one with just the 100% builds and one with all
>builds; prefer the latter one if possible
>c) add a --ignore-phased-updates flag to force processing if phased updates
>check fails

I thought about something like c) originally, but then opted to add
-m/--machine-id to set a specific machine-id override. -m is problematic
though because it will require trial-and-error to find a machine-id that would
cause your device to choose a specific phased percentage.

Adding a `force` flag would eliminate the need to override the machine-id.
The question is whether `force` should also allow you to install a "pulled"
image, i.e. one that has a 0% phased-percentage. That could be a problem if
an image is known to be bad.

A thought: if the `force` flag (however its spelled) is a counter rather than
a boolean, a single --force would ignore all but a 0% image. Give two
--forces (e.g. -ff) would then force the install of even a 0% image.

Definitely a) and b) are too invasive of a change for rtm. c) could be done
here, but I'd probably also then remove -m/--machine-id as it would really not
be necessary.

>3/ I'm getting a phone coming with a factory version 10 from a store; stable
>channel's latest build has version 20 with a phased update score of 100%, and
>candidate build with version 21 with a phased update scores of 1%, I am very
>unlikely to get the latest and greatest stable update... until version 21
>reaches 100% -- bad IMO

Well, presumably you'd get the update somewhere long before it reaches 100%,
since it's just as unlikely your device is 99% as it is 1%. :) Of course,
that might or might not be a good thing for you, if version 21 is bad, but
maybe marginally better if 21 is less bad than 20.

We did talk about this, although not specifically in the context of a
fresh-from-factory update. The question then is whether we should fall back
to installing a less-than-highest update that is at 100% if there is a highest
update that you wouldn't normally get because of your device's percentage. We
thought that no, you shouldn't. I think it wouldn't be long before, say 21's
percentage was raised high enough for you to get it.

The solution would be to include the phase percentage in the "score" that each
candidate path gets when calculating a winning upgrade path. That's not
something I want to change this close to rtm, but it's something we can look
at after rtm.

Tracked in LP: #1384854.

>4/ Using build id in phased update computations seems bad.
>The build id is used to compute device's phased score, which means that if we
>publish multiple images in quick succession (typical if we are trying to
>address issues) with 10% phased score, a *different set of 10% of devices* is
>going to get each image, resulting in a larger part of our user base getting
>on board for the latest images; that is, the number of publications impacts
>the number of phased testers. Also, having multiple builds with different
>phased updates scores would have no meaning.
>Instead, consider we dont use the same build id. Then the same users get new
>images first (function of machine id and channel only) every release, but we
>can publish as many of them as we want in quick successions or even have
>multiple images with different percentages in the pipe, and that still makes
>sense.

I realize that there's an implication here that if you don't include some
varying bit of information, you won't have a good distribution of guinea pig
devices.

Meaning, if your device-id + channel gives your device a phased percentage of
13%, you'll always be a guinea pig for any image with a percentage of 13% or
greater, at least on the same channel. That's also probably not a good thing.

You could add a 'subchannel' key in the images, which if present would be used
as input to the random number seed. Thus if you wanted to publish a set of
images that were logically connected (i.e. 21 improves 20, which improves 19,
but none are known to be 100%), then at least a device that gets N% for images
21, 20, and 19 will get the same N% for all of them, but it would get a
different percentage for say image 22 which had a different (or no) subchannel
key.

Maybe you could also generate a varying bit of information every time you
actually fulfill an upgrade. Maybe even time.time(). Thus if you are 13% for
image 19 but don't upgrade to 19, then you'll also be 13% for 20, and if you
don't upgrade to 20, you'll also be 13% for 21.

Now let's say image 19 is 5%, image 20 is 10%, and image 21 is 15%. You won't
upgrade to 19 or 20, but you would upgrade to 21. At that point, we'd
generate a new local varying bit of information, and thus your "guinea pig
score" (i.e. the device's phased percentage) would be different for any image
after 21.

>> === modified file 'systemimage/config.py'
>> --- systemimage/config.py 2014-09-16 20:06:56 +0000
>> +++ systemimage/config.py 2014-10-23 14:38:04 +0000
>> @@ -33,6 +33,7 @@
>> as_loglevel, as_object, as_timedelta, makedirs, temporary_directory)
>>
>>
>> +UNIQUE_MACHINE_ID_FILE = '/var/lib/dbus/machine-id'
>
>Strikes me that we're using the dbus machine id instead of whoopsie's id --
>or rather some id derived of it -- which is more stable (derived from
>hardware, e.g. MAC address, serial number etc.). Might be a good idea to move
>to that.
>
>NB: confirmed that on current rtm images, we are using different dbus machine
>ids.

Do you have any feedback on why the whoopsie id would be better than the
machine id? Is it just consistency with other tools? Is it possible that the
whoopsie id was missing, in which case we'd probably need to fall back to the
machine id.

Tracked in LP: #1384859

>> + def test_phased_percentage_different_target(self):
>> + # All else being equal, a different target gies different %.
>
>FYI s/gies/gives/ -- no need to fix here, just in case you care to fix this
>in a future branch

Since this hasn't landed yet... fixed, thanks!

Thanks for the review!

On Oct 23, 2014, at 04:45 PM, Loïc Minier wrote:

I thought about something like c) originally, but then opted to add
-m/--machine-id to set a specific machine-id override.  -m is problematic
though because it will require trial-and-error to find a machine-id that would
cause your device to choose a specific phased percentage.

Adding a `force` flag would eliminate the need to override the machine-id.
The question is whether `force` should also allow you to install a "pulled"
image, i.e. one that has a 0% phased-percentage.  That could be a problem if
an image is known to be bad.

A thought: if the `force` flag (however its spelled) is a counter rather than
a boolean, a single --force would ignore all but a 0% image.  Give two
--forces (e.g. -ff) would then force the install of even a 0% image.

Definitely a) and b) are too invasive of a change for rtm.  c) could be done
here, but I'd probably also then remove -m/--machine-id as it would really not
be necessary.

Well, presumably you'd get the update somewhere long before it reaches 100%,
since it's just as unlikely your device is 99% as it is 1%. :)  Of course,
that might or might not be a good thing for you, if version 21 is bad, but
maybe marginally better if 21 is less bad than 20.

We did talk about this, although not specifically in the context of a
fresh-from-factory update.  The question then is whether we should fall back
to installing a less-than-highest update that is at 100% if there is a highest
update that you wouldn't normally get because of your device's percentage.  We
thought that no, you shouldn't.  I think it wouldn't be long before, say 21's
percentage was raised high enough for you to get it.

The solution would be to include the phase percentage in the "score" that each
candidate path gets when calculating a winning upgrade path.  That's not
something I want to change this close to rtm, but it's something we can look
at after rtm.

Tracked in LP: #1384854.

I realize that there's an implication here that if you don't include some
varying bit of information, you won't have a good distribution of guinea pig
devices.

You could add a 'subchannel' key in the images, which if present would be used
as input to the random number seed.  Thus if you wanted to publish a set of
images that were logically connected (i.e. 21 improves 20, which improves 19,
but none are known to be 100%), then at least a device that gets N% for images
21, 20, and 19 will get the same N% for all of them, but it would get a
different percentage for say image 22 which had a different (or no) subchannel
key.

Maybe you could also generate a varying bit of information every time you
actually fulfill an upgrade.  Maybe even time.time().  Thus if you are 13% for
image 19 but don't upgrade to 19, then you'll also be 13% for 20, and if you
don't upgrade to 20, you'll also be 13% for 21.

Now let's say image 19 is 5%, image 20 is 10%, and image 21 is 15%.  You won't
upgrade to 19 or 20, but you would upgrade to 21.  At that point, we'd
generate a new local varying bit of information, and thus your "guinea pig
score" (i.e. the device's phased percentage) would be different for any image
after 21.

>> === modified file 'systemimage/config.py'
>> --- systemimage/config.py	2014-09-16 20:06:56 +0000
>> +++ systemimage/config.py	2014-10-23 14:38:04 +0000
>> @@ -33,6 +33,7 @@
>>      as_loglevel, as_object, as_timedelta, makedirs, temporary_directory)
>>  
>>  
>> +UNIQUE_MACHINE_ID_FILE = '/var/lib/dbus/machine-id'
>
>Strikes me that we're using the dbus machine id instead of whoopsie's id --
>or rather some id derived of it -- which is more stable (derived from
>hardware, e.g. MAC address, serial number etc.). Might be a good idea to move
>to that.
>
>NB: confirmed that on current rtm images, we are using different dbus machine
>ids.

Do you have any feedback on why the whoopsie id would be better than the
machine id?  Is it just consistency with other tools?  Is it possible that the
whoopsie id was missing, in which case we'd probably need to fall back to the
machine id.

Tracked in LP: #1384859

>> +    def test_phased_percentage_different_target(self):
>> +        # All else being equal, a different target gies different %.
>
>FYI s/gies/gives/ -- no need to fix here, just in case you care to fix this
>in a future branch

Since this hasn't landed yet... fixed, thanks!

« Back to merge proposal