Merge lp://staging/~jtv/launchpad/db-bug-766807 into lp://staging/launchpad/db-devel

Proposed by Jeroen T. Vermeulen
Status: Merged
Approved by: Jeroen T. Vermeulen
Approved revision: no longer in the source branch.
Merged at revision: 10493
Proposed branch: lp://staging/~jtv/launchpad/db-bug-766807
Merge into: lp://staging/launchpad/db-devel
Diff against target: 842 lines (+613/-28) (has conflicts)
11 files modified
database/schema/security.cfg (+4/-0)
lib/canonical/config/schema-lazr.conf (+10/-0)
lib/lp/archivepublisher/interfaces/createdistroseriesindexesjob.py (+26/-0)
lib/lp/archivepublisher/model/createdistroseriesindexesjob.py (+155/-0)
lib/lp/archivepublisher/tests/test_createdistroseriesindexesjob.py (+309/-0)
lib/lp/archivepublisher/zcml/configure.zcml (+12/-2)
lib/lp/services/job/runner.py (+39/-1)
lib/lp/soyuz/interfaces/distributionjob.py (+7/-0)
lib/lp/soyuz/scripts/initialise_distroseries.py (+13/-1)
lib/lp/soyuz/scripts/publishdistro.py (+15/-23)
lib/lp/soyuz/scripts/tests/test_initialise_distroseries.py (+23/-1)
Text conflict in database/schema/security.cfg
To merge this branch: bzr merge lp://staging/~jtv/launchpad/db-bug-766807
Reviewer Review Type Date Requested Status
Gavin Panella (community) code Approve
William Grant code* Approve
Graham Binns (community) Abstain
Review via email: mp+59027@code.staging.launchpad.net

Commit message

[r=allenap,wgrant][bug=766807] Auto-create new distroseries indexes.

Description of the change

= Summary =

When a new distroseries is created (and we plan to publish it), several things need to happen. One of them is an initial run of the publish-distro script for the main and, if present, the partner archives. This is steps 12 & 14 in https://wiki.ubuntu.com/NewReleaseCycleProcess

As part of our derivation work, we need to automate more of this process and generalize it beyond Ubuntu.

== Proposed fix ==

Use the job system. Set up a new DistributionJob-derived job type that creates the indexes.

The job is created in the initialise-from-parent.py script. If you look at the Ubuntu release process you'll note that this is in a section where publishing cron jobs for the distribution are disabled. Creation of the indexes also falls within this section. Once the indexes have been created, the release manager is notified so that they know that this job is no longer a blocker for re-enabling the publishing scripts.

For the Ubuntu case, the documented contacts are Lamont and Steve L. Steve is documented as the Ubuntu release manager, along with a very few others, so that makes this notification set a good fit to the existing process design.

We don't publish Debian releases, nor do we want to send notifications to its owners just because their releases show up in Launchpad. We have no publisher configuration in the database for Debian, which would break publish-distro. So in this branch I don't produce index-creation jobs for distributions that have no publisher config; I hope that is a compromise that will make everyone happy.

We may need either 1 or 2 publish-distro runs for a series. For the 2-run case, we could either generate a single job that performs both runs, or two separate jobs for the two respective runs. I opted for the former: humans are waiting for the news that this job is completed. Tracking successful completion of both gets much harder with two independent jobs.

== Pre-implementation notes ==

Without Julian available, we have to make do as best we can. Some of the details and caveats were discussed with wgrant, SteveK, and lifeless. Points that spring to mind:

 * As a matter of design, the publish_distro script should remain unaware of distroseries.

 * There's no particular reason to run publish-distro after creating the symlinks in the Ubuntu release process.

 * All pockets should be generated, though only for the new distroseries. By default publish_distro processes the entire distribution.

In future we'll clearly want to automate a lot more of the Ubuntu release process. For now, creating individual jobs for each step should still be manageable. In future we may be able to automate steps 5, 6, 9, 10, and 18 relatively easily.

== Implementation details ==

The new job type was originally called InitializeDistroSeriesIndexesJob, but this clashed with the more generically-named InitialiseDistroSeriesJob. Also note the difference in spelling. In the end I opted for a shorter, more distinct name. It wasn't worth creating a dedicated interface for this new job type.

You'll notice that I removed try_and_commit from the publisher code. This was a holdover from obsolete memory micro-management practices, but with those gone it's been stripped down to the point of uselessness. Ditching it made the code a bit easier to read.

The new job runner will have to run on the server that hosts the Ubuntu archives.

== Tests ==

{{{
./bin/test -vvc lp.archivepublisher.tests.test_createdistroseriesindexesjob
./bin/test -vvc lp.soyuz.scripts.tests.test_initialise_distroseries
}}}

== Demo and Q/A ==

The creation of the new jobs is controlled by a feature flag: archivepublisher.auto_create_indexes.enabled.

To test this feature on dogfood or similar:
 * Enable the feature flag.
 * Create a new distro release based on an existing release.
 * You may need to go through some of the steps in the Ubuntu release process.
 * Run initialise-from-parent.py on the new release.
 * Check that a DistributionJob of type 4 has been created for the new release.
 * Run the job runner: cronscripts/create-distroseries-indexes.py -vvv
 * This should run publish_distro and generate indexes in the distribution's dists root.

= Launchpad lint =

I left a few bits of lint in place: some unused variables because I don't want to damage the artistic integrity of the tests(*), and a long line because I think another concurrent branch already fixes it in a way that would cause conflicts.

(*) Meaning that whoever wrote them may have introduced the variables for legibility or consistency and I don't want to mess with what I do not understand (unless it's actually fun).

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/soyuz/scripts/initialise_distroseries.py
  lib/lp/soyuz/scripts/tests/test_initialise_distroseries.py
  lib/lp/soyuz/scripts/publishdistro.py
  lib/lp/archivepublisher/zcml/configure.zcml
  lib/lp/soyuz/interfaces/distributionjob.py
  lib/lp/archivepublisher/tests/test_createdistroseriesindexesjob.py
  lib/lp/archivepublisher/interfaces/createdistroseriesindexesjob.py
  lib/lp/archivepublisher/model/createdistroseriesindexesjob.py

./lib/lp/soyuz/scripts/tests/test_initialise_distroseries.py
     261: local variable 'ps' is assigned to but never used
     271: local variable 'ps' is assigned to but never used
     285: local variable 'test2' is assigned to but never used
     324: local variable 'test2' is assigned to but never used
./lib/lp/soyuz/scripts/publishdistro.py
     208: E501 line too long (80 characters)
     208: Line exceeds 78 characters.

Jeroen

To post a comment you must log in.
Revision history for this message
Graham Binns (gmb) wrote :

Hi Jeroen,

I took another look at this as promised but I still don't trust myself to give it a proper review; I don't have the context to be able to understand it without a lot of hand-holding.

I think it would be a good idea for you to arrange for someone else on your squad to review this; they'll likely have the necessary context (and also won't have just returned from a two-week break).

Apologies for punting,

Graham

review: Abstain
Revision history for this message
William Grant (wgrant) wrote :

Looks mostly good, but a few comments:

43 +[create_distroseries_indexes]
44 +# The database role that these jobs will run in.
45 +# datatype: string
46 +dbuser: archivepublisher

Can you change JobCronScript to take the dbuser directly, rather than continuing the proliferation of pointless config sections?

79 + :param distroseries: A newly created `DistroSeries` that needs
80 + its archive indexes created.

The series may have been created some time ago; it was merely recently initialised.

135 +def get_addresses_for(person):
136 + """Get list of contact email address(es) for `person`."""

How does this differ from lp.registry.model.person.get_recipients?

145 +class CreateDistroSeriesIndexesJob(DistributionJobDerived):
146 + """Job to create a distroseries's archive indexes.
147 +
148 + To do this it runs publish-distro on the distribution, in careful mode.
149 + """

I'd prefer clarification that it operates only on the primary and partner archives, and only in careful *indices* mode.

190 + return "initializing archive indexes for %s" % self.distroseries.title

DistroSeries.title is pretty much useless... even __str__ is a much better idea.

197 + def runPublishDistro(self, extra_args=None):
198 + """Invoke the publish-distro script to create indexes.

As discussed last night, ew ew ew die. But OK.

216 + def getMailRecipients(self):
217 + """List email addresses to notify of success or failure."""
218 + recipient = self.distroseries.driver or self.distribution.owner

Both have a driver and an owner. Why driver of one and owner of the other?

223 + subject = "Launchpad has created archive indexes for %s." % (
224 + self.distroseries.name)

Email subjects don't tend to have full stops, and this doesn't identify the distribution. I'd go more along the lines of "Archive indices created for Ubuntu Oneiric". This whole thing also skips all our message rationale stuff, and it all seems like a fairly dubious idea, but I guess it will have to do for now.

Also using displayname and title here... distribution.name + series.name is probably better.

304 +class TestHelpers(TestCaseWithFactory):
305 + """Test module's helpers."""

As above, this is all duplicated elsewhere. Please use the existing functions if possible.

601 + CREATEDISTROSERIESINDEXES = DBItem(4, """

That's bordering on unreadable... maybe underscores are a good idea?

review: Approve (code*)
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :
Download full text (4.4 KiB)

Thanks. Good stuff.

> 43 +[create_distroseries_indexes]
> 44 +# The database role that these jobs will run in.
> 45 +# datatype: string
> 46 +dbuser: archivepublisher
>
> Can you change JobCronScript to take the dbuser directly, rather than
> continuing the proliferation of pointless config sections?

This is how the job infrastructure likes us to do it now. Especially with the poor documentation we have, I'd rather stick to the pattern it expects than fight it.

> 79 + :param distroseries: A newly created `DistroSeries` that
> needs
> 80 + its archive indexes created.
>
> The series may have been created some time ago; it was merely recently
> initialised.

True. I was thinking "new" in the sense of "not having passed through this code yet," but I changed it as you suggest.

> 135 +def get_addresses_for(person):
> 136 + """Get list of contact email address(es) for `person`."""
>
> How does this differ from lp.registry.model.person.get_recipients?

Apparently it doesn't! Knowing that we had to have something like this I searched that file (and others) for the relevant words I could think of, but did not find get_recipients. That's a nice little weight loss program.

> 145 +class CreateDistroSeriesIndexesJob(DistributionJobDerived):
> 146 + """Job to create a distroseries's archive indexes.
> 147 +
> 148 + To do this it runs publish-distro on the distribution, in careful
> mode.
> 149 + """
>
> I'd prefer clarification that it operates only on the primary and partner
> archives, and only in careful *indices* mode.

Then the main thing that needs clarifying is the existence of a careful indices mode. I updated the help string for the option to say that, and reduced apt-ftparchive to the status of an example.

Changing the long option spelling seemed a bridge too far though. Another day.

> 190 + return "initializing archive indexes for %s" %
> self.distroseries.title
>
> DistroSeries.title is pretty much useless... even __str__ is a much better
> idea.

Okay: interpolating self.distroseries now.

> 197 + def runPublishDistro(self, extra_args=None):
> 198 + """Invoke the publish-distro script to create indexes.
>
> As discussed last night, ew ew ew die. But OK.

It's elsewhere as well. Sad, but there you go. We have a similar thing with LpQueryDistro, which is a LaunchpadScript but doesn't break the ugly pattern.

> 216 + def getMailRecipients(self):
> 217 + """List email addresses to notify of success or failure."""
> 218 + recipient = self.distroseries.driver or
> self.distribution.owner
>
> Both have a driver and an owner. Why driver of one and owner of the other?

The UI shows the distribution driver as Driver, but the distroseries driver as Release Manager.

I don't think a distroseries owner has much status in practice; moreover, if no release manager has been assigned for the series, I don't think it makes much sense to attach much credence to the series owner. The choice of distribution owner is more likely to be meaningful.

> 223 + subject = "Launchpad has...

Read more...

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

StevenK also had a look at this and suggested I use the new, generic job runner. I made that change, which eliminated the need to execute the script in a separate process as an end-to-end test, and that in turn led to all sorts of other changes in the test.

Since the job runner wasn't documented, I added some descriptions that I hope will be helpful.

Revision history for this message
Gavin Panella (allenap) wrote :

Looks good, and wgrant's review was very thorough, so r=me with some
small comments.

[1]

+ arguments = [
+ "-A",
+ "-d", self.distribution.name,
+ ]
+ for suite in self.getSuites():
+ arguments += ["-s", suite]

If there are long forms of these arguments, would you be happy to use
them so that it's easier to see what's happening?

[2]

+ message = dedent("""\
+ You are receiving this email because you are registered in
+ Launchpad as a release manager for %s.
+
+ The archive indexes for %s have been successfully created.

It matters very little, but will MailController reflow the paragraphs
in case the replacements are very long?

[3]

+ controller = MailController(
+ from_addr, self.getMailRecipients(), subject, message)
+ if controller is not None:
+ controller.send()

Is there any situation where controller will be None?

[4]

+ def destroySelf(self):
+ """See `IDistributionJob`."""
+ Store.of(self.context).remove(self.context)

Is this needed? It doesn't seem to be used (and isn't tested).

[5]

+def silence_publisher_logger():
+ """Silence the logger that `run_publisher` creates."""
+ getLogger("publish-distro").setLevel(FATAL)

Does the test harness return the logger to its previous level after
the test has run? If not, perhaps this could be changed into a
Fixture.

[6]

+ def makeJob(self, distroseries=None):
+ """Create an `CreateDistroSeriesIndexesJob`."""
+ if distroseries is None:
+ distroseries = self.factory.makeDistroSeries()
+ job = removeSecurityProxy(self.getJobSource().makeFor(distroseries))
+ return job
+
+ def getDistsRoot(self, distribution):
+ """Get distsroot directory for `distribution`."""
+ archive = removeSecurityProxy(distribution.main_archive)
+ pub_config = getPubConfig(archive)
+ return pub_config.distsroot

Why is removeSecurityProxy() needed here? The need to unwrap these
objects makes me think that their security proxies are not configured
correctly.

[7]

+ def becomeArchivePublisher(self):
+ """Become the archive publisher database user (and clean up later)."""
+ self.becomeDbUser(config.archivepublisher.dbuser)
+ self.addCleanup(self.becomeDbUser, 'launchpad')

I can't see that any other tests clean up after using becomeDbUser(),
so perhaps you don't need to either.

[8]

+ def test_getSuites_ignores_suites_for_other_distroseries(self):
+ # getSuites does not list suites in the distributio that do not

s/distributio/distribution/

[9]

+ def test_schedule_index_creation(self):
...
+ DistributionJobType.CREATEDISTROSERIESINDEXES,

Old spelling.

review: Approve (code)
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :
Download full text (4.5 KiB)

> Looks good, and wgrant's review was very thorough, so r=me with some
> small comments.

And yet, very observant ones. Thank you for that; it's one of the best ways to learn.

> + arguments = [
> + "-A",
> + "-d", self.distribution.name,
> + ]
> + for suite in self.getSuites():
> + arguments += ["-s", suite]
>
> If there are long forms of these arguments, would you be happy to use
> them so that it's easier to see what's happening?

Unfortunately I wouldn't. :( The -A one's more "meaningful" spelling is actually --careful-apt, which turns out to be a misnomer. It ought to be changed eventually. I could spell out "-d" in full ("--distribution") but that may seem incongruous.

In the longer term however we do intend to get rid of this ugliness altogether and call a proper python function rather than simulating a command-line invocation.

> + message = dedent("""\
> + You are receiving this email because you are registered in
> + Launchpad as a release manager for %s.
> +
> + The archive indexes for %s have been successfully created.
>
> It matters very little, but will MailController reflow the paragraphs
> in case the replacements are very long?

Probably not. A mail client might. This notification is a transitory solution; the real fix is to automate the full process so that the email is no longer needed.

> + controller = MailController(
> + from_addr, self.getMailRecipients(), subject, message)
> + if controller is not None:
> + controller.send()
>
> Is there any situation where controller will be None?

No, you're right. That check was from some previous state of the code, I think. Fixed.

> + def destroySelf(self):
> + """See `IDistributionJob`."""
> + Store.of(self.context).remove(self.context)
>
> Is this needed? It doesn't seem to be used (and isn't tested).

I couldn't find any use for it, but the interface specifies it (so I can't have my baseline tests for interface compliance without it) so I decided to test it properly. Also made the implementation clean up the underlying Job record as well.

> +def silence_publisher_logger():
> + """Silence the logger that `run_publisher` creates."""
> + getLogger("publish-distro").setLevel(FATAL)
>
> Does the test harness return the logger to its previous level after
> the test has run? If not, perhaps this could be changed into a
> Fixture.

I suspect that it won't. But in this particular case that might not be a bad thing: we'd stand to lose some test noise that we'd probably not bother to address otherwise.

> + def makeJob(self, distroseries=None):
> + """Create an `CreateDistroSeriesIndexesJob`."""
> + if distroseries is None:
> + distroseries = self.factory.makeDistroSeries()
> + job = removeSecurityProxy(self.getJobSource().makeFor(distroseries))
> + return job
> +
> + def getDistsRoot(self, distribution):
> + """Get distsroot directory for `distribution`."""
> + archive = removeSecurityProxy(distribution.main_archive)
> + pub_config = getPubConfi...

Read more...

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches

to status/vote changes: