Merge lp://staging/~yellow/launchpad/lxcsetup into lp://staging/launchpad

Proposed by Francesco Banconi
Status: Merged
Approved by: Francesco Banconi
Approved revision: no longer in the source branch.
Merged at revision: 14733
Proposed branch: lp://staging/~yellow/launchpad/lxcsetup
Merge into: lp://staging/launchpad
Diff against target: 823 lines (+819/-0)
1 file modified
utilities/setuplxc.py (+819/-0)
To merge this branch: bzr merge lp://staging/~yellow/launchpad/lxcsetup
Reviewer Review Type Date Requested Status
Benji York (community) code Approve
Graham Binns Pending
Review via email: mp+89660@code.staging.launchpad.net

Description of the change

= Summary =

This branch adds a script that can be used to set up a Launchpad environment
inside a LXC, useful for testing Launchpad.

== Proposed fix ==

https://dev.launchpad.net/ParallelTests describes how LXC containers offer
a cheap way to run tests in parallel using ephemeral instances, obtaining the
required isolation to workaround the existing globals (shared work dirs,
hardcoded tcp ports, etc.). The `setuplxc.py` script creates a Launchpad
environment from scratch, ready to be used by ephemerals (e.g. in a buildbot
slave context).

== Pre-implementation notes ==

During development we realized that the same approach (with few changes)
should work to set up a Launchpad development environment.
The developer can just:

- run the script (passing his current local username as user), e.g.::

    ./setuplxc.py -u username -e <email address hidden> -n 'Firstname Lastname'
    -c lp-devel -d /home/username/lp-deps /home/username/lp-branches/

- ssh into the container (in the example above: ssh lp-devel)
- cd ~/lp-branches/devel
- make schema
- make run
- start hacking

== Implementation details ==

utilities/setuplxc.py:
* the script is implemented in Python
* requires Python 2.7
* must be run as root
* help on required arguments: utilities/setuplxc.py -h

utilities/launchpad-database-setup
* refactored so that it can be run by root

== Tests ==

The script uses internal helper functions providing doctests. To run them::

     python -m doctest -v utilities/setuplxc.py

A more extensive and complete testing approach would require to setup
a precise KVM.

== Demo and Q/A ==

To demo and Q/A this change, do the following:

 * Install precise in a virtual machine (e.g. KVM)

 * Copy the script inside the virtual machine, e.g. for kvm::

    kvm -hda precise_HDA.img -boot c -m 2000 -redir tcp:2222::22 &
    scp -P 2222 /utilities/setuplxc.py root@localhost:/tmp/

 * Run the script, e.g.::

    ssh -p 2222 root@localhost
    cd /tmp/
    ./setuplxc [arguments]

== lint ==

= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  utilities/setuplxc.py

To post a comment you must log in.
Revision history for this message
Benji York (benji) wrote :

Here's what I have this far:

lxc-create and lxc-start use the "-n" option to specify the instance
name, we might want to do the same.

It would be nice if the script used the current user info (login name)
if none are provided as arguments. Taking that a step further, bzr
whoami produces consistent, easily parsable results that could make
--email and --name optional as well.

...Later: oh! The user info is required because the script is expecting
to be run as root. I suppose the easiest thing would be to leave it
as-is, but it'd be really nice to not have to provide all of those
arguments. Perhaps the script can be re factored into two halves: the
first starts as the user in question, and the second part is run under
sudo (prompting the user for a password if nectary).

I'm a big fan of doctest, but it's LP policy to not introduce any new
(non-documentation) doctests, so the embedded doctests should be
translated into unittest-style tests.

Is it really necessary to skip the "sudo" (in launchpad-database-setup)
if already running as root? I would expect the sudo to be a noop in
that scenario. Similarly, "sudo -u postgres" should work equally well
for root as it does for non-root users.

We've mostly transitioned away from the %-style string construction and
switched to the new .format() string method so the script should avoid
constructing strings with %.

On line 210 of the diff: The summary line of a docstring should fit on a
single line.

The file_append function doesn't have to rewrite the entire file.
Instead the file can be opened in append mode and the new line appended.

Also, when appending a line to a file you should check to see if the
last line of the file doesn't have a trailing newline, otherwise you can
end up adding your string to the last line of the file instead of
appending an entirely new line.

Since this script requires Python 2.7, I would use the print function
instead of the print keyword (from __future__ import print_function).

On line 366 of the diff, the description of the --name argument leaves
off the "r" in "for" ("used fo bzr").

The parenthesis around the multi-line constructions of the "help"
arguments aren't necessary.

Requiring an SSH private key without passphrase gives me some security
shivers and may raise a few eyebrows.

Revision history for this message
Gary Poster (gary) wrote :

I'm a branch participant too, so I'll have a small reply. My thoughts, fwiw:

- very helpful review. I'm old-skool with %-based string interpolation, I guess. That came from me, I think. I'm fine with format.
- Given that this script is standalone, I thought that the doctests were fine when I saw them. Separating them from the script would be a shame IMO
- file_append wants to insert a line at the beginning IIRC, so rewriting makes sense. Maybe I don't RC.
- print function from future: eh. If Francesco wants IMO :-)
- requiring an SSH private key without passphrase is necessary for this to be doable mechanically, such as part of a juju install, AFAIK. If there's another approach, I'd be very interested in hearing of it.

Revision history for this message
Martin Pool (mbp) wrote :

I'm happy to see this, but it is a bit unfortunate how much code is
getting duplicated between ec2, rocketfuel-setup, and now this...
Perhaps it can later be split out to scripts that are used by all of
them.

Revision history for this message
Gary Poster (gary) wrote :

Martin, we do have a slack-time card to eliminate duplication between this and rocketfuel-setup.

That said, the problem with that task, and with similar work for ec2, is that IMO rocketfuel-setup and setuplxc should be standalone--someone should be able to get the file, without the Launchpad tree, and the script should be able to run. It doesn't have to be a file, but that's the easiest unit to work with for our use cases, particularly for setuplxc. I'd be tempted to have rocketfuel-setup depend on setuplxc rather than reverse, but that's painting a bikeshed.

An alternative approach would be to have a deb package that encompasses all three, so instead of "get this script and run it" you have "[install this PPA, ] install this package and run it." Our to-be-developed buildbot slave juju charm that wants to use setuplxc could have a configuration option to install PPAs and packages, in addition to or instead of accepting files to run. That sounds interesting to me, though it's out of scope for the current project. Could be a slack/20% time project though, maybe.

Revision history for this message
Benji York (benji) wrote :

Here's the rest of my review:

The places that have comments like "This requires Oneiric or later." or
"This script is run as root." could assert those preconditions and
generate errors if they are not met.

A question about file_insert and file_append: are the files being
modified long-lived? If we are using file_append on a file and the user
adds a new line to the end of the file then we will re-append another
copy of our line(s) to that file. Do those lines really have to be at
the very beginning and very end? Could we instead check to see if they
exist anywhere in the file and append/prepend only if they don't?

Writing the above made me realize that file_insert should probably be
named file_prepend since its behavior is the mirror of file_append and
doesn't insert the line at an arbitrary point.

Tiny suggestion: I think it would read better if you used

    with ssh_connection(lxcname) as ssh:
        ...

instead of

    with ssh(lxcname) as sshcall:
        ...

in initialize_lxc. The later "ssh(...)" calls would read better as
well.

We should have some error checking/reporting for all the commands we're
running (via SSH). Just something simple to stderr would help
tremendously for that inevitable time in the future when the script
fails for one reason or another.

What purpose does the time.sleep(5) in stop_lxc serve? Is it intended
as a deadline for the instance to stop cleanly? If so, it seems fine
(if a little short). We could increase it to account for the inevitable
slow shutdowns that will sometimes happen and add a probe in a loop to
see if the instance has actually stopped or not that can exit early if
the instance stops before the timeout value is reached (much as you do
in create_lxc).

On the other hand, since these are ephemeral instances, perhaps we don't
really care to let them shut down cleanly, in that case we could just
lxc-stop them without the poweroff.

Name of the Namespace class could be a bit more specific (line 584 of
the diff). Perhaps "Configuration".

Revision history for this message
Francesco Banconi (frankban) wrote :

> A question about file_insert and file_append: are the files being
> modified long-lived? If we are using file_append on a file and the user
> adds a new line to the end of the file then we will re-append another
> copy of our line(s) to that file. Do those lines really have to be at
> the very beginning and very end? Could we instead check to see if they
> exist anywhere in the file and append/prepend only if they don't?

The check is already done. However, I can improve the way it is done: file_insert checks only for the first line of the file, it could do it for the entire file, remove the line if present, put the line on top.
The order of the lines is important, for example in resolv.conf we need to make sure that the lxc bridge nameserver is the first.
s/file_insert/file_prepend: +1

> What purpose does the time.sleep(5) in stop_lxc serve? Is it intended
> as a deadline for the instance to stop cleanly? If so, it seems fine
> (if a little short). We could increase it to account for the inevitable
> slow shutdowns that will sometimes happen and add a probe in a loop to
> see if the instance has actually stopped or not that can exit early if
> the instance stops before the timeout value is reached (much as you do
> in create_lxc).
>
> On the other hand, since these are ephemeral instances, perhaps we don't
> really care to let them shut down cleanly, in that case we could just
> lxc-stop them without the poweroff.

I agree on creating a loop that checks if the container is stopped for, say, 60 seconds.
The lxc will be used as a template for ephemerals, or as a Launchpad environment for developers, so, in my opinion, lxc-stop is not sufficient.

Many thanks Benji for your detailed review, working on it.

Revision history for this message
Benji York (benji) wrote :

The script looks great, don't let the wall of text below fool you. :)

All of the points below can be ignored if you so desire, except for the
last three paragraphs. All these things seemed important enough to
note, but not important enough to block on if you don't agree. The last
three paragraphs probably do need to be addressed though. I'm confident
you'll do that so I've approved the MP.

I lost my first draft of this, so I'm afraid this is more terse than I
would like.

Since this is a stand-alone script, we don't need to populate __all__.

The leading underscores in function names aren't needed (since we use
__all__ to document a module's exported names we don't have to ugly up
names that we don't want people to use).

We can use email.Utils.parseaddr instead of _parse_whoami.

I assume the "parser" argument for bzr_whois was intended as a
dependency injection point, but since it's not used it can be removed.

A small thing, but it seems to me that

    if not content.endswith('\n'):
        f.write('\n')
    f.write(line)

would be a bit easier to read than

    f.write(line if content.endswith('\n') else '\n{}'.format(line))

Since "ssh" doesn't perform any cleanup actions (i.e., there is no code
after the yield), it could just be a regular function.

There are a few instances of code like this:

    'Error running command: {}'.format(' '.join(sshcmd))

A slight simplification would be to use string concatenation instead:

    'Error running command: ' + ' '.join(sshcmd)

I don't understand the meaning of "clean" in _clean_users,
_clean_userdata, and _clean_ssh_keys. Maybe it's an assertion that they
are clean, in that case I'd expect the functions to be named
"validate...", but they do more than validation, so that doesn't sounnd
right either. Maybe "set" would be the right word ("set_directories",
etc.).

You can use email.Utils.formataddr instead of string operations in
initialize_host to construct the user's whois string.

The os.system call in initialize_host will do the wrong thing if
checkout_dir contains a space (or quotation mark, etc.).
subprocess.call might be a better choice there.

If a function_args_map action raises an exception in main, the exception
is returned (not reraised) and the return value of main is passed to
sys.exit which expects an integer. That doesn't seem quite right.
Perhaps main should just return 1 if an exception occurs.

The dance to restart as root deserves a couple lines of commentary
explaining that if we weren't started as root then we gather up all the
info we need about the user and then restart as root. (I really like
this feature, by the way).

review: Approve (code)
Revision history for this message
Francesco Banconi (frankban) wrote :

> The script looks great, don't let the wall of text below fool you. :)

Hooray!
My thoughts on some of your observations:

> We can use email.Utils.parseaddr instead of _parse_whoami.

I agree, I didn't think about looking at the standard library :-(

> I assume the "parser" argument for bzr_whois was intended as a
> dependency injection point, but since it's not used it can be removed.

It is, see line 127 of the diff.

> Since "ssh" doesn't perform any cleanup actions (i.e., there is no code
> after the yield), it could just be a regular function.

Yes, my same doubt. I ended up using a context manager just to separate connection arguments from the real "call" argument, and to avoid repeating username and location on each ssh call.

> I don't understand the meaning of "clean" in _clean_users,
> _clean_userdata, and _clean_ssh_keys. Maybe it's an assertion that they
> are clean, in that case I'd expect the functions to be named
> "validate...", but they do more than validation, so that doesn't sounnd
> right either. Maybe "set" would be the right word ("set_directories",
> etc.).

Argh... naming things... I used "clean" as e verb, because after calling them we can assume the namespace to be "clean", usable. Maybe "handle_*" could be better?

> If a function_args_map action raises an exception in main, the exception
> is returned (not reraised) and the return value of main is passed to
> sys.exit which expects an integer. That doesn't seem quite right.

sys.exit expects other types too, and in that case it prints the string representation of the passed object to stderr and then exits with 1, that's what we want.

Thank you Benji for all your suggestions.

Revision history for this message
Benji York (benji) wrote :

> > I assume the "parser" argument for bzr_whois was intended as a
> > dependency injection point, but since it's not used it can be removed.
>
> It is, see line 127 of the diff.

I'm not seeing it, but I'll take your word for it. Oh! I think you
mean that the parameter is used, right I see that; I mean that the
*argument* is never used, i.e., no caller of bzr_whois ever passes an
argument for "parser", therefore it can be removed and its use in the
body of bzr_whois can be replaced with parseaddr.

> > Since "ssh" doesn't perform any cleanup actions (i.e., there is no code
> > after the yield), it could just be a regular function.
>
> Yes, my same doubt. I ended up using a context manager just to separate
> connection arguments from the real "call" argument, and to avoid repeating
> username and location on each ssh call.

The function can still return a callable so you get those same benefits.
The transformation would be to remove the decorator and change the yield
into a return. Then instead of doing a "with ssh(...) as foo" you'd do
a "foo = ssh(...)".

> > I don't understand the meaning of "clean" in _clean_users,
> > _clean_userdata, and _clean_ssh_keys. Maybe it's an assertion that they
> > are clean, in that case I'd expect the functions to be named
> > "validate...", but they do more than validation, so that doesn't sounnd
> > right either. Maybe "set" would be the right word ("set_directories",
> > etc.).
>
> Argh... naming things... I used "clean" as e verb, because after calling them
> we can assume the namespace to be "clean", usable. Maybe "handle_*" could be
> better?

"handle" is certainly reasonable.

> > If a function_args_map action raises an exception in main, the exception
> > is returned (not reraised) and the return value of main is passed to
> > sys.exit which expects an integer. That doesn't seem quite right.
>
> sys.exit expects other types too, and in that case it prints the string
> representation of the passed object to stderr and then exits with 1, that's
> what we want.

Well, I guess I learned something new today. :)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.