Merge lp://staging/~vila/ubuntu-ci-services-itself/secgroups into lp://staging/ubuntu-ci-services-itself

Proposed by Vincent Ladeuil
Status: Rejected
Rejected by: Vincent Ladeuil
Proposed branch: lp://staging/~vila/ubuntu-ci-services-itself/secgroups
Merge into: lp://staging/ubuntu-ci-services-itself
Diff against target: 118 lines (+96/-0)
2 files modified
juju-deployer/test_update.py (+7/-0)
leon.py (+89/-0)
To merge this branch: bzr merge lp://staging/~vila/ubuntu-ci-services-itself/secgroups
Reviewer Review Type Date Requested Status
Francis Ginther Needs Information
PS Jenkins bot (community) continuous-integration Approve
Review via email: mp+206771@code.staging.launchpad.net

Commit message

Leon will cleanup after test runs by killing leaked security groups.

Description of the change

I don't have time to fully debug who/why security groups are leaked during
the test runs.

So for now I resort to semi-manual cleanup: leon.py.

In addition to the usual nova credentials, this requires knowing which juju
environment is current.

Since I don't want/care to automate that for now, you have to manually
provide those in a unit_config file at the root of your branch.

From there, Leon will get rid of those annoying security groups.

Leon doesn't think nor does it try to handle errors, it just shoot any
security group it can find (but preserve a few precious ones, see code for
details).

Feel free to enhance to suit your tastes but let's try to resort to Leon
only for stuff we cannot handle better from the tests themselves ;)

To post a comment you must log in.
241. By Vincent Ladeuil

Leon is not verbal, but a little feedback is good.

242. By Vincent Ladeuil

Document what I know how the precious security groups.

Revision history for this message
Evan (ev) wrote :

Could we refactor this so that we use it in tests/run.py to clean up
*just* the security groups that were created in the test run (minus
precious)?

+ # FIXME: There is probably a better way, but for now, just require
+ # that unit_config defines 'juju_env' and be done. -- vila
+ # 2014-02-17
+ auth_conf.get('juju_env')

tests/run.py handles this well:

    juju_env = os.environ.get('JUJU_ENV')
    if juju_env is None:
        juju_env = envs.get('default')
    if not juju_env:
        log.error('No juju environment specified.')
        sys.exit(1)

Revision history for this message
Vincent Ladeuil (vila) wrote :

Note that nova will fail to delete a security group if it's still in use which offers a *little* protection against Leon's misuse.

Revision history for this message
Evan (ev) wrote :

Ah, nevermind that latter point. I misread :)

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

FAILED: Continuous integration, rev:241
No commit message was specified in the merge proposal. Click on the following link and set the commit message (if you want a jenkins rebuild you need to trigger it yourself):
https://code.launchpad.net/~vila/ubuntu-ci-services-itself/secgroups/+merge/206771/+edit-commit-message

http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/159/
Executed test runs:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/159/rebuild

review: Needs Fixing (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

FAILED: Continuous integration, rev:242
No commit message was specified in the merge proposal. Click on the following link and set the commit message (if you want a jenkins rebuild you need to trigger it yourself):
https://code.launchpad.net/~vila/ubuntu-ci-services-itself/secgroups/+merge/206771/+edit-commit-message

http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/160/
Executed test runs:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/160/rebuild

review: Needs Fixing (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

FAILED: Continuous integration, rev:242
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/161/
Executed test runs:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/161/rebuild

review: Needs Fixing (continuous-integration)
243. By Vincent Ladeuil

Shooting a bit in the dark to make the test pass on s-jenkins, the theory is that bzr is not setup there but we don't need a full setup either.

Revision history for this message
Vincent Ladeuil (vila) wrote :

> Could we refactor this so that we use it in tests/run.py to clean up
> *just* the security groups that were created in the test run (minus
> precious)?

That's part of what I'm doing in lp:~vila/ubuntu-ci-services-itself/integration and I'd rathe avoid duplicating the work especially since we don't yet know who/why these groups fail and a fix can come magically from juju or juju-deployer.

In the mean time, I think cleaning up after a test run while collecting data is the minimum we should do.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

FAILED: Continuous integration, rev:242
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/164/
Executed test runs:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/164/rebuild

review: Needs Fixing (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

FAILED: Continuous integration, rev:243
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/165/
Executed test runs:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/165/rebuild

review: Needs Fixing (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

PASSED: Continuous integration, rev:242
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/166/
Executed test runs:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/166/rebuild

review: Approve (continuous-integration)
Revision history for this message
Vincent Ladeuil (vila) wrote :

So, investigating why the s-jenkins job was failing for *this* MP, it turns out that *trunk* was used to run the tests which means:
- trunk is broken but the last revision in this MP fixes it,
- our tarmac setup still let a broken branch land and we probably want to fix that for good.

Revision history for this message
Chris Johnston (cjohnston) wrote :

Trunk is not presently broken. It has to do with the env that the tests are
running in. Something may have changed to make it to where the current
setup for the s-jenkins env isn't able to properly handle running the
tests. Running the tests locally, the passed just fine. I have not yet seen
a single time where tarmac landed code with failing tests.

Revision history for this message
Andy Doan (doanac) wrote :

while "leon.py" is meant to be temporary. i think the fact it requires a 5 line explanation of its name indicates we could probably give it a clearer name for the time being. how about rename it to clean-secgroups.py or something.

Most importantly revno 243 is required before tarmac.sh is going to work on s-jenkins again. So we need this merged just to get other MPs flowing again.

Revision history for this message
Andy Doan (doanac) wrote :

19 + # FIXME: uci-engine-ci is a bit dry and needs some help. Postponing
20 + # full isolated setup for now -- vila 2014-02-07

this is a bit subjective from me. but do we really need this comment in code? ie what's the long term idea you have for ensuring the fact that calls to bzr won't need BZR_EMAIL setup? I just fear comments like this, wind up in code and then live forever causing noise. maybe its just a better explanation is needed in the FIXME so i understand what you are envisioning.

Revision history for this message
Chris Johnston (cjohnston) wrote :

Does firewall-mode: global not work as recommended [1] and by ev:

2014-02-14 11:46:49 ev cjohnston: I think we're safe to enable `firewall-mode: global` here for now
2014-02-14 11:47:03 ev it's okay if these things are all listening on 8080

This would remove the need for the script and for a 5 line description of what leon means.

[1] https://bugs.launchpad.net/juju/+bug/1027641/comments/7

Revision history for this message
Paul Larson (pwlars) wrote :

> Does firewall-mode: global not work as recommended [1] and by ev:
>
> 2014-02-14 11:46:49 ev cjohnston: I think we're safe to enable
> `firewall-mode: global` here for now
> 2014-02-14 11:47:03 ev it's okay if these things are all listening on
> 8080
>
> This would remove the need for the script and for a 5 line description of what
> leon means.
>
> [1] https://bugs.launchpad.net/juju/+bug/1027641/comments/7
+1 on the name change. I'm sad to say that my parser ignored the comments and I spent a few (pre-coffee) minutes staring at "Note that nova will fail to delete a security group if it's still in use which offers a *little* protection against Leon's misuse." this morning wonder who the hell Leon is and why he's misusing our stuff. :)

That being said, have you actually tried the `firewall-mode: global` option? Admittedly, it's been a while since I tried it, and I did so on hpcloud where there were (and still are) other problems, but it didn't work well for me at the time. If it works, then I like that approach better.

Revision history for this message
Chris Johnston (cjohnston) wrote :

On Tue, Feb 18, 2014 at 6:22 PM, Paul Larson <email address hidden>wrote:

>
> That being said, have you actually tried the `firewall-mode: global`
> option? Admittedly, it's been a while since I tried it, and I did so on
> hpcloud where there were (and still are) other problems, but it didn't work
> well for me at the time. If it works, then I like that approach better.
>

I haven't gotten around to trying it yet. If it doesn't work, then we need
to get a bug filed as well, but I would rather have that attempted first.

Revision history for this message
Francis Ginther (fginther) wrote :

I'm taking a look at leon with a use case that I have juju deployed nodes that are unrelated to the ucsi nodes and they shouldn't be touched. I'll admit I'm more than a bit nervous about automatically removing security groups, but realize that our options are limited when it comes to automating. Just need to do some deeper review here.

Also the change to make jenkins work again is necessary regardless of leon, let's get it merged: https://code.launchpad.net/~cjohnston/ubuntu-ci-services-itself/fix-bzr-error/+merge/207218

Revision history for this message
Francis Ginther (fginther) wrote :

> I'm taking a look at leon with a use case that I have juju deployed nodes that
> are unrelated to the ucsi nodes and they shouldn't be touched. I'll admit I'm
> more than a bit nervous about automatically removing security groups, but
> realize that our options are limited when it comes to automating. Just need to
> do some deeper review here.
>
> Also the change to make jenkins work again is necessary regardless of leon,
> let's get it merged: https://code.launchpad.net/~cjohnston/ubuntu-ci-
> services-itself/fix-bzr-error/+merge/207218

Wait, on closer inspection, leon isn't automatically executed. I may be jumping to conclusions a bit on this.

Revision history for this message
Chris Johnston (cjohnston) wrote :

I tried firewall-mode: global this evening. After two runs, I only have two secgroups.. Seems that it works just fine. I'd recommend we stick to that.

Revision history for this message
Francis Ginther (fginther) wrote :

This is a reasonable addition to me to quickly clean things up when I know what I'm doing, but I'm hitting an authorization issue:

Traceback (most recent call last):
  File "./leon.py", line 92, in <module>
    leon.clean_spurious_security_groups()
  File "./leon.py", line 73, in clean_spurious_security_groups
    groups = self.nova.security_groups.list()
  File "/tmp/venv/local/lib/python2.7/site-packages/python_novaclient-2.15.0-py2.7.egg/novaclient/v1_1/security_groups.py", line 96, in list
    'security_groups')
--snip a bunch of novaclient code--
  File "/tmp/venv/local/lib/python2.7/site-packages/python_novaclient-2.15.0-py2.7.egg/novaclient/client.py", line 189, in request
    raise exceptions.from_response(resp, body, url, method)
novaclient.exceptions.Unauthorized: The request you have made requires authentication. (HTTP 401)

I have a unit_config file with the auth* and juju_env parameters defined. I can also manually run "nova secgroup-list" and view my security groups. Do I need to have some authorization enabled on my account?

review: Needs Information
Revision history for this message
Vincent Ladeuil (vila) wrote :

> This is a reasonable addition to me to quickly clean things up when I know
> what I'm doing, but I'm hitting an authorization issue:
>
> Traceback (most recent call last):
> File "./leon.py", line 92, in <module>
> leon.clean_spurious_security_groups()
> File "./leon.py", line 73, in clean_spurious_security_groups
> groups = self.nova.security_groups.list()
> File "/tmp/venv/local/lib/python2.7/site-packages/python_novaclient-2.15.0-p
> y2.7.egg/novaclient/v1_1/security_groups.py", line 96, in list
> 'security_groups')
> --snip a bunch of novaclient code--
> File "/tmp/venv/local/lib/python2.7/site-
> packages/python_novaclient-2.15.0-py2.7.egg/novaclient/client.py", line 189,
> in request
> raise exceptions.from_response(resp, body, url, method)
> novaclient.exceptions.Unauthorized: The request you have made requires
> authentication. (HTTP 401)
>
>
> I have a unit_config file with the auth* and juju_env parameters defined. I
> can also manually run "nova secgroup-list" and view my security groups. Do I
> need to have some authorization enabled on my account?

Not that I know of, but your use case is killing Leon for sure, the slight protection excuse doesn't hold anymore for people that have multiple deployments in flight. We would need a far more precise behavior for that.

Additionally, since Chris validated firewall-mode, cleaning the secgroups is not an issue anymore.

If we decide to revert firewall-mode usage later, we may discover that juju[-deployer] has been fixed.

Finally, I realized that I got my movie reference wrong, I was thinking about Victor in Nikita (but Jean Reno plays the same kind of guy which confused me ;)

Unmerged revisions

243. By Vincent Ladeuil

Shooting a bit in the dark to make the test pass on s-jenkins, the theory is that bzr is not setup there but we don't need a full setup either.

242. By Vincent Ladeuil

Document what I know how the precious security groups.

241. By Vincent Ladeuil

Leon is not verbal, but a little feedback is good.

240. By Vincent Ladeuil

Ask Leon to kill spurious security groups.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches