Merge lp://staging/~stub/charms/precise/postgresql/integration into lp://staging/charms/postgresql

Proposed by Stuart Bishop
Status: Merged
Merged at revision: 105
Proposed branch: lp://staging/~stub/charms/precise/postgresql/integration
Merge into: lp://staging/charms/postgresql
Diff against target: 2291 lines (+1428/-149)
24 files modified
.bzrignore (+1/-0)
Makefile (+11/-2)
README.md (+39/-0)
charm-helpers.yaml (+1/-2)
config.yaml (+71/-11)
hooks/charmhelpers/core/fstab.py (+3/-1)
hooks/charmhelpers/core/hookenv.py (+35/-20)
hooks/charmhelpers/core/host.py (+44/-9)
hooks/charmhelpers/core/services/__init__.py (+2/-0)
hooks/charmhelpers/core/services/base.py (+313/-0)
hooks/charmhelpers/core/services/helpers.py (+125/-0)
hooks/charmhelpers/core/templating.py (+51/-0)
hooks/charmhelpers/fetch/__init__.py (+51/-12)
hooks/hooks.py (+273/-58)
lib/test-client-charm/config.yaml (+13/-0)
lib/test-client-charm/hooks/hooks.py (+165/-0)
lib/test-client-charm/hooks/start (+3/-0)
lib/test-client-charm/hooks/stop (+3/-0)
lib/test-client-charm/metadata.yaml (+13/-0)
templates/postgres.cron.tmpl (+14/-12)
templates/postgresql.conf.tmpl (+3/-0)
templates/swiftwal.conf.tmpl (+5/-5)
test.py (+162/-12)
testing/jujufixture.py (+27/-5)
To merge this branch: bzr merge lp://staging/~stub/charms/precise/postgresql/integration
Reviewer Review Type Date Requested Status
Charles Butler (community) deploy + code Approve
Review Queue (community) automated testing Needs Fixing
Cory Johns (community) Needs Fixing
Adam Israel (community) Needs Fixing
Review via email: mp+233666@code.staging.launchpad.net

Commit message

Various pending bug fixes and feature work.

Description of the change

Integration branch containing all of stub's PostgreSQL charm merge proposals currently in the review queue.

Separate, smaller merge proposals also exist, with each branch in the sequence dependent on the previous. This branch is here in case anyone wants to review everything at once.

To post a comment you must log in.
Revision history for this message
Review Queue (review-queue) wrote :

The results (PASS) are in and available here: http://reports.vapour.ws/charm-tests/charm-bundle-test-960-results

review: Approve (cbt)
Revision history for this message
Adam Israel (aisrael) wrote :

Hi Stuart,

Thanks for all your work on the Postgresql charm! I've spent some time reviewing this cumulative MP against postgresql.

I should note that the above comment from review-queue is incorrect. The results did not pass automated testing.

There were a handful of errors running make lint. I fixed those and pushed the change to my personal namespace[1].

Moving on to the integration tests[2], I was able to get 19 tests to pass, with 14 failures.

I notice that a couple of the failed tests is attempting to execute `juju run --unit postgresql/1 relation-list -r replication:168`, however, relation-list only seems to be valid within the context of hook execution.

I had a great deal more failures initially, with deploy failing. I set JUJU_REPOSITORY and those tests were able to fun.

If you think it would be useful in debugging the failed tests, I can also include any juju/machine logs.

[1] lp:~aisrael/charms/precise/postgresql/fix-lint
[2] http://pastebin.ubuntu.com/8371905/

review: Needs Fixing
Revision history for this message
Stuart Bishop (stub) wrote :

'make lint' is clean when run on my trusty box. I did merge in your branch, but it introduces lint failures:

stub@belch:~/charms/postgresql/integration$ make lint
Lint check (flake8)
directory hooks
checking hooks/helpers.py
checking hooks/hooks.py
hooks/hooks.py:991:80: E501 line too long (80 > 79 characters)
hooks/hooks.py:998:80: E501 line too long (80 > 79 characters)
checking hooks/test_hooks.py
directory testing
checking testing/__init__.py
checking testing/jujufixture.py
directory tests
checking test.py
make: *** [lint] Error 1

I suspect you are running from Precise, with older versions of the lint checkers? Or a more modern version? Possibly more modern since one of the comments changed predates my maintainership. Apart from the trivially fixed long lines, your other changes look benign so I've merged it in anyway.

relation-list does work with 'juju run', because 'juju run' runs the command in a hook context (and is why it is functionally different to juju ssh). I do notice you are running juju 1.18, and I'm running the test suite with juju 1.20 (latest stable). I am not aware of any 1.20 features or bug fixes that I'm relying on, so 1.18 could be fine.

JUJU_REPOSITORY needs to be set, yes. I think we are stuck with this until I can port the tests to Amulet. Or when juju lets me deploy the cwd as a charm rather than requiring this baroque structure.

The tests are flaky, and will alas remain flaky until Bug #1200267 is addressed. I can increase the various sleeps scattered around the code base to make things better with slower providers, which will of course make the test suite even slower. I have increased what I think is the relevant timeout, and am now ignoring all failures from 'juju run' in this section (rather than just errcode==2, which is what we usually see when juju is caught with its pants down) - I think there was a race here where I might attempt to list units in a relation before the relation has been joined and an error code returned (the juju run failures you see).

The test_failover failure is interesting. I think for you it was failing just due to not waiting long enough. For me, I think that a race condition has been introduced with juju 1.20. I need to investigate this further.

Revision history for this message
Adam Israel (aisrael) wrote :

Hey Stuart,

Interesting. I'm not sure why make lint is producing different results. My juju version is 1.20.7-trusty-amd64, on a recently updated Trusty host, but the machine's agent version is 1.18.4.1.

Perhaps it's an issue of running inside a Vagrant VM.

I'll re-run the tests natively and see if there are any differences. It's entirely possible that something in the vagrant cloud image is different.

Revision history for this message
Adam Israel (aisrael) wrote :

Confirmed -- my Vagrant VM has a slightly newer version of flake8, which I suspect was installed via pip.

Revision history for this message
Stuart Bishop (stub) wrote :

I think I've addressed everything from Adam's review now.

Revision history for this message
Cory Johns (johnsca) wrote :

Stuart,

I still had test failures, which are all errors in the replication-relation-joined hook, and which doesn't appear to be directly related to a change from this set.

* Test run output: http://pastebin.ubuntu.com/8473480/
* Hook error from log: http://pastebin.ubuntu.com/8473481/

Also, while definitely not related to this change set, the tests for this charm do have some significant hurdles to passing in the automated test environment. Specifically:

* The `make test` target doesn't force the `make testdep` target, causing the `make test` target to fail if run first (which is done by the automated test runner). Once the runner manually discovers and runs 00_setup.py, the rest of the tests will fail, but `make test` will already have failed by that point. (This might be best fixed by having the runner discover and run the testdep target, though I do think it would be more intuitive to have the test target depend on the testdep target.)

* As you noted in your previous comment, the tests rely on the JUJU_REPOSITORY and SERIES (if being tested against precise) environment variables. The former may be set by the test runner, but it is unlikely that the latter will be.

* The tests for this charm require a local checkout of the postgresql-psql charm, with no fallback to installing from the store (which is what's causing this failure: http://reports.vapour.ws/charm-tests/charm-bundle-test-960-results/charm/charm-testing-aws/6).

Even though these weren't introduced with this changeset, we really need to address them to get the postgresql charm passing in the automated test environment.

review: Needs Fixing
Revision history for this message
Stuart Bishop (stub) wrote :

The exception traceback you pastebinned is very interesting. The "user
= rel['user']" line does not exist in this branch, so I think an old
version of the charm is being deployed somehow. The actual exception
looks familiar, and I think it is one that is fixed in this branch.
Hopefully it is something simple like the trusty charm is being run
from JUJU_REPOSITORY, whereas you were hoping for the precise branch
(the test run appears to be using trusty as the series). I beat my
head against a similar wall the other day, never finding out how juju
was managing to deploy a charm that didn't exist, until I totally
rebuild my JUJU_REPOSITORY.

I can add testdep as a requirement to the makefile easily enough. Is
'charm test' no longer the proposed method of running the tests?
(done)

I don't know why SERIES is required. The makefile does "SERIES :=
$(juju get-environment default-series)" at the top, so unless it is
overridden it will be using the default series of the environment.
(unchanged - charm version issue?)

I will fix things to use the charmstore version of postgresql-psql if
the series is Precise. There is not a version in the charm store for
Trusty. Ideally, there will never be as I'd like to just include this
charm inside the PostgreSQL charm somehow (in much the same way that
Amulet launches its sentinal charms). I haven't looked at this yet, as
I've been putting it off until I have time to switch things to Amulet.
Perhaps there is some syntax that will always use the precise branch
of the postgresql-psql charm no matter if I'm deploying to trusty or
precise hosts? (done for precise)

I'd be interested in hearing thoughts on the code changes and style btw.

On 2 October 2014 00:05, Cory Johns <email address hidden> wrote:

> Even though these weren't introduced with this changeset, we really need to address them to get the postgresql charm passing in the automated test environment.

Does this block landing? Let me know (and how to go about checking),
as I've been landing other branches to this charm from other
contributers without such scrutiny.

The current branch will also fail on the automated test environment,
and it doesn't seem helpful to block on this (and it means nobody’s
changes to the charm can land, since I doubt anyone else will be
attempting to fix this). I can simply disable most of the tests if it
is a requirement (and still have more tests than most charms).

--
Stuart Bishop <email address hidden>

Revision history for this message
Cory Johns (johnsca) wrote :

> The "user = rel['user']" line does not exist in this branch,
> so I think an old version of the charm is being deployed somehow.

I was testing by merging this branch into trunk, and the line is there. It looks like it was added upstream in this commit: http://bazaar.launchpad.net/~charmers/charms/precise/postgresql/trunk/revision/102

> Is 'charm test' no longer the proposed method of running the tests?

The automated test runner uses bundletester, which does the equivalent of `charm test` but also does some additional discovery, which includes running `make test` before doing `charm test`. Also, it was just less intuitive for me to check for a testdep target before running `make test`; I was all ready to nack you for missing dependencies. :p

Tim Van Steenburgh is working on a blog post with detailed info about the automated test runner and recommended way for running tests locally to ensure they pass for the automated runner, which we hope to have out soon. In the meantime, the bundletester tool can be `pip install`ed from https://pypi.python.org/pypi/bundletester/0.3.7

> I don't know why SERIES is required.

You're absolutely right. I checked the charms out under my local precise namespace, but was running with the default series set to trusty and was confused by why it was trying to deploy the charms as trusty, so that's my bad.

> Perhaps there is some syntax that will always use the precise branch
> of the postgresql-psql charm no matter if I'm deploying to trusty or
> precise hosts?

Just removing the conditional and hard-coding PSQL_CHARM to "cs:precise/postgresql-psql" should work.

> Does this block landing?

No, not really. I certainly won't push to block this MP based on upstream test failures, but that doesn't mean I won't push for you to fix them if you're willing. :)

Although, failing tests from upstream *do* make it more difficult to review the new changes, especially on a complex charm like this one.

Also, our goal is to gate reviews on the automated tests, to improve quality and help us manage the queue of reviews more effectively. One of our milestones on that goal is to get all of the failing charms on http://reports.vapour.ws/charm-tests-by-charm passing, or to back them out of Recommended status (unpromulgate) if we can't fix them or get the maintainer to fix them.

Revision history for this message
Cory Johns (johnsca) wrote :

For reference, here is the blog post I mentioned: http://blog.juju.solutions/cloud/juju/2014/10/02/charm-testing.html

Stuart, I will take a look at your updates shortly to see if they resolve the issue I was seeing. Thanks.

Revision history for this message
Stuart Bishop (stub) wrote :

This is where I'm at at the moment:

PostgreSQL integration tests, all non-beta versions,
trial test.PG91Tests
test
  PG91Tests
    test_admin_addresses ... [OK]
    test_basic ... [OK]
    test_basic_admin ... [OK]
    test_explicit_database ... [OK]
    test_failover ... [ERROR]
    test_failover_election ... [OK]
    test_roles_granted ... [FAIL]
    test_roles_revoked ... [FAIL]
    test_streaming_replication ... [OK]
    test_swiftwal_logshipping_replication ... [OK]
    test_syslog ... [OK]
    test_upgrade_charm ... [OK]
    test_wal_e_swift_logshipping ... [OK]

Revision history for this message
Stuart Bishop (stub) wrote :

Looks sorted now. Bugs were on trunk, so trunk is busted until this branch lands.

114. By Stuart Bishop

Merge trunk

Revision history for this message
Review Queue (review-queue) wrote :

This items has failed automated testing! Results available here http://reports.vapour.ws/charm-tests/charm-bundle-test-1227-results

review: Needs Fixing (automated testing)
115. By Stuart Bishop

Make hooks.py executable

116. By Stuart Bishop

Embed test client charm and remove JUJU_REPOSITORY requirement

117. By Stuart Bishop

Allow charm store charms, fixing rsyslog test

118. By Stuart Bishop

Ignore magic synced charmhelpers

Revision history for this message
Whit Morriss (whitmo) wrote :

Hey Stuart,

I'm running into issues getting bundletester to run the tests. It gets through lint and proof, then hangs, and exits after printing "Alarm clock" to stdout. psql and postgres seem to have deployed ok.

This may be an issue with my environment. still investigating.

-w

Revision history for this message
Stuart Bishop (stub) wrote :

Yeah, I've got another MP up disabling the integration tests by default as having them is not being helpful. If you work out what assumptions bundletester is making (or vice versa), we can fix that in a separate branch. My prime concern with this branch is actually landing it, as trunk is broken and fixes have been stuck here for months.

Revision history for this message
Charles Butler (lazypower) wrote :

Stub,

I've taken some time to re-review this branch with a mindset of "Focus on the code and verify the charm works" - And indeed. This branch of postgres fixes the upstream brokeneness that I've seen.

I'm going to tentatively approve this branch on the good faith promise that there will be an update correcting the testing story - as the PG91 tests are whats failing consistently when I test - but with the severity and longevity of this response cycle - I see value in whats here with additional work scheduled elsewhere to correct our infra problems re: testing.

Thanks again for the continued effort and work to keep the postgres charm a high quality member of the charm store.

If you have any questions/comments/concerns about the review contact us in #juju on irc.freenode.net or email the mailing list <email address hidden>, or ask a question tagged with "juju" on http://askubuntu.com.

review: Approve (deploy + code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches