Ok, I've finished a comprehensive review of the integration tests. There were several tests failing that I've determined to be a failure in the test infrastructure. I've landed a patch to bundletester that was running tests more than once. I'll be sending patches to amulet, juju-deployer, and juju-wait (the change from max_workers=6 to max_workers=1 fixed the lock timeout error). With the above patches implemented locally, I've been able to run the 9.1 integration test with only one failure, that looks like it's ultimately erroneous: test_syslog initially failed on the deploy command. This line: self.deployment.add('rsyslog', 'cs:rsyslog', 2) should read: self.deployment.add('rsyslog', 'rsyslog', 2) "cs:rsyslog" returns a 404 - see https://manage.jujucharms.com/api/3/charm/trusty/cs:rsyslog With that fixed, the test runs but fails it's assert checks: _StringException: Traceback (most recent call last): File "/charms/trusty/postgresql/test.py", line 697, in test_syslog self.failUnless('master {}'.format(token) in out) File "/usr/lib/python2.7/unittest/case.py", line 614, in deprecated_func return original_func(*args, **kwargs) File "/usr/lib/python2.7/unittest/case.py", line 424, in assertTrue raise self.failureException(msg) AssertionError: False is not true I think the failure is wrong, though. I ssh'd into each rsyslog unit after the failure and `tail -n 100 /var/log/syslog | grep`ped for each string and found it. While running test_explicit_database, I ran into a case where the replication-relation-broken failed on tearDown, and reset_environment wasn't cycling through to resolve failures. The test passed when I flipped the force flag in amuletfixture.py, but that might not be what you intended. The stack trace of the failure: 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 Traceback (most recent call last): 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/replication-relation-broken", line 2701, in 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 hooks.execute(sys.argv) 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/charmhelpers/core/hookenv.py", line 490, in execute 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 self._hooks[hook_name]() 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/replication-relation-broken", line 2263, in replication_relation_broken 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 config_changed() 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/replication-relation-broken", line 1140, in config_changed 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 generate_postgresql_hba(postgresql_hba) 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/replication-relation-broken", line 521, in generate_postgresql_hba 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 unit=hookenv.local_unit(), rid=relid) 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/charmhelpers/core/hookenv.py", line 44, in wrapper 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 res = func(*args, **kwargs) 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/charmhelpers/core/hookenv.py", line 289, in relation_get 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 return json.loads(subprocess.check_output(_args)) 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 File "/usr/lib/python2.7/subprocess.py", line 573, in check_output 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 raise CalledProcessError(retcode, cmd, output=output) 2015-03-26 15:17:01 INFO unit.postgresql/0.replication-relation-broken logger.go:40 subprocess.CalledProcessError: Command '['relation-get', '--format=json', '-r', u'db:11', '-', 'postgresql/0']' returned non-zero exit status 1 Other changes: - increased the default timeout in amuletfixture.py from 1200 to 3600 (may not be necessary) - in test.py:test_failover, I changed three instances of self.deployment.wait() to self.deployment.wait(timeout=self.timeout) Each unit test took around an hour to run. I wonder if there's anything to be done to shorten that without sacrificing the completeness of the unit tests. Perhaps limit calls to `apt-get update` in `hooks/hooks.py`; it's called twice, on line 32 and again on line 1848. If one of those calls could be eliminated, that would probably shave a few minutes off the test time. With the above few fixes, plus the merges I need to land in the tools, I think we'll be in good shape to land these changes. Thanks again for all of your work and patience on this!