Merge lp://staging/~hazmat/txzookeeper/backoff-retry-managed-sanity into lp://staging/~hazmat/txzookeeper/trunk

Proposed by Kapil Thangavelu
Status: Merged
Merge reported by: Kapil Thangavelu
Merged at revision: not available
Proposed branch: lp://staging/~hazmat/txzookeeper/backoff-retry-managed-sanity
Merge into: lp://staging/~hazmat/txzookeeper/trunk
Diff against target: 3336 lines (+2225/-241)
27 files modified
.bzrignore (+4/-0)
Makefile (+6/-0)
debian/changelog (+18/-0)
setup.py (+6/-3)
txzookeeper/__init__.py (+5/-2)
txzookeeper/client.py (+159/-79)
txzookeeper/lock.py (+6/-3)
txzookeeper/managed.py (+413/-0)
txzookeeper/node.py (+7/-26)
txzookeeper/queue.py (+5/-2)
txzookeeper/retry.py (+359/-0)
txzookeeper/tests/__init__.py (+57/-7)
txzookeeper/tests/common.py (+27/-2)
txzookeeper/tests/proxy.py (+97/-0)
txzookeeper/tests/test_client.py (+63/-77)
txzookeeper/tests/test_conn_failure.py (+194/-0)
txzookeeper/tests/test_lock.py (+4/-1)
txzookeeper/tests/test_managed.py (+309/-0)
txzookeeper/tests/test_node.py (+6/-3)
txzookeeper/tests/test_queue.py (+5/-2)
txzookeeper/tests/test_retry.py (+332/-0)
txzookeeper/tests/test_security.py (+4/-1)
txzookeeper/tests/test_session.py (+111/-20)
txzookeeper/tests/test_utils.py (+4/-1)
txzookeeper/tests/utils.py (+4/-1)
txzookeeper/todo.txt (+1/-9)
txzookeeper/utils.py (+19/-2)
To merge this branch: bzr merge lp://staging/~hazmat/txzookeeper/backoff-retry-managed-sanity
Reviewer Review Type Date Requested Status
Kapil Thangavelu Pending
Review via email: mp+146938@code.staging.launchpad.net

Description of the change

Session management and retry improvements.

Lots of improvements and several bug fixes.

Previously managed clientwas returning the underlying session client instead
of retry, so retry wasn't enabled for uses trying to catch/store the client from
connect.

The retry code was using seconds for session timeout instead of milliseconds.

The retry code wasn't taking account of the start of retry properly when
determining max retry.

The retry on persistent error wasn't signaling to the managed client to
establish a new session.

The client code wasn't properly cleaning up txzk handles on conn timeouts.

The session establishment now uses the client session events so even without
activity or active watches the session is restablished.

Session restablishment will now backoff on reconnects for up to 6m.

https://codereview.appspot.com/7307051/

To post a comment you must log in.
Revision history for this message
Kapil Thangavelu (hazmat) wrote :

Reviewers: mp+146938_code.launchpad.net,

Message:
Please take a look.

Description:
Session management and retry improvements.

Lots of improvements and several bug fixes.

Previously managed clientwas returning the underlying session client
instead
of retry, so retry wasn't enabled for uses trying to catch/store the
client from
connect.

The retry code was using seconds for session timeout instead of
milliseconds.

The retry code wasn't taking account of the start of retry properly when

determining max retry.

The retry on persistent error wasn't signaling to the managed client to
establish a new session.

The client code wasn't properly cleaning up txzk handles on conn
timeouts.

The session establishment now uses the client session events so even
without
activity or active watches the session is restablished.

Session restablishment will now backoff on reconnects for up to 6m.

https://code.launchpad.net/~hazmat/txzookeeper/backoff-retry-managed-sanity/+merge/146938

(do not edit description out of merge proposal)

Please review this at https://codereview.appspot.com/7307051/

Affected files:
   A .bzrignore
   A [revision details]
   M debian/changelog
   M setup.py
   M txzookeeper/__init__.py
   M txzookeeper/client.py
   M txzookeeper/lock.py
   A txzookeeper/managed.py
   M txzookeeper/node.py
   M txzookeeper/queue.py
   A txzookeeper/retry.py
   M txzookeeper/tests/__init__.py
   M txzookeeper/tests/common.py
   A txzookeeper/tests/proxy.py
   M txzookeeper/tests/test_client.py
   A txzookeeper/tests/test_conn_failure.py
   M txzookeeper/tests/test_lock.py
   A txzookeeper/tests/test_managed.py
   M txzookeeper/tests/test_node.py
   M txzookeeper/tests/test_queue.py
   A txzookeeper/tests/test_retry.py
   M txzookeeper/tests/test_security.py
   M txzookeeper/tests/test_session.py
   M txzookeeper/tests/test_utils.py
   M txzookeeper/tests/utils.py
   M txzookeeper/todo.txt
   M txzookeeper/utils.py

Revision history for this message
Kapil Thangavelu (hazmat) wrote :
52. By Kapil Thangavelu

much better session recovery and retry handling

Revision history for this message
Kapil Thangavelu (hazmat) wrote :
Revision history for this message
Kapil Thangavelu (hazmat) wrote :

for reasons unknown the diff on this very wrong, all the other revs other than 52 are already on trunk.

Revision history for this message
Benjamin Saller (bcsaller) wrote :

Hi,

Thanks for the branch. Initially I had an older version of pyzookeeper in my path and the tests were segfaulting, but after tracking that down I was able to get things working smoothly.

Coverage shows that the session backoff code isn't tested, if I'm reading it properly self._backoff_seconds is never incremented in the tests and so some of the focus of this branch goes without regression testing.

It looks like retry_error_callback never triggers cb_retry_expired in the tests either.

Nice simplification in node.py.

This looks good but I would like to see a few more tests around the retry path.

53. By Kapil Thangavelu

app errors don't trigger reconnect

Revision history for this message
Kapil Thangavelu (hazmat) wrote :

the test for backoff are in test_session which operates a full zk cluster
and has a test case that verifies backoff has run. there's a new makefile
with one target .. make coverage that runs through the full test suite. in
testing with juju environments, i discovered another minor issue with
retry.

thanks for the review.

-k

On Wed, Feb 6, 2013 at 7:15 PM, Benjamin Saller <
<email address hidden>> wrote:

> Hi,
>
> Thanks for the branch. Initially I had an older version of pyzookeeper in
> my path and the tests were segfaulting, but after tracking that down I was
> able to get things working smoothly.
>
> Coverage shows that the session backoff code isn't tested, if I'm reading
> it properly self._backoff_seconds is never incremented in the tests and so
> some of the focus of this branch goes without regression testing.
>
> It looks like retry_error_callback never triggers cb_retry_expired in the
> tests either.
>
> Nice simplification in node.py.
>
> This looks good but I would like to see a few more tests around the retry
> path.
>
>
>
> --
>
> https://code.launchpad.net/~hazmat/txzookeeper/backoff-retry-managed-sanity/+merge/146938
> You are the owner of lp:~hazmat/txzookeeper/backoff-retry-managed-sanity.
>

54. By Kapil Thangavelu

argh.. conditional fail

55. By Kapil Thangavelu

prevent forced reconnect hurds

Revision history for this message
Mikael Uvebrandt (mikael-uvebrandt) wrote :

Would love to see this in a release soon - we've seen this reconnect hammering in production, and it's conceivable that it contributed to killing the zookeeper (3.3.3) cluster in one instance (open files limit hit).

56. By Kapil Thangavelu

incr rev

57. By Kapil Thangavelu

fix typo in reconnect test assert, cleanup hurd logic

58. By Kapil Thangavelu

work around libzk bug to make session expiration tests more reliable

Revision history for this message
Kapil Thangavelu (hazmat) wrote :

Merged to trunk.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches

to all changes: