Merge lp://staging/~rogpeppe/gozk/safe-close into lp://staging/gozk/zookeeper

Proposed by Roger Peppe
Status: Merged
Merged at revision: 31
Proposed branch: lp://staging/~rogpeppe/gozk/safe-close
Merge into: lp://staging/gozk/zookeeper
Diff against target: 523 lines (+301/-13)
5 files modified
close_test.go (+220/-0)
retry_test.go (+3/-3)
suite_test.go (+4/-3)
zk.go (+68/-1)
zk_test.go (+6/-6)
To merge this branch: bzr merge lp://staging/~rogpeppe/gozk/safe-close
Reviewer Review Type Date Requested Status
The Go Language Gophers Pending
Review via email: mp+94812@code.staging.launchpad.net

Description of the change

zookeeper: make Conn.Close safe to call concurrently with other operations.

I am concerned at how complex the test for this is,
relative to the simplicity of the actual code change.
(and it's also potentially fragile on a heavily loaded machine).
That said, I can't think of another way to test the
operations directly, so I'm leaving it in.
Better suggestions welcome.

https://codereview.appspot.com/5699093/

To post a comment you must log in.
Revision history for this message
Roger Peppe (rogpeppe) wrote :

Reviewers: mp+94812_code.launchpad.net,

Message:
Please take a look.

Description:
I am concerned at how complex the test for this is,
relative to the simplicity of the actual code change.
(and it's also potentially fragile on a heavily loaded machine).
That said, I can't think of another way to test the
operations directly, so I'm leaving it in.
Better suggestions welcome.

https://code.launchpad.net/~rogpeppe/gozk/safe-close/+merge/94812

(do not edit description out of merge proposal)

Please review this at https://codereview.appspot.com/5699093/

Affected files:
   A close_test.go
   M retry_test.go
   M suite_test.go
   M zk.go
   M zk_test.go

Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :

https://codereview.appspot.com/5699093/diff/1001/close_test.go
File close_test.go (right):

https://codereview.appspot.com/5699093/diff/1001/close_test.go#newcode96
close_test.go:96: time.Sleep(0.05e9)
How can you tell this is really unblocking at the right time and
exercising what you intend it to at all?

https://codereview.appspot.com/5699093/

Revision history for this message
Roger Peppe (rogpeppe) wrote :
33. By Roger Peppe

improve close_test comment.
simplify io.Copy logic.

Revision history for this message
William Reade (fwereade) wrote :

This essentially LGTM; I don't think that it's a serious problem that
the tests can theoretically fail under "enough" load. So it comes down
to a question of practical reliability: for the sake of argument, what
if we shoot for an observed failure rate of <1% on your local machine
under "normal" load, and bump up the sleeps as needed if they turn out
not to be good enough for clean testing in practice (say, on ARM ;))?

https://codereview.appspot.com/5699093/

Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :

LGTM, sorry for the delay.

https://codereview.appspot.com/5699093/

Revision history for this message
Roger Peppe (rogpeppe) wrote :

i forgot to publish this comment.

https://codereview.appspot.com/5699093/diff/1001/close_test.go
File close_test.go (right):

https://codereview.appspot.com/5699093/diff/1001/close_test.go#newcode96
close_test.go:96: time.Sleep(0.05e9)
On 2012/02/27 18:03:17, niemeyer wrote:
> How can you tell this is really unblocking at the right time and
exercising what
> you intend it to at all?

the idea is that any request that requests or changes a zookeeper node
must make at least one round trip to the server. so we interpose a proxy
between the client and the server which can stop all incoming traffic on
demand, thus hopefully blocking the request until we want it to unblock.

we assume that all requests take less than 0.1s to complete, thus when
we wait below, neither of the above goroutines should complete within
the allotted time (the request because it's waiting for a reply from the
server and the close because it's waiting for the request to complete).
if the locking doesn't work, the Close will return early. if the proxy
blocking doesn't work, the request will return early.

when we reenable incoming messages from the server, both goroutines
should complete (we can't tell which completes first - i don't think
that's observable) but i think that the fact that the close blocked
waiting for the request is sufficient.

i've changed the above comment to try to make this clearer.

as i said in the description of the CL, i think this a lot of mechanism
to test something that's quite simple to verify by eye, but i haven't
yet thought of a better way.
one way could be to test Close without testing all the individual
operations - simpler but less complete.

https://codereview.appspot.com/5699093/

Revision history for this message
Roger Peppe (rogpeppe) wrote :

*** Submitted:

zookeeper: make Conn.Close safe to call concurrently with other
operations.

I am concerned at how complex the test for this is,
relative to the simplicity of the actual code change.
(and it's also potentially fragile on a heavily loaded machine).
That said, I can't think of another way to test the
operations directly, so I'm leaving it in.
Better suggestions welcome.

R=niemeyer, fwereade
CC=
https://codereview.appspot.com/5699093

https://codereview.appspot.com/5699093/

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches