Code review comment for lp://staging/~rogpeppe/gozk/update-server-interface

Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :

Good changes overall, thanks. Some follow ups:

[7]

You've renamed installDir to zkDir. Let's rename functions and filenames
to match that as well (zkdir.txt, writeZkDir, etc)

[8]

Ok, but this logic is wrong because it doesn't check the length and could
panic.

Alright, so instead of doing either of those things, please just don't append
a '\n' at the end, so we can just string(data) straight. This is already
being done for pid.txt, and sounds like a sane approach.

[10]

Indeed, but that feels like a slightly different case because the method
is already a question (does it exist?), so the fact it doesn't exist
is an answer to that question rather than an error.

If the method was named something like Stat, it should definitely be an
error rather than simply an empty stat.

[11]

+// ServerCommand returns the command used to start the
+// ZooKeeper server. It is provided for debugging and testing
+// purposes only.
+func (srv *Server) command() ([]string, os.Error) {

s/ServerCommand/command/
s/It is provided.*//

[12]

+// NetworkPort returns the TCP port number that
+// the server is configured for.

s/NetworkPort/networkPort/

[13]

+// This file defines methods on Server that deal with starting
+// and stopping the ZooKeeper service. They are independent of ZooKeeper
+// itself, and may be factored out at a later date.

Please merge this on server.go. As pointed out in the other review and obvious
through out the code, this is really specific to ZooKeeper. If/when we do
refactor this out to generalize, we'll have to reorganize anyway.

[14]

+var ErrNotRunning = os.NewError("process not running")

I'm not a big fan of the Err* convention. Makes for an ugly name with typing
information embedded on it. Some packages do use it, but thankfully it's
also not strictly followed in the Go standard library either.

I'd appreciate renaming this to NotRunning instead. zk.NotRunning seems
pretty good.

[15]

+// Process returns a Process referring to the running server from
+// where it's been stored in pid.txt. If the file does not
+// exist, or it cannot find the process, it returns the error
+// ErrNotRunning.

Where the pid is stored is an implementation detail that can
change. We don't have to document that for now.

[16]

+// getProcess gets a Process from a pid and check that the

s/check/checks/

[17]

+ if e, ok := err.(*os.SyscallError); ok && e.Errno == os.ECHILD || err == os.ECHILD {

Ugh. I hope we get that cleaned in Go itself with the suggested changes for 1.

[18]

+ // is to poll until it exits. If the process has taken longer than
+ // a second to exit, then it's probably not going to.

This is very optimistic. A process can easily take more than a second to exit. If
this is an edge case and there's a reason to wait at all (dir content, etc), then
let's wait a bit longer, like 10 seconds for instance.

[19]

+// cases to test:
+// child server, stopped normally; reattach, start
+// non-direct child server, killed abnormally; reattach, start (->error), remove pid.txt; start
+// non-direct child server, still running; reattach, start (->error), stop, start
+// child server, still running; reattach, start (-> error)
+// child server, still running; reattach, stop, start.
+// non-direct child server, still running; reattach, stop, start.

A bit tricky to follow, but looks like a nice set of tests, thanks.

review: Needs Fixing

« Back to merge proposal