Merge lp://staging/~rogpeppe/gozk/update-server-interface into lp://staging/~juju/gozk/trunk

Proposed by Roger Peppe
Status: Merged
Merge reported by: Gustavo Niemeyer
Merged at revision: not available
Proposed branch: lp://staging/~rogpeppe/gozk/update-server-interface
Merge into: lp://staging/~juju/gozk/trunk
Diff against target: 3053 lines (+1235/-782)
9 files modified
Makefile (+6/-2)
example/example.go (+22/-23)
reattach_test.go (+198/-0)
retry_test.go (+65/-74)
runserver.go (+168/-0)
server.go (+229/-0)
suite_test.go (+42/-121)
zk.go (+252/-311)
zk_test.go (+253/-251)
To merge this branch: bzr merge lp://staging/~rogpeppe/gozk/update-server-interface
Reviewer Review Type Date Requested Status
Gustavo Niemeyer Needs Fixing
Review via email: mp+77009@code.staging.launchpad.net

Description of the change

Updated server API.

To post a comment you must log in.
24. By Roger Peppe

Fix zookeeper spelling

Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :

Great stuff here Rog, thanks!

Some comments:

[1]

-// The ZooKepeer installation directory is specified by installedDir.
+// The ZooKeeper installation directory is specified by installDir.

installDir feels like a strange name in this context. I agree installedDir
isn't great either, but the new one gives a wrong impression about what
it means.. we're not installing anything there.

Let's use something like zkDir instead.

[2]

+func CreateServer(port int, runDir, installDir string) (*Server, os.Error) {
(...)
+func AttachServer(runDir string) (*Server, os.Error) {

The organization inside these is looking very nice.

[3]

+func (srv *Server) getServerProcess() (*os.Process, os.Error) {
(...)
+ return os.FindProcess(pid)

FindProcess does nothing on Unix. We can run at least a Signal 0 against
the pid to tell if it exists or not.

[4]

+ if p != nil {
+ p.Release()
+ }
+ return fmt.Errorf("ZooKeeper server may already be running (remove %q to clear)", srv.path("pid.txt"))

The recommendation here is a bit strange. If the process is actually running,
removing the pid file will allow a second server to be started in parallel,
which is not nice.

I suggest just saying "ZooKeeper server seems to be running already", which
is a bit more true given that we'll be checking the pid as per the point above.
Then, the developer can run Stop() on it if he feels like stopping it, or
do whatever else out of band if he doesn't want to use the interface, but
then it's his responsibility to check for background processes, etc.

[5]

+// Stop kills the ZooKeeper server. It is a no-op if it is already running.

The second part is reversed, and reads a bit hard.

Suggestion: It does nothing if not running.

Also, please mention that the data is preserved, and that the server can
be restarted. "kill" may feel like otherwise.

[6]

+// Destroy stops the ZooKeeper server, and then removes its run
+// directory and all its contents.

s/and all its contents././

Removing the directory means removing its contents necessarily.

[7]

+func (srv *Server) writeInstallDir() os.Error {
+ return ioutil.WriteFile(srv.path("installdir.txt"), []byte(srv.installDir+"\n"), 0666)

These functions and filename should be renamed in sync to the first point.

[8]

+ if data[len(data)-1] == '\n' {
+ data = data[0 : len(data)-1]
+ }
+ srv.installDir = string(data)

srv.zkDir = string(bytes.TrimSpace(data))

[9]

+ s.zkTestRoot = c.MkDir() + "/zk"
+ port := 21812
+ s.zkAddr = fmt.Sprint("localhost:", port)
+
+ s.zkServer, err = zk.CreateServer(port, s.zkTestRoot, "")
  if err != nil {
   c.Fatal("Cannot set up server environment: ", err)
  }
- s.StartZK(c)
+ err = s.zkServer.Start()
+ if err != nil {
+ c.Fatal("Cannot start ZooKeeper server: ", err)
+ }

This is awesome! Thanks!

review: Approve
Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :

[10]

+// exist, it assumes the server is not running and returns (nil, nil).

This interface is strange, and can easily lead to bugs since it's not an idiom in Go.
I'd never assume p is nil if err == nil. Please return an error if p == nil.

Revision history for this message
Roger Peppe (rogpeppe) wrote :

On 5 October 2011 15:34, Gustavo Niemeyer <email address hidden> wrote:
> Review: Approve
>
> Great stuff here Rog, thanks!
>
> Some comments:
>
>
> [1]
>
> -// The ZooKepeer installation directory is specified by installedDir.
> +// The ZooKeeper installation directory is specified by installDir.
>
> installDir feels like a strange name in this context. I agree installedDir
> isn't great either, but the new one gives a wrong impression about what
> it means.. we're not installing anything there.
>
> Let's use something like zkDir instead.

done.

> +func (srv *Server) getServerProcess() (*os.Process, os.Error) {
> (...)
> +       return os.FindProcess(pid)
>
> FindProcess does nothing on Unix.  We can run at least a Signal 0 against
> the pid to tell if it exists or not.

ok, as discussed i'll just not bother about the race.

> [5]
>
> +// Stop kills the ZooKeeper server. It is a no-op if it is already running.
>
> The second part is reversed, and reads a bit hard.
>
> Suggestion: It does nothing if not running.

better. my text was just wrong. done.
>
> Also, please mention that the data is preserved, and that the server can
> be restarted. "kill" may feel like otherwise.

done.

>
>
> [6]
>
> +// Destroy stops the ZooKeeper server, and then removes its run
> +// directory and all its contents.
>
> s/and all its contents././
>
> Removing the directory means removing its contents necessarily.

done.

> [7]
>
> +func (srv *Server) writeInstallDir() os.Error {
> +       return ioutil.WriteFile(srv.path("installdir.txt"), []byte(srv.installDir+"\n"), 0666)
>
> These functions and filename should be renamed in sync to the first point.

i'm not sure what you mean here.

> +       if data[len(data)-1] == '\n' {
> +               data = data[0 : len(data)-1]
> +       }
> +       srv.installDir = string(data)
>
> srv.zkDir = string(bytes.TrimSpace(data))

really? this code will have written the data inside the file, so there
should be no
extra space added. it's legal (ok it's pretty unusual) for a file name to
end in a space, and calling TrimSpace would break that.

25. By Roger Peppe

Added reattach_test.go to test server reattaching.
Service-related methods now get their own file.

Revision history for this message
Roger Peppe (rogpeppe) wrote :

On 5 October 2011 15:49, Gustavo Niemeyer <email address hidden> wrote:
> [10]
>
>
> +// exist, it assumes the server is not running and returns (nil, nil).
>
> This interface is strange, and can easily lead to bugs since it's not an idiom in Go.
> I'd never assume p is nil if err == nil. Please return an error if p == nil.

i've changed this, but for reference, i've just noticed a similar example
in your code, in Exists:

 // We diverge a bit from the usual here: a ZNONODE is not an error
 // for an exists call, otherwise every Exists call would have to check
 // for err != nil and err.Code() != ZNONODE.

it's for a very similar reason that i returned a nil process and nil error from
Process when there was no process to be found.

Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :
Download full text (3.3 KiB)

Good changes overall, thanks. Some follow ups:

[7]

You've renamed installDir to zkDir. Let's rename functions and filenames
to match that as well (zkdir.txt, writeZkDir, etc)

[8]

Ok, but this logic is wrong because it doesn't check the length and could
panic.

Alright, so instead of doing either of those things, please just don't append
a '\n' at the end, so we can just string(data) straight. This is already
being done for pid.txt, and sounds like a sane approach.

[10]

Indeed, but that feels like a slightly different case because the method
is already a question (does it exist?), so the fact it doesn't exist
is an answer to that question rather than an error.

If the method was named something like Stat, it should definitely be an
error rather than simply an empty stat.

[11]

+// ServerCommand returns the command used to start the
+// ZooKeeper server. It is provided for debugging and testing
+// purposes only.
+func (srv *Server) command() ([]string, os.Error) {

s/ServerCommand/command/
s/It is provided.*//

[12]

+// NetworkPort returns the TCP port number that
+// the server is configured for.

s/NetworkPort/networkPort/

[13]

+// This file defines methods on Server that deal with starting
+// and stopping the ZooKeeper service. They are independent of ZooKeeper
+// itself, and may be factored out at a later date.

Please merge this on server.go. As pointed out in the other review and obvious
through out the code, this is really specific to ZooKeeper. If/when we do
refactor this out to generalize, we'll have to reorganize anyway.

[14]

+var ErrNotRunning = os.NewError("process not running")

I'm not a big fan of the Err* convention. Makes for an ugly name with typing
information embedded on it. Some packages do use it, but thankfully it's
also not strictly followed in the Go standard library either.

I'd appreciate renaming this to NotRunning instead. zk.NotRunning seems
pretty good.

[15]

+// Process returns a Process referring to the running server from
+// where it's been stored in pid.txt. If the file does not
+// exist, or it cannot find the process, it returns the error
+// ErrNotRunning.

Where the pid is stored is an implementation detail that can
change. We don't have to document that for now.

[16]

+// getProcess gets a Process from a pid and check that the

s/check/checks/

[17]

+ if e, ok := err.(*os.SyscallError); ok && e.Errno == os.ECHILD || err == os.ECHILD {

Ugh. I hope we get that cleaned in Go itself with the suggested changes for 1.

[18]

+ // is to poll until it exits. If the process has taken longer than
+ // a second to exit, then it's probably not going to.

This is very optimistic. A process can easily take more than a second to exit. If
this is an edge case and there's a reason to wait at all (dir content, etc), then
let's wait a bit longer, like 10 seconds for instance.

[19]

+// cases to test:
+// child server, stopped normally; reattach, start
+// non-direct child server, killed abnormally; reattach, start (->error), remove pid.txt; start
+// non-direct child server, still running; reattach, start (->error), stop, start
+// child server, still running; reattach,...

Read more...

review: Needs Fixing
Revision history for this message
Roger Peppe (rogpeppe) wrote :
Download full text (3.3 KiB)

PTAL

On 7 October 2011 16:58, Gustavo Niemeyer <email address hidden> wrote:
> [7]
>
> You've renamed installDir to zkDir. Let's rename functions and filenames
> to match that as well (zkdir.txt, writeZkDir, etc)

done

>
> [8]
>
> Ok, but this logic is wrong because it doesn't check the length and could
> panic.
>
> Alright, so instead of doing either of those things, please just don't append
> a '\n' at the end, so we can just string(data) straight. This is already
> being done for pid.txt, and sounds like a sane approach.

done

> [11]
>
> +// ServerCommand returns the command used to start the
> +// ZooKeeper server. It is provided for debugging and testing
> +// purposes only.
> +func (srv *Server) command() ([]string, os.Error) {
>
> s/ServerCommand/command/
> s/It is provided.*//

given that it's not external any more, that "debugging and testing"
remark isn't necessary now.

> [12]
>
> +// NetworkPort returns the TCP port number that
> +// the server is configured for.
>
> s/NetworkPort/networkPort/
>
> [13]
>
> +// This file defines methods on Server that deal with starting
> +// and stopping the ZooKeeper service. They are independent of ZooKeeper
> +// itself, and may be factored out at a later date.
>
> Please merge this on server.go. As pointed out in the other review and obvious
> through out the code, this is really specific to ZooKeeper. If/when we do
> refactor this out to generalize, we'll have to reorganize anyway.

i can do this, but given that the dependencies of the two files
are mostly different (only fmt, ioutil and os in common), and there's
very little interlink between the two, i think it makes sense to keep
them as two different files - i think it makes the code easier to read.
perhaps "runserver.go" might be a better name than "service.go" though.
i've done that for the time being. that said, it's no big deal.

> [14]
>
> +var ErrNotRunning = os.NewError("process not running")
>
> I'm not a big fan of the Err* convention. Makes for an ugly name with typing
> information embedded on it. Some packages do use it, but thankfully it's
> also not strictly followed in the Go standard library either.
>
> I'd appreciate renaming this to NotRunning instead. zk.NotRunning seems
> pretty good.

ok.

>
> [15]
>
> +// Process returns a Process referring to the running server from
> +// where it's been stored in pid.txt. If the file does not
> +// exist, or it cannot find the process, it returns the error
> +// ErrNotRunning.
>
> Where the pid is stored is an implementation detail that can
> change. We don't have to document that for now.
>
> [16]
>
> +// getProcess gets a Process from a pid and check that the
>
> s/check/checks/

done

> [18]
>
> +               // is to poll until it exits. If the process has taken longer than
> +               // a second to exit, then it's probably not going to.
>
> This is very optimistic. A process can easily take more than a second to exit. If
> this is an edge case and there's a reason to wait at all (dir content, etc), then
> let's wait a bit longer, like 10 seconds for instance.

ok. given that we're killing the server with SIGKILL, it's not going to
have any chance to take ages to c...

Read more...

26. By Roger Peppe

Fixes as per review.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches

to all changes: