Merge lp://staging/~trond-norbye/libmemcached/bug_394442 into lp://staging/~tangent-org/libmemcached/trunk

Proposed by Trond Norbye
Status: Merged
Merged at revision: not available
Proposed branch: lp://staging/~trond-norbye/libmemcached/bug_394442
Merge into: lp://staging/~tangent-org/libmemcached/trunk
Diff against target: None lines
To merge this branch: bzr merge lp://staging/~trond-norbye/libmemcached/bug_394442
Reviewer Review Type Date Requested Status
Libmemcached-developers Pending
Review via email: mp+8098@code.staging.launchpad.net
To post a comment you must log in.
Revision history for this message
Trond Norbye (trond-norbye) wrote :

 Bug 394442: Replication mget test fail on linux due to race conditions

  All replicas are stored on the different servers by using the quiet commands,
  and when we start to receive them we will do that in another memcached_st
  instance (aka another connection to the server). The code goes directly
  from the SET command to trying to fetch the items from all of the servers.
  This means that the memcached server may still be processing the quiet set
  commands while another thread in the memcached server tries to fetch all of
  the items.

  To fix this we need to ensure that all of the set commands are executed on
  all of the memcached servers before starting to receive them. memcached_quit
  will send the QUIT command to all of the servers and wait for a response,
  so we now that when memcached_quit returns all commands are executed on
  the client.

  In addition the response counter should not be updated when we send out the
  replica storage / delete commands.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'libmemcached/memcached_delete.c'
--- libmemcached/memcached_delete.c 2009-06-21 15:51:17 +0000
+++ libmemcached/memcached_delete.c 2009-07-01 19:36:25 +0000
@@ -143,7 +143,9 @@
143 if ((memcached_do(server, (const char*)request.bytes, 143 if ((memcached_do(server, (const char*)request.bytes,
144 sizeof(request.bytes), 0) != MEMCACHED_SUCCESS) ||144 sizeof(request.bytes), 0) != MEMCACHED_SUCCESS) ||
145 (memcached_io_write(server, key, key_length, flush) == -1))145 (memcached_io_write(server, key, key_length, flush) == -1))
146 memcached_io_reset(server);146 memcached_io_reset(server);
147 else
148 memcached_server_response_decrement(server);
147 }149 }
148 }150 }
149151
150152
=== modified file 'libmemcached/memcached_storage.c'
--- libmemcached/memcached_storage.c 2009-06-21 15:51:17 +0000
+++ libmemcached/memcached_storage.c 2009-07-01 19:36:25 +0000
@@ -491,7 +491,9 @@
491 send_length, 0) != MEMCACHED_SUCCESS) ||491 send_length, 0) != MEMCACHED_SUCCESS) ||
492 (memcached_io_write(srv, key, key_length, 0) == -1) ||492 (memcached_io_write(srv, key, key_length, 0) == -1) ||
493 (memcached_io_write(srv, value, value_length, flush) == -1))493 (memcached_io_write(srv, value, value_length, flush) == -1))
494 memcached_io_reset(server);494 memcached_io_reset(srv);
495 else
496 memcached_server_response_decrement(srv);
495 }497 }
496 }498 }
497499
498500
=== modified file 'tests/function.c'
--- tests/function.c 2009-06-29 17:47:16 +0000
+++ tests/function.c 2009-07-01 19:36:25 +0000
@@ -3070,6 +3070,16 @@
3070 return rc;3070 return rc;
3071}3071}
30723072
3073static memcached_return pre_replication_noblock(memcached_st *memc)
3074{
3075 memcached_return rc= MEMCACHED_FAILURE;
3076 if (pre_replication(memc) == MEMCACHED_SUCCESS &&
3077 pre_nonblock(memc) == MEMCACHED_SUCCESS)
3078 rc= MEMCACHED_SUCCESS;
3079
3080 return rc;
3081}
3082
3073static void my_free(memcached_st *ptr __attribute__((unused)), void *mem)3083static void my_free(memcached_st *ptr __attribute__((unused)), void *mem)
3074{3084{
3075 free(mem);3085 free(mem);
@@ -3583,6 +3593,21 @@
3583 assert(rc == MEMCACHED_SUCCESS);3593 assert(rc == MEMCACHED_SUCCESS);
35843594
3585 /*3595 /*
3596 ** We are using the quiet commands to store the replicas, so we need
3597 ** to ensure that all of them are processed before we can continue.
3598 ** In the test we go directly from storing the object to trying to
3599 ** receive the object from all of the different servers, so we
3600 ** could end up in a race condition (the memcached server hasn't yet
3601 ** processed the quiet command from the replication set when it process
3602 ** the request from the other client (created by the clone)). As a
3603 ** workaround for that we call memcached_quit to send the quit command
3604 ** to the server and wait for the response ;-) If you use the test code
3605 ** as an example for your own code, please note that you shouldn't need
3606 ** to do this ;-)
3607 */
3608 memcached_quit(memc);
3609
3610 /*
3586 ** "bubba" should now be stored on all of our servers. We don't have an3611 ** "bubba" should now be stored on all of our servers. We don't have an
3587 ** easy to use API to address each individual server, so I'll just iterate3612 ** easy to use API to address each individual server, so I'll just iterate
3588 ** through a bunch of "master keys" and I should most likely hit all of the3613 ** through a bunch of "master keys" and I should most likely hit all of the
@@ -3653,6 +3678,21 @@
3653 }3678 }
36543679
3655 /*3680 /*
3681 ** We are using the quiet commands to store the replicas, so we need
3682 ** to ensure that all of them are processed before we can continue.
3683 ** In the test we go directly from storing the object to trying to
3684 ** receive the object from all of the different servers, so we
3685 ** could end up in a race condition (the memcached server hasn't yet
3686 ** processed the quiet command from the replication set when it process
3687 ** the request from the other client (created by the clone)). As a
3688 ** workaround for that we call memcached_quit to send the quit command
3689 ** to the server and wait for the response ;-) If you use the test code
3690 ** as an example for your own code, please note that you shouldn't need
3691 ** to do this ;-)
3692 */
3693 memcached_quit(memc);
3694
3695 /*
3656 * Don't do the following in your code. I am abusing the internal details3696 * Don't do the following in your code. I am abusing the internal details
3657 * within the library, and this is not a supported interface.3697 * within the library, and this is not a supported interface.
3658 * This is to verify correct behavior in the library3698 * This is to verify correct behavior in the library
@@ -4588,6 +4628,7 @@
4588 {"consistent_ketama_weighted", pre_behavior_ketama_weighted, 0, consistent_weighted_tests},4628 {"consistent_ketama_weighted", pre_behavior_ketama_weighted, 0, consistent_weighted_tests},
4589 {"test_hashes", 0, 0, hash_tests},4629 {"test_hashes", 0, 0, hash_tests},
4590 {"replication", pre_replication, 0, replication_tests},4630 {"replication", pre_replication, 0, replication_tests},
4631 {"replication_noblock", pre_replication_noblock, 0, replication_tests},
4591 {0, 0, 0, 0}4632 {0, 0, 0, 0}
4592};4633};
45934634

Subscribers

People subscribed via source and target branches

to all changes: