Percona Server moved to https://jira.percona.com/projects/PS

Merge lp://staging/~laurynas-biveinis/percona-server/bug803865-5.5 into lp://staging/percona-server/5.5

bug803865-5.5
Merge into 5.5

Proposed by Laurynas Biveinis on 2011-07-22

Status:

Merged

Approved by:

Stewart Smith on 2011-08-13

Approved revision:

no longer in the source branch.

Merged at revision:

149

Proposed branch:

lp://staging/~laurynas-biveinis/percona-server/bug803865-5.5

Merge into:

lp://staging/percona-server/5.5

Diff against target:

49 lines (+42/-0)

1 file modified

response_time_distribution.patch (+42/-0)

To merge this branch:

bzr merge lp://staging/~laurynas-biveinis/percona-server/bug803865-5.5

Medium

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
Stewart Smith (community)		2011-07-22	Approve on 2011-08-13
Review via email: mp+68771@code.staging.launchpad.net

Description of the change

Fixes LP bug #803865: atomic_cas_64 crash. For 5.5, no actual test failures have been observed yet, the fix is preeemptive.

Here is the relevant code from include/atomic/x86_gcc.h:

  asm volatile ("push %%ebx;"
                "movl (%%ecx), %%ebx;"
                "movl 4(%%ecx), %%ecx;"
                LOCK_prefix "; cmpxchg8b %0;"
                "setz %2; pop %%ebx"
                : "=m" (*a), "+A" (*cmp), "=c" (ret)
                : "c" (&set), "m" (*a)
                : "memory", "esp")

There are following issues that make it easy for GCC to break this
code by optimization:
- The zero-th operand is marked as output-only, but it is an
input-output one: CMPXCHG8B reads the memory pointed to it, and also
sets EDX:EAX if the operand and EDX:EAX are non-equal. It is also an
output operand: the instruction sets it to ECX:EBX.
- The code probably tried to say that zero-th operand is also an input
one by specifying it as the 4th operand ("m" (*a)), but this fails,
because GCC does not guarantee putting the input and output operands
referenced by the same C expression at the same location. Also this
operand is not referenced in the assembly at all.
- The code clobbers EFLAGS but does not specify it.

While we are at it, there are following issues that make the resulting
code suboptimal:
- "volatile" is not needed as all the output operands are specified
fully (i.e. no undescribed side-effects in the assembly);
- No need to force using ECX for input &set and output ret,
- Clobber "memory" is redundant as the code does not access memory in
a way undescribed by constraints;
- Clobber "esp", I fail to understand its point at all.

The fix is a full rewrite of the fragment:

  asm ("movl %%edi, -4(%%esp);"
       "leal %0, %%edi;"
       "xchgl %%ebx, %%esi;"
       LOCK_prefix "; cmpxchg8b (%%edi);"
       "movl %%esi, %%ebx;"
       "movl -4(%%esp), %%edi;"
       "setz %1;"
       : "+m" (*a), "=q" (ret), "+A" (*cmp)
       : "S" ((int32)(set & 0xFFFFFFFF)), "c" ((int32)(set >> 32))
       : "flags")

All operands are loaded and EBX is set as the very last op before CMPXCHG8B, because some of the operands might be addressed through EBX even if we specify that the lower half of set should be assigned to it. ESI is fixed for the lower half of set to keep GCC away from reusing any other already taken register if it can prove an equal live value there. The EDI register is saved manually, as simply specifying it as a clobbered register sometimes chokes GCC with "impossible reloads", probably due to high register pressure. Finally, EDI is saved through direct stack manipulation instead of PUSH/POP, as other operands might be addressed through ESP.

The test results are at
http://jenkins.percona.com/job/percona-server-5.5-param/52/