Merge lp://staging/~percona-toolkit-dev/percona-toolkit/ptc-errors-on-slave-with-different-time-zone-1388870 into lp://staging/~percona-toolkit-dev/percona-toolkit/release-2.2.13

Proposed by Frank Cizmich
Status: Merged
Approved by: Frank Cizmich
Approved revision: 613
Merged at revision: 621
Proposed branch: lp://staging/~percona-toolkit-dev/percona-toolkit/ptc-errors-on-slave-with-different-time-zone-1388870
Merge into: lp://staging/~percona-toolkit-dev/percona-toolkit/release-2.2.13
Diff against target: 155 lines (+38/-13)
7 files modified
bin/pt-table-checksum (+3/-3)
lib/RowChecksum.pm (+3/-3)
t/lib/RowChecksum.t (+4/-4)
t/pt-table-checksum/bugs.t (+25/-0)
t/pt-table-checksum/samples/chunkidx004.txt (+1/-1)
t/pt-table-checksum/samples/chunkidx005.txt (+1/-1)
t/pt-table-checksum/samples/n-chunk-index-cols.txt (+1/-1)
To merge this branch: bzr merge lp://staging/~percona-toolkit-dev/percona-toolkit/ptc-errors-on-slave-with-different-time-zone-1388870
Reviewer Review Type Date Requested Status
Daniel Nichter Needs Fixing
Review via email: mp+245683@code.staging.launchpad.net

Description of the change

Problem:
  When a slave is in a server with a different system timezone, pt-table-checksum detects differences on timestamp type columns.

Solution:
 Normalize the output of those columns using UNIX_TIMESTAMP

Notes:
 - No difference in performance was detected. Docs say it is an extremely fast function.
 - Tested manually. Creating a test case for this is near impossible since system tz is read-only on the database, and is lifted from the machine at startup.

To post a comment you must log in.
Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

I think this is ok. The +0 was just a clever way to coerce a timestamp to a big int. UNIX_TIMESTAMP() does the same but avoids the tz problem. So the only thing this MP lacks is a test case. This should be possible because you can SET GLOBAL time_zone='+7:00' on a slave and then reproduce the problem. We should not get in habit of fixes without solid test cases.

review: Needs Fixing
Revision history for this message
Frank Cizmich (frank-cizmich) wrote :

> I think this is ok. The +0 was just a clever way to coerce a timestamp to a
> big int. UNIX_TIMESTAMP() does the same but avoids the tz problem. So the only
> thing this MP lacks is a test case. This should be possible because you can
> SET GLOBAL time_zone='+7:00' on a slave and then reproduce the problem. We
> should not get in habit of fixes without solid test cases.

Believe me I tried! But this is some sort of subtle issue.
A change in global time_zone alone does not reproduce the problem.
It's only when you change the system time_zone that the issue arises, (and it is solved by using UNXIX_TIMESTAMP)
To change the system_time_zone , the only way is to shut down the slave, change the machine time_zone, and boot slave again.
That's the only way I could reproduce the bug.

I think it's something to do with the way INSERT...SELECT handles timezones. When reading data from slaves it prefers system_time_zone over (global) time_zone.

Again, timestamps are stored correctly, and UNIX_TIMESTAMP fixes the issue.

Just that an automated test case seems near impossible.

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

Can do TZ="UTC" <start slave> to set system_time_zone on startup, so this can be tested.

Revision history for this message
Frank Cizmich (frank-cizmich) wrote :

Just to add that, also according to my tests, indeed it's a replication thing.

When doing insert on master, slave uses system_tz to do convert to UTC locally when inserting into a timestamp column.
When doing insert..select on master, slave also uses its own system_tz to convert to UTC locally when reading.

613. By Frank Cizmich

pt-tcs added testcase for timezone bug

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches

to all changes: