I recently encountered a situation in which after running Percona Toolkit’s pt-table-checksum on a customer system, 95% of the table on the MySQL master was different on the MySQL slave. Although this table was not a critical part of the infrastructure, from time to time, writes to the table from the master would break replication. Additionally, this table has about 6 million rows, and running pt-table-sync would take sometime. Rebuilding the slave from backup of the master would not be an easy option as well since the slave acts as an archive where it has a lot more data than the master.

So how did we solve it? With pt-online-schema-change and a NOOP ALTER.

How is it possible? pt-online-schema-change works by creating a shadow copy of the original table and start copying the rows from the latter to the former. An additional set of TRIGGERs also ensures that any additional changes to existing rows after they have been copied to the shadow version will propagate.

There is little one caveat though, the  binlog_format  on the master would have to be ROW so the actual ROW images from the master would be copied to the slave. If your binlog_format is set to STATEMENT, you’d still end up with the same inconsistency. Since change statements will be logged as STATEMENT, the NOOP ALTER will operate on the slave table instead of copying from the master. You can configure the binlog_format for the alter process via the –set-vars option i.e.  --set-vars 'binlog_format=ROW'  to make it more easier!

4 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Alex

Could you please clarify why pt-online-schema-change would be faster and safer than pt-table-sync?

Karl

What binlog format do you guys generally recommend? Is MIXED stable enough to use?

Flavian

That’s a Good technique for VERY BIG DB’s