MySQL Add-on Locks Up MySQL or Backup Hangs During MySQL
Symptom
When taking a backup, the MySQL add-on other MySQL queries or applications fail or hang or MySQL becomes unresponsive. The backup process is busy and seems to hang forever or for several minutes or a long period of time.
Cause
This can be caused by one of three issues.
- Some MySQL versions have a deadlock bug related to MyISAM triggered by "FLUSH TABLES WITH READ LOCK"
http://bugs.mysql.com/bug.php?id=20139
Noted in 5.1.12 changelog.
FLUSH TABLES followed by a LOCK TABLES statement to lock a log table and a non-log table caused an infinite loop and high CPU use. Now FLUSH TABLES ignores log tables. To flush the log tables, use FLUSH LOGS instead.
http://bugs.mysql.com/bug.php?id=26380
Documented bugfix in the 5.1.20 as follows:
A FLUSH TABLES WITH READ LOCK statement followed by a FLUSH LOGS statement caused a deadlock if the general log or the slow query log was enabled.
- If you have many thousands of tables for example you have 10,000 or 20,000 or more tables in your MySQL database(s). If you don't have at least 10,000 tables this is almost certainly not the issue. If you do during the backup process the Linux Agent executes a "SHOW TABLE STATUS". See instructions for below in resolutions for debugging if this is your issue.
- Another application using MySQL (not the R1Soft CDP Agent) is locking MyISAM tables and not unlocking them or holding the lock a long period of time.
When the Agent calls "FLUSH TABLES WITH READ LOCK" MySQL will acquire a Read Lock on ALL Tables. MySQL has a known bug with table locks and MyISAM. MySQL has no lock timeout and with MyISAM it will wait FOREVER to acquire a read lock if another process has a lock on a table.
Usually this is caused by a malformed slow query holding a table lock for a long period of time. You need to not run that bad query if this is the issue or schedule CDP backups when this query is not running. For example some people have reporting or maintenance tasks that run only periodically and require a table lock to run. Schedule CDP backups outside of this table locking query.
Follow the advanced debugging steps below to see of a FLUSH TABLES WITH READ LOCK query triggers your MySQL hang issue. If so enable query logging and reproduce the hang. Then go back in the query log and search for a table LOCK query executed before the FLUSH TABLES WITH READ LOCK and identify what application is issuing the table lock. Alternatively "SHOW PROCESSLIST" can be used to identify the bad application using the table lock.
See the following resources for help:
http://dev.mysql.com/doc/refman/5.1/en/query-log.html
http://dev.mysql.com/doc/refman/5.1/en/show-processlist.html
http://dev.mysql.com/doc/refman/5.0/en/lock-tables.html
In order to perform a hot online backup all tables must be locked for less than one second. It is a limitation and known bug with MySQL and MyISAM that table locks will hang forever if they are waiting on another lock.
Resolution
If you are using an effected MySQL version (MySQL versions earlier than 5.1.20) you should upgrade MySQL to rule out the dead lock possibility.
Advanced MySQL Debugging
The Linux Agent executes uses this process and queries during the backup process.
1) Execute the following THREE TIMES IN A ROW TO in order to flush large memory caches through to disk to make snapshot and lock times very small
a. "FLUSH TABLES"
b. sync the disk (uses proprietary Linux IOCTL call in CDP device driver to sync only the disk where MySQL databases are located)
2) "FLUSH TABLES WITH READ LOCK"
3) "FLUSH LOGS"
4) Create a new volume snapshot using the CDP device driver
5) "SHOW TABLES STATUS FROM..." for each database
6) "UNLOCK TABLES"
You should try executing the following queries to see if they reproduce your hang or deadlock issue.
Execute these queries in order:
1) "FLUSH TABLES" - OK if this takes several minutes this does not interrupt other queries
2) "FLUSH TABLES" - OK if this takes several minutes this does not interrupt other queries
3) "FLUSH TABLES" - OK if this takes several minutes this does not interrupt other queries
4) "FLUSH TABLES WITH READ LOCK" - Does this now trigger your deadlock or hang issue? Check for bad table lock queries or MySQL deadlock bug.
5) "FLUSH LOGS" - Does this cause any issues?
6) "UNLOCK TABLES"
Checking the MySQL logs, if logging is enable, or running "SHOW PROCESSLIST" will reveal if any the causing conditions are applicable. If so, restarting MySQL or "killing" the process ID in MySQL (KILL #) should end the deadlock. Once done, attempt to re-run the backup task.
Execute this query by itself - does it take too long to return? Will be an issue if you have many thousands of databases. Substitute YOUR_DATABASE with the database in your MySQL instance the most tables.
1) "SHOW TABLES STATUS FROM SUBSTITUTE_YOUR_DATABASE_HERE"