Monthly Archives: May 2017

InnoDB Locks Analysis: Why is Blocking Query NULL and How To Find More Information About the Blocking Transaction?

This post was originally published on the MySQL Support Team Blog at https://blogs.oracle.com/mysqlsupport/entry/innodb_locks_analysis_why_is on 14 April 2017.

Consider the scenario that you execute a query. You expect it to be fast – typically subsecond – but now it take not return until after 50 seconds (innodb_lock_wait_timeout seconds) and then it returns with an error:

You continue to investigate the issue using the sys.innodb_lock_waits view or the underlying Information Schema tables (INNODB_TRX, INNODB_LOCKS and INNODB_LOCK_WAITS).

Note: The above Information Schema tables with lock and lock waits information have been moved to the Performance Schema in 8.0 as the data_locks and data_lock_waits tables. The sys schema view however works the same.

However, when you query the locks information, the blocking query is returned as NULL. What does that mean and how to proceed from that to get information about the blocking transaction?

Setting Up an Example

Before proceeding, lets set up an example which will be investigated later in the blog. The example can be set up as (do not disconnect Connection 1 when the queries have been executed):

  1. Connection 1:
  2. Connection 2 (blocks for innodb_lock_wait_timeout seconds):
  3. The following output while Connection 2 is still blocking from sys.innodb_lock_waits shows that the blocking query is NULL (slightly reformatted):

    The warnings will only occur in the 5.7.14 and later as the InnoDB lock tables being moved to the Performance Schema in MySQL 8.0. It is recommended to use the sys.innodb_lock_waits view as that is updated accordingly in MySQL 8.0.

Investigating Idle Transactions

To investigate idle transactions, you need to use the Performance Schema to get this information. First determine the Performance Schema thread id for the blocking transaction. For this you need the blocking_pid, in the above example:

and use this with the The threads Table table like:

For the following queries insert the thread id found above for the THREAD_ID = … where clauses.

To get the latest query executed, use the The events_statements_current Table table or the The session and x$session Views view:

or:

In this case this does not explain why the lock is held as the last query update a different row then where the lock issue occurs. However if the events_statements_history consumer is enabled (it is by default in MySQL 5.7 and later), the The events_statements_history Table table will include the last 10 statements (by default) executed for the connection:

So now the history of the blocking transaction can be seen and it is possible to determine why the locking issue occur.

Note: The history also includes some queries executed before the transaction started. These are not related to the locking issue.

If transaction monitoring is also enabled (only available in MySQL 5.7 and later), it is possible to get more information about the transaction and automatically limit the query of the history to the current transaction. Transaction monitoring is not enabled by default. To enable it, use:

Note: This must be done before either of the transactions is started. Only transaction started after the transaction monitoring is enabled will be instrumented.

If the above was enabled before the blocking transaction started, you can get more details about the blocking transaction as:

And to get the statement history of the transaction:

 

Working Around MySQL Cluster Push Down Limitations Using Subqueries

This post was originally published on the MySQL Support Team Blog at https://blogs.oracle.com/mysqlsupport/entry/working_around_mysql_cluster_push on 5 August 2016.

I worked on an issue recently where a query was too slow when executed in MySQL Cluster. The issue was that Cluster has some restrictions when it comes to push down conditions.

As an example of this, consider the following query using the employees sample database. The query takes a look at the average salary based on how many years the employee has been with the company. As the latest hire date in the database is in January 2000, the query uses 1 February 2000 as the reference date.

Initially the query performs like (performance is with two data nodes and all nodes in the same virtual machine on a laptop, so the timings are not necessarily representative of a production system, though the improvements should be repeatable):

The straight join is needed as the performance is better than leaving the join order up to the optimizer.

The schema for the two tables in use is:

Why this poor performance? Looking at the EXPLAIN plan is the first step:

The EXPLAIN plan itself does not look bad – the index usage is as expected. However note the 3 warnings – one is the usual rewritten query after the application of rewriting and optimizer rules, but the two other gives more information why the performance is not what would be expected:

Here it can be seen that the tables are not pushable, because they are involved in the GROUP BY.

This gave me an idea. A couple of years ago, I wrote a post on the MySQL Server Blog about MySQL Server Blog: Better Performance for JOINs Not Using Indexes. These materialized temporary tables can also include auto keys that can improve the join performance significantly. What if a similar tactics is used with the above query?

Changing the join on the employees table to join a subquery:

That is more than a factor 3 improvement.

The new EXPLAIN plan is:

Note how <auto_key0> is used for the join on the derived table. This is what helps make the rewrite work.

Actually we can do a bit better. The above subquery selects all columns from the employees table even though only the emp_no and hire_date columns are used. So the next rewrite is:

The improvement of only including the columns needed will vary. For example if the internal temporary table ends up being converted into an on-disk table because of the extra data, or a covering index no longer can be used can increase the importance of only choosing the columns needed.

A final slight improvement can be gained by also converting the salaries table into a subquery:

The total speed up is from a little under 24 seconds to a little over 5 second or more than a factor 4.

Note: the above rewrite will not work out of the box in MySQL Cluster 7.5. The reason is that it is based on MySQL Server 5.7 where the optimizer can perform more advanced rewrites than it was possible on MySQL 5.6. So by default the above subqueries will be rewritten to normal joins and you end up with the original query again. To use the tactics of subqueries in Cluster 7.5, it is necessary first to disable the optimizer switch allowing the optimizer to rewrite the subqueries to normal joins: