What Does I/O Latencies and Bytes Mean in the Performance and sys Schemas?

The Performance Schema and sys schema are great for investigating what is going on in MySQL including investigating performance issues. In my work in MySQL Support, I have a several times heard questions whether a peak in the InnoDB Data File I/O – Latency graph in MySQL Enterprise Monitor (MEM) or some values from the corresponding tables and view in the Performance Schema and sys schema are cause for concern. This blog will discuss what these observations means and how to use them.
The MEM InnoDB File I/O Graphs showing a peak in latency and bytes at 12:51.

The Tables and Views Involved

This blog will look into three sources in the Performance Schema for I/O latencies, so let’s first take a look at those tables. The three Performance Schema tables are:

  • events_waits_summary_global_by_event_name: with the event name set to wait/io/table/sql/handler or wait/io/file/%. This is the table used for the waits_global_by_latency and wait_classes_global_by_% views in the sys schema.
  • table_io_waits_summary_by_table: this is the table used for the schema_table_statistics% views in the sys schema.
  • file_summary_by_instance: this is the table used for the io_global_by_file_by% views in the sys schema.

These are also the sources used in MySQL Enterprise Monitor for the InnoDB Data File I/O graphs shown above. Let’s take a look at an example of each of the tables.

events_waits_summary_global_by_event_name

The events_waits_summary_global_by_event_name has aggregate data for wait events grouped by the event name. For the purpose of this discussion it is the table and file wait/io events that are of interested. An example of the data returned is:

These three queries show the same data, just obtained and displayed in different ways.

In the result there are two groups of events. The wait/io/table events (the wait/io/table/sql/handler is the only event of this group which is why it can be listed explicitly) and the wait/io/file group.

The table events are for accessing data in tables. It does not matter whether the data is cached in the buffer pool or is accessed on disk. In this table and view, there is no distinguishing between different types of access (read, write, etc.).

The file events are, as the name suggest, for actually accessing files. There is one file event per file type. For example, in he output there are the wait/io/file/innodb/innodb_log_file event for accessing the InnoDB redo log files, the wait/io/file/innodb/innodb_data_file event for accessing the InnoDB data files themselves, the wait/io/file/sql/binlog event for the binary log files, etc.

In the second query, all of the timings are wrapped in the sys.format_time() function. The timings returned by the Performance Schema are in picoseconds (10^-12 second) which are somewhat hard for us humans to read. The sys.format_time() function converts the picoseconds to human readable strings. When you sort, however, make sure you sort by the original non-converted values.

Tip: Use the sys.format_time() function to convert the picoseconds to a human readable value, but do only so for the returned row; sort and filter by the original values. The Performance Schema always returns timings in picoseconds irrespective of the timer used internally for the event.

The sys schema by default returns the timing as human readable values. If you need the values in picoseconds prefix the table name with x$, for example sys.x$waits_global_by_latency. The sys schema includes an ORDER BY clause in most views. For the waits_global_by_latency view, the default ordering is by the total latency, so there is no need to add an ORDER BY clause in this example.

table_io_waits_summary_by_table

The table_io_waits_summary_by_table Performance Schema table and schema_table_statistics% sys schema views are related to the wait/io/table/sql/handler event just discussed. These provide information about the amount of time spent per table. Unlike querying the wait/io/table/sql/handler in the wait_events_% tables, it split the time spent into whether it is used for reads, writes, fetch, insert, update, or delete. The read and write columns are the aggregates for the corresponding read and write operations, respectively. Since fetch is the only read operation, the read and fetch columns will have the same values.

The table and view show the table I/O, i.e. the access to table data irrespective of whether the data is accessed in-memory or on disk. This is similar to the wait/io/table/sql/handler event. An example of the result of querying the table and view for the employees.salaries table is:

In this case it shows that there has been mostly writes – inserts – for the table. The sys schema view effectively joins on the performance_schema.file_summary_by_instance for the read columns, so for the schema_table_statistics view fetch and read are not synonyms.

So, what is it the file_summary_by_instance table shows that is different table the “table I/O” that has been the topic of the first two tables? Let’s see.

file_summary_by_instance

Unlike the two previous tables, the file_summary_by_instance shows how much time is spent on actual file I/O and how much data is accessed. This makes the file_summary_by_instance table and the corresponding sys schema views very useful for determining where time is spent doing disk I/O and which files have the most data accesses on disk.

An example of using the Performance Schema and two of the sys schema views is:

This example is from Microsoft Windows, and as always when backslashes are in play, it is fun to try to determine the appropriate number of backslashes to use. When specifying the file name with LIKE, you need four backslashes per backslash in the file name; when using = you need two backslashes.

Again, the values are split into reads and writes (though not as detailed as before with fetch, insert, update, and delete – that is not known at the level where the file I/O happens). The miscellaneous values include everything that is not considered reads or write; this includes opening and closing the file.

The sys schema queries not only have formatted the timings, but also the path and the bytes. This has been done using the sys.format_path() and sys.format_bytes() functions, respectively.

From the output, it can be seen that despite no rows were ever fetched (read) from the employees.salaries table (that was found in the previous output), there has still been some read file I/O. This was what the sys.schema_table_statistics view reflected.

So, what does all of this mean? The graph in MySQL Enterprise Monitor showed that there was six seconds file I/O latency for the InnoDB data files. Is that bad? As often with these kinds of questions, the answer is: “it depends”.

What to Make of the Values

In this case we have a case where the graphs in MySQL Enterprise Monitor show a peak for the latency and bytes used doing I/O on the InnoDB data files. This is actually disk I/O. But what exactly does that means and should the alarm sound?

Let’s first refresh our memory on how the graphs looked:
The MEM InnoDB File I/O Graphs showing a peak in latency and bytes at 12:31.If you are not using MySQL Enterprise Monitor, you may have similar graphs from your monitoring solution, or you have obtained latency and bytes values from the tables and views discussed in this blog.

The latency graph shows that we have done six seconds if I/O. What does that mean? It is the aggregate I/O during the period the data was collected. In this case, the data is plotted for every minute, so in the one minute around 12:51, of the 60 seconds a total of six seconds was spent doing I/O. Now, the six seconds suddenly do no sound so bad. Similar for the bytes, around 4.6MiB of data was read or written.

In general, the values obtained either from the monitoring graphs or from the underlying tables cannot be used to conclude whether the is a problem or not. They just show how much I/O was done at different times.

Similar for the values from the Performance Schema. On their own they do not tell much. You can almost say they are too neutral – they just state how much work was done, not whether it was too much or too little.

A more useful way to use this data is in case a problem is reported. This can be that system administrator reports the disks are 100% utilized or that end users report the system is slow. Then, you can go and look at what happened. If the disk I/O was unusually high at that point in time, then that is likely related, and you can continue your investigation from there.

There are more reports in MySQL Enterprise Monitor both as time series graphs and as point-in-time snapshots. The point-in-time snapshots are often using the sys schema views but allows sorting. An example is the Database File I/O reports:
MEM's Database File I/O ReportMySQL Workbench also provides performance reports based on the sys scheme. The equivalent to the previous report is the Top File I/O Activity Report:
MySQL Workbench's Top File I/O Activity ReportThe MySQL Workbench performance reports also allows you to export the report or copy the query used to generate the report, so you can execute it manually.

With respect to the wait/io/table/sql/handler events, then remember that I/O here does not mean disk I/O, but table I/O. So, all it means that time is accumulating for these events – including when looking at the per table is that the data in the table is used. There are also per index values in table_io_waits_summary_by_index_usage Performance Schema table and the schema_index_statistics sys view which have the same meaning. Whether a given usage is too high depends on many things. However, it can again be useful to investigate what is causing problems.

For the example data from this blog, it was triggered by loading the employees sample database. So, there was no problem. If you want to put data into your database, it must necessarily be written to the data file, otherwise it will not be persisted. However, if you think the data import took too long, you can use the data as part of your investigation on how to make the import faster.

The conclusion is that you should not panic if you see the I/O latencies peak. On their own they just mean that more work was done during that period that usual. Without context a peak is no worse than a dip (which could indicate something is preventing work from being done or that there is a natural slow period). Instead use the I/O latencies and bytes together with other observations to understand where MySQL is spending time and for which files data is being read and written.

References

I will recommend you look at the following references, if you want to understand more about the Performance Schema tables and sys schema views discussed in this blog:

The GitHub repository for the sys schema includes all the definitions of the views (and other objects). Since these are written in plain SQL, they are very useful to see in more depth where the data is coming from. The GitHub website allows you to browse through each of the files. Each sys schema object is defined in its own file.

Overview of the MySQL Server Architecture

Sometimes it can be useful to take a step back and look at the world from a bit larger distance than usual. So in this blog, I will take a look at the high level architecture of MySQL Server.

Info
This is meant as a simplified overview and does not include all details.
Overview of the MySQL Server Architecture
Overview of the MySQL Server Architecture

For the discussion I will be referring to the the following figure that shows some of the features and plugins of MySQL. Orange boxes are available both for the community version and the commercial (enterprise) version, whereas red means the plugin is exclusive for the commercial version. Ellipsoid elements are plugins, whereas square (yes with round corners) elements indicate other features. The figure is based on MySQL 5.7.20 where the audit log can be used to block access to tables.
At the top there are the various connectors and APIs that can be used in the application to connect to MySQL. As the mysql command-line client uses the C API, this is included in this part of the figure as well. There are more connectors and APIs than I could fit into the figure; for the complete list see the documentation index.

The large darkish box represents the server. At the top there is the connection handling where there are two choices: use one thread per connection or use the commercial thread pool plugin. One thread per connection is the original way of handling new connections and as the name implies, it is simply creating a new thread for each new connection. The MySQL Enterprise Thread Pool will limit the overall number of threads by re-using the threads. This leads to better scalability particularly in the case of many connections executing short queries. After the initial thread handling, it is possible to send the connection through the optional Connection-Control plugin which will throttle the connections for an account, when there are multiple attempts to login with the wrong password.

Once the connection has been cleared to move ahead, the query is checked against the Query Cache if it is a SELECT statement, and the Query Cache is enabled. For most workloads the Query Cache will be an overhead that does not justify the potential gain. So, in practice it is recommended to completely disable it.

Info
Note: The Query Cache has been deprecated and is removed in MySQL 8.0.3 and later.

Next is the optional Query Rewrite plugin which allows the DBA to define which queries should be rewritten based on normalized queries like those in the Performance Schema digest summary table. For example, to change the query SELECT 130 to SELECT * FROM world.City WHERE ID = 130 – and the same for all other IDs, the following rule can be used:

This assumes the Query Rewrite plugin has already been installed.

It is not until now that the query actual reaches the parser (this will not happen if the query has a match in the Query Cache). After the parser has processed the query, it is known which schemas, tables, columns, etc. that will be required to answer the query, which means it is also possible to verify whether the user is allowed to use these objects.

After the parser, the MySQL Enterprise Firewall will be invoked if it is installed and enabled. The firewall compares the query’s digest to a whitelist of digests and if the query is in the whitelist. If the query is not in the whitelist it can be recorded or blocked (depending on configuration).

The last plugin is the MySQL Enterprise Audit plugin. In MySQL 5.7.20 and later, the audit log can be used to block access to a schema or a table in addition to log the access. This is the time where the plugin checks whether the access is allowed.

Finally the optimizer will determine the query plan and the query will be executed. The query execution will in almost all cases involve requesting data from or sending data to one of the storage engines. MySQL supports several storage engines through the pluggable storage engine architecture. Nowadays, the main storage engine is InnoDB, but several others exists including the NDBCluster storage engine that is used in MySQL NDB Cluster to execute queries through the SQL nodes (MySQL Server instances).

If You Want to Learn More

One nice way to investigate which order plugins and features are executed is to use the Performance Schema. For example to investigate the above features, enable the following instruments and consumers in the MySQL configuration file (or online executing UPDATE statements):

Then use two connections – one for monitoring and one for executing the query. Disable instrumentation for the monitoring connection:

Determine the Performance Schema thread ID for the connection that will be executing queries:

This assumes the two connections use different usernames. If that is not the case, then you will need to adjust the above query.

Optionally, to make it easier to find the events just for the test query, truncate all tables in the Performance Schema (this will lose all historical data, so is most appropriate for a test system):

Now execute the query – for example one that gets rewritten by the Query Rewriter plugin using the rule above:

Finally, the Performance Schema events can be retrieved:

Have fun.

InnoDB Locks Analysis: Why is Blocking Query NULL and How To Find More Information About the Blocking Transaction?

This post was originally published on the MySQL Support Team Blog at https://blogs.oracle.com/mysqlsupport/entry/innodb_locks_analysis_why_is on 14 April 2017.

Consider the scenario that you execute a query. You expect it to be fast – typically subsecond – but now it take not return until after 50 seconds (innodb_lock_wait_timeout seconds) and then it returns with an error:

You continue to investigate the issue using the sys.innodb_lock_waits view or the underlying Information Schema tables (INNODB_TRX, INNODB_LOCKS and INNODB_LOCK_WAITS).

Note: The above Information Schema tables with lock and lock waits information have been moved to the Performance Schema in 8.0 as the data_locks and data_lock_waits tables. The sys schema view however works the same.

However, when you query the locks information, the blocking query is returned as NULL. What does that mean and how to proceed from that to get information about the blocking transaction?

Setting Up an Example

Before proceeding, lets set up an example which will be investigated later in the blog. The example can be set up as (do not disconnect Connection 1 when the queries have been executed):

  1. Connection 1:
  2. Connection 2 (blocks for innodb_lock_wait_timeout seconds):
  3. The following output while Connection 2 is still blocking from sys.innodb_lock_waits shows that the blocking query is NULL (slightly reformatted):

    The warnings will only occur in the 5.7.14 and later as the InnoDB lock tables being moved to the Performance Schema in MySQL 8.0. It is recommended to use the sys.innodb_lock_waits view as that is updated accordingly in MySQL 8.0.

Investigating Idle Transactions

To investigate idle transactions, you need to use the Performance Schema to get this information. First determine the Performance Schema thread id for the blocking transaction. For this you need the blocking_pid, in the above example:

and use this with the The threads Table table like:

For the following queries insert the thread id found above for the THREAD_ID = … where clauses.

To get the latest query executed, use the The events_statements_current Table table or the The session and x$session Views view:

or:

In this case this does not explain why the lock is held as the last query update a different row then where the lock issue occurs. However if the events_statements_history consumer is enabled (it is by default in MySQL 5.7 and later), the The events_statements_history Table table will include the last 10 statements (by default) executed for the connection:

So now the history of the blocking transaction can be seen and it is possible to determine why the locking issue occur.

Note: The history also includes some queries executed before the transaction started. These are not related to the locking issue.

If transaction monitoring is also enabled (only available in MySQL 5.7 and later), it is possible to get more information about the transaction and automatically limit the query of the history to the current transaction. Transaction monitoring is not enabled by default. To enable it, use:

Note: This must be done before either of the transactions is started. Only transaction started after the transaction monitoring is enabled will be instrumented.

If the above was enabled before the blocking transaction started, you can get more details about the blocking transaction as:

And to get the statement history of the transaction:

 

Who Holds the Metadata Lock? MySQL 5.7.3 Brings Help

In MySQL 5.5 metadata locks were introduced to ensure consistency for some operations that previously could cause problems. For example if a transaction SELECTed from a table and another connection changed the structure of the table before the transaction was committed; in that case the two operations would be written in reverse order to the binary log which could prevent replaying the binary log.

However there was no way to see who held metadata locks. SHOW PROCESSLIST would show who were waiting for metadata locks, and often you could guess who held it, but it wasn’t so always. Consider for example the following:

In this case it happens to be the process with Id = 3 that holds the metadata lock, but that is not obvious from the above output.

In MySQL 5.7.3 this has changed. There is a new Performance Schema table metadata_locks which will tell which metadata locks are held and which are pending. This can be used to answer the question for a case like the one above.

First instrumentation of metadata locks should be enabled. To do that you must enable the wait/lock/metadata/sql/mdl instrument in setup_instruments. Additionally the global_instrumentation consumer must be enabled in setup_consumers.

Currently the wait/lock/metadata/sql/mdl instrument is not enabled by default. I have created a feature request to consider enabling it by default, but obviously whether that will happen also depends on the performance impact.

To enable metadata lock instrumentation:

Note: The global_instrumentation consumer is enabled by default.

With the metadata lock instrumentation enabled, it is now easy to answer who holds the metadata lock (I’ve excluded the connections own metalocks here as I’m only interested in the metadata lock contention going on between other queries):

So in this case there is a metadata lock GRANTED to process list id 3 whereas the other connections have a PENDING lock status for the metadata lock for the City table.

MySQL Connect 2013: ps_tools

During my talk and hands on labs at MySQL Connect 2013 I used several views, functions, and procedures from ps_helper and ps_tools.

The slides from the two talks can be downloaded from Thanks For Attending MySQL Connect or from the MySQL Connect 2013 Content Catalogue.

You can read more about ps_helper on Mark Leith’s blog and you can download ps_helper from github.

To get ps_tools, download ps_tools.tgz from here. Once unpacked there is a ps_tools_5x.sql for each of the versions supported. The tools presented at MySQL Connect were all based on MySQL 5.6 and 5.7. Note that several of the included tools are not particularly useful on their own but are more meant as utility functions for some of the other tools. The actual implementations are organised so they are in the subdirectory corresponding to the earliest version it applies to. For a few tools, such as ps_reset_setup(), there are multiple versions depending on the MySQL version it applies to.

Several of the tools have a help text at the top of the source file.

The main tools are:

  • Function ps_thread_id(): returns the thread id for a given connection. Specify NULL to get the thread id for the current connection. For example:
  •  View ps_setup_consumers: similar to performance_schema.setup_consumers, but includes a column to display whether the consumer is effectively enabled. See also slide 10 from CON 5282.
  • Function substr_count(): See also A Couple of Substring Functions: substr_count() and substr_by_delim().
  • Function substr_by_delim(): See also A Couple of Substring Functions: substr_count() and substr_by_delim().
  • Function ps_account_enabled(): Check whether a given account is enabled according to performance_schema.setup_actors. For example:
  • View ps_accounts_enabled: Lists each account and whether it is enabled according to performance_schema.setup_actors.
  • Procedure ps_setup_tree_instruments(): Creates a tree displaying whether instruments are enabled. See Easier Overview of Current Performance Schema Setting for more details.
  • Procedure ps_setup_tree_actors_by_host(): Creates a tree displaying whether instruments are enabled. See Easier Overview of Current Performance Schema Setting for more details.
  • Procedure ps_setup_tree_actors_by_user(): Creates a tree displaying whether instruments are enabled. See Easier Overview of Current Performance Schema Setting for more details.
  • Procedure ps_setup_tree_consumers(): Creates a tree displaying whether instruments are enabled. See Easier Overview of Current Performance Schema Setting and slide 11 from CON 5282 for more details.
  • Procedure ps_dump_thread_stack(): This is the same as in ps_helper with one bug fix. See also slides 30-31 from CON 5282.
  • Procedure ps_enable_all(): Enable all consumers and instruments.
  • Procedure ps_reset_setup(): Reset consumers, instruments, actors, and threads to the defaults (without taking my.cnf into consideration).
  • View is_innodb_lock_waits: show information about waiting locks for InnoDB.
  • View is_engine_data: summary of the amount of data and indexes grouped by storage engine.
  • View ps_memory_by_thread_by_event_raw: the amount of memory used grouped by thread and event without any formatting and ordering.
  • View ps_memory_current_by_thread_by_event: The same as above but formatted output and ordered by current usage.
  • View ps_memory_high_by_thread_by_event: The same as above but formatted output and ordered by the high watermark.
  • Procedure schema_unused_routines(): Lists unused stored procedures and functions.

Thanks For Attending MySQL Connect

MySQL Connect 2013 was held this past Saturday through Monday, and I would like to extend a big thank you to everyone who attended my sessions, I talked to or otherwise took part in the conference.

I had two sessions as well as participated in a Birds of the Feather session with the Community and Support teams. The slides have been uploaded the the Content Catalog but they are not available for download from there yet. Until then you can download them from the the links below:

The ps_helper views, procedures, and functions used in the above presentations can be downloaded from https://github.com/MarkLeith/dbahelper:

git clone https://github.com/MarkLeith/dbahelper

For ps_tools, I will follow up on this site with more information although some of the tools can be found in Easier Overview of Current Performance Schema Setting. Note: the presentation uses the naming convention that Performance Schema tools are prefixed ps_ – that was not the case in the above blog, so e.g. ps_setup_tree_consumers is in the blog call setup_tree_consumers.

And again: thanks for attending MySQL Connect 2013 – hope to see you again next year.

Easier Overview of Current Performance Schema Setting

While I prepared for my Hands-On Lab about the Performance Schema at MySQL Connect last year, one of the things that occurred to me was how difficult it was quickly getting an overview of which consumers, instruments, actors, etc. are actually enabled. For the consumers things are made more complicated as the effective setting also depends on parents in the hierarchy. So my thought was: “How difficult can it be to write a stored procedure that outputs a tree of the hierarchies.” Well, simple enough in principle, but trying to be general ended up making it into a lengthy project and as it was a hobby project, it often ended up being put aside for more urgent tasks.

However here around eight months later, it is starting to shape up. While there definitely still is work to be done, e.g. creating the full tree and outputting it in text mode (more on modes later) takes around one minute on my test system – granted I am using a standard laptop and MySQL is running in a VM, so it is nothing sophisticated.

The current routines can be found in ps_tools.sql.gz – it may later be merged into Mark Leith’s ps_helper to try to keep the Performance Schema tools collected in one place.

Note: This far the routines have only been tested in Linux on MySQL 5.6.11. Particularly line endings may give problems on Windows and Mac.

Description of the ps_tools Routines

The current status are two views, four stored procedure, and four functions – not including several functions and procedures that does all the hard work:

  • Views:
    • setup_consumers – Displays whether each consumer is enabled and whether the consumer actually will be collected based on the hierarchy rules described in Pre-Filtering by Consumer in the Reference Manual.
    • accounts_enabled – Displays whether each account defined in the mysql.user table has instrumentation enabled based on the rows in performance_schema.setup_actors.
  • Procedures:
    • setup_tree_consumers(format, color) – Create a tree based on setup_consumers displaying whether each consumer is effectively enabled. The arguments are:
      • format is the output format and can be either (see also below).:
        • Text: Left-Right
        • Text: Top-Bottom
        • Dot: Left-Right
        • Dot: Top-Bottom
      • color is whether to add bash color escapes sequences around the consumer names when used a Text format (ignored for Dot outputs).
    • setup_tree_instruments(format, color, only_enabled, regex_filter) – Create a tree based on setup_instruments displaying whether each instrument is enabled. The tree is creating by splitting the instrument names at each /. The arguments are:
      • format is the output format and can be either:
        • Text: Left-Right
        • Dot: Left-Right
        • Dot: Top-Bottom
      • color is whether to add bash color escapes sequences around the instrument names when used a Text format (ignored for Dot outputs).
      • type – whether to base the tree on the ENABLED or TIMED column of setup_instruments.
      • only_enabled – if TRUE only the enabled instruments are included.
      • regex_filter – if set to a non-empty string only instruments that match the regex will be included.
    • setup_tree_actors_by_host(format, color) – Create a tree of each account defined in mysql.user and whether they are enabled; grouped by host. The arguments are:
      • format is the output format and can be either:
        • Text: Left-Right
        • Dot: Left-Right
        • Dot: Top-Bottom
      • color is whether to add bash color escape sequences around the names when used a Text format (ignored for Dot outputs).
    • setup_tree_actors_by_user – Create a tree of each account defined in mysql.user and whether they are enabled; grouped by username. The arguments are:
      • format is the output format and can be either:
        • Text: Left-Right
        • Dot: Left-Right
        • Dot: Top-Bottom
      • color is whether to add bash color escape sequences around the names when used a Text format (ignored for Dot outputs).
  • Functions:
    • is_consumer_enabled(consumer_name) – Returns whether a given consumer is effectively enabled.
    • is_account_enabled(host, user) – Returns whether a given account (host, user) is enabled according to setup_actors.
    • substr_count(haystack, needle, offset, length) – The number of times a given substring occurs in a string. A port of the PHP function of the same name.
    • substr_by_delim(set, delimiter, pos) – Returns the Nth element from a delimiter string.

The two functions substr_count() and substr_by_delim() was also described in an earlier blog.

The formats for the four stored procedures consists of two parts: whether it is Text or Dot and the direction. Text is a tree that can be viewed directly either in the mysql command line client (coloured output not supported) or the shell (colours supported for bash). Dot will output a DOT graph file in the same way as dump_thread_stack() in ps_helper. The direction is as defined in the DOT language, so e.g. Left-Right will have the first level furthest to the left, then add each new level to the right of the parent level.

Examples

As the source code – including comments – is more than 1600 lines, I will not discuss it here, but rather go through some examples.

setup_tree_consumers

Using the coloured output:

setup_tree_consumers_tbor the same using a non-coloured output:
setup_tree_consumers_lr

setup_tree_instruments

setup_tree_instruments_lrHere a small part of the tree is selected using a regex.

setup_tree_actors_%

With only root@localhost and root@127.0.0.1 enabled, the outputs of setup_tree_actors_by_host and setup_tree_actors_by_user gives respectively:setup_tree_actors_by_host_lrsetup_tree_actors_by_user_lr

DOT Graph of setup_instruments

The full tree of setup_instruments can be created using the following sequence of steps (I am using graphviz to get support for dot files):

setup_tree_instruments_dot_lr_snipThe full output is rather large (6.7M). If you want to see if you can get to it at http://mysql.wisborg.dk/wp-content/uploads/2013/05/setup_tree_instruments_dot_lr.png.

Views

Conclusion

There is definitely more work to do on making the Performance Schema easier to access. ps_helper and ps_tools are a great start to what I am sure will be an extensive library of useful diagnostic queries and tools.

Slides and Other Files From My Hands-On Labs at MySQL Connect 2012

First of all a big thank you to all of you who attended my two Hands-On Labs (HOL) session at this year’s MySQL Connect. I ended up doing two sessions as there was a last minute cancellation, so in addition to the previously announced session about the Performance Schema, I also did an introduction to MySQL.

The slides and the workbook for the Performance Schema session will become available from the official Oracle OpenWorld/MySQL Connect catalog, but you can also get the files from my blog which for the Performance Schema session will also include the helper functions and procedures used and some sample queries used to create load on the server.

HOL10471 – Getting Started with MySQL

This session only had slides, however the slides includes the commands and queries executed during the session. Note that the part on using MySQL Workbench (by Alfredo Kojimais) is not included.

The slides are available at: HOL 10471 – Getting Started with MySQL

HOL10472 – Improving Performance with the MySQL Performance Schema

The Performance Schema lab used several files. The main one to use is the workbook which includes details on the queries and commands to run. The hol10472.tgz also includes some support files (for example stored routines) used.

The following files can be downloaded:

Have fun playing with MySQL!

Why I am So Excited About the MySQL Performance Schema

The improved Performance Schema in MySQL 5.6 provides a new way of investigating issues in the database. Many issues that previously required tools such as strace, dtrace, etc. can now be investigated directly from inside MySQL in a platform independent way using standard SQL statements.

The Performance Schema is enabled by default starting from the latest milestone release, 5.6.6. You have instruments which are the things you can measure, and consumers which are those that use the measurements. Not all instruments and consumers are enabled out of the box, however once the plugin is enabled, individual instruments and consumers can be switched on and off dynamically.

As an example take the case mentioned in What’s the innodb main thread really doing? where the main InnoDB thread appears to be stuck in “doing background drop tables” even though no tables are being dropped. Now the underlying issue has been resolved in 5.6, but other issues could show up. So how would the Performance Schema help in cases like that?

First you need to ensure that the necessary instrumentations and consumers are enabled:

Note: the above settings are just the specifics to the investigation below. You will also need to ensure all the involved threads are instrumented, and most likely you also want to enable other instruments and/or consumers to be able to get more details, for example if an SQL query is involved, you can get more information about what the query is doing.

The mutex contention can now easily be investigated using the two tables events_waits_current and mutex_instances. The events_waits_current table will have a row for each combination of thread and event. Threads includes both background threads and foreground threads (those clients are using), so for example an interaction between the purge thread(s) in InnoDB and a query submitted by the application can be checked.

A very simple query to get the investigation started can look like:

Once you have the basic information about the threads involved, you can look for more information. One example is to use the events_statements_current, events_statements_history, and events_statements_history_long tables to get more knowledge of any foreground threads involved.

If you attend MySQL Connect 2012, you will be able to learn more about the MySQL Performance Schema for example by attending the hands-on lab “Improving Performance with the MySQL Performance Schema (HOL10472)” where you will be able to try out the Performance Schema yourself.

If you cannot make it to MySQL Connect or cannot wait to try the Performance Schema yourself, you can download MySQL Server 5.6 from https://dev.mysql.com/downloads/mysql/5.6.html. I can also recommend downloading Mark Leith’s ps_helper views and stored procedures.

What is the MySQL Performance Schema and Why is It Needed?

When you have a non-trivial database installation, you will inevitably sooner or later encounter performance related issues ranging from a query not executing as fast as desirable to complete meltdowns where the database does not respond at all.

Until MySQL 5.5 the tools available to investigate what is going on inside MySQL were somewhat limited. Some of the tools were:

  • The slow and general query logs
  • The status counters available through SHOW [SESSION|GLOBAL] STATUS
  • Storage engine status, e.g. SHOW ENGINE INNODB STATUS
  • The EXPLAIN command to investigate the query plan of a SELECT statement
  • SHOW PROFILE to profile one or more queries
  • The MySQL error log

All of these tools are very useful, but also have their limitations, for example the SHOW STATUS mainly consists of counters that does not provide much insight into what is happening specifically.

In MySQL 5.5 a new tool was introduced, the Performance Schema (often abbreviated P_S). The Performance Schema consist of instrumentation points directly in the source code which allow inspection of the internals of the server at runtime. Some of the advantages for the Performance Schema implementation are:

  • The Performance Schema data is available through the PERFORMANCE SCHEMA storage engine in the performance_schema database, so it is possible to query the data using standard SQL statements.
  • The Performance Schema is available irrespectively of the platform, so while the exact data collected will differ between platforms, the way it works from a DBAs perspective it is the same. This for example means it is possible to create tools that can work across all the MySQL instances. A great example of this is the ps_helper collection of views and stored procedures written by Mark Leith.
  • It is possible to configure the Performance Schema dynamically as long as the plugin has been enabled (this is the defaults as of MySQL 5.6.6).
  • It is easy to add new instrumentation points including adding instrumentation to third party plugins.
  • Enabling the Performance Schema is transparent to normal operations (although obviously there will be a small performance impact – MySQL 5.6 is much better in this respect than MySQL 5.5 though).

For the full list of implementation details, see the documentation of the Performance Schema in the MySQL Reference Manual.

In MySQL 5.5 the amount of information in the Performance Schema is relatively limited. If you take a look into the performance_schema database, it becomes immediately obvious that much has happened between MySQL 5.5 and 5.6.6: the former has 17 views, the latter 52 views. The information available in MySQL 5.5 is primarily I/O oriented whereas MySQL 5.6 also – among other – includes:

  • Statement information (similar to the slow query log)
  • Information about index usage, e.g. to identify unused indexes
  • The possibility to create a stack trace for a given thread
  • etc.

The Performance Schema is the subject of both a talk and a hands-on lab at the upcoming MySQL Connect 2012 which takes place on September 29th and 30th in San Francisco. These two sessions will go into greater depth about the above features. You can also read more about the sessions at: