InnoDB Progress Information

MySQL has since version 5.7 had support for progress information for some queries. As promised in my previous post, I will here discuss how you can use that to get progress information for ALTER TABLE on InnoDB tables.

Background and Setup

Progress information is implemented through the Performance Schema using the stage events. In version 8.0.12 there are currently seven stages that can provide this information for  ALTER TABLE statements on InnoDB tables. In MySQL 8, it is easy to list the stages capable of reporting progress information by using the setup_instruments Performance Schema table:

mysql> SELECT NAME, ENABLED, TIMED
         FROM performance_schema.setup_instruments
        WHERE NAME LIKE 'stage/innodb/alter table%'
              AND PROPERTIES = 'progress';
+------------------------------------------------------+---------+-------+
| NAME                                                 | ENABLED | TIMED |
+------------------------------------------------------+---------+-------+
| stage/innodb/alter table (end)                       | YES     | YES   |
| stage/innodb/alter table (flush)                     | YES     | YES   |
| stage/innodb/alter table (insert)                    | YES     | YES   |
| stage/innodb/alter table (log apply index)           | YES     | YES   |
| stage/innodb/alter table (log apply table)           | YES     | YES   |
| stage/innodb/alter table (merge sort)                | YES     | YES   |
| stage/innodb/alter table (read PK and internal sort) | YES     | YES   |
+------------------------------------------------------+---------+-------+
7 rows in set (0.00 sec)

This also shows how the setup_instruments table in MySQL 8 has some additional information about the instruments such as properties and documentation (not included in the output). Adding this information is still work in progress.

MySQL 5.7 does not provide as easy a way to obtain the instruments providing progress information. Instead you need to consult the reference manual. However, the principle in using the feature is the same.

As you can see, all of the instruments are enabled and timed by default. What is not enabled by default, however, is the consumer that can make the information available:

mysql> SELECT NAME, ENABLED,
              sys.ps_is_consumer_enabled(NAME) AS EnabledWithHierarchy
         FROM performance_schema.setup_consumers
        WHERE NAME = 'events_stages_current';
+-----------------------+---------+----------------------+
| NAME                  | ENABLED | EnabledWithHierarchy |
+-----------------------+---------+----------------------+
| events_stages_current | NO      | NO                   |
+-----------------------+---------+----------------------+
1 row in set (0.01 sec)

Since the consumers form a hierarchical system, the sys schema function ps_is_consumer_enabled() is used to show whether the consumer is enabled taking the whole hierarchy into consideration.

In order to use the progress information, you need to enable the events_stages_current consumer. This is the consumer that is responsible for keeping the performance_schema.events_stages_current table up to date, i.e. record the current (or latest if there is no current stage) for each thread. With the default Performance Schema settings, the rest of the hierarchy is enabled. To enable event_stages_current and verify it will be consuming instruments, you can use the following queries:

mysql> UPDATE performance_schema.setup_consumers
          SET ENABLED = 'YES'
        WHERE NAME = 'events_stages_current';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> SELECT NAME, ENABLED,
              sys.ps_is_consumer_enabled(NAME) AS EnabledWithHierarchy
         FROM performance_schema.setup_consumers
        WHERE NAME = 'events_stages_current';
+-----------------------+---------+----------------------+
| NAME                  | ENABLED | EnabledWithHierarchy |
+-----------------------+---------+----------------------+
| events_stages_current | YES     | YES                  |
+-----------------------+---------+----------------------+
1 row in set (0.00 sec)

That is it. Now you can monitor the progress of the queries that uses the stages with progress information.

Warning

The more parts of the Performance Schema that is enabled and the more fine grained monitoring, the more overhead. Stages are not the worst with respect to overhead; nevertheless it is recommended you keep an eye on the affect of enabling the events_stages_current consumer.

Monitoring Progress

The base for monitoring the progress information is the performance_schema.events_stages_current table. There are two columns of interest for this discussion:

  • WORK_COMPLETED: The amount of work that is reported to have been completed.
  • WORK_ESTIMATED: The estimated amount of work that needs to be done.

For InnoDB ALTER TABLE the estimated amount of work is for the entire operation. That said, the estimate may be revised during the process, so it may happen that the if you calculate the percentage it decreases as time goes. However, in general the percentage (100% * WORK_COMPLETED/WORK_ESTIMATED) will increase steadily until the operation completes at 100%.

To learn more about how the progress information works, the following pages in the manual are recommended:

For now, let's look at an example.

Example

For the example, the salaries table in the employees sample database will be used. The table is sufficiently large that it will be possible to query the progress while adding a column using the INPLACE algorithm. As discussed in MySQL 8.0.12: Instant ALTER TABLE, it is possible to add a column instantly, but for the purpose of this example, the INPLACE algorithm illustrates the progress information feature better. The query that will be executed is:

ALTER TABLE salaries ADD COLUMN new_col int NOT NULL DEFAULT 0, ALGORITHM=INPLACE;

The performance_schema.events_stages_current table can be joined with the performance_schema.events_statements_current to show the query and progress. For example:

mysql> SELECT stmt.THREAD_ID, stmt.SQL_TEXT, stage.EVENT_NAME AS State,
              stage.WORK_COMPLETED, stage.WORK_ESTIMATED,
              ROUND(100*stage.WORK_COMPLETED/stage.WORK_ESTIMATED, 2) AS CompletedPct
         FROM performance_schema.events_statements_current stmt
              INNER JOIN performance_schema.events_stages_current stage
                      ON stage.THREAD_ID = stmt.THREAD_ID
                         AND stage.NESTING_EVENT_ID = stmt.EVENT_ID\G
*************************** 1. row ***************************
     THREAD_ID: 63857
      SQL_TEXT: ALTER TABLE salaries ADD COLUMN new_col int NOT NULL DEFAULT 0, ALGORITHM=INPLACE
         State: stage/innodb/alter table (read PK and internal sort)
WORK_COMPLETED: 8906
WORK_ESTIMATED: 27351
  CompletedPct: 32.56
1 row in set (0.00 sec)

There is another way though. Instead of using the performance_schema.events_stages_current table directly, an easier way is to use the sys.session view. This is an advanced process list that includes much more information than the usual SHOW PROCESSLIST statement including progress information. The performance of sys.session has been improved with more than an order of magnitude in MySQL 8 by the addition of indexes to the Performance Schema tables making it highly useful.

Querying the sys.session view for sessions showing progress information while the ALTER TABLE is in progress returns an output similar to the following example:

mysql> SET @sys.statement_truncate_len = 85;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT thd_id, conn_id, db, command, state, current_statement,
              statement_latency, progress, current_memory, program_name
         FROM sys.session
        WHERE progress IS NOT NULL\G
*************************** 1. row ***************************
           thd_id: 63857
          conn_id: 63818
               db: employees
          command: Query
            state: alter table (merge sort)
current_statement: ALTER TABLE salaries ADD COLUMN new_col int NOT NULL DEFAULT 0, ALGORITHM=INPLACE
statement_latency: 4.22 s
         progress: 49.39
   current_memory: 464.27 KiB
     program_name: MySQLWorkbench
1 row in set (0.06 sec)

In the example, the @sys.statement_truncate_len user variable is set to 85. By default the sys schema ensures the current statement is at most 64 characters long. In order to avoid truncation in this case, the truncate length is increased to 85.

The example output shows that the progress is at around 49%. It is important to note that is an estimate and not an exact number. The ALTER TABLE is performing a merge sort at the time, and the query has been running for 4.22 seconds.

A couple of other interesting columns are included. It can be seen the connection is using 464KiB at the time. In MySQL 8 memory instrumentation is enabled by default (in MySQL 5.7 you need to enable it yourself). Additionally, the name of the program executing the query is MySQLWorkbench, that is the query originates from MySQL Workbench.

So, next time you plan a large ALTER TABLE operation, consider enabling the events_stages_current consumer, so you can follow the progress.

I have worked with MySQL databases since 2006 both as an SQL developer, a database administrator, and for more than eight years as part of the Oracle MySQL Support team. I have spoken at MySQL Connect and Oracle OpenWorld on several occasions. I have contributed to the sys schema and four Oracle Certified Professional (OCP) exams for MySQL 5.6 to 8.0. I have written four books, all published at Apress.

2 Comments on “InnoDB Progress Information

Leave a Reply

Your email address will not be published.

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.