Happy Birthday MySQL

Today 23 May 2020, it is 25 years since the first release of MySQL. So, I would like to take the opportunity to wish MySQL – and Sakila – a happy birthday.

Happy birthday MySQL. Happy birthday Sakila.

My own MySQL journey started in 2006 when I at a job interview was told that if I got the job, I would need to learn MySQL before starting. Since the job involved PHP coding, I got the book Web Database Applications with PHP and MySQL as well as Managing and Using MySQL. Around a week later, I started in my first job involving MySQL at Noggin Pty Ltd.

The two first MySQL books, I studied: Web Database Applications with PHP and MySQL and Managing & Using MySQL.
The two first MySQL books, I studied.

This was in the days of MySQL 5.0 when stored functions, procedures, and triggers were new, and statement based replication was the only binary log format around. The job evolved into including database administration, and over the next four years and a bit, I got the chance to work around large parts of the corners of MySQL. One of the most interesting tasks was to develop the database backend for a messaging system that had to support active-active replication (to allow the customers to keep using the service even if they couldn't reach both data centers). To avoid the data diverging, conflict resolution was implemented through triggers (this was possible, because MySQL 5.0 exclusively used statement based replication).

Advice

I do not recommend you to go down the road of active-active replication unless you use Group Replication (and even then, you are probably better off staying with single primary mode) or MySQL NDB Cluster.

This was also the period, when I decided to pursue the MySQL 5.0 developer and DBA certifications. Back then, each certification consisted of two exams, and there was an official study guide.

MySQL 5.0 Certification Study Guide
The MySQL 5.0 Certification Study Guide

The next step in my MySQL journey took me to MySQL itself where I worked as a MySQL technical support engineer. I have many fond memories from my years with Oracle, and it was an invaluable experience. You learn a lot by helping customers as you encounter a lot of different cases you would never encounter by working on your own database. It was also great participating in the discussions with the developers and product management. A part of the support job also included writing items for the MySQL 5.6, 5.7, and 8 certification exams, so in that way I went from being a candidate to becoming the “examiner”. (Note, writing the exam items was a team effort including long calls to review each others items.)

The latest step in my journey is as a database reliability engineer at Okta, so yet another way of working with MySQL. All in all, it has been an interesting journey and I hope both my personal MySQL journey and that of MySQL itself will continue for many more years.

Happy birthday.

MySQL Compressed Binary Logs

On a busy server, the binary logs can end up being one of the largest contributors to amount of disk space used. That means higher I/O, larger backups (you are backing up your binary logs, right?), potentially more network traffic when replicas fetch the logs, and so on. In general, binary logs compress well, so it has been a long time wish for a feature that allowed you to compress the logs while MySQL are still using them. Starting from MySQL 8.0.20 that is now possible. I will take a look at the new feature in this post.

Configuration

The binary log compression feature is controlled by two variables, one for enabling the feature and another to specify the compression level. These are summarized in the below table.

Variable NameDefault
Value
Allowed
Values
binlog_transaction_compressionOFFOFF/ON
binlog_transaction_compression_level_zstd31-22

The two option names are fairly self explanatory. binlog_transaction_compression specifies whether compression is enabled or not, and binlog_transaction_compression_level_zstd specifies the compression level. The high level, the (in principle – see also the tests later in the blog) better compression at the cost of increased CPU.

Both options can be set dynamically both at the global and session scopes. However, it is not allowed to change the session values in the middle of a transaction. If you do that, you get an error like this:

mysql> SET SESSION binlog_transaction_compression = ON;
ERROR: 1766 (HY000): The system variable binlog_transaction_compression cannot be set when there is an ongoing transaction.

While it is easy to configure binary log compression, there are some limitations, you should be aware of.

Limitations

The short version of the limitations boils down to only transactional row events are compressed. Statement based events, GTID information, rotation events, row-based events for non-transactional tables, etc. are not compressed. Also if you have a transaction and add non-transactional changes, then neither will be compressed.

If you use all default values for your binary logs and use the InnoDB storage engine (the default), then compression will work. For the complete list of limitations, see Binary Log Transaction Compression in the manual.

Information

This also means that each transaction is compressed on its own. See the examples later in the blog for more on what that means.

As I usually preach, monitoring is key to understanding your system. What does the binary log compression feature support in terms of monitoring?

Monitoring

There are two ways to monitor the performance of the binary log compression feature. A Performance Schema table with compression statistics and new stage events.

The binary_log_transaction_compression_stats table in the Performance Schema includes statistics since the last restart of MySQL (or last time the table was truncated) for the compression. The table has two rows for the binary log, one for compressed events and one for events that are not compressed. Replicas will similarly have two rows for the relay log. An example of the content as well as the binary logs at the same time is:

mysql> SELECT * FROM binary_log_transaction_compression_stats\G
*************************** 1. row ***************************
                            LOG_TYPE: BINARY
                    COMPRESSION_TYPE: ZSTD
                 TRANSACTION_COUNTER: 15321
            COMPRESSED_BYTES_COUNTER: 102796461
          UNCOMPRESSED_BYTES_COUNTER: 252705572
              COMPRESSION_PERCENTAGE: 59
                FIRST_TRANSACTION_ID: 74470a0c-8ea4-11ea-966e-080027effed8:30730
  FIRST_TRANSACTION_COMPRESSED_BYTES: 313
FIRST_TRANSACTION_UNCOMPRESSED_BYTES: 363
         FIRST_TRANSACTION_TIMESTAMP: 2020-05-07 19:26:37.744437
                 LAST_TRANSACTION_ID: 74470a0c-8ea4-11ea-966e-080027effed8:46058
   LAST_TRANSACTION_COMPRESSED_BYTES: 712
 LAST_TRANSACTION_UNCOMPRESSED_BYTES: 1627
          LAST_TRANSACTION_TIMESTAMP: 2020-05-07 19:38:14.149782
*************************** 2. row ***************************
                            LOG_TYPE: BINARY
                    COMPRESSION_TYPE: NONE
                 TRANSACTION_COUNTER: 20
            COMPRESSED_BYTES_COUNTER: 5351
          UNCOMPRESSED_BYTES_COUNTER: 5351
              COMPRESSION_PERCENTAGE: 0
                FIRST_TRANSACTION_ID: 74470a0c-8ea4-11ea-966e-080027effed8:30718
  FIRST_TRANSACTION_COMPRESSED_BYTES: 116
FIRST_TRANSACTION_UNCOMPRESSED_BYTES: 116
         FIRST_TRANSACTION_TIMESTAMP: 2020-05-07 19:26:37.508155
                 LAST_TRANSACTION_ID: 74470a0c-8ea4-11ea-966e-080027effed8:31058
   LAST_TRANSACTION_COMPRESSED_BYTES: 116
 LAST_TRANSACTION_UNCOMPRESSED_BYTES: 116
          LAST_TRANSACTION_TIMESTAMP: 2020-05-07 19:30:30.840767
2 rows in set (0.0026 sec)

mysql> SHOW BINARY LOGS;
+---------------+-----------+-----------+
| Log_name      | File_size | Encrypted |
+---------------+-----------+-----------+
| binlog.000142 |       240 | No        |
| binlog.000143 |      4933 | No        |
| binlog.000144 |  28238118 | No        |
| binlog.000145 |  24667167 | No        |
| binlog.000146 |  39221771 | No        |
| binlog.000147 |  11944631 | No        |
| binlog.000148 |       196 | No        |
+---------------+-----------+-----------+
7 rows in set (0.0005 sec)

The example is generated with GTIDs enabled, and all of the binary logs except binlog.000142 have been created after the last restart.

This shows that there has been 15341 transactions (74470a0c-8ea4-11ea-966e-080027effed8:30718-46058) of which 15321 have been compressed with an average of 59% compression (compressing 252,705,572 bytes to 102,796,461 bytes). There are also statistics for the first and most recent transaction that was compressed. Similarly there have been 20 “transactions” that could not be compressed.

One thing is to know the compression rate, you also need to know the price you pay in terms of the overhead of performing the compression and decompression. That can be monitored using two stage events in the Performance Schema:

  • stage/sql/Compressing transaction changes.
  • stage/sql/Decompressing transaction changes.

(Yes, the period is part of the name.)

Neither are enabled by default and for the instruments to be collection, you will also need to enable the events_stages_current consumer. For example to enable these in the configuration file:

[mysqld]
performance-schema-instrument = "stage/sql/%Compressing transaction changes.=ON"
performance-schema-consumer-events-stages-current = ON

Warning

There is some overhead by enabling instrumentation of stages. Make sure you test the impact first, if you consider enabling these on a production system.

You can compare that with the wait/io/file/sql/binlog event (enabled by default) which is the time spent doing I/O. For example:

mysql> SELECT EVENT_NAME, COUNT_STAR,
              FORMAT_PICO_TIME(SUM_TIMER_WAIT) AS total_latency,
              FORMAT_PICO_TIME(MIN_TIMER_WAIT) AS min_latency,
              FORMAT_PICO_TIME(AVG_TIMER_WAIT) AS avg_latency,
              FORMAT_PICO_TIME(MAX_TIMER_WAIT) AS max_latency
         FROM performance_schema.events_stages_summary_global_by_event_name
        WHERE EVENT_NAME LIKE 'stage/sql/%transaction changes.'\G
*************************** 1. row ***************************
   EVENT_NAME: stage/sql/Compressing transaction changes.
   COUNT_STAR: 15321
total_latency: 6.10 s
  min_latency: 22.00 ns
  avg_latency: 397.96 us
  max_latency: 982.12 ms
*************************** 2. row ***************************
   EVENT_NAME: stage/sql/Decompressing transaction changes.
   COUNT_STAR: 0
total_latency:   0 ps
  min_latency:   0 ps
  avg_latency:   0 ps
  max_latency:   0 ps
2 rows in set (0.0008 sec)

mysql> SELECT *
         FROM sys.io_global_by_wait_by_latency
        WHERE event_name = 'sql/binlog'\G
*************************** 1. row ***************************
   event_name: sql/binlog
        total: 27537
total_latency: 4.83 min
  avg_latency: 10.51 ms
  max_latency: 723.92 ms
 read_latency: 138.25 us
write_latency: 290.69 ms
 misc_latency: 4.82 min
   count_read: 37
   total_read: 1002 bytes
     avg_read:   27 bytes
  count_write: 16489
total_written: 99.26 MiB
  avg_written: 6.16 KiB
1 row in set (0.0015 sec)

In this case, I have used the sys.io_global_by_wait_by_latency view as a short cut as it automatically makes the latencies human readable. For the latencies of the compression/decompression stages, the FORMAT_PICO_TIME() function convert them.

In this example, MySQL has spent 6.21 seconds compressing binary logs averaging just under 400 microseconds per transaction. For comparison, 4.8 minutes have been spent doing I/O on the binary log files.

You should check the time spent on writing and reading the binary log files before enabling compression, so you can determine the change in performance. You should also check the change in CPU usage.

MySQL 8 Query Performance Tuning

Book

If you want to learn more about MySQL performance tuning, then I have written MySQL 8 Query Performance Tuning published by Apress.

The book is available from Apress (print and DRM free ePub+PDF), Amazon (print and Kindle), Barnes & Noble (print), and others.

In the above outputs, it showed the compression was 59%, but how is it for different kinds of workloads?

Examples – Workloads

In order to determine how well the compression works, I have performed a series of tasks and compared the size of the binary logs with and without compression. For comparison, I have also tried to manually compress the binary logs from the series of tests without compression to see the optimal compression (as opposed to the per-transaction compression used by MySQL). The tests have been performed with the default configuration except the settings necessary for the given test.

The following workloads have been performed:

  • Bulk load: Loading the employees sample database.
  • Bulk update: Updating the salary column of all rows in the employees.salaries table: UPDATE employees.salaries SET salary = salary + 1.
  • Bulk load: Populating four tables with 100,000 rows using sysbench as preparation for the oltp_read_write benchmark.
  • OLTP workload: Running the sysbench oltp_read_write benchmark with --events=15000 using the four tables from the previous test.
  • Single row deletes: Deleting all 100,000 rows in one of the tables from the sysbench test. The rows are deleted one by one which represents a worst case scenario for the compression as the transactions are very small and there is only the before image in the binary log for each deleted row.

The single row deletes can be performed easily using MySQL Shell, for example in the Python language mode:

from datetime import datetime

for i in range(100000):
    if i % 5000 == 0:
        print('{0}: i = {1}'.format(datetime.now(), i))
    session.run_sql('DELETE FROM sbtest.sbtest1 WHERE id = {0}'.format(i+1))
print('{0}: i = 1000000'.format(datetime.now()))

The size of each task generates uncompressed binary logs in the range of 25 MiB to 82 MiB.

The tests were performed with the following settings:

  • No compression
  • Compression enabled
  • Encrypted but no compressed
  • Encryption and compression enabled
  • No compression in MySQL + compressing with zstd

The compression with zstd was chosen as MySQL uses the Zstandard compression algorithm. If you want to try yourself, you can download the Zstandard source code from Facebook's GitHub repository which also includes the compilation instructions.

The resulting size of the binary log in bytes for each combination can be found in the following table.

TestNormalCompressEncryptEncrypt +
Compress
zstd
Load employees6637863928237933663791512823863426892387
Bulk Update8569832024667051856988322466767724779953
sysbench prepare7636774039221052763682523922180639775067
sysbench run261902001193371326190712119365618468625
Single Row Deletes4730015639492400473006683963768416525219

That the encrypted binary log compresses as well as the unencrypted suggest that the compression is done before the encryption. (Encrypted data does not compress well.) Because there is no difference whether encryption is enabled, only the normal (uncompressed), compressed, and zstd results will be discussed further. The binary log sizes can also be seen in this figure:

Binary log sizes for different compressions schemes.
Binary log sizes for different compressions schemes.

Unsurprising the bulk loads and update perform the best with compressed binary log being 29% to 51% of the uncompressed size. The sysbench OLTP workload also compresses well ending up being 46% of the size. Also unsurprising, the single row deletes do not compress nearly as well with the compressed binary log being 83% of the size of the uncompressed binary log.

When comparing the binary logs compressed by MySQL with those compressed manually with zstd, the file size is around the same for the bulk loads reflecting that for large transactions, compressing on a per-transaction basis works as well as compressing the whole file. As the transaction size becomes smaller, the relative efficiency of the per-transaction compression reduces which is particularly visible for the single row deletes.

The other factor to consider is the compression level.

Examples – Compression Level

There are some oddities when it comes to the compression level, and the short story is that there is no need to change the setting.

The first oddity is that the allowed values are 1-22 but zstd only supports levels 1-19. There is nothing in the MySQL documentation explaining the difference.

The second oddity is that there is effectively no change in the size of the compressed binary logs by changing the value of binlog_transaction_compression_level_zstd as can be seen from the table:

Level/TestMySQLzstd
LoadOLTPLoadOLTP
12823814211935450344832078531545
32823793311933713268923878468625
112823812811902669247371946639524
192823824611937664188671875724300
222823812511910022

For comparison, the compression at levels 1, 3, 11, and 19 are included compressing the binary log manually with zstd. “Load” is for loading the employees database and OLTP is the sysbench oltp_read_write benchmark. The data can also be seen in this figure:

The binary log size as a function of the compression level.
The binary log size as a function of the compression level.

As it can be seen, there is essentially no difference in the file size irrespective of the compression level used in MySQL whereas for zstd the file size reduces as expected with increased compression level. For level 1 with the load test MySQL even compresses significantly better than zstd. It is like the compression level is never being set in MySQL.

Information

One possible explanation is that Zstandard supports training the algorithm for a given type of data (creating a dictionary). This particularly helps improving compression of small data. I do not know whether MySQL uses a dictionary and if so whether that makes all compression levels work roughly equal.

Conclusion

The new binary log transaction compression works great and can be a great way to reduce the amount of I/O, disk usage, and network usage. Should you enable it?

Most likely you will benefit from enabling binary log compression unless you are very strapped for CPU. From these tests, it is likely the disk space occupied by binary logs can be roughly cut in half, but as always it is workload dependent and you should test with your workload.

Advice

You can also enable compression at the protocol level for the transmission of the binary logs. There is no reason to both enable binary log transaction compression and the protocol compression.

On the other hand, there is not reason at present to change the compression level.

Apress Blog: MySQL Performance Tuning Best Practices

To celebrate the publishing of my new book MySQL 8 Query Performance Tuning, the Apress team invited me (thanks Jonathan and Liz) to write a post for the Apress blog. I decided to write about my top six best practices:

  • Be wary of best practices
  • Monitor
  • Work methodically
  • Consider the full stack
  • Make small, incremental changes
  • Understand the change

Yes, my first best practice is to be wary of best practices. Read why I added that and the other best practices at MySQL Performance Tuning Best Practices.

MySQL 8.0.20: Index-Level Optimizer Hints

MySQL introduced optimizer hints in version 5.7 and greatly extended the feature in MySQL 8. One thing that has been missing though is the ability to specify index hints using the syntax of optimizer hints. This has been improved of in MySQL 8.0.20 with the introduction of index-level optimizer hints for the FORCE and IGNORE versions of the index hints. This blog will look at the new index hint syntax.

Example of using index-level optimizer hints.

Warning

Do not add index hints – neither using the old or new style – unless you really need them. When you add index hints, you limit the options of the optimizer which can prevent the optimizer obtaining the optimal query plan as new optimizer improvements are implemented or the data changes.

On the other hand, if you really have a query where ANALYZE TABLE and increasing the number of pages analyzed in the random index dives do not help you, index hints can be very useful to ensure optimal performance.

The Short Story

To make a long story short, consider this query in 8.0.19 and earlier:

SELECT ci.CountryCode, co.Name AS Country, ci.Name AS City, ci.District
  FROM world.country co IGNORE INDEX (Primary)
       INNER JOIN world.city ci FORCE INDEX FOR ORDER BY (CountryCode)
                  ON ci.CountryCode = co.Code
 WHERE co.Continent = 'Asia'
 ORDER BY ci.CountryCode, ci.ID;

This query has two index hints, IGNORE INDEX in the second line and USE INDEX FOR ORDER BY in the third line.

In MySQL 8.0.20, you can write the query as:

SELECT /*+ NO_INDEX(co PRIMARY) ORDER_INDEX(ci CountryCode) */
       ci.CountryCode, co.Name AS Country, ci.Name AS City, ci.District
  FROM world.country co
       INNER JOIN world.city ci
                  ON ci.CountryCode = co.Code
 WHERE co.Continent = 'Asia'
 ORDER BY ci.CountryCode, ci.ID;

Information

However, note that there seems to be a bug, so the ORDER_INDEX() hint makes the optimizer choose a plan like NO_JOIN_INDEX() for the same index is also specified. The workaround is to also add the JOIN_INDEX().

Let's take a look which index-level optimizer hints that have been added and how they map to the old index hints.

The Details

There are four pairs of new index hints which all maps back to the old style hints as in the below table.

New HintOld Hint
JOIN_INDEX
NO_JOIN_INDEX
FORCE INDEX FOR JOIN
IGNORE INDEX FOR JOIN
GROUP_INDEX
NO_GROUP_INDEX
FORCE INDEX FOR GROUP BY
IGNORE INDEX FOR GROUP BY
ORDER_INDEX
NO_ORDER_INDEX
FORCE INDEX FOR ORDER BY
IGNORE INDEX FOR ORDER BY
INDEX
NO_INDEX
FORCE INDEX
IGNORE INDEX

The new hints support all of the usual features of optimizer hints such as specifying the query block for a hint, adding them inline in subqueries, etc. For the full details, see the manual and the release notes (which has an extensive description of the new hints).

If you need to specify multiple index hints, there are two ways to accomplish it depending on whether the indexes are on the same table or not. Consider a query on the world.city table where you will not allow neither the primary key nor the CountryCode index to be used. You can accomplish that as in this example:

SELECT /*+ NO_INDEX(ci PRIMARY, CountryCode) */
       ID, CountryCode, Name, District, Population
  FROM world.city ci
 WHERE Population > 1000000;

On the other hand, if the indexes are on different tables, then you will have to specify multiple hints. Let's say you want to force the optimizer to choose the hash join algorithm when joining the country and city tables in the world database by ignoring the primary key on the country table and the CountryCode index on the city table (effectively forcing the join not to use an index irrespective of the join order). In this case, you can use the NO_INDEX() hint twice, once on each table:

SELECT /*+ NO_INDEX(co PRIMARY) NO_INDEX(ci CountryCode) */
       ci.CountryCode, co.Name AS Country, ci.Name AS City, ci.District
  FROM world.country co
       INNER JOIN world.city ci ON ci.CountryCode = co.Code
 WHERE co.Continent = 'Asia';

How About USE INDEX?

The old index hint syntax also includes the USE INDEX variant, but none of the new index-level optimizer hints corresponds to it. So, what do you do if you want to specify the softer USE INDEX rather than FORCE INDEX? First of all, in that case there is a good chance, you do not need the index hint at all, so try to remove it and verify whether the optimizer uses the same query plan. If so, it is better to remove it, so the optimizer can use the optimal join strategy as new optimizer features become available or the data changes.

If you really need the USE INDEX hint, the simplest is to continue to use the old syntax. However, be aware that you cannot mix the old and new syntax hints (in that case, the old hints are ignored).

Alternative, you can simulate USE INDEX by specifying all other applicable indexes in a NO_INDEX() optimizer hint. For example, consider the world.countrylanguage table:

mysql> SHOW CREATE TABLE countrylanguage\G
*************************** 1. row ***************************
       Table: countrylanguage
Create Table: CREATE TABLE `countrylanguage` (
  `CountryCode` char(3) NOT NULL DEFAULT '',
  `Language` char(30) NOT NULL DEFAULT '',
  `IsOfficial` enum('T','F') NOT NULL DEFAULT 'F',
  `Percentage` float(4,1) NOT NULL DEFAULT '0.0',
  PRIMARY KEY (`CountryCode`,`Language`),
  KEY `CountryCode` (`CountryCode`),
  CONSTRAINT `countryLanguage_ibfk_1` FOREIGN KEY (`CountryCode`) REFERENCES `country` (`Code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.0008 sec)

If you want to find all languages spoken in Australia and for some reason want to either use the CountryCode index or no index at all, then you can tell the optimizer to ignore the primary key:

SELECT /*+ NO_INDEX(cl PRIMARY) */
       CountryCode, Language, IsOfficial, Percentage
  FROM world.countrylanguage cl
 WHERE CountryCOde = 'AUS';

That is equivalent of the old syntax hints with USE INDEX (CountryCode):

SELECT CountryCode, Language, IsOfficial, Percentage
  FROM world.countrylanguage cl USE INDEX (CountryCode)
 WHERE CountryCOde = 'AUS';

While this works the same and is simple enough with just one alternative index, it is not so easy in the general case. To be sure your NO_INDEX() optimizer hint is the same as the USE INDEX index hint, you will need to fetch the list of indexes and add them to the NO_INDEX() hint each time you execute the query. That is the reason for the suggestion to keep using the old style index hint if you rely on USE INDEX.

New Book: MySQL 8 Query Performance Tuning

I have over the last few years been fortunate to have two books published through Apress, Pro MySQL NDB Cluster which I wrote together with Mikiya Okuno and MySQL Connector/Python Revealed. With the release of MySQL 8 around a year ago, I started to think of how many changes there has been in the last few MySQL versions. Since MySQL 5.6 was released as GA in early 2013, some of the major features related to performance tuning includes the Performance Schema which was greatly changed in 5.6, histograms, EXPLAIN ANALYZE, hash joins, and visual explain. Some of these are even unique in MySQL 8.

So, I was thinking that it could be interesting to write a book that focuses on performance tuning in MySQL 8. In order to try to limit the scope somewhat (which as you can see from the page count that I was not too successful with), I decided to mainly look at the topics related to query performance. I proposed this to my acquisition editor Jonathan Gennick, and he was very interested. Oracle whom I worked for at the time was also interested (thanks Adam Dixon, Victoria Reznichenko, Edwin DeSouza, and Rich Mason for supporting me and approving the project). Also thanks to the Apress editors and staff who has been involved including but not limited to Jonathan Gennick, Jill Balzano, Laura Berendson, and Arockia Rajan Dhurai.

Now around a year later, the final result is ready: MySQL 8 Query Performance Tuning. If you are interested, you can read more about the content and/or buy it at Apress, Amazon, and others book shops:

ShopPaper BookE-Book
ApressSoftcoverDRM free ePub/PDF
Amazon USSoftcoverMobi (Kindle)

Book Structure

The book is divided into six parts with a total of 27 chapters. I have attempted to keep each chapter relatively self-contained with the aim that you can use the book as a reference book. The drawback of this choice is that there is some duplication of information from time to time. An example is Chapter 18 which describes the more theoretical side of locks and how to monitor locks, and Chapter 22 which provides practical examples of investigating lock contention. Chapter 22 naturally draws on the information in Chapter 18, so some of the information is repeated. This was a deliberate choice, and I hope it helps you reduce the amount of page flipping to find the information you need.

The six parts progressively move you through the topics starting with some basic background and finishing with more solution-oriented tasks. The first part starts out discussing the methodology, benchmarks, and test data. The second part focuses on the sources of information such as the Performance Schema. The third part covers the tools such as MySQL Shell used in this book. The fourth part provides the theoretical background used in the last two parts. The fifth part focuses on analyzing queries, transactions, and locks. Finally, the sixth part discusses how to improve performance through the configuration, query optimization, replication, and caching. There are cases where some content is a little out of place, like all replication information is contained in a single chapter.

Chapters

Part I: Getting Started

Part I introduces you to the concepts of MySQL query performance tuning. This includes some high-level considerations, of which some are not unique to MySQL (but are of course discussed in the context of MySQL). The four chapters are

  • Chapter 1 – MySQL Performance Tuning
    This introductory chapter covers some high-level concepts of MySQL performance tuning such as the importance of considering the whole stack and the lifecycle of a query.
  • Chapter 2 – Query Tuning Methodology
    It is important to work in an effective way to solve performance problems. This chapter introduces a methodology to work effectively and emphasizes the importance of working proactively rather than doing firefighting.
  • Chapter 3 – Benchmarking with Sysbench
    It is often necessary to use benchmarks to determine the effect of a change. This chapter introduces benchmarking in general and specifically discusses the Sysbench tool including how to create your own custom benchmarks.
  • Chapter 4 – Test Data
    The book mostly uses a few standard test databases which are introduced in this chapter.

Part II: Sources of Information

MySQL exposes information about the performance through a few sources. The Performance Schema, the sys schema, the Information Schema, and the SHOW statement are introduced in each their chapter. There are only relatively few examples of using these sources in this part; however, these four sources of information are used extensively in the remainder of the book. If you are not already familiar with them, you are strongly encouraged to read this part. Additionally, the slow query log is covered. The five chapters are

  • Chapter 5 – The Performance Schema
    The main source of performance related information in MySQL is – as the name suggests – the Performance Schema. This chapter introduces the terminology, the main concepts, the organization, and the configuration.
  • Chapter 6 – The sys Schema
    The sys schema provides reports through predefined views and utilities in stored functions and programs. This chapter provides an overview of what features are available.
  • Chapter 7 – The Information Schema
    If you need metadata about the MySQL and the databases, the Information Schema is the place to look. It also includes important information for performance tuning such as information about indexes, index statistics, and histograms. This chapter provides an overview of the views available in the Information Schema.
  • Chapter 8 – SHOW Statements
    The SHOW statements are the oldest way to obtain information ranging from which queries are executing to schema information. This chapter relates the SHOW statements to the Information Schema and Performance Schema and covers in somewhat more detail the SHOW statements without counterparts in the two schemas.
  • Chapter 9 – The Slow Query Log
    The traditional way to find slow queries is to log them to the slow query log. This chapter covers how to configure the slow query log, how to read the log events, and how to aggregate the events with the mysqldump utility.

Part III: Tools

MySQL provides several tools that are useful when performing the daily work as well as specialized tasks. This part covers three tools ranging from monitoring to simple query execution. This book uses Oracle’s dedicated MySQL monitoring solution (requires commercial subscription but is also available as a trial) as an example of monitoring. Even if you are using other monitoring solutions, you are encouraged to study the examples as there will be a large overlap. These three tools are also used extensively in the remainder of the book. The three chapters in this part are

  • Chapter 10 – MySQL Enterprise Monitor
    Monitoring is one of the most important aspects of maintaining a stable and well-performing database. This chapter introduces MySQL Enterprise Monitor (MEM) and shows how you can install the trial and helps you navigate and use the graphical user interface (GUI).
  • Chapter 11 – MySQL Workbench
    MySQL provides a graphical user interface through the MySQL Workbench product. This chapter shows how you can install and use it. In this book, MySQL Workbench is particularly important for its ability to create diagrams – known as Visual Explain – representing the query execution plans.
  • Chapter 12 – MySQL Shell
    One of the newest tools around from Oracle for MySQL is MySQL Shell which is a second-generation commandline client with support for executing code in both SQL, Python, and JavaScript. This chapter gets you up to speed with MySQL Shell and teaches you about its support for using external code modules, its reporting infrastructure, and how to create custom modules, reports, and plugins.

Part IV: Schema Considerations and the Query Optimizer

In Part IV, there is a change of pace, and the focus moves to the topics more directly related to performance tuning starting with topics related to the schema, the query optimizer, and locks. The six chapters are

  • Chapter 13 – Data Types
    In relational databases, each column has a data type. This data type defines which values can be stored, which rules apply when comparing two values, how the data is stored, and more. This chapter covers the data types available in MySQL and gives guidance on how to decide which data types to use.
  • Chapter 14 – Indexes
    An index is used to locate data, and a good indexing strategy can greatly improve the performance of your queries. This chapter covers the index concepts, considerations about indexes, index types, index features, and more. It also includes a discussion on how InnoDB uses indexes and how to come up with an indexing strategy.
  • Chapter 15 – Index Statistics
    When the optimizer needs to determine how useful an index is and how many rows match a condition on an indexed value, it needs information on the data in the index. This information is index statistics. This chapter covers how index statistics work in MySQL, how to configure them, monitoring, and updating the index statistics.
  • Chapter 16 – Histograms
    If you want the optimizer to know how frequent a value occurs for a given column, you need to create a histogram. This is a new feature in MySQL 8, and this chapter covers how histograms can be used, their internals, and how to query the histogram metadata and statistics.
  • Chapter 17 – The Query Optimizer
    When you execute a query, it is the query optimizer that determines how to execute it. This chapter covers the tasks performed by the optimizer, join algorithms, join optimizations, configuration of the optimizer, and resource groups.
  • Chapter 18 – Locking Theory and Monitoring
    One of the problems that can cause the most frustration is lock contention. The first part of this chapter explains why locks are needed, lock access levels, and lock types (granularities). The second part of the chapter goes into what happens when a lock cannot be obtained, how to reduce lock contention, and where to find information about locks.

Part V: Query Analysis

With the information from Part IV, you are now ready to start analyzing queries. This includes finding the queries for further analysis and then analyzing the query using EXPLAIN or the Performance Schema. You also need to consider how transactions work and investigate lock contention when you have two or more queries fighting for the same locks. The four chapters are

  • Chapter 19 – Finding Candidate Queries for Optimization
    Whether part of the daily maintenance or during an emergency, you need to find the queries that you need to analyze and potentially optimize. This chapter shows how you can use the Performance Schema, the sys schema, MySQL Workbench, your monitoring solution, and the slow query log to find the queries that are worth looking into.
  • Chapter 20 – Analyzing Queries
    Once you have a candidate query, you need to analyze why it is slow or impacts the system too much. The main tool is the EXPLAIN statement which provides information about the query plan chosen by the optimizer. How to generate and read – including examples – the query plans using EXPLAIN is the main focus of the chapter. You can also use the optimizer trace to get more information on how the optimizer arrived at the selected query plan. An alternative way to analyze queries is to use the Performance Schema and sys schema to break queries down into smaller parts.
  • Chapter 21 – Transactions
    InnoDB executes everything as a transaction, and transactions is an important concept. Proper use of transactions ensures atomicity, consistency, and isolation. However, transactions can also be the cause of severe performance and lock problems. This chapter discusses how transactions can become a problem and how to analyze them.
  • Chapter 22 – Diagnosing Lock Contention
    This chapter goes through four scenarios with lock contention (flush locks, metadata locks, record-level locks, and deadlocks) and discusses the symptoms, the cause, how to set up the scenario, the investigation, the solution, and how to prevent problems.

Part VI: Improving Queries

You have found your problem queries and analyzed them and their transaction to understand why they are underperforming. But how do you improve the queries? This chapter goes through the most important configuration options not covered elsewhere, how to change the query plan, schema changes and bulk loading, replication, and caching as means to improve the performance. The five chapters are

  • Chapter 23 – Configuration
    MySQL requires resources when executing a query. This chapter covers the best practices for configuring these resources and the most important configuration options that are not covered in other discussions. There is also an overview of the data lifecycle in InnoDB as background for the discussion of configuring InnoDB.
  • Chapter 24 – Change the Query Plan
    While the optimizer usually does a good job at finding the optimal query execution plan, you will from time to time have to help it on its way. It may be that you end up with full table scans because no indexes exist or the existing indexes cannot be used. You may also wish to improve the index usage, or you may need to rewrite complex conditions or entire queries. This chapter covers these scenarios as well as shows how you can use the SKIP LOCKED clause to implement a queue system.
  • Chapter 25 – DDL and Bulk Data Load
    When you perform schema changes or load large data sets into the system, you ask MySQL to perform a large amount of work. This chapter discusses how you can improve the performance of such tasks including using the parallel data load feature of MySQL Shell. There is also a section on general data load considerations which also applies to data modifications in general and shows the difference between sequential and random order inserts. That discussion is followed by considerations on what this means for the choice of primary key.
  • Chapter 26 – Replication
    The ability to replicate between instances is a popular feature in MySQL. From a performance point of view, replication has two sides: you need to ensure replication performs well, and you can use replication to improve performance. This chapter discusses both sides of the coin including covering the Performance Schema tables that can be used to monitor replication.
  • Chapter 27 – Caching
    One way to improve the performance of queries is to not execute them at all, or at least avoid executing part of the query. This chapter discusses how you can use caching tables to reduce the complexity of queries and how you can use Memcached, the MySQL InnoDB Memcached plugin, and ProxySQL to avoid executing the queries altogether.

I hope you will enjoy the book.

Back Up MySQL View Definitions

If you want to back up your table and views, stored procedures, or stored function definitions, you can use mysqldump or mysqlpump to export the schema without the data. However, if you just want the views you need to look for another option. This blog shows how MySQL Shell comes to the rescue.

Backup the view definition using MySQL Shell
Backup the view definition using MySQL Shell

There are a couple of approaches to get the view definitions. One option is to consider the information_schema.VIEWS view which has the following columns:

mysql> SELECT COLUMN_NAME AS Field, COLUMN_TYPE AS Type
         FROM information_schema.COLUMNS
        WHERE TABLE_SCHEMA = 'information_schema'
              AND TABLE_NAME = 'VIEWS'
        ORDER BY ORDINAL_POSITION;
+----------------------+---------------------------------+
| Field                | Type                            |
+----------------------+---------------------------------+
| TABLE_CATALOG        | varchar(64)                     |
| TABLE_SCHEMA         | varchar(64)                     |
| TABLE_NAME           | varchar(64)                     |
| VIEW_DEFINITION      | longtext                        |
| CHECK_OPTION         | enum('NONE','LOCAL','CASCADED') |
| IS_UPDATABLE         | enum('NO','YES')                |
| DEFINER              | varchar(288)                    |
| SECURITY_TYPE        | varchar(7)                      |
| CHARACTER_SET_CLIENT | varchar(64)                     |
| COLLATION_CONNECTION | varchar(64)                     |
+----------------------+---------------------------------+
10 rows in set (0.0011 sec)

This looks good, but there are two flaws. First of all, the algorithm of the view is not included among the information. Granted, most view definitions do not explicitly define the algorithm, but from time to time it is important. The other limitation is not visible from the column list but becomes clear if you look at an example of a view:

mysql> SELECT *
         FROM information_schema.VIEWS
        ORDER BY LENGTH(VIEW_DEFINITION) LIMIT 1\G
*************************** 1. row ***************************
       TABLE_CATALOG: def
        TABLE_SCHEMA: sys
          TABLE_NAME: version
     VIEW_DEFINITION: select '2.1.1' AS `sys_version`,version() AS `mysql_version`
        CHECK_OPTION: NONE
        IS_UPDATABLE: NO
             DEFINER: mysql.sys@localhost
       SECURITY_TYPE: INVOKER
CHARACTER_SET_CLIENT: utf8mb4
COLLATION_CONNECTION: utf8mb4_0900_ai_ci
1 row in set (0.0017 sec)

This just selects the information for the view with the shortest view definition as the definition is not important, so there is no reason to include more data than necessary. You may have a different view returned.

The important point in the output is the value of DEFINER. You may have to quote the username or hostname, but that is not simple to do because the full account name is listed.

An alternative is to export the view definition using the SHOW CREATE VIEW statement. For example for the sakila.staff_list view:

mysql> SHOW CREATE VIEW sakila.staff_list\G
*************************** 1. row ***************************
                View: staff_list
         Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `sakila`.`staff_list` AS select `s`.`staff_id` AS `ID`,concat(`s`.`first_name`,_utf8mb3' ',`s`.`last_name`) AS `name`,`a`.`address` AS `address`,`a`.`postal_code` AS `zip code`,`a`.`phone` AS `phone`,`sakila`.`city`.`city` AS `city`,`sakila`.`country`.`country` AS `country`,`s`.`store_id` AS `SID` from (((`sakila`.`staff` `s` join `sakila`.`address` `a` on((`s`.`address_id` = `a`.`address_id`))) join `sakila`.`city` on((`a`.`city_id` = `sakila`.`city`.`city_id`))) join `sakila`.`country` on((`sakila`.`city`.`country_id` = `sakila`.`country`.`country_id`)))
character_set_client: utf8mb4
collation_connection: utf8mb4_0900_ai_ci
1 row in set, 1 warning (0.0013 sec)

This has all the required information with the username and hostname properly quoted. It is fine to use SHOW CREATE VIEW like this for a few views, but it is not practical to back up all view definitions and automatically pick up new definitions. This is where the scripting modes of MySQL Shell are useful.

This example uses Python, but you can also choose to implement a solution in JavaScript. In MySQL 8 you can use the X DevApi to easily query the information_schema.VIEWS view and add filters as required. An example of exporting all views except those in the system databases (mysql, information_schema, sys, and performance_schema) is:

\py

i_s = session.get_schema('information_schema')
views = i_s.get_table('VIEWS')
stmt = views.select('TABLE_SCHEMA', 'TABLE_NAME')
stmt = stmt.where("TABLE_SCHEMA NOT IN " +
       "('mysql', 'information_schema', 'sys', 'performance_schema')")

result = stmt.execute()
for view in result.fetch_all():
    sql = 'SHOW CREATE VIEW `{0}`.`{1}`'.format(*view)
    v_result = session.sql(sql).execute()
    v_def = v_result.fetch_one()
    print('DROP TABLE IF EXISTS `{0}`.`{1}`;'.format(*view))
    print('DROP VIEW IF EXISTS `{0}`.`{1}`;'.format(*view))
    print(v_def[1] + ';')
    print('')

You need an empty line after the stmt = stmt.where(...) statement and at the end to tell MySQL Shell that you have completed multi-line statements. The example assume that you already have a connection to MySQL.

First the schema object for information_schema schema and table object for the VIEWS view are fetched. Then a select statement is created with a WHERE clause specifying which schemas that we want the view definitions for. Change this as required. You can chain the two stmt and the result assignments to a single line (in the above example it was split out to improve the readability in the blog):

result = views.select('TABLE_SCHEMA', 'TABLE_NAME').where("TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'sys', 'performance_schema')").execute()

The result object can be used to loop over the views and execute SHOW CREATE VIEW for each view. In this example, a DROP TABLE and DROP VIEW are added, but that can optionally be removed. Then the second column of the SHOW CREATE VIEW output is printed.

Note that in the example, when the SHOW CREATE VIEW statement is put together, the schema and table names are quoted using backticks:

    sql = 'SHOW CREATE VIEW `{0}`.`{1}`'.format(*view)

For this to be valid, it assumes you have no view names with backticks in the name (if you have – please don't! – you need to escape the backticks by duplicating it). If you have the ANSI_QUOTES SQL mode enabled, you should change the backticks with double quotes.

You can also use the character and collation information from information_schema.VIEWS view to set the client character set and collation like mysqldump does. This is left as an exercise for the reader.

MySQL Server 8.0.18: Thanks for the Contributions

In my blog series about external contributions to MySQL 8 we have reached version 8.0.18 which was released Monday 14 October 2019. Again the community has contributed to make MySQL better. Thank you.

The contributions to MySQL 8.0.18 includes several patches from Facebook as well as patches from Gillian Gunson, Przemysław Skibiński (Percona), Daniel Black, and Satya Bodapati (also Percona). The contributions are:

  • MySQL now provides more control over the use of compression to minimize the number of bytes sent over connections to the server. Previously, a given connection was either uncompressed or used the zlib compression algorithm. Now, it is also possible to use the zstd algorithm (zstd 1.3), and to select a compression level for zstd connections. The permitted compression algorithms can be configured on the server side, as well as on the connection-origination side for connections by client programs and by servers participating in master/slave replication or Group Replication. For more information, see Connection Compression Control.

    Connection compression using the zstd algorithm requires that the server be built with zstd library support. The new WITH_ZSTD CMake option indicates whether to use the bundled or system zstd library.

    Legacy compression-control parameters, such as the --compress client option, are deprecated and will be removed in a future MySQL version.

    Thanks to Facebook for a contribution on which some of this work was based.
  • The sys.schema_unused_indexes view now filters out unique indexes. Thanks to Gillian Gunson for the contribution. (Bug #24798995, Bug #83257)
  • The new innodb_idle_flush_pct variable permits placing a limit on page flushing during idle periods, which can help extend the life of solid state storage devices. See Limiting Buffer Flushing During Idle Periods.

    Thanks to Facebook for the contribution. (Bug #27147088, Bug #88566)
  • Replication: The heartbeat interval for a replication slave, which is controlled by the MASTER_HEARTBEAT_PERIOD option of the CHANGE MASTER TO statement, can be specified with a resolution in milliseconds. Previously, the master's binary log dump thread used a granularity of seconds to calculate whether a heartbeat signal should be sent to the slave, causing excessive heartbeat activity in the case of multiple skipped events. To remove this issue, all heartbeat-related calculations by the master and slave are now carried out using a granularity of nanoseconds for precision. Thanks to Facebook for the contribution. (Bug #29363787, Bug #94356)
  • When generating C source from SQL scripts, Some utf8-encoded characters were split across lines. Thanks to Przemysław Skibiński from Percona for the patch. (Bug #30152555, Bug #96449)
  • With strict SQL mode enabled, the STR_TO_DATE() function did not properly handle values with time parts only. Thanks to Daniel Black for the contribution. (Bug #18090591, Bug #71386)
  • InnoDB: A long running ALTER TABLE … ADD INDEX operation with concurrent inserts caused semaphore waits. Thanks to Satya Bodapati from Percona for the patch. (Bug #29008298)

If you have patches you would like to contribute you can do so from MySQL's GitHub repository (requires signing the Oracle Contributor Agreement).

Thank you for the contributions.

Create MySQL Test Instance with Oracle Cloud Free Tier

Oracle announced this week at Oracle OpenWorld that it has introduced a new cloud offer called Oracle Cloud Free Tier. As the name suggest, it allows you to run a few limited instances in Oracle Cloud for free. I will in this blog show how you can use the free tier to setup a MySQL test instance.

Tip

If you want to read more about Oracle Cloud Free Tier see https://www.oracle.com/cloud/free/ and the FAQ at https://www.oracle.com/cloud/free/faq.html.

The first step is to sign up for the cloud service which you do by opening https://www.oracle.com/cloud/free/ and click on the Start for free button near the top of the page:

Click on Start for free to get started

This will take you through a series of pages where you create your Oracle Cloud account. The steps are straight forward. You will need to provide a valid mobile number and credit card (but no fees are charges provided you stick to the always free services). At the end you are directed to the Oracle Cloud login page:

Oracle Cloud Login Screen
Oracle Cloud Login Screen

Enter the email address and password from the registration process, and you are ready to use create Oracle Cloud services. You will need a compute instance which you create by choosing the Compute Create a VM Instance quick action:

The Oracle Cloud front page after logging in the first time

Notice how there is a label Always Free Eligible which tells you can create instances in the free tier.

On the next screen, you can fill in the details for the instance. You can choose all the default values which will create a VM.Standard.E2.1.Micro virtual machine which is one of the shapes that are included in the free tier. The shape includes 1 OCPU (1 CPU with hyperthreading, so two virtual CPUs) and 1GiB of memory. It will also set everything up for you including the network with ssh access.

To be able to ssh to the instance, you need to add the public ssh key for your ssh key pair. If you do not already have an ssh key, then https://docs.oracle.com/en/cloud/paas/event-hub-cloud/admin-guide/generate-ssh-key-pair-using-puttygen.html has an example of creating one on Microsoft Windows.

Once you click create, a workflow is created and launched. While the workflow is running, the icon in the top left corner is yellow/orange to indicate that the instance is being worked on. Once the workflow has completed, the instance is available and the icon turns green.

You will need the Public IP Address which you can find in the Primary VNIC Information section when viewing the instance details:

With that and your ssh key, you can connect to the instance using the opc user, for example (this assumes you have the private key in OpenSSH format):

shell$ ssh -i ~/.ssh/id_rsa opc@<ip address of vm>

The first step to install MySQL is to install the MySQL yum repository:

[opc@mysql ~]$ sudo yum install https://dev.mysql.com/get/mysql80-community-release-el7-3.noarch.rpmLoaded plugins: langpacks, ulninfo                                                                  
mysql80-community-release-el7-3.noarch.rpm                                    |  25 kB  00:00:00     
Examining /var/tmp/yum-root-4Yk8Ev/mysql80-community-release-el7-3.noarch.rpm: mysql80-community-release-el7-3.noarch                                                                                     
Marking /var/tmp/yum-root-4Yk8Ev/mysql80-community-release-el7-3.noarch.rpm to be installed          
Resolving Dependencies                                                                               
--> Running transaction check                                                                        
---> Package mysql80-community-release.noarch 0:el7-3 will be installed                              
--> Finished Dependency Resolution

Dependencies Resolved

=====================================================================================================
 Package                      Arch      Version     Repository                                  Size
=====================================================================================================
Installing:
 mysql80-community-release    noarch    el7-3       /mysql80-community-release-el7-3.noarch     31 k

Transaction Summary
=====================================================================================================
Install  1 Package

Total size: 31 k
Installed size: 31 k
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : mysql80-community-release-el7-3.noarch                                            1/1
  Verifying  : mysql80-community-release-el7-3.noarch                                            1/1

Installed:
  mysql80-community-release.noarch 0:el7-3

Complete!

Now you can install any of the MySQL products using the yum command. For example, to install MySQL Server, the MySQL client programs, and MySQL Shell:

[opc@mysql ~]$ sudo yum install mysql-community-server mysql-community-client mysql-shell
Loaded plugins: langpacks, ulninfo                                                       
Resolving Dependencies                                                                   
--> Running transaction check                                                            
---> Package mysql-community-client.x86_64 0:8.0.17-1.el7 will be installed              
--> Processing Dependency: mysql-community-libs(x86-64) >= 8.0.11 for package: mysql-community-client-8.0.17-1.el7.x86_64                                                                                 
---> Package mysql-community-server.x86_64 0:8.0.17-1.el7 will be installed                          
--> Processing Dependency: mysql-community-common(x86-64) = 8.0.17-1.el7 for package: mysql-community-server-8.0.17-1.el7.x86_64                                                                          
---> Package mysql-shell.x86_64 0:8.0.17-1.el7 will be installed                                     
--> Running transaction check                                                                        
---> Package mariadb-libs.x86_64 1:5.5.64-1.el7 will be obsoleted                                    
--> Processing Dependency: libmysqlclient.so.18()(64bit) for package: 2:postfix-2.10.1-7.el7.x86_64  
--> Processing Dependency: libmysqlclient.so.18(libmysqlclient_18)(64bit) for package: 2:postfix-2.10.1-7.el7.x86_64                                                                                      
---> Package mysql-community-common.x86_64 0:8.0.17-1.el7 will be installed                          
---> Package mysql-community-libs.x86_64 0:8.0.17-1.el7 will be obsoleting                           
--> Running transaction check                                                                        
---> Package mysql-community-libs-compat.x86_64 0:8.0.17-1.el7 will be obsoleting                    
--> Finished Dependency Resolution                                                                   

Dependencies Resolved

=====================================================================================================
 Package                           Arch         Version            Repository                   Size 
=====================================================================================================
Installing:                                                                                          
 mysql-community-client            x86_64       8.0.17-1.el7       mysql80-community            32 M 
 mysql-community-libs              x86_64       8.0.17-1.el7       mysql80-community           3.0 M 
     replacing  mariadb-libs.x86_64 1:5.5.64-1.el7                                                   
 mysql-community-libs-compat       x86_64       8.0.17-1.el7       mysql80-community           2.1 M 
     replacing  mariadb-libs.x86_64 1:5.5.64-1.el7                                                   
 mysql-community-server            x86_64       8.0.17-1.el7       mysql80-community           415 M 
 mysql-shell                       x86_64       8.0.17-1.el7       mysql-tools-community        15 M 
Installing for dependencies:                                                                         
 mysql-community-common            x86_64       8.0.17-1.el7       mysql80-community           589 k 

Transaction Summary
=====================================================================================================
Install  5 Packages (+1 Dependent package)                                                           

Total download size: 468 M
Is this ok [y/d/N]: y     
Downloading packages:     
warning: /var/cache/yum/x86_64/7Server/mysql80-community/packages/mysql-community-common-8.0.17-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY                                    
Public key for mysql-community-common-8.0.17-1.el7.x86_64.rpm is not installed                       
(1/6): mysql-community-common-8.0.17-1.el7.x86_64.rpm                         | 589 kB  00:00:00     
(2/6): mysql-community-libs-8.0.17-1.el7.x86_64.rpm                           | 3.0 MB  00:00:01     
(3/6): mysql-community-libs-compat-8.0.17-1.el7.x86_64.rpm                    | 2.1 MB  00:00:00     
Public key for mysql-shell-8.0.17-1.el7.x86_64.rpm is not installed] 5.0 MB/s |  44 MB  00:01:25 ETA 
(4/6): mysql-shell-8.0.17-1.el7.x86_64.rpm                                    |  15 MB  00:00:06     
(5/6): mysql-community-client-8.0.17-1.el7.x86_64.rpm                         |  32 MB  00:00:13     
(6/6): mysql-community-server-8.0.17-1.el7.x86_64.rpm                         | 415 MB  00:01:29     
-----------------------------------------------------------------------------------------------------
Total                                                                5.1 MB/s | 468 MB  00:01:31     
Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-mysql                                        
Importing GPG key 0x5072E1F5:                                                                        
 Userid     : "MySQL Release Engineering <mysql-build@oss.oracle.com>"                               
 Fingerprint: a4a9 4068 76fc bd3c 4567 70c8 8c71 8d3b 5072 e1f5                                      
 Package    : mysql80-community-release-el7-3.noarch (@/mysql80-community-release-el7-3.noarch)      
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-mysql                                                     
Is this ok [y/N]: y                                                                                  
Running transaction check                                                                            
Running transaction test                                                                             
Transaction test succeeded
Running transaction
  Installing : mysql-community-common-8.0.17-1.el7.x86_64                                        1/7
  Installing : mysql-community-libs-8.0.17-1.el7.x86_64                                          2/7
  Installing : mysql-community-client-8.0.17-1.el7.x86_64                                        3/7
  Installing : mysql-community-server-8.0.17-1.el7.x86_64                                        4/7
  Installing : mysql-community-libs-compat-8.0.17-1.el7.x86_64                                   5/7
  Installing : mysql-shell-8.0.17-1.el7.x86_64                                                   6/7
  Erasing    : 1:mariadb-libs-5.5.64-1.el7.x86_64                                                7/7
  Verifying  : mysql-community-libs-8.0.17-1.el7.x86_64                                          1/7
  Verifying  : mysql-community-server-8.0.17-1.el7.x86_64                                        2/7
  Verifying  : mysql-community-common-8.0.17-1.el7.x86_64                                        3/7
  Verifying  : mysql-community-client-8.0.17-1.el7.x86_64                                        4/7
  Verifying  : mysql-shell-8.0.17-1.el7.x86_64                                                   5/7
  Verifying  : mysql-community-libs-compat-8.0.17-1.el7.x86_64                                   6/7
  Verifying  : 1:mariadb-libs-5.5.64-1.el7.x86_64                                                7/7

Installed:
  mysql-community-client.x86_64 0:8.0.17-1.el7        mysql-community-libs.x86_64 0:8.0.17-1.el7
  mysql-community-libs-compat.x86_64 0:8.0.17-1.el7   mysql-community-server.x86_64 0:8.0.17-1.el7
  mysql-shell.x86_64 0:8.0.17-1.el7

Dependency Installed:
  mysql-community-common.x86_64 0:8.0.17-1.el7

Replaced:
  mariadb-libs.x86_64 1:5.5.64-1.el7

Complete!

There are some dependencies that are pulled in and existing libraries are upgraded.

That is it. All that remains is to start MySQL and set the root password. You start MySQL through systemd like:

[opc@mysql ~]$ sudo systemctl start mysqld

Since it is the first time MySQL is started, the data directory (/var/lib/mysql) is initialized and the root account is created with a random password. You can find the random password in the error log:

[opc@mysql ~]$ sudo grep password /var/log/mysqld.log
2019-09-18T09:59:55.552745Z 5 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: i(Y5Nkko>S.t

The password you see will of course be different. Use the password the first time you authenticate, then use the ALTER USER statement to set the new password. For example, using MySQL Shell:

[opc@mysql ~]$ mysqlsh --user=root --sql
Please provide the password for 'root@localhost': ************
Save password for 'root@localhost'? [Y]es/[N]o/Ne[v]er (default No): No
MySQL Shell 8.0.17

Copyright (c) 2016, 2019, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
Creating a session to 'root@localhost'
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 10 (X protocol)
Server version: 8.0.17 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
 MySQL  localhost:33060+ ssl  SQL > ALTER USER CURRENT_USER() IDENTIFIED BY 'New$secureP@ssw0rd';
Query OK, 0 rows affected (0.0061 sec)

Information

The password validation component is installed by default when installing MySQL using RPMs. This means that the password must be at least eight characters long and include at least one lower case, one upper case, one digit, and one special character.

You are now ready to use MySQL. Have fun.

Tip

Since the free compute instance only has one gigabyte of memory, you may want to reduce the memory footprint of MySQL. Lefred has a blog with some suggestions at https://lefred.be/content/mysql-8-0-memory-consumption-on-small-devices/.

MySQL Server 8.0.17: Thanks for the Contributions

MySQL 8.0.17 was released Monday and it includes great features such as the Clone feature and multi-valued indexes. There are also several nice contributions from the community. These are the changes that this blog is about.

The contributions to MySQL Server 8.0.17 include patches from Facebook, Daniël van Eeden, Mattias Jonsson, and Simon Mudd (all from Booking.com), Daniel Black, Yibo Cai (from Arm Technology), Josh Braden, and Zhou Mengkang. The larger contributions are:

  • The mysql client program now sends os_user and os_sudouser connection attributes, when available, to indicate the name of the operating system user running the program and the value of the SUDO_USER environment variable, respectively. For general information about connection attributes, see Performance Schema Connection Attribute Tables. Thanks to Daniël van Eeden for the contribution on which this feature was based. (Bug #29210935, Bug #93916)
  • The mysqldump option –set-gtid-purged controls whether or not a SET @@GLOBAL.gtid_purged statement is added to the mysqldump output. The statement updates the value of gtid_purged on a server where the dump file is reloaded, to add the GTID set from the source server's gtid_executed system variable. A new choice –set-gtid-purged=COMMENTED is now available. When this value is set, if GTIDs are enabled on the server you are backing up, SET @@GLOBAL.gtid_purged is added to the output (unless gtid_executed is empty), but it is commented out. This means that the value of gtid_executed is available in the output, but no action is taken automatically when the dump file is reloaded. With COMMENTED, you can control the use of the gtid_executed set manually or through automation. For example, you might prefer to do this if you are migrating data to another server that already has different active databases. Thanks to Facebook for this contribution. (Bug #94332, Bug #29357665)
  • MySQL now uses open(O_TMPFILE) whenever applicable when creating a temporary file that is immediately unlinked. This is more efficient than previously and avoids the small possibility of a race condition. Thanks to Daniel Black for the contribution. (Bug #29215177, Bug #93937)
  • When clients were terminated for inactivity exceeding the wait_timeout value, the message written to the error log was unclear. Now ER_NET_WAIT_ERROR is written, which is more specific about the cause of the problem. Thanks to Mattias Jonsson for the contribution. (Bug #28940167, Bug #93240)
  • InnoDB: Insufficient memory barriers in the rw-lock implementation caused deadlocks on ARM. Thanks to Yibo Cai from Arm Technology for the contribution. (Bug #29508001, Bug #94699)
  • Replication: When events generated by one MySQL server instance were written to the binary log of another instance, the second server implicitly assumed that the first server supported the same number of binary log event types as itself. Where this was not the case, the event header was handled incorrectly. The issue has now been fixed. Thanks to Facebook for the contribution. (Bug #29417234)
  • Replication: When binary logging is enabled on a replication slave, the combination of the –replicate-same-server-id and –log-slave-updates options on the slave can cause infinite loops in replication if the server is part of a circular replication topology. (In MySQL 8.0, binary logging is enabled by default, and slave update logging is the default when binary logging is enabled.) However, the use of global transaction identifiers (GTIDs) prevents this situation by skipping the execution of transactions that have already been applied. The restriction on this combination of options has therefore now been removed when gtid_mode=ON is set. With any other GTID mode, the server still does not start with this combination of options. As a safeguard against creating the problem situation after the server has started, you now cannot change the GTID mode to anything other than ON on a running server that has this combination of options set. Thanks to Facebook for the contribution. (Bug #28782370, Bug #92754)
  • Replication: When a MEMORY table is implicitly deleted on a master following a server restart, the master writes a DELETE statement to the binary log so that slaves also empty the table. This generated event now includes a comment in the binary log so that the reason for the DELETE statement is easy to identify. Thanks to Daniël van Eeden for the contribution. (Bug #29157796, Bug #93771)

There are also a number of smaller patches that has helped improve the comments and messages in the MySQL source code. These are:

  • Bug 29403708 – CONTRIBUTION: FIX TYPO IN AUTHENTICATION METHODS DOCUMENTATION
    Thanks to Daniël van Eeden.
  • Bug 29428435 – CONTRIBUTION: FIX TYPOS IN MYSQLDUMP.CC
    Thanks to Josh Braden.
  • Bug 29262200 – CONTRIBUTION: FIX TYPOS IN COMMENTS FOR COM_XXX COMMANDS
    Thanks to Simon Mudd.
  • Bug 29468128 – CONTRIBUTION: UPDATE HANDLER.CC
    Thanks to Zhou Mengkang.

Thank you for your contributions. Feel free to keep submitting ideas to the MySQL bugs database with ideas how to improve MySQL.

Connector/Python C Extension Prepared Statement Support

MySQL Connector/Python 8 made the C Extension the default for the platform/Python version combinations supporting it. One thing that was missing from the C Extension implementation (unless you used the _mysql_connector module) was support for prepared statements. That has been taken care of with the release of version 8.0.17.

The two main advantages of using prepared statements are security and performance. The security comes in as you can pass query parameters and have them applied server-side, so you are sure they are quoted and escaped correctly taking the data type into consideration. The performance benefit happens, when you execute the same query (except for the parameters) several times as MySQL will prepare it only for the first execution and then reuse the prepared statement – that is where the name comes from.

You use the prepared statements with the C Extension in the same way as for the pure Python implementation – by setting the prepared argument to True when creating a cursor. The simplest way to explain is to show an example.

import mysql.connector

connect_args = {
    "user": "root",
    "host": "localhost",
    "port": 3306,
    "password": "password",
    "use_pure": False,
}

db = mysql.connector.connect(**connect_args)
cursor = db.cursor(prepared=True)
print(cursor)
print("")

sql = "SELECT * FROM world.city WHERE ID = %s"
city_ids = [130, 456, 3805]

print("  ID  Name            Country  District         Popluation")
print("-" * 58)
fmt = "{0:4d}  {1:14s}  {2:^7s}  {3:15s}  {4:10d}"
for city_id in city_ids:
    cursor.execute(sql, (city_id,))
    city = cursor.fetchone()
    print(fmt.format(*city))

cursor.close()
db.close()

In the connection arguments, use_pure is set to False. Since that is the default, it is not needed, but it has been added here to make it explicit that the C Extension is used.

Avoid

Do not hardcode the connection arguments in your programs. It is done here to keep the example simple, but it is both insecure and inflexible to do in real programs.

When the cursor is created in line 12, prepared is set to True making it a prepared statement cursor. To verify that, the cursor is printed in the next line.

You create the statement by adding the string %s as a placeholder where you want to add the parameters to the query. You can then keep executing the query. In the example, the query is executed for three different IDs. (Yes, for this example, all three cities could have been fetched in one query, but imagine this query is used as part of a larger application where the three cities are not required at the same time. This could for example be for three independent user requests.) The parameter is provided as a tuple to the execute() method of the cursor. The output of the program is:

CMySQLCursorPrepared: (Nothing executed yet)

  ID  Name            Country  District         Popluation
----------------------------------------------------------
 130  Sydney            AUS    New South Wales     3276207
 456  London            GBR    England             7285000
3805  San Francisco     USA    California           776733

Notice that the cursor uses the class CMySQLCursorPrepared, which is the prepared statement cursor class for the C Extension.

MySQL Connector/Python Revealed

Book

If you want to learn more about MySQL Connector/Python, then I have written MySQL Connector/Python Revealed published by Apress. The book both covers the traditional Python Database API (PEP 249) and the X DevAPI which is new as of MySQL 8.

The book is available from Apress (print and DRM free ePub+PDF), Amazon (print and Kindle), Barnes & Noble (print), and others.

Have fun coding.