Performance Schema Functions

The sys schema was created to make it easier to use the Performance Schema. This included several functions, for example to convert the picoseconds used by the Performance Schema into human readable strings. In MySQL 8.0.16, three of these functions have been implemented as native functions in MySQL Server.

SELECT FORMAT_BYTES(1073741824)

Why do away with the sys schema functions? There are two reasons: performance and ease of use. The native functions are written in C++ like the rest of the server whereas the sys schema functions were written as stored functions. Function written in C++ are inherently faster than stored functions. Additionally, that the functions are native means you no longer need to prefix them with sys. to tell MySQL where to find the functions.

The table below shows the mapping from the sys schema functions to the new native functions.

sys SchemaNativeDescription
format_bytes()FORMAT_BYTES()Scale bytes, for example 1024 bytes to 1 kiB.
format_time()FORMAT_PICO_TIME()Convert picoseconds to human readable strings.
ps_thread_id()PS_THREAD_ID()
PS_CURRENT_THREAD_ID()
Find the Performance Schema thread ID belonging to a connection. The PS_CURRENT_THREAD_ID() function does not take any arguments and returns – as the name suggest, the thread ID for the connection itself.

A simple example using the new functions is:

mysql> SELECT PS_THREAD_ID(CONNECTION_ID()),
              PS_CURRENT_THREAD_ID(),
              TIMER_WAIT,
              FORMAT_PICO_TIME(TIMER_WAIT),
              SUM(CURRENT_NUMBER_OF_BYTES_USED),
              FORMAT_BYTES(SUM(CURRENT_NUMBER_OF_BYTES_USED))
         FROM events_statements_current
              INNER JOIN memory_summary_by_thread_by_event_name
                    USING (THREAD_ID)
        WHERE THREAD_ID = PS_CURRENT_THREAD_ID()
       GROUP BY THREAD_ID, EVENT_ID\G
*************************** 1. row ***********************
                  PS_THREAD_ID(CONNECTION_ID()): 59
                         PS_CURRENT_THREAD_ID(): 59
                                     TIMER_WAIT: 346074677
                   FORMAT_PICO_TIME(TIMER_WAIT): 346.07 us
              SUM(CURRENT_NUMBER_OF_BYTES_USED): 1423655
FORMAT_BYTES(SUM(CURRENT_NUMBER_OF_BYTES_USED)): 1.36 MiB
1 row in set (0.0011 sec)

The example requires the performance_schema to be the default schema.

As you can see, the PS_THREAD_ID() function with the result of the CONNECTION_ID() (returning the connection ID for the current connection) is the same as using PS_CURRENT_THREAD_ID(). You can of course use the function with any connection ID. If the connection ID does not exist, the function returns NULL.

The FORMAT_PICO_TIME() function returns 346.07 us (u is used instead of here instead of μ for microseconds) instead of 346074677 which is the raw value for TIMER_WAIT (the duration of the query at the time the data is retrieved). Similarly FORMAT_BYTES() converts the bytes to MiB.

If you use the sys.ps_thread_id() or sys.format_bytes() functions, then the first time for each connection, there will be a warning that the function name is the same as for a native function, for example:

mysql> SELECT sys.format_bytes(1423655);
+---------------------------+
| sys.format_bytes(1423655) |
+---------------------------+
| 1.36 MiB                  |
+---------------------------+
1 row in set, 1 warning (0.0008 sec)

mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
  Level: Note
   Code: 1585
Message: This function 'format_bytes' has the same name as a native function
1 row in set (0.0003 sec)

If you get these warnings from your own objects, you should update them to use the new native functions.

That’s it. Hope you like the new functions.

Connector/Python Connection Attributes

MySQL Server has since version 5.6 supported connection attributes for the clients. This has allowed a client to provide information such as which program or connector the client is, the client version, the license, etc. The database administrator can use this information for example to verify whether all clients have been upgraded, which client is executing a particular query, and so forth.

In MySQL 8.0.16 this feature has been included for the X DevAPI in the MySQL connectors as well, including MySQL Connector/Python which I will cover in this blog. First though, let’s take a look at how the attributes are exposed in MySQL Server.

Example of the built-in connection attributes provided by MySQL Connector/Python
The built-in MySQL Connector/Python connection attributes

Connection Attributes in MySQL Server

The connection attributes are made available in MySQL Server through two tables within the Performance Schema: session_account_connect_attrs and session_connect_attrs. The two tables have the same definition – the only difference is for which connections they show the connection attributes.

The session_account_connect_attrs table includes the attributes for connections using the same user account as for the one querying the table. This is useful if you want to grant permission for a user to check the attributes for their own connections but not for other connections.

On the other hand, session_connect_attrs shows the attributes for all connections. This is useful for the administrator to check the attributes for all users.

Information

It is up to the client what attributes they want to expose and the values they provide. In that sense, you can only trust the attributes to the same extend that you trust the clients to submit correct values.

The tables have four columns:

  • PROCESSLIST_ID: This is the same ID as in SHOW PROCESSLIST or the PROCESSLIST_ID column in performance_schema.threads.
  • ATTR_NAME: This is the name of the attribute, for example _client_name.
  • ATTR_VALUE: This is the value for the attribute, for example mysql-connector-python.
  • ORDINAL_POSITION: The attributes have an order. The ordinal position specifies the position of the attribute. The first attribute for the connection has ordinal position 0, the next 1, and so forth.

The PROCESSLIST_ID and ATTR_NAME together form the primary key of the tables.

Now that you know how to query the table, let’s take a look at how it works with MySQL Connector/Python.

Connector/Python Attributes

There are essentially three different ways to use the connection attributes from MySQL Connector/Python. You can have them disabled. This is the default and means that no attributes will be provided. You can enable them and use the default attributes. And finally, there is support for providing custom attributes. Let’s look at each of the two cases where attributes are enabled.

MySQL Connector/Python Revealed

Book

If you want to learn more about MySQL Connector/Python, then I have written MySQL Connector/Python Revealed published by Apress. The book both covers the traditional Python Database API (PEP 249) and the X DevAPI which is new as of MySQL 8.

The book is available from Apress (print and DRM free ePub+PDF), Amazon (print and Kindle), Barnes & Noble (print), and others.

Enabled with Default Attributes

If you want your Python program to connect to MySQL using the default attributes, all you need to do is to set the connection-attributes option to True. You can do that in one of two ways depending on how you set the connection arguments

If you use a dictionary of arguments, you add connection-attributes as a key with the value set to True:

import mysqlx

connect_args = {
    "host": "127.0.0.1",
    "port": 33060,
    "user": "pyuser",
    "password": "Py@pp4Demo",
    "connection-attributes": True,
};

db = mysqlx.get_session(**connect_args)
p_s = db.get_schema("performance_schema")
attr = p_s.get_table("session_account_connect_attrs")
stmt = attr.select()
stmt.where("PROCESSLIST_ID = CONNECTION_ID()")
stmt.order_by("ORDINAL_POSITION")
result = stmt.execute()

print("+------+-----------------+------------------------+---------+")
print("| P_ID | ATTR_NAME       | ATTR_VALUE             | ORDINAL |")
print("+------+-----------------+------------------------+---------+")
fmt = "| {0:4d} | {1:<15s} | {2:<22s} | {3:7d} |"
for row in result.fetch_all():
    print(fmt.format(row[0], row[1], row[2], row[3]))
print("+------+-----------------+------------------------+---------+")

db.close()

The program creates the connection, then queries the performance_schema.session_account_connect_attrs table using the crud methods. Finally, the result is printed (note that the PROCESSLIST_ID and ORDINAL_POSITION columns have had their names shortened in the output to make the output less wide – the process list ID will differ in your output):

+------+-----------------+------------------------+---------+
| P_ID | ATTR_NAME       | ATTR_VALUE             | ORDINAL |
+------+-----------------+------------------------+---------+
|   45 | _pid            | 19944                  |       0 |
|   45 | _platform       | x86_64                 |       1 |
|   45 | _os             | Windows-10.0.17134     |       2 |
|   45 | _source_host    | MY-LAPTOP              |       3 |
|   45 | _client_name    | mysql-connector-python |       4 |
|   45 | _client_version | 8.0.16                 |       5 |
|   45 | _client_license | GPL-2.0                |       6 |
+------+-----------------+------------------------+---------+

Notice that all of the attribute names start with an underscore. That means it is a built-in attribute. Attribute names starting with an underscore are reserved and can only be set by MySQL itself.

You can also connect using a URI, in that case the connection is made like the following example:

import mysqlx
import urllib

uri = "mysqlx://{0}:{1}@{2}:{3}/?connection-attributes=True".format(
    "pyuser",
    urllib.parse.quote('Py@pp4Demo', safe=''),
    "127.0.0.1",
    "33060"
)
db = mysqlx.get_session(uri)

The username, password, host, and port are added through the format() function to make the code less wide. The important thing here is the connection-attributes=True. You can also leave out =True as mentioning the connection-attributes option without any value is the same as enabling it.

What do you do, if you want to add your own customer attributes? That is supported as well.

Custom Attributes

This far the connection-attributes option has just been set to True. However, it also supports taking a list or dictionary as the argument. That can be used to set your own custom attributes.

The easiest way to understand this is to see an example:

import mysqlx

attributes = {
    "application": "my_test_app",
    "foo": "bar",
    "foobar": "",
}
connect_args = {
    "host": "127.0.0.1",
    "port": 33060,
    "user": "pyuser",
    "password": "Py@pp4Demo",
    "connection-attributes": attributes,
};

db = mysqlx.get_session(**connect_args)
p_s = db.get_schema("performance_schema")
attr = p_s.get_table("session_account_connect_attrs")
stmt = attr.select()
stmt.where("PROCESSLIST_ID = CONNECTION_ID()")
stmt.order_by("ORDINAL_POSITION")
result = stmt.execute()

print("+------+-----------------+------------------------+---------+")
print("| P_ID | ATTR_NAME       | ATTR_VALUE             | ORDINAL |")
print("+------+-----------------+------------------------+---------+")
fmt = "| {0:4d} | {1:<15s} | {2:<22s} | {3:7d} |"
for row in result.fetch_all():
    value = row[2] if row[2] is not None else ""
    print(fmt.format(row[0], row[1], value, row[3]))
print("+------+-----------------+------------------------+---------+")

db.close()

Notice in line 29 that it is checked whether the attribute value is None (NULL in SQL). When the attribute value is an empty string or no value like for the foobar attribute, it is returned as None in the result set.

Alternatively, you can specify the same three connection attributes using a list:

attributes = [
    "application=my_test_app",
    "foo=bar",
    "foobar",
]
connect_args = {
    "host": "127.0.0.1",
    "port": 33060,
    "user": "pyuser",
    "password": "Py@pp4Demo",
    "connection-attributes": attributes,
};

You can also use an URI of course. You need to use the list syntax for that:

uri = "mysqlx://{0}:{1}@{2}:{3}/" \
    + "?connection-attributes=[application=my_test_app,foo=bar,foobar]".format(
    "pyuser",
    urllib.parse.quote('Py@pp4Demo', safe=''),
    "127.0.0.1",
    "33060",
)

These all give the same result (the process list ID will differ):

+------+-----------------+------------------------+---------+
| P_ID | ATTR_NAME       | ATTR_VALUE             | ORDINAL |
+------+-----------------+------------------------+---------+
|   74 | _pid            | 20704                  |       0 |
|   74 | _platform       | x86_64                 |       1 |
|   74 | _os             | Windows-10.0.17134     |       2 |
|   74 | _source_host    | MY-LAPTOP              |       3 |
|   74 | _client_name    | mysql-connector-python |       4 |
|   74 | _client_version | 8.0.16                 |       5 |
|   74 | _client_license | GPL-2.0                |       6 |
|   74 | application     | my_test_app            |       7 |
|   74 | foo             | bar                    |       8 |
|   74 | foobar          |                        |       9 |
+------+-----------------+------------------------+---------+

Notice that the built-in attributes are still included and the custom attributes have been added at the end.

That concludes this introduction to connection attributes with the MySQL Connector/Python X DevAPI. I will recommend to enable them by default even if just for the built-in attributes. The attributes can be very handy when debugging issues on the server.

MySQL Server 8.0.16: Thanks for the Contributions

MySQL 8.0.16 was released last week and includes many great features including support for CHECK constraints and upgrades without the need of mysql_upgrade. As usual there are also several contributions from the community. These are the ones, I would like to highlight in this blog to say “thank you for the contributions”.

Facebook has contributed with another two patches and so has Daniel Black. Additionally, there are patches from Yuhui Wang, Wei Zhao, Yan Huang, and Dirkjan Bussink. The contributions are:

  • mysql-test-run.pl now supports the MTR_UNIQUE_IDS_DIR environment variable, which may be set to specify a unique-IDs directory to be used as the common location for all chroot environments by multiple simultaneous mysql-test-run.pl instances. This enables those instances to avoid conflicts when reserving port numbers. Thanks to Facebook for the contribution. (Bug #29221085, Bug #93950)
  • Changes to the all_persisted_variables test. Thanks to the Facebook team for the contribution. (Bug #29013375, Bug #93478)
  • InnoDB: Write-ahead did not work as expected due to an incorrectly initialized variable. Thanks to Yuhui Wang for the contribution. (Bug #29028838, Bug #93442)
  • InnoDB: A Linux AIO handler function failed to check if completed I/O events succeeded. Thanks to Wei Zhao for the contribution. (Bug #27850600, Bug #90402)
  • InnoDB: A function called by a CREATE TABLE thread attempted to access a table object after it was freed by a background thread. Thanks to Yan Huang for the patch. (Bug #27373959, Bug #89126)
  • Replication: If the WAIT_FOR_EXECUTED_GTID_SET() function was used with a timeout value including a fractional part (for example, 1.5), an error in the casting logic meant that the timeout was rounded down to the nearest whole second, and to zero for values less than 1 second (for example, 0.1). The casting logic has now been corrected so that the timeout value is applied as originally specified with no rounding. Thanks to Dirkjan Bussink for the contribution. (Bug #29324564, Bug #94247)
  • Determination of the number of online CPUs available to the mysqld process is now more accurate. Thanks to Daniel Black for the contribution. (Bug #28907677, Bug #93144)
  • Made a comparison in the internal method Item_result::item_cmp_type() more efficient. Our thanks to Daniel Black for the contribution. (Bug #92784, Bug #28796107)

It sometimes happen that we receive a contribution that we really like, but – for one reason or another – choose not to use the patch. One such case is the new feature to allow reconfiguration of the SSL settings at runtime. Thank you Facebook for submitting a contribution for a similar feature even though we ended up not using the patch.

Thank you for your contributions. Feel free to keep submitting ideas to the MySQL bugs database with ideas how to improve MySQL

MySQL Shell 8.0.16: User Defined Reports

In my blog yesterday, I wrote about the new reporting framework in MySQL Shell. It is part of the 8.0.16 release. I also noted that it includes the possibility to create your own custom reports and use those with the \show and \watch commands. This blog will explore how you can create a report and register it, so it automatically is available when you start MySQL Shell.

Update

This blog was updated on 30 April to include the use of the values argument for report options. This moved the validation of the optional arguments in to the reporting framework and automatically includes the list of valid options in the generated help text.

The help text for the example sessions report.
The help text for the example sessions report.

Background

You can write the code that generates the report in either JavaScript or Python. The reports can be used from either language mode – even SQL – irrespective of which language you choose, so go with what you are most comfortable with.

Once you have written your code, you save it in the init.d folder (does not exist by default) inside the user configuration folder. By default this is at the following location depending on whether you are on Microsoft Windows or Linux:

  • Microsoft Windows: %AppData%MySQL\mysqlsh
  • Linux: ~/.mysqlsh

You can overwrite this path with the MYSQLSH_USER_CONFIG_HOME environment variable.

You are free to choose any file name, but a good rule is to name the file the same as the report. However, it is required that you use .py as the file name extension if you wrote the report in Python and .js if you used JavaScript.

At that point, you need to register the report, so you can use it through the reporting framework. You do that using the shell.registerReport() method from inside the same file that has the report code. It takes four arguments: the name of the report, the report type, the function generating the report (as a function object), and optional a dictionary with the description. I will not go into the details of these argument here beyond providing an example of using it. The manual has a quite detailed section on registering your report including what the arguments are.

One thing that is worth discussing a bit as it influences how the report content should be formatted is the report type. This can have one of three values:

  • list: The content of the report is returned as a list of lists constituting the rows of a table. The \show and \watch commands can then show the data either using the familiar tabular format or in vertical. The decision of which display format to use can be made when running the report.
  • report: The report content is returned in YAML.
  • print: The report code print the output directly.

The report and print types are the more flexible, but the list type works well with query results.

This can all feel very abstract. The best way to actually understand how it works is to write an example report to go through the steps.

Example Custom Report

The custom report, I will create is based on the one in the reference manual, but modified to allow you to choose what to sort by. The example should help make it clearer how to create your own reports.

The example is quite simple and could be generated using the built-in query report, but it serves as a good starting point to understand the mechanics of custom reports, and even simple reports like this provides a way to have your report logic saved in one place and easily accessible from within MySQL Shell. The example is written in Python, but a report generating the same result written in JavaScript would look similar (although not identical).

Download the Source

You do not need to copy and paste all the code snippets if you want to try this example. You can download the entire sessions.zip file from below and extract the file with the report source code.

The Report Function

The first thing is to define the report itself. This report is called sessions, so the function with the code is also called sessions. This is not required, but it is best practice:

sort_allowed = {
    'thread': 'thd_id',
    'connection': 'conn_id',
    'user': 'user',
    'db': 'db',
    'latency': 'statement_latency',
    'memory': 'current_memory',
}

def sessions(session, args, options):

First a dictionary is defined with the keys specifying the allowed values for the --sort option and the values as what will actually be used for the ordering. Then there is the definition of the reporting function itself. The function takes three arguments:

  • session: A MySQL Shell session object. This gives you access to all of the session properties and methods when you create the report.
  • args: A list of any additional arguments passed to the the report. This is what the query report uses to get the query that will be executed. This report does not use any such arguments, so anything passed this way will be ignored.
  • options: This is a dictionary with named options. This report will support two such named options:
    • --limit or -l which sets the maximum number of rows to retrieve. The option will use the limit key in the dictionary. The default is not to impose any limit.
    • --sort or -s which chooses what to sort by. The option will use the sort key in the dictionary. The report will support ordering by thread, connection, user, db, latency, and memory. The default is to sort by latency.

You can choose different names for the arguments if you prefer.

The next thing is to define the query that will retrieve the result that will be used in the report. You can do this in several ways. If you want to execute an SQL query, you can use session.sql() (where session is the name of the session object in your argument list). However, it is simpler to code the query using the X DevAPI as that makes it trivial to customize the query, for example with the limit option and what to order by.

    sys = session.get_schema('sys')
    session_view = sys.get_table('x$session')
    query = session_view.select(
        'thd_id', 'conn_id', 'user', 'db',
        'sys.format_statement(current_statement) AS statement',
        'sys.format_time(statement_latency) AS latency',
        'format_bytes(current_memory) AS memory')

The statement will query the sys.x$session view. This is the non-formatted version of sys.session. The reason for using this is to allow custom sorting of the result set according to the --sort option. The view is obtained using the session.get_schema() method first to get a schema object for the sys schema, then the get_table() method of the schema object.

The query can then be defined from the table (view in this case) object by using the select() method. The arguments are the columns that should be included in the result. As you can see, it if possible to manipulate the columns and rename them.

Want to Learn More?

If you want to learn more about the MySQL X DevAPI and how to use the Python version of it, then I have written MySQL Connector/Python Revealed published by Apress. The book is available from Apress (print and DRM free ePub+PDF), Amazon (print and Kindle), Barnes & Noble (print), and others.

The X DevAPI makes it trivial to modify the query with the options the report are invoked with. First handle the --sort option:

    # Set what to sort the rows by (--sort)
    try:
        order_by = options['sort']
    except SystemError:
        order_by = 'latency'
        
    if order_by in ('latency', 'memory'):
        direction = 'DESC'
    else:
        direction = 'ASC'
    query.order_by('{0} {1}'.format(
        sort_allowed[order_by], direction))

    # If ordering by latency, ignore those statements with a NULL latency
    # (they are not active)
    if order_by == 'latency':
        query.where('statement_latency IS NOT NULL')

If the --sort option is not provided, then a SystemError exception is raised. The first part of the snippet handles this, and ensures that the report default to ordering by the latency. Then, it is checked if the provided value is one of the supported values.

The next step is to decide whether to sort in descending or ascending order. You can of course add another option for this, but here the logic is contained within the report choosing descending when sorting by latency or memory usage; otherwise ascending.

The final step is to tell MySQL what to order by which is done in lines 29-30 by invoking the order_by() method. This is where the programmatic approach of the X DevAPI makes it easier to gradually put the query together compared to working directly with the SQL statement.

This report adds a little extra logic to the query. If the result is ordered by latency, only queries that are currently executing (the latency IS NOT NULL are included). This is one of the advantages of creating a custom report rather than writing the query ad-hoc as you can include logic like that.

The --limit option is handled in a similar way:

    # Set the maximum number of rows to retrieve is --limit is set.
    try:
        limit = options['limit']
    except SystemError:
        limit = 0
    if limit > 0:
        query.limit(limit)

There is not much to note about this code snippet. In line 43 the limit is applied (if the value is greater than 0) by invoking the limit() method. Finally, the query can be executed and the report generated:

    result = query.execute()
    report = [result.get_column_names()]
    for row in result.fetch_all():
        report.append(list(row))

    return {'report': report}

The execute() method is used to tell MySQL that the query can be executed. This returns a result object. The get_column_names() method of the result object can be used to get the column names. Then, the rows are added by iterating over them. As you can see, there is only one report list: the first element is a list with the column headers, the remaining are the row values.

Tip

The first element in the report list contains the column headers. The remaining elements contain the values.

Finally, the result is returned as a dictionary. That is it for generating the report, but it should also be registered.

Registering the Report

The registration of the report is done in the same file as where the report function was defined. You perform the registration by calling the shell.register_report() method:

shell.register_report(
    'sessions',
    'list',
    sessions,
    {
        'brief': 'Shows which sessions exist.',
        'details': ['You need the SELECT privilege on sys.session view and the '
                    + 'underlying tables and functions used by it.'],
        'options': [
            {
                'name': 'limit',
                'brief': 'The maximum number of rows to return.',
                'shortcut': 'l',
                'type': 'integer'
            },
            {
                'name': 'sort',
                'brief': 'The field to sort by.',
                'shortcut': 's',
                'type': 'string',
                'values': sort_allowed.keys()
            }
        ],
        'argc': '0'
    }
)

The first argument is the name of the report, ‘sessions’, then the report type. The third argument is the function itself. Then comes a dictionary describing the report.

There are two parts to the dictionary: the two first arguments with a description of the report – first a short (brief) description, then more details. Then a list of the options that the report supports. The final argument is the number of additional arguments. Notice that for the --sort option, the element values have been set with the allowed values for the option. This will make MySQL Shell validate the option.

Now, you are ready to test the report.

Testing the Report

First the report must be installed. If you do not already have the init.d directory, create it under %AppData%MySQL\mysqlsh if you are on Microsoft Windows or under ~/.mysqlsh if you are on Linux. Then copy sessions.py into the directory.

Now, start MySQL Shell and the report is ready to be used:

mysql-js> \show
Available reports: query, sessions.

mysql-js> \show sessions --help
sessions - Shows which sessions exist.

You need the SELECT privilege on sys.session view and the underlying tables and
functions used by it.

Usage:
       \show sessions [OPTIONS]
       \watch sessions [OPTIONS]

Options:
  --help                        Display this help and exit.
  --vertical, -E                Display records vertically.
  --limit=integer, -l           The maximum number of rows to return.
  --sort=string, -s             The field to sort by. Allowed values: latency,
                                thread, db, connection, user, memory.

mysql-js> \show sessions
+--------+---------+---------------+------+-------------------------------------------------------------------+----------+------------+
| thd_id | conn_id | user          | db   | statement                                                         | latency  | memory     |
+--------+---------+---------------+------+-------------------------------------------------------------------+----------+------------+
| 65     | 28      | mysqlx/worker | NULL | SELECT `thd_id`,`conn_id`,`use ... ER BY `statement_latency` DESC | 38.09 ms | 965.58 KiB |
+--------+---------+---------------+------+-------------------------------------------------------------------+----------+------------+

mysql-js> \show sessions -E
*************************** 1. row ***************************
   thd_id: 65
  conn_id: 28
     user: mysqlx/worker
       db: NULL
statement: SELECT `thd_id`,`conn_id`,`use ... ER BY `statement_latency` DESC
  latency: 35.49 ms
   memory: 968.88 KiB

Notice how the help text has been generated from the information that was provided when the report was registered, and how the -E option can be used to turn the tabular output format into the vertical format. Note also that the report is invoked from JavaScript mode and still works even though the report is written in Python – MySQL Shell will automatically handle that for you and ensure the report is executed using the correct interpreter.

It is left as an exercise for the reader to add the --sort and --limit options and to use the report with the \watch command.

Note

On Microsoft Windows, it sometimes happens that when an option is not explicitly passed to the report, then the options dictionary is still set with a value. You can avoid that by providing the options explicitly.

One related feature that is worth covering before finishing is the shell.reports object.

The shell.reports Object

This far the \show and \watch commands have been used to invoke the reports, but there is a lower level way to do it – using the shell.reports object. It is also a very useful way to explore which reports are available.

Let’s start with the latter – exploring reports – as that also shows you how the shell.reports object work. As usual in MySQL Shell, it has built-in help:

mysql-py> shell.reports.help()
NAME
      reports - Gives access to built-in and user-defined reports.

SYNTAX
      shell.reports

DESCRIPTION
      The 'reports' object provides access to built-in reports.

      All user-defined reports registered using the shell.register_report()
      method are also available here.

      The reports are provided as methods of this object, with names
      corresponding to the names of the available reports.

      All methods have the same signature: Dict report(Session session, List
      argv, Dict options), where:

      - session - Session object used by the report to obtain the data.
      - argv (optional) - Array of strings representing additional arguments.
      - options (optional) - Dictionary with values for various report-specific
        options.

      Each report returns a dictionary with the following keys:

      - report (required) - List of JSON objects containing the report. The
        number and types of items in this list depend on type of the report.

      For more information on a report use: shell.reports.help('report_name').

FUNCTIONS
      help([member])
            Provides help about this object and it's members

      query(session, argv)
            Executes the SQL statement given as arguments.

      sessions(session[, argv][, options])
            Shows which sessions exist.

This includes a list of the functions available – and notice that the two reports that exist, query and sessions, are among the functions. You can also use the help() function with the report name as a string argument to get the report specific help.

If you invoke one of the report functions, you execute the report. This is much similar to invoking the report using the \show command, but it will be the raw report result that is returned. Let’s try it both for the query and sessions reports:

mysql-py> shell.reports.query(shell.get_session(), ["SELECT NOW()"])
{
    "report": [
        [
            "NOW()"
        ],
        [
            "2019-04-27 15:53:21"
        ]
    ]
}

mysql-py> shell.reports.sessions(shell.get_session(), [], {'limit': 10, 'sort': 'latency'})
{
    "report": [
        [
            "thd_id",
            "conn_id",
            "user",
            "db",
            "statement",
            "latency",
            "memory"
        ],
        [
            66,
            29,
            "mysqlx/worker",
            null,
            "SELECT `thd_id`,`conn_id`,`use ... ment_latency` DESC LIMIT 0, 10",
            "39.76 ms",
            "886.99 KiB"
        ]
    ]
}

It is not often this is needed, but in case you want to manually manipulate the output, it can be useful.

Tip

If you use JavaScript mode, then use shell.getSession() instead of shell.get_session() to get a session object to pass to the report.

That is all. Now over to you to create your own reports.

MySQL Shell 8.0.16: Built-in Reports

Readers of my blog know that I like how MySQL Shell allows you to customize it and use it’s Python and JavaScript support to create custom libraries with tools that help with your daily tasks and even creating auto-refreshing reports. Lefred has even taken this a step further and started to port Innotop to MySQL Shell.

One disadvantage of my example of auto-refreshing reports and the Innotop port is they both rely on the curses Python module to refresh the screen. While avoiding to reinvent the wheel is usually a good thing, and the curses library is both powerful and easy to use, it is not well supported on Microsoft Windows. The good news is that in MySQL 8.0.16 and later, you can also get auto-refreshing reports with a new built-in reporting framework in MySQL Shell. This blog shows how this framework works.

Example of using the \watch command to generate a auto-refreshing report.

Built-In Features

The great thing with the built-in framework is that you can start using it even without coding as it comes with a pre-configured report. The framework consists of three parts:

  • \show: This is the most basic command which runs a report once and displays the result.
  • \watch: This is similar to the watch command on Linux, where a command (report in this case) is repeatedly executed with the screen refreshed to show the new result.
  • shell.registerReport(): This method can be used to register custom reports. The details of custom reports are discussed in the blog MySQL Shell 8.0.16: User Defined Reports.

The \show command is a good place to start.

The \show Command

You can get more information about how the \show command works and reports in general using the built-in help system:

mysql-js> \h \show
NAME
      \show - Executes the given report with provided options and arguments.

SYNTAX
      \show <report_name> [options] [arguments]

DESCRIPTION
      The report name accepted by the \show command is case-insensitive, '-'
      and '_' characters can be used interchangeably.

      Common options:

      - --help - Display help of the given report.
      - --vertical, -E - For 'list' type reports, display records vertically.

      The output format of \show command depends on the type of report:

      - 'list' - displays records in tabular form (or vertically, if --vertical
        is used),
      - 'report' - displays a YAML text report,
      - 'print' - does not display anything, report is responsible for text
        output.

      If executed without the report name, lists available reports.

      Note: user-defined reports can be registered with shell.registerReport()
      method.

EXAMPLES
      \show
            Lists available reports, both built-in and user-defined.

      \show query show session status like 'Uptime%'
            Executes 'query' report with the provided SQL statement.

      \show query --vertical show session status like 'Uptime%'
            As above, but results are displayed in vertical form.

      \show query --help
            Displays help for the 'query' report.

SEE ALSO

Additional entries were found matching \show

The following topics were found at the SQL Syntax category:

- SHOW

For help on a specific topic use: \? <topic>

e.g.: \? SHOW

This already gives a lot of information, not only about the \show command, but also about reports. Reports can be in one of three formats (more on that in a later blog), if they are using the list format (which the query report discussed below uses), you can get the output in tabular format (the default) or vertical using the --vertical or -E option. And finally, you can get more information about known reports by running the report with the --help option, and you can get a list of known reports running \show without arguments:

mysql-js> \show
Available reports: query.

Let’s take a closer look at the query report.

The Query Report

The query report is a very simple report that take a query and runs it. You can get the help text for it by executing \show query --help:

mysql-js> \show query --help
query - Executes the SQL statement given as arguments.

Usage:
       \show query [OPTIONS] [ARGUMENTS]
       \watch query [OPTIONS] [ARGUMENTS]

Options:
  --help                        Display this help and exit.
  --vertical, -E                Display records vertically.

Arguments:
  This report accepts 1-* arguments.

So, to run it, you simply provide the query as an argument – you can do this either just providing the query as is or as a quoted string. Let’s say you want to use the following query for the report:

SELECT conn_id,
       sys.format_statement(current_statement) AS statement,
       format_pico_time(statement_latency) AS latency
  FROM sys.x$session
 ORDER BY statement_latency DESC
 LIMIT 10

This will show the longest running queries limited to 10 queries. Note that is uses the new format_pico_time() function that replaces the sys.format_time() function in MySQL 8.0.16. Newlines are not allowed in the query when generating the report, so the command becomes:

mysql-js> \show query SELECT conn_id, sys.format_statement(current_statement) AS statement, format_pico_time(statement_latency) AS latency FROM sys.x$session ORDER BY statement_latency DESC LIMIT 10
+---------+-------------------------------------------------------------------+----------+
| conn_id | statement                                                         | latency  |
+---------+-------------------------------------------------------------------+----------+
| 8       | SELECT conn_id, sys.format_sta ... tatement_latency DESC LIMIT 10 | 33.34 ms |
| 4       | NULL                                                              |   0 ps   |
+---------+-------------------------------------------------------------------+----------+

Granted, this is not particular useful – you could just have executed the query on its own. However, if you consider the \watch command instead, it become more useful.

Tip

The \show command is more useful for more complex reports that does more than just execute a single query or executes a complex query where the report functions as a stored query.

The \watch Command

The \watch the command supports two additional arguments on its own:

  • --interval=float, -i float: The amount of time in seconds to wait between displaying the result of the report until the report is run again. Valid values are 0.1 second to 86400 seconds (one day).
  • --nocls: Do not clear the screen between iterations of the report. This will make the subsequent output be displayed below the previous output. This can for example be useful for reports returning a single line of output and you that way have the history of the report up the screen.

Report may also add options of their own. The query report for example accepts one argument, which is the query to execute. Other reports may accept other arguments.

Otherwise, you start the report the same way as when using \show. For example, to run the query every five seconds:

mysql-js> \watch query --interval=5 SELECT conn_id, sys.format_statement(current_statement) AS statement, format_pico_time(statement_latency) AS latency FROM sys.x$session ORDER BY statement_latency DESC LIMIT 10

That’s it. If you want to stop the report again, use CTRL+c and the report will stop at the next refresh.

Conclusion

The report framework in MySQL Shell 8.0.16 gives a nice starting point for generating reports. The built-in query function may not be the most fancy you can think of, but it is very easy way to quickly make a query run repeatedly at set intervals. However, the real power of the report framework is that you now have a framework to create cross-platform custom reports. That is the topic of a later blog.

Install MySQL 8 on Linux with lower_case_table_names = 1

MySQL stores several files on disk. Even in MySQL 8 where the data dictionary is stored in InnoDB tables, there are still all the tablespace files. Different file system behave differently, and one particular challenge is case sensitivity. On Microsoft Windows, the case does not matter, on Linux the case is important, and on macOS the case of the file names is preserved but the operating system by default makes it look like it is case insensitive.

Which convention that is the correct depends on your personal preference and use case. Between case sensitivity and case insensitivity, it basically boils down to whether mydb, MyDB, and MYDB should be the same identifier or three different ones. Since MySQL originally relied on the file system for its data dictionary, the default was to rely on the case sensitivity of the file system. The option lower_case_table_names was introduced to override the behaviour. The most common use is to set lower_case_table_names to 1 on Linux to introduce case insensitive schema and table names.

Dolphin with lower_case_table_names

This blog will first discuss how lower_case_table_names work in MySQL 8 – it is not the same as in earlier versions. Then it will be shown how MySQL 8 can be initialized on Linux to use case insensitive identifiers.

Advice

To use case insensitive identifiers in MySQL 8, the main thing is that you must set lower_case_table_names = 1 in your MySQL configuration file before you initialize the data directory (this happens on the first start when using systemd).

MySQL 8 and lower_case_table_names

In MySQL 8, it is no longer allowed to change the value of the lower_case_table_names option after the data directory has been initialized. This is a safety feature – as described in the reference manual:

It is prohibited to start the server with a lower_case_table_names setting that is different from the setting used when the server was initialized. The restriction is necessary because collations used by various data dictionary table fields are based on the setting defined when the server is initialized, and restarting the server with a different setting would introduce inconsistencies with respect to how identifiers are ordered and compared.

https://dev.mysql.com/doc/refman/en/server-system-variables.html#sysvar_lower_case_table_names

If you try to start MySQL 8 with a different value of lower_case_table_names than MySQL was initialized, you will get an error like (from the MySQL error log):

2019-04-14T03:57:19.095459Z 1 [ERROR] [MY-011087] [Server] Different lower_case_table_names settings for server ('1') and data dictionary ('0').
2019-04-14T03:57:19.097773Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.
2019-04-14T03:57:19.098425Z 0 [ERROR] [MY-010119] [Server] Aborting
2019-04-14T03:57:20.784893Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.15)  MySQL Community Server - GPL.

So what are the steps to initialize MySQL 8 with lower_case_table_names = 1? Let’s go through them.

Installing MySQL 8 with Case Insensitive Identifier Names

There are several ways to install MySQL 8 on Linux. The steps that will be shown here are:

  1. Install the MySQL repository.
  2. Remove previous installations of MySQL or one of its forks.
  3. Clean the data directory.
  4. Install MySQL 8.
  5. Initialize with lower_case_table_names = 1.

The example commands are from Oracle Linux 7 and also works on Red Hat Enterprise Linux (RHEL) 7 and CentOS 7. The MySQL YUM repository will be used. On other Linux distributions the steps will in general be different, but related.

1. Install the MySQL Repository

MySQL provides repositories for several Linux distributions for the Community Edition. You can download the repository definition files from MySQL Community Downloads. The files can also be accessed directly. For this example the YUM repository definition will be downloaded using wget and then installed using yum:

shell$ wget https://dev.mysql.com/get/mysql80-community-release-el7-2.noarch.rpm
--2019-04-14 12:28:31--  https://dev.mysql.com/get/mysql80-community-release-el7-2.noarch.rpm
...
HTTP request sent, awaiting response... 200 OK
Length: 25892 (25K) [application/x-redhat-package-manager]
Saving to: ‘mysql80-community-release-el7-2.noarch.rpm’

100%[===========================================================>] 25,892      --.-K/s   in 0.01s   

2019-04-14 12:28:33 (1.76 MB/s) - ‘mysql80-community-release-el7-2.noarch.rpm’ saved [25892/25892]

shell$ yum install mysql80-community-release-el7-2.noarch.rpm
...

Dependencies Resolved

=====================================================================================================
 Package                      Arch      Version     Repository                                  Size
=====================================================================================================
Installing:
 mysql80-community-release    noarch    el7-2       /mysql80-community-release-el7-2.noarch     31 k

Transaction Summary
=====================================================================================================
Install  1 Package

Total size: 31 k
Installed size: 31 k
Is this ok [y/d/N]: y
Downloading packages:
...
  Installing : mysql80-community-release-el7-2.noarch                                            1/1 
  Verifying  : mysql80-community-release-el7-2.noarch                                            1/1 

Installed:
  mysql80-community-release.noarch 0:el7-2                                                           

Complete!

You can now remove the previous installation (if present) and its files.

2. Remove Previous Installations

MySQL or one of its forks may have been installed beforehand. This may even happen as a dependency of another package. You should never have more than one MySQL or fork installed using the package system (yum or rpm on Oracle Linux, RHEL, and CentOS).

Tip

If you need to install different versions of MySQL side by side, use the tarball distributions.

You want to uninstall the existing packages in such a way that you do not remove the programs that depend on it – otherwise you will have to re-install those later. One option is to use the rpm command with the --nodeps option. On Oracle Linux 7, RHEL 7, and CentOS 7 this may look like:

shell$ rpm -e --nodeps mariadb-server-5.5.56-2.el7.x86_64 mariadb-5.5.56-2.el7.x86_64 mariadb-libs-5.5.56-2.el7.x86_64

You can find out which packages are installed using rpm -qa and pass the output through grep to search for the packages of interest.

The next step is to clean out any existing files left behind.

3. Clean the Data Directory

In order to be able to initialize MySQL in step 5., the data directory must be empty. You can choose to use a non-default location for the data directory, or you can re-use the default location which use /var/lib/mysql. If you want to preserve your old data directory, make sure you back it up first!

Warning

Important: If you want to keep your old data files, make sure you back them up before proceeding! All existing files will be permanently lost during this step.

The data directory may have been removed in step 2., but if it has not, you can remove it using the following command:

shell$ rm -rf /var/lib/mysql

Optionally, you can also remove the error log, and if you store files outside the data directory (for example the binary log files or InnoDB log files), you should also remove those. The error log is located in /var/log/; for other files, you will need to check your configuration file (usually /etc/my.cnf).

You are now ready to install the MySQL 8.

4. Install MySQL 8

You can choose between several packages and patch releases (maintenance releases). It is recommended to install the latest patch release. You can see from the release notes which release is the latest. By default, yum will also install the latest release. Which packages you want to install depends on your requirements. The MySQL reference manual includes a list of the available packages with a description of what they include.

In this example, the following packages will be installed:

  • mysql-community-client: Client applications such as the mysql command-line client.
  • mysql-community-common: Some common files for MySQL programs.
  • mysql-community-libs: Shared libraries using the latest version of the API.
  • mysql-community-libs-compat: Shared libraries using the version of the API corresponding to what RPM packages from the Oracle Linux/RHEL/CentOS repositories that depend on MySQL uses. For Oracle Linux 7, RHEL 7, and CentOS 7 this means version 18 (e.g. libmysqlclient.so.18).
  • mysql-community-server: The actual MySQL Server.
  • mysql-shell: MySQL Shell – the second generation command-line client with devops support. This RPM is not listed in the above reference as it is not part of the MySQL Server RPM bundle, however when using the MySQL YUM repository, it can be installed in the same way as the other RPMs.

The yum command thus becomes:

shell$ yum install mysql-community-{client,common,libs,libs-compat,server} mysql-shell
...

Dependencies Resolved

=====================================================================================================
 Package                           Arch         Version            Repository                   Size
=====================================================================================================
Installing:
 mysql-community-client            x86_64       8.0.15-1.el7       mysql80-community            25 M
 mysql-community-common            x86_64       8.0.15-1.el7       mysql80-community           566 k
 mysql-community-libs              x86_64       8.0.15-1.el7       mysql80-community           2.2 M
 mysql-community-libs-compat       x86_64       8.0.15-1.el7       mysql80-community           2.1 M
 mysql-community-server            x86_64       8.0.15-1.el7       mysql80-community           360 M
 mysql-shell                       x86_64       8.0.15-1.el7       mysql-tools-community       9.0 M

Transaction Summary
=====================================================================================================
Install  6 Packages

Total download size: 400 M
Installed size: 1.8 G
Is this ok [y/d/N]: y
Downloading packages:
...
Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-mysql
Importing GPG key 0x5072E1F5:
 Userid     : "MySQL Release Engineering <mysql-build@oss.oracle.com>"
 Fingerprint: a4a9 4068 76fc bd3c 4567 70c8 8c71 8d3b 5072 e1f5
 Package    : mysql80-community-release-el7-2.noarch (installed)
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-mysql
Is this ok [y/N]: y
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : mysql-community-common-8.0.15-1.el7.x86_64                                        1/6 
  Installing : mysql-community-libs-8.0.15-1.el7.x86_64                                          2/6 
  Installing : mysql-community-client-8.0.15-1.el7.x86_64                                        3/6 
  Installing : mysql-community-server-8.0.15-1.el7.x86_64                                        4/6 
  Installing : mysql-community-libs-compat-8.0.15-1.el7.x86_64                                   5/6 
  Installing : mysql-shell-8.0.15-1.el7.x86_64                                                   6/6 
  Verifying  : mysql-community-libs-compat-8.0.15-1.el7.x86_64                                   1/6 
  Verifying  : mysql-community-common-8.0.15-1.el7.x86_64                                        2/6 
  Verifying  : mysql-community-server-8.0.15-1.el7.x86_64                                        3/6 
  Verifying  : mysql-shell-8.0.15-1.el7.x86_64                                                   4/6 
  Verifying  : mysql-community-client-8.0.15-1.el7.x86_64                                        5/6 
  Verifying  : mysql-community-libs-8.0.15-1.el7.x86_64                                          6/6 

Installed:
  mysql-community-client.x86_64 0:8.0.15-1.el7   mysql-community-common.x86_64 0:8.0.15-1.el7       
  mysql-community-libs.x86_64 0:8.0.15-1.el7     mysql-community-libs-compat.x86_64 0:8.0.15-1.el7  
  mysql-community-server.x86_64 0:8.0.15-1.el7   mysql-shell.x86_64 0:8.0.15-1.el7                  

Complete!

Notice how the GPG key for the MySQL YUM repository is downloaded, and you are requested to verify it is the correct key. This happens, because it is the first time the repository is used. You can also manually add the GPG key using the instructions in Signature Checking Using GnuPG.

You are now ready to the final step: configuring and starting MySQL Server for the first time.

5. Initialize with lower_case_table_names = 1

As mentioned in the introduction to this blog, you need to ensure that lower_case_table_names is configured when MySQL initializes its data directory. When you use systemd to start MySQL, it will happen automatically when you start MySQL with an empty data directory. This means, you should update the MySQL configuration file with the desired value of lower_case_table_names before the first start.

The default location for the MySQL configuration file is /etc/my.cnf. Open this file with your favourite editor and ensure the line lower_case_table_names = 1 is listed in the [mysqld] group:

[mysqld]
lower_case_table_names = 1

Optionally, you can make other changes to the configuration as needed.

Tip

Other than a few capacity settings such as innodb_buffer_pool_size and the configuration of the InnoDB redo logs, the default configuration is a good starting point for most installations.

Now, you can start MySQL:

shell$ systemctl start mysqld

This will take a little time as it includes initializing the data directory. Once MySQL has started, you can retrieve the temporary password for the root account from the MySQL error log:

shell$ grep 'temporary password' /var/log/mysqld.log 
2019-04-14T03:29:00.122862Z 5 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: aLxwMUQr%7C,

The temporary password is randomly generated during the initialization to avoid MySQL being left with a known default password. Use this temporary password to log in and set your permanent root password:

shell$ mysql --user=root --host=localhost --password
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 8.0.15

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> ALTER USER root@localhost IDENTIFIED BY 'n3w_$tr0ng_P@s$word';
Query OK, 0 rows affected (0.12 sec)

By default for RPM installations, MySQL has the password validation component installed using the MEDIUM strength policy. This means you will need to use a relatively strong password.

You can now verify that MySQL uses case insensitive schema and table identifiers:

mysql> SELECT @@global.lower_case_table_names;
+---------------------------------+
| @@global.lower_case_table_names |
+---------------------------------+
|                               1 |
+---------------------------------+
1 row in set (0.00 sec)

mysql> SELECT VARIABLE_SOURCE, VARIABLE_PATH
         FROM performance_schema.variables_info
        WHERE VARIABLE_NAME = 'lower_case_table_names';
+-----------------+---------------+
| VARIABLE_SOURCE | VARIABLE_PATH |
+-----------------+---------------+
| GLOBAL          | /etc/my.cnf   |
+-----------------+---------------+
1 row in set (0.01 sec)

mysql> CREATE SCHEMA db1;
Query OK, 1 row affected (0.03 sec)

mysql> use DB1;
Database changed
mysql> CREATE TABLE t1 (id int unsigned NOT NULL PRIMARY KEY);
Query OK, 0 rows affected (0.47 sec)

mysql> INSERT INTO T1 VALUES (1);
Query OK, 1 row affected (0.05 sec)

mysql> SELECT * FROM t1;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.01 sec)

The query in lines 9-11 queries what the source of the value of the lower_case_table_names option is. This shows that the value of 1 (from the previous query) is picked up from the /etc/my.cnf file. The rest of the queries show how the db1 schema and the db1.t1 table can be accessed both using lower and upper case.

That is it. Now you can use MySQL Server without having to remember which case was used when a schema object was created.

MySQL Backup Best Practices

Today is World Backup Day, so I thought I would use the opportunity to discuss some best practices and general considerations regarding backing up MySQL instances. While I focus on MySQL, several of these tips apply to backups in general.

Doplhin with a USB Stick with the label BACKUP
Backup your data

Before heading into the gory details, let’s first take a look at the best practices at a high level:

  • Make sure you can restore your backups:
    • Document and script the restore procedures. Do you know the steps required to restore a full backup – or a single table?
    • Keep copies of the backups off-site. Do you have a copy of your backup if the data center becomes unavailable for example due to a fire?
    • Validate your backups. Does your backup method work with the features you use? Are you writing to a disk which is failing?
  • Monitor the backups. Do you know when a backup failed? How long time does the backups take?
  • Use a backup method appropriate for your system and your requirements.
  • Never stop considering your backup strategy. The World changes, so does your backup requirements.

Information

No two systems are identical. Make sure you consider your specific requirements when you design your backup procedures.

The rest of this blog will discuss all of these things in more detail.

Make Sure You Can Restore Your Backups

It may seem obvious, but one of the more common issues I see is that backups exist, but the steps to restore them are not known. Or even worse, the backups cannot be restored at all as they are broken.

Advice

Whatever you do regarding backups, the single most important thing is that you verify you know how to restore the backups and that the restored instance is valid.

There are several things to consider: how to restore a backup, do you have access to the backup, and is the backup valid?

The Restore Procedure

If you do not know how to restore your backups, the day you do need to restore one, a relatively standard operation can turn into a major crisis with the manager staring down your backup.

So, make sure you practice the steps to restore your backups for all of the scenarios you can think of, for example:

  • a plain full restore of the backup
  • a point-in-time recovery, that is: restore the backup itself and apply binary logs up to a given point in time
  • a partial restore, for example to restore a single table or schema from a full backup

There are more possible scenarios. Take some time to consider which are important for your databases and regular practice doing those kind of restores.

When you practice a restore, document all steps you make in detail and keep the steps in a place where they can easily be found again – for example in a knowledge base. Even better, script the restore; that works both to document how the restore should be done, but also automates the steps and ensure each restore is done in the same way. If you need to restore a backup in the middle of a crisis, then having all the steps scripted and documented not only helps you remember what to do, but also reduces the chance that something goes wrong.

Related to this discussion is that you should copy the backups to remote storage.

Copy the Backups Off Site

In the previous section, it was discussed how you need to consider all your restore scenarios. That should include the case where the whole server or even whole data center is gone. What do you do in that case?

From XKCD comic 1718.

Other than the need to provision a new MySQL instance somewhere else, you also need to still have access to your backups. This means that a backup that is only stored locally (either on the same host or in the same data center) is of no use for this case.

Advice

Always copy your backups to remote storage. How remote depends on your needs.

When you decide where to store your backups, you need to consider your requirements. How long time is acceptable to wait to download the backup during a recovery, and what kind of disasters (power outage, fire, earthquake, meteor strike, etc.) must the backup be able to survive? You can choose to have backups available on the local host and/or data center, so they are quickly available, for example in case a user deletes the wrong data. Then have another storage location either in the other end of the country or even on another continent to protect against a major disaster.

Advice

Replication is also a great way to export your data to another location. Note though that unless the replication is delayed, the replica will not help you recover from bad queries.

Of course even having the best written instructions in the World and copies of the backups on all continents do not help you if the backup is corrupted or broken.

Verify Your Backups

A backup is only as good as your ability to restore it and bring the restored instance online. This is the reason, it is so important to test your restore procedures as discussed above. Optimally, you should restore every single backup. In the real world that is not always realistic, but it is still important that you practice a restore from time to time.

Advice

The ultimate validation of your backups is to restore them, bring the restored instance online, and verify the data is correct.

In practice it may not be possible to restore every single backup in all the restore combinations. So, you will need to add some other checks. The exact checks you should do depend on your backups, but some possibilities are:

  • MySQL Enterprise Backup (MEB) has a validate command. This will verify the InnoDB checksums for each page. This checks whether the backup is truncated, corrupted, or damaged.
  • MySQL Enterprise Backup can store the result of the backup in the mysql.backup_history table (enabled by default). This includes the overall backup status.
  • Verify the backup is created and has a minimum size.
  • If you have a logical backup, grep for some strings you know should be in the backup, such as CREATE TABLE statements.

The validation of your backups is of course only useful if you realize when the validation fails, so you also need to monitor the backups.

Monitor the Backups

Monitoring is one of the most important tasks for a database administrator. That also includes monitoring the backups. Once you have verification of the backups in place, you need to ensure the validation status is monitored.

How you do this depends on the monitoring solution you use. In MySQL Enterprise Monitor (MEM) there is a built-in backup dashboard with information about your MySQL Enterprise Backup (MEB) backups; this information is based on the data logged by MySQL Enterprise Backup to the mysql.backup_history table and includes the type of backup, the backup status, how long time the backup took, how long time locks were held, etc. MySQL Enterprise Monitor also creates events when backups fail.

The MySQL Enterprise Monitor (MEM) dashboard for backups.
The MySQL Enterprise Monitor (MEM) dashboard for backups.

This far, all the advises have been focused on what you should do with the backup after it has been created. What about creating the backups?

Creating Backups

When you decide how you want to create the backup, there are many considerations to take. This section will consider some of those.

Information

Remember that all systems are unique. What is the best choice for one system may not be the best for another.

First of all you need to determine what you need for your backups and what interruption of your production system is allowed when creating the backups. Some of the things to consider are:

  • How much data can you afford to lose in case of a catastrophic disaster?
  • How long time is acceptable to restore the backup?
  • What data must be included in the backup?
  • Which other files (for example binary logs and configuration files) must be included?
  • Do you need to be able to do a point-in-time recovery?
  • Can the production system be taken offline during the backup or into read-only mode? If so, for how long?

Advice

Since MySQL 5.6, mysqlbinlog has been able to stream binary logs to disk giving you a near real-time backup of the binary logs that can be used for a point-in-time recovery. The binary log is enabled by default in MySQL 8.0.

Answering these questions helps you determine the backup method that is optimal for your system. Some of the backup methods available are:

  • Logical Backups:
    • mysqlpump: This is available in MySQL 5.7 and later and allows for parallel backups. In most cases other than for MySQL NDB Cluster, it is preferred over mysqldump.
    • mysqldump: This is the classical program to create logical backups in MySQL.
    • Native NDB Backups: This is a bit of hybrid between a logical backup and a raw backup specialized for the NDBCluster storage engine. It uses a native storage format but can be converted to CSV files.
  • Binary (Raw) Backups:

Whichever method you choose, make sure you understand its limitations. As an example, file system snapshots can work great in many cases, but if MySQL uses more than one file system for the database files, then it may not be possible to create a consistent snapshot (FLUSH TABLES WITH READ LOCK does not stop background writes for InnoDB except for tables that have been explicitly listed).

Advice

Always do your testing with as much write activity as you can have in a worst-case scenario. Backing up an idle instance will not reveal limitations related to writes happening during the backup.

You also need to take the overhead of the backup method into consideration. At the very least it will impact MySQL by reading the data. There will also be some locking involved even if it in some cases may be very limited. In all cases, creating the backup at the most quiet time of the day can help reduce the impact. An option is also to use a replica for the backups, but even in that case the overhead must be considered as the replica need to be able to keep up or catch up before the next backup.

Advice

If you create the backup from a dedicated replica, you create the backup without impacting the users. In this case, you should also monitor the replica and ensure it can keep up with the production source, so you always create up to date backups.

Now you have considered how to create the backups, validated them, copied them to secure off-site locations, and tested all possible the restore scenarios. So you are all set and can put backups on auto-pilot? Not so fast.

Backups Are a Never Ending Process

The World is not a static place. Neither are your MySQL instances. The configuration changes, the application add new features, requirements change, the amount of data grows, new MySQL instance are installed on new hardware or different cloud shapes or with a different cloud provider, there are updates to MySQL and the backup program, and so on.

This means that the process of working with the backup and restore processes never ends. You need to regularly evaluate whether your backup strategy still works and fulfills all requirements. Look at the bright side: you keep learning and the experience you have gathered since the last evaluation may help you implement an even better backup solution.

Happy World Backup Day.

Easy Execution of Dynamic Queries with the sys Schema

When you write stored procedures in MySQL, you sometimes need to generate queries on the fly, for example as you process the result of another query. This is supported using prepared statements. This blog explores how you can take advantage of the sys schema to simplify the use of dynamic queries.

Executing a query using the sys.execute_prepared_stmt() procedure.
Executing a query using the sys.execute_prepared_stmt() procedure.

The sys schema includes several stored procedures and functions as well as views to make the database administrator’s life easier. One of these is the execute_prepared_stmt() procedure which can be used to execute a query given as a string. An example is:

mysql> CALL sys.execute_prepared_stmt(
    ->     'SELECT * FROM world.city WHERE ID = 130');
+-----+--------+-------------+-----------------+------------+
| ID  | Name   | CountryCode | District        | Population |
+-----+--------+-------------+-----------------+------------+
| 130 | Sydney | AUS         | New South Wales |    3276207 |
+-----+--------+-------------+-----------------+------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.01 sec)

Let’s take a step back and look at how it works when you execute a dynamic query in MySQL before looking at a more realistic example.

Dynamic Queries and Prepared Statements

A prepared statement is created by providing the query in a string. Optionally, the query can include placeholders, so you can reuse the prepared statement with different parameters. However, here I will look at queries without placeholders.

You prepare the query with the PREPARE statement. The query can be specified either using a literal string or in a user variable. The above example uses a literal string, but when you work with dynamic queries in a stored procedure, you will assign the query to a user variable.

Once you have prepared the query, you execute it using the EXECUTE statement. If you need to execute the query several times, you can do this without preparing it again, however typically that is not the case with dynamic queries.

Finally, you should tell MySQL that you are done with the prepared statement, so the memory used to store the statement can be freed. If you do not do this, you will end up using more memory than necessary, and you can potentially run out of prepared statements (the maximum number allowed can be configured with max_prepared_stmt_count option). You tell MySQL to forget about the prepared statement with the DEALLOCATE PREPARE statement.

If you combine it all, you have the following sequence of steps to generate and execute the dynamic query:

mysql> SET @sql = CONCAT('SELECT COUNT(*) FROM ',
    ->                   sys.quote_identifier('world'),
    ->                   '.',
    ->                   sys.quote_identifier('city'));
Query OK, 0 rows affected (0.00 sec)

mysql> PREPARE stmt_count FROM @sql;
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> EXECUTE stmt_count;
+----------+
| COUNT(*) |
+----------+
|     4079 |
+----------+
1 row in set (0.01 sec)

mysql> DEALLOCATE PREPARE stmt_count;
Query OK, 0 rows affected (0.00 sec)

In the example, the query is created by concatenating the different parts of the query. The sys.quote_identifier() function is used to quote the schema and table names. Then the query is executed, and finally deallocated.

This is simple enough but can become tedious, if you execute many dynamic queries, and there is always the chance that you forget to include DEALLOCATE PREPARE causing higher memory usage than necessary. That is where sys.execute_prepared_stmt() comes into the picture.

sys.execute_prepared_stmt()

The sys.execute_prepared_stmt() procedure was born when Mark Leith and I were working on the sys schema in preparation to include it into MySQL 5.7. In the end we got tired of the repeated pattern with preparing, executing, and deallocating the statements. So, we decided to create a utility procedure for it.

Building on the above example, consider a stored procedure that takes a schema name as a parameter and goes through all the tables in the schema executing SELECT COUNT(*) to find the number of rows in each table. An example of a procedure that does this is:

DELIMITER $$
CREATE PROCEDURE row_count(IN in_schema varchar(64) CHARSET utf8mb4)
   SQL SECURITY INVOKER
   DETERMINISTIC
   READS SQL DATA
BEGIN
   DECLARE v_table varchar(64) CHARSET utf8mb4;
   DECLARE v_sql varchar(271);
   DECLARE v_done bool DEFAULT FALSE;
   DECLARE c_tables CURSOR FOR
      SELECT TABLE_NAME
        FROM information_schema.TABLES
       WHERE TABLE_SCHEMA = in_schema
             AND TABLE_TYPE = 'BASE TABLE';
   DECLARE CONTINUE HANDLER FOR NOT FOUND SET v_done = TRUE;

   OPEN c_tables;
   c_tables: LOOP
      FETCH c_tables INTO v_table;
      IF v_done THEN
         LEAVE c_tables;
      END IF;

      SET v_sql = CONCAT(
         'SELECT ',
         QUOTE(v_table), ' AS TableName, '
         'COUNT(*) AS RowCount FROM ',
         sys.quote_identifier(in_schema),
         '.',
         sys.quote_identifier(v_table)
      );
      CALL sys.execute_prepared_stmt(v_sql);
   END LOOP;
   CLOSE c_tables;
END$$
DELIMITER ;

The most interesting part of the procedure with respect to this discussion is the loop where the information_schema.TABLES view is queried to get all tables in the provided schema. Once the table name has been fetched into the v_table variable, the query is generated in lines 24 to 31 in a very similar way to the previous example. Then the generated query is executed using sys.execute_prepared_stmt() in line 32.

If you are using dynamic queries, I hope you will find sys.execute_prepared_stmt() useful. Happy coding.

Testing MySQL NDB Cluster with dbdeployer

A great way to install MySQL when you need to do quick tests is to use a sandbox tool. This allows you to perform all the installation steps with a single command making the whole process very simple, and it allows for automation of the test. Giuseppe Maxia (also known as the Data Charmer, @datacharmer on Twitter) has for many years maintained sandbox tools for MySQL, first with MySQL Sandbox and now with dbdeployer.

One of the most recent features of dbdeployer is the support for MySQL NDB Cluster. In this blog, I will take this feature and test it. First, I will briefly discuss what MySQL NDB Cluster is, then install dbdeployer, and finally set up a test cluster.

Deploying a MySQL NDB Cluster cluster with dbdeployer.
Deploying a MySQL NDB Cluster cluster with dbdeployer.

What is MySQL NDB Cluster?

MySQL NDB Cluster is primarily an in-memory database (but also with support for on-disk data) that has been designed from day one to be highly available and providing consistent response times. A cluster consists of several nodes which can be one of three types:

  • Data Nodes: This is where the actual data is stored. Currently there is support for up to 48 data nodes in a cluster with up to 1TiB of data memory for each node.
  • API Nodes: These are the nodes where queries are executed on. An API node can be a normal mysqld process (also known as an SQL node), or it can be a NoSQL node using the C++ (this is the native NDB API), Java (ClusterJ), memcached, or Node.js API.
  • Management Nodes: These nodes hold the configuration of the cluster, and one of the management nodes is the most common choice as an arbitrator in case it is necessary decide between two halves of data nodes to avoid a split brain scenario.

Tip

If you are interested in learning more about the arbitration process, then I wrote a blog earlier about the importance of installing the management nodes on different hosts than where the data nodes are installed. This blog includes several examples of handling node failures and arbitration.

You will typically have at least two data nodes in a cluster with two copies (replicas) of the data. This allows one data node to be offline while the cluster stays online. More data nodes can be added to increase the capacity or to add more data partitions. The data partitioning (sharding) and the replicas is all handled automatically, including when querying the data.

Over view of the MySQL NDB Cluster architecture.
Over view of the MySQL NDB Cluster architecture.

All of this means that you will end up with quite a few nodes. In a production cluster, you need at least two of each node type to have high availability. Even though you may not need high availability for your testing, you will still need at least two data nodes, one management node, and one SQL node. Being able to automate the installation of the cluster is a great help when you need to do a quick test – which brings us to dbdeployer. The first step is to install it.

Want to Know More about MySQL NDB Cluster?

I am one of the authors of Pro MySQL NDB Cluster (Apress) which is an almost 700 pages long book dedicated to MySQL NDB Cluster. You can buy it from Apress (print or DRM free ePub and PDF), Amazon (print and Kindle/Mobi), Barnes & Nobles (print), and other book shops.

I have also written a brief introduction to MySQL NDB Cluster – but with a little more information than above – on Apress’ blog.

Installing dbdeployer

It is simple to install dbdeployer. From the dbdeployer’s GitHub page, there are releases that can be downloaded and easily installed. For this blog, I am using release 1.24.0 on Linux. I will recommend you to use the latest release. In addition to Linux, dbdeployer is also available for macOS. Unfortunately there is no Microsoft Windows support.

An example of downloading and installing dbdeployer is:

shell$ mkdir Downloads

shell$ cd Downloads/

shell$ wget https://github.com/datacharmer/dbdeployer/releases/download/v1.24.0/dbdeployer-1.24.0.linux.tar.gz
...
HTTP request sent, awaiting response... 200 OK
Length: 4888282 (4.7M) [application/octet-stream]
Saving to: ‘dbdeployer-1.24.0.linux.tar.gz’

100%[================================>] 4,888,282   1.70MB/s   in 2.8s   

2019-03-25 17:48:54 (1.70 MB/s) - ‘dbdeployer-1.24.0.linux.tar.gz’ saved [4888282/4888282]

shell$ tar -zxf dbdeployer-1.24.0.linux.tar.gz 

shell$ mkdir ~/bin

shell$ mv dbdeployer-1.24.0.linux ~/bin/dbdeployer

shell$ export PATH=${PATH}:~/bin

This downloads and unpacks the 1.24.0 release into the Downloads directory. Then the dbdeployer binary is moved to the ~/bin directory and renamed to dbdeployer. Finally, the ~/bin directory is added to the path searched when executing a command, so it is not necessary to specify the path each dbdeployer is executed. There are other ways to perform these steps and other options where to install it; see also the official documentation.

That it is. Now it is possible to install a test cluster.

Installing a Test Cluster

Since dbdeployer works on a single host, all of the nodes will be installed on the same host. While this is bad for a production cluster, it is perfectly fine for most test clusters.

Warning

While a single host cluster is great for most tests, for testing your application before a deployment to production, it is recommended to use a multi-host cluster that is as similar to your production cluster as possible.

The first step is to download MySQL NDB Cluster as a tar-ball. You can get the latest patch release of each version from https://dev.mysql.com/downloads/cluster/. If you need to test with an older release, you can get that from https://downloads.mysql.com/archives/cluster/. In this example, MySQL NDB Cluster 7.6.9 is downloaded from the latest releases and places in the ~/Downloads directory:

shell$ cd ~/Downloads/

shell$ wget https://dev.mysql.com/get/Downloads/MySQL-Cluster-7.6/mysql-cluster-gpl-7.6.9-linux-glibc2.12-x86_64.tar.gz
...
HTTP request sent, awaiting response... 200 OK
Length: 914236523 (872M) [application/x-tar-gz]
Saving to: ‘mysql-cluster-gpl-7.6.9-linux-glibc2.12-x86_64.tar.gz’

100%[================================>] 914,236,523  699KB/s   in 23m 52s

2019-03-25 18:49:29 (624 KB/s) - ‘mysql-cluster-gpl-7.6.9-linux-glibc2.12-x86_64.tar.gz’ saved [914236523/914236523]

Once the download has completed, use the unpack command of dbdeployer to unpack the downloaded file:

shell$ dbdeployer unpack --prefix=ndb ~/Downloads/mysql-cluster-gpl-7.6.9-linux-glibc2.12-x86_64.tar.gz 
Unpacking tarball /home/dbdeployer/Downloads/mysql-cluster-gpl-7.6.9-linux-glibc2.12-x86_64.tar.gz to $HOME/opt/mysql/ndb7.6.9
.........100.........200.........300.........400.........500........
...
.........20300.........20400.........20500.........20600.........2070020704
Renaming directory /home/dbdeployer/opt/mysql/mysql-cluster-gpl-7.6.9-linux-glibc2.12-x86_64 to /home/dbdeployer/opt/mysql/ndb7.6.9

You are now ready for the actual creation of the test cluster. This is done using the deploy command:

shell$ dbdeployer deploy replication ndb7.6.9 --topology=ndb --concurrent
$HOME/sandboxes/ndb_msb_ndb7_6_9/initialize_nodes
MySQL Cluster Management Server mysql-5.7.25 ndb-7.6.9
2019-03-27 17:22:16 [ndbd] INFO     -- Angel connected to 'localhost:20900'
2019-03-27 17:22:16 [ndbd] INFO     -- Angel allocated nodeid: 2
2019-03-27 17:22:17 [ndbd] INFO     -- Angel connected to 'localhost:20900'
2019-03-27 17:22:17 [ndbd] INFO     -- Angel allocated nodeid: 3
executing 'start' on node 1
............ sandbox server started
executing 'start' on node 2
.... sandbox server started
executing 'start' on node 3
.... sandbox server started
NDB cluster directory installed in $HOME/sandboxes/ndb_msb_ndb7_6_9
run 'dbdeployer usage multiple' for basic instructions'

This creates a cluster with two data nodes, one management nodes, and three SQL nodes. The nodes have been installed in the ${HOME}/sandboxes/ndb_msb_ndb7_6_9/ directory:

shell$ ls sandboxes/ndb_msb_ndb7_6_9/
check_nodes          ndb_conf  node3               test_replication
clear_all            ndb_mgm   restart_all         test_sb_all
cluster_initialized  ndbnode1  sbdescription.json  use_all
initialize_nodes     ndbnode2  send_kill_all       use_all_masters
n1                   ndbnode3  start_all           use_all_slaves
n2                   node1     status_all
n3                   node2     stop_all

Notice how there for example is an ndb_mgm script. This is a wrapper script around the ndb_mgm binary in the MySQL installation – the MySQL NDB Cluster management client. This makes it easy to connect to the management node, for example to check the status of the cluster:

shell$ ./sandboxes/ndb_msb_ndb7_6_9/ndb_mgm -e "SHOW"
Connected to Management Server at: localhost:20900
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @127.0.0.1  (mysql-5.7.25 ndb-7.6.9, Nodegroup: 0, *)
id=3    @127.0.0.1  (mysql-5.7.25 ndb-7.6.9, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @127.0.0.1  (mysql-5.7.25 ndb-7.6.9)

[mysqld(API)]   4 node(s)
id=4    @127.0.0.1  (mysql-5.7.25 ndb-7.6.9)
id=5    @127.0.0.1  (mysql-5.7.25 ndb-7.6.9)
id=6    @127.0.0.1  (mysql-5.7.25 ndb-7.6.9)
id=7 (not connected, accepting connect from localhost)

Before wrapping up, let’s see how you can connect to the different SQL nodes and see how they indeed query the same data.

Testing the Cluster

As a simple test, connect to the first SQL node and create a table. Then, connect to the second SQL node and insert a row. Finally, connect to the third SQL node and query the data.

The SQL nodes are in the node* directories in ${HOME}/sandboxes/ndb_msb_ndb7_6_9/. Each of those work in the same way as for a standalone MySQL Server sandbox, so you can use the use wrapper script to connect using the MySQL command-line client:

shell$ ./sandboxes/ndb_msb_ndb7_6_9/node1/use 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 5.7.25-ndb-7.6.9-cluster-gpl-log MySQL Cluster Community Server (GPL)

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

node1 [localhost:27510] {msandbox} ((none)) > 

Now, the table can be created (output has been reformatted):

node1 [localhost:27510] {msandbox} ((none)) > \R
Returning to default PROMPT of mysql> 

mysql> CREATE SCHEMA db1;
Query OK, 1 row affected (0.42 sec)

mysql> CREATE TABLE db1.t1 (
          id int unsigned NOT NULL auto_increment PRIMARY KEY,
          val varchar(36)
       ) ENGINE=NDBCluster;
Query OK, 0 rows affected (2.64 sec)

I changed the prompt back to the default mysql> prompt. This is not because I don’t like the prompt created by dbdeployer, but simply to make the formatting of the queries nicer. In general, I do prefer my prompt to tell me where I am connected, so the normal dbdeployer prompt will otherwise work well for me.

The table creation is just like normal except the engine is set to NDBCluster. This is the engine name that tells MySQL to create the table in the data nodes.

The second step is to connect to the second instance and insert a row:

node2 [localhost:27511] {msandbox} ((none)) > INSERT INTO db1.t1 (val) VALUES (UUID());
Query OK, 1 row affected (0.11 sec)

node2 [localhost:27511] {msandbox} ((none)) > SELECT * FROM db1.t1;
+----+--------------------------------------+
| id | val                                  |
+----+--------------------------------------+
|  1 | 84f59369-5051-11e9-9078-08002709eea3 |
+----+--------------------------------------+
1 row in set (0.05 sec)

Notice how this worked without creating the table. Since the table was created in the data nodes, all SQL nodes that connect to these data nodes will automatically know about the table.

Finally, confirm the data is also available in the third node:

node3 [localhost:27512] {msandbox} ((none)) > SELECT * FROM db1.t1;
+----+--------------------------------------+
| id | val                                  |
+----+--------------------------------------+
|  1 | 84f59369-5051-11e9-9078-08002709eea3 |
+----+--------------------------------------+
1 row in set (0.12 sec)

Verdict

It is fantastic that dbdeployer now support MySQL NDB Cluster as well. It will be a great help performing tests. I do have some comments based on my testing. It is very likely some of those are just do to the fact, that this is my initial use of dbdeployer and thus, I will not claim that I understand all details of how it works yet, so do not take the following comments as the final word – nor are the comments meant as negative criticism:

  • I find it a little confusing that a cluster is considered a replication topology. Yes, there is synchronous replication between the data nodes, but it is not related to the replication you have between two MySQL Server instances (which is also supported between two clusters). Personally, I would have called a single cluster for a single sandbox, and then allow for a (future) feature setting up two clusters with replication between them.
  • The restart_all sandbox command literally shuts down the whole cluster, then starts it again (but see also two items later). For MySQL NDB Cluster there are essentially two different types of restarts (which each can either be a normal or an initial restart):
    • System Restart: All data nodes at least are shut down together, then started together. This is what restart_all implements.
    • Rolling Restart: The cluster as a whole remains online throughout the restart phase. This is done by always leaving one data node in each node group online while restarting the data nodes. SQL nodes are restarted such that at least one SQL node is online at all times. This is the normal way to do most configuration changes as it avoids downtime. I miss this restart type.
  • There does not seem to be any way to choose between normal and initial restarts.
  • The start_all does not start the management and data nodes (only the SQL nodes are started). This may be on purpose, but seems inconsistent with stop_all that does shut down the management and data nodes. Actually, I have not been able to find a way to start the cluster cleanly. There is initialize_nodes that will start the management and data nodes, but the script will also try to start the SQL nodes and load grants into the SQL nodes.
  • The stop_all script, first shuts down the management and data nodes. Then the SQL nodes. It is better to do it in the opposite order as it avoids errors on the SQL nodes if queries are executed during the shutdown. In older versions of MySQL NDB Cluster, it could also take a long time to shut down an SQL node that had lost the connection to the data nodes.
  • The management node is given NodeId = 1 and the data nodes the subsequent ids. Data nodes can only have ids 1-48, so I always recommend reserving these ids for data nodes, and make the first management node have NodeId = 49 and SQL nodes later ids.
  • There does not seem to be any way to change the number of management nodes. The --ndb-nodes option appears to be taken as one management node, and the rest as data nodes. Maybe a better way would be to have two options like:
    • --ndb-nodegroups: The number of node groups in the cluster. The number of data nodes can then be calculated as <# Node Groups> * NoOfReplicas.
    • --ndb-mgmnodes: The number of management nodes.
  • There is no check whether the number of NDB nodes is valid. For example with --ndb-nodes=4, dbdeployer tries to create a cluster with three data nodes which is not valid with NoOfReplicas = 2.
  • I did not find any way to specify my preferred configuration of the cluster as part of the sandbox deployment.
  • Consider adding the --reload option when starting ndb_mgmd (the management node). This will make the management node check whether there are any changes to the cluster configuration (stored in <path to sandbox>/ndb_conf/config.ini) and if so apply those changes.

This may seem like a long list of comments, but I am also very well aware that support for MySQL NDB Cluster has only just been added, and that it takes time to implement all the details. Rome was not built in one day.

So, I would very much like to conclude with a big thank you to the Data Charmer. This is a great initial implementation.

Using SQLAlchemy with MySQL 8

I few months ago, I wrote about using the Django framework with MySQL 8. There are also other Python frameworks that are worth considering. In this blog, I will look at using SQLAlchemy with MySQL 8.

In order for you to be able to use MySQL 8 with SQLAlchemy, you need three pieces of software: MySQL Server, MySQL Connector/Python, and SQLAlchemy. I will go through the installations, then I will look at a code example.

Information

The examples in this blog uses MySQL Server 8.0.15, MySQL Connector/Python 8.0.15, and SQLAlchemy 1.2.18. It should be possible to reuse the instructions with other MySQL versions as well, except in older MySQL versions you need to explicitly use the utf8mb4 character set.

If you do not want to read through the whole blog, the main thing for using MySQL and MySQL Connector/Python with SQLAlchemy is to create the engine with dialect set to mysql and the driver to mysqlconnector:

engine = sqlalchemy.create_engine(
    'mysql+mysqlconnector://<user>:<password>@<host>:<port>/<default_db>...')

Installing MySQL Server

There are several ways to install MySQL Server and which one is the best depends on your circumstances and preferences. For the sake of this blog, I will show how MySQL Server can be installed on Oracle Linux/RHEL/CentOS 7 using RPMs and on Microsoft Windows using MySQL Installer. For more options, see the installation chapter in the reference manual. Let’s look at the Linux installation first.

RPM Install on Enterprise Linux

MySQL provides repositories for several Linux distributions including the Oracle Linux/RHEL/CentOS family. This makes it easy to install MySQL. The step to install the repository definition is:

shell$ sudo yum install https://dev.mysql.com/get/mysql80-community-release-el7-1.noarch.rpmLoaded plugins: langpacks, ulninfo
Repository ol7_developer_EPEL is listed more than once in the configuration
mysql80-community-release-el7-1.noarch.rpm                                     |  25 kB  00:00:00     
Examining /var/tmp/yum-root-Ts4OzC/mysql80-community-release-el7-1.noarch.rpm: mysql80-community-release-el7-1.noarch
Marking /var/tmp/yum-root-Ts4OzC/mysql80-community-release-el7-1.noarch.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package mysql80-community-release.noarch 0:el7-1 will be installed
--> Finished Dependency Resolution
 
Dependencies Resolved
 
======================================================================================================
 Package                       Arch       Version   Repository                                   Size
======================================================================================================
Installing:
 mysql80-community-release     noarch     el7-1     /mysql80-community-release-el7-1.noarch      31 k
 
Transaction Summary
======================================================================================================
Install  1 Package
 
Total size: 31 k
Installed size: 31 k
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : mysql80-community-release-el7-1.noarch                                             1/1
  Verifying  : mysql80-community-release-el7-1.noarch                                             1/1
 
Installed:
  mysql80-community-release.noarch 0:el7-1
 
Complete!

Now, you can install MySQL Server. There are several RPMs to choose from and which you need depends on which other features you need to use. A common set of RPMs can be installed as:

shell$ sudo yum install mysql-community-server mysql-community-libs \
            mysql-community-libs-compat mysql-community-common mysql-community-client
...

Note

If you have another MySQL installation, it will be upgraded to the latest release (at the time of writing 8.0.15).

On the first start, the data directory will be initialized:

shell$ sudo systemctl start mysqld

To keep a fresh installation secure, a random password has been set for the root user. This can be found from the MySQL error log:

shell$ sudo grep password /var/log/mysqld.log 
2018-11-05T08:05:09.985857Z 5 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: KWNfuA!1r:PF

Use this password to connect to MySQL and update the password (please use a strong password):

shell$ mysql --user=root --password
Enter password:

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 19
Server version: 8.0.15 MySQL Community Server - GPL

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> ALTER USER root@localhost IDENTIFIED BY 'Kij0@jDi~Faf4';
Query OK, 0 rows affected (0.13 sec)

MySQL is now ready for use. Before continuing, I will show an example of installing MySQL on Microsoft Windows.

Microsoft Windows

On Microsoft Windows an easy way to install MySQL is to use the MySQL Installer. The installer can be downloaded from the MySQL download site. The MySQL Installer can be used to install most MySQL products. If you start MySQL Installer for the first time, you will be taken directly to the screen to choose the products to install; if you already have installed products, you will need to choose to add new products.

On the Select Products and Features screen, choose to install MySQL Server 8.0 (MySQL Installer will list the latest release from the list of available products):

Installing MySQL Server 8.0.13 from MySQL Installer
Installing MySQL Server from MySQL Installer – The screen shot is for 8.0.13, but other than the version number, it is the same for 8.0.15.

Optionally, you can filter the list of products. Feel free to choose other products you want. MySQL Notifier can be useful for starting and stopping MySQL, if you do not plan to have MySQL running at all times. You can also install MySQL Connector/Python this way, however for this blog a different method will be used.

Follow the installation wizard. For this blog, the default choices will work, though during the configuration you may want to ensure Open Windows Firewall ports for network access is unchecked unless you need remote access.

Preparing MySQL Server

While MySQL is now ready to work with SQLAlchemy, you will likely want to do a few more preparation steps. Here creating the MySQL user and schema (database) used by your application will be covered.

An example of creating the user pyuser@localhost and give it all privileges to the sqlalchemy schema and to create the sqlalchemy schema is:

mysql> CREATE USER pyuser@localhost IDENTIFIED BY 'Py@pp4Demo';
Query OK, 0 rows affected (0.01 sec)

mysql> GRANT ALL ON sqlalchemy.* TO pyuser@localhost;
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE SCHEMA sqlalchemy;
Query OK, 1 row affected (0.01 sec)

This will allow the pyuser user to connect from the same host as MySQL Server is installed by authenticating with the password Py@app4Demo.

Installing MySQL Connector/Python and SQLAlchemy

Both MySQL Connector/Python and SQLAlchemy can be installed in a platform independent way using the pip command. Since Python 2.7 is soon end of life, I will assume Python 3.6 in this blog. (MySQL Connector/Python 8.0.13 and later also supports Python 3.7.)

If you do not have Python 3.6 installed on Oracle Linux/RHEL/CentOS 7, you can easily install it for example from for EPEL repository. Assuming you have configured the EPEL repository, the following steps install Python 3.6, enable pip, and update pip to the latest version:

shell$ yum install python36
shell$ python3.6 -m ensurepip
shell$ python3.6 -m pip install --upgrade pip

You can now use python3.6 to invoke Python 3.6. In the following, replace python with python3.6 if you have installed Python 3.6 in this way.

To install the latest MySQL Connector/Python release (currently 8.0.15):

PS> python -m pip install mysql-connector-python
Collecting mysql-connector-python
  Using cached https://files.pythonhosted.org/packages/31/45/ef8cf013918108f508a1a1bb5539abaff5f78f3a569f7fa30232967713c9/mysql_connector_python-8.0.15-cp36-cp36m-win_amd64.whl

Collecting protobuf>=3.0.0 (from mysql-connector-python)
  Downloading https://files.pythonhosted.org/packages/5d/5c/476f473c2efc0a8d9fd7185e6c08dcbd21c469698e2a80487fa054b8c5ba/protobuf-3.7.0-cp36-cp36m-win_amd64.whl (1.1MB)
    100% |████████████████████████████████| 1.1MB 6.6MB/s
Requirement already satisfied: six>=1.9 in c:\users\myuser\appdata\local\programs\python\python36\lib\site-packages (from protobuf>=3.0.0->mysql-connector-python)
 (1.11.0)
Requirement already satisfied: setuptools in c:\users\myuser\appdata\local\programs\python\python36\lib\site-packages (from protobuf>=3.0.0->mysql-connector-pytho
n) (28.8.0)
Installing collected packages: protobuf, mysql-connector-python
Successfully installed mysql-connector-python-8.0.15 protobuf-3.7.0

Similar for installing SQLAlchemy:

PS> python -m pip install SQLAlchemy
Collecting SQLAlchemy
  Downloading https://files.pythonhosted.org/packages/21/ed/7eb53305b43ca51774a458d7c292f3bc7664d7a9bbb5bac4149fa34756b9/SQLAlchemy-1.2.18.tar.gz (5.7MB)
    100% |████████████████████████████████| 5.7MB 3.3MB/s
Installing collected packages: SQLAlchemy
  Running setup.py install for SQLAlchemy ... done

That’s it. Now you are ready to use SQLAlchemy with MySQL Connector/Python 8 and MySQL Server 8.

SQLAlchemy Example

With MySQL and SQLAlchemy installed, it is straight forward to use SQLAlchemy in you application. Here, I will just show a very simple example based on the tutorial in the official SQLAlchemy manual.

The first step is to import the necessary parts of SQLAlchemy. In this example, only sqlalchemy itself and declarative_base from sqlalchemy.ext.declarative are required:

import sqlalchemy
from sqlalchemy.ext.declarative import declarative_base

The second step is to define MySQL as the engine and that you want to connect using MySQL Connector/Python:

# Define the MySQL engine using MySQL Connector/Python
engine = sqlalchemy.create_engine(
    'mysql+mysqlconnector://pyuser:Py@pp4Demo@localhost:3306/sqlalchemy',
    echo=True)

The definition of MySQL and MySQL Connector Python happens in line 6. mysql defines that you are using MySQL as the database (the dialect), and mysqlconnector tells that you want to use MySQL Connector/Python as the driver. The rest of the line defines the connection options. In this case you have user:password@host:port/default_db. You can add more options if you need it. So, in this case the following options have been set:

  • User: pyuser
  • Password: Py@app4Demo
  • Host: localhost
  • Port: 3306
  • Default database: sqlalchemy

Avoid

Never hard code the connection parameters into your application. Particularly the password is an absolutely no go. It is done here to keep the example simple, but doing so in an actual application makes deployments hard and pose a severe security issue.

See also the documentation of the engine configuration in the SQLAlchemy manual for more information.

The argument echo=True makes SQLAlchemy print each SQL statement it executes. This can be useful when testing.

The third step is to define and create a table – in this example the users table:

# Define and create the table
Base = declarative_base()

class User(Base):
    __tablename__ = 'users'

    id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True)
    name = sqlalchemy.Column(sqlalchemy.String(length=50))
    fullname = sqlalchemy.Column(sqlalchemy.String(length=50))
    nickname = sqlalchemy.Column(sqlalchemy.String(length=50))

    def __repr__(self):
        return "<User(name='{0}', fullname='{1}', nickname='{2}')>".format(
                            self.name, self.fullname, self.nickname)

Base.metadata.create_all(engine)

MySQL requires that you specify the maximum number of characters for varchar columns, which is the data type used when specifying sqlalchemy.String(). So, the length argument is passed. (The length argument is also the first argument to sqlalchemy.String(), so you do not need to specify explicitly that it is the length.)

The call to Base.metadata.create_all() tells SQLAlchemy to create the underlying database table, if it does not already exist. Otherwise the existing table will be used.

The fourth step is to add a user. This requires a session. Once the session is created, you can add the user to it:

# Create a session
Session = sqlalchemy.orm.sessionmaker()
Session.configure(bind=engine)
session = Session()

# Add a user
jwk_user = User(name='jesper', fullname='Jesper Wisborg Krogh', nickname='&#x1f42c;')
session.add(jwk_user)
session.commit()

There are two things, I will like you to pay attention to here. When defining the jwk_user in line 31, the nickname is set to be the dolphin emoji. This emoji requires four bytes in UTF-8 (0xF09F90AC). In older versions of MySQL (5.7 and earlier), you would need to explicitly change the character set to handle UTF-8. In MySQL 8, however, the four byte variant of UTF-8 (utf8mb4) is the default, so the dolphin emoji will work out of the box.

Tip

If you are interested in more information about the recommendation of which character set to use in MySQL, I wrote a blog about that last year: Which Character Set Should You Use in MySQL?

The other thing is that in order to persist the new user, you need to call session.commit(). The session works as a transaction here, so the changes will not be persisted until it is explicitly committed.

The fifth – and final – step is to query the data just saved:

# Query the user
our_user = session.query(User).filter_by(name='jesper').first()
print('\nOur User:')
print(our_user)
print('Nick name in hex: {0}'.format(our_user.nickname.encode('utf-8')))

The query finds the first user with the name set to “jesper”. Because the dolphin emoji tends not to be displayed correctly in many shells, the byte sequence in hex is also printed, so it can be confirmed it is indeed the dolphin emoji that was retrieved.

The output of the entire program is:

2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE 'sql_mode'
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE 'lower_case_table_names'
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine SELECT DATABASE()
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS CHAR(60)) AS anon_1
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS CHAR(60)) AS anon_1
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine DESCRIBE `users`
2019-03-03 14:02:57,280 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,280 INFO sqlalchemy.engine.base.Engine ROLLBACK
2019-03-03 14:02:57,280 INFO sqlalchemy.engine.base.Engine
CREATE TABLE users (
        id INTEGER NOT NULL AUTO_INCREMENT,
        name VARCHAR(50),
        fullname VARCHAR(50),
        nickname VARCHAR(50),
        PRIMARY KEY (id)
)


2019-03-03 14:02:57,280 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine COMMIT
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine INSERT INTO users (name, fullname, nickname) VALUES (%(name)s, %(fullname)s, %(nickname)s)
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine {'name': 'jesper', 'fullname': 'Jesper Wisborg Krogh', 'nickname': '&#x1f42c;'}
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine COMMIT
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine SELECT users.id AS users_id, users.name AS users_name, users.fullname AS users_fullname, users.nickname AS users_nickname
FROM users
WHERE users.name = %(name_1)s
 LIMIT %(param_1)s
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine {'name_1': 'jesper', 'param_1': 1}

Our User:
<User(name='jesper', fullname='Jesper Wisborg Krogh', nickname='&#x1f42c;')>
Nick name in hex: b'\xf0\x9f\x90\xac'

As it can be seen from the last two lines, the dolphin emoji was saved and retrieved correctly.

Complete Example Code

For completeness, here is the entire example program:

import sqlalchemy
from sqlalchemy.ext.declarative import declarative_base

# Define the MySQL engine using MySQL Connector/Python
engine = sqlalchemy.create_engine(
    'mysql+mysqlconnector://pyuser:Py@pp4Demo@localhost:3306/sqlalchemy',
    echo=True)

# Define and create the table
Base = declarative_base()
class User(Base):
    __tablename__ = 'users'

    id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True)
    name = sqlalchemy.Column(sqlalchemy.String(length=50))
    fullname = sqlalchemy.Column(sqlalchemy.String(length=50))
    nickname = sqlalchemy.Column(sqlalchemy.String(length=50))

    def __repr__(self):
        return "<User(name='{0}', fullname='{1}', nickname='{2}')>".format(
                            self.name, self.fullname, self.nickname)

Base.metadata.create_all(engine)

# Create a session
Session = sqlalchemy.orm.sessionmaker()
Session.configure(bind=engine)
session = Session()

# Add a user
jwk_user = User(name='jesper', fullname='Jesper Wisborg Krogh', nickname='&#x1f42c;')
session.add(jwk_user)
session.commit()

# Query the user
our_user = session.query(User).filter_by(name='jesper').first()
print('\nOur User:')
print(our_user)
print('Nick name in hex: {0}'.format(our_user.nickname.encode('utf-8')))

Enjoy using MySQL, MySQL Connector/Python, and SQLAlchemy.