Connector/Python C Extension Prepared Statement Support

MySQL Connector/Python 8 made the C Extension the default for the platform/Python version combinations supporting it. One thing that was missing from the C Extension implementation (unless you used the _mysql_connector module) was support for prepared statements. That has been taken care of with the release of version 8.0.17.

The two main advantages of using prepared statements are security and performance. The security comes in as you can pass query parameters and have them applied server-side, so you are sure they are quoted and escaped correctly taking the data type into consideration. The performance benefit happens, when you execute the same query (except for the parameters) several times as MySQL will prepare it only for the first execution and then reuse the prepared statement – that is where the name comes from.

You use the prepared statements with the C Extension in the same way as for the pure Python implementation – by setting the prepared argument to True when creating a cursor. The simplest way to explain is to show an example.

import mysql.connector

connect_args = {
    "user": "root",
    "host": "localhost",
    "port": 3306,
    "password": "password",
    "use_pure": False,
}

db = mysql.connector.connect(**connect_args)
cursor = db.cursor(prepared=True)
print(cursor)
print("")

sql = "SELECT * FROM world.city WHERE ID = %s"
city_ids = [130, 456, 3805]

print("  ID  Name            Country  District         Popluation")
print("-" * 58)
fmt = "{0:4d}  {1:14s}  {2:^7s}  {3:15s}  {4:10d}"
for city_id in city_ids:
    cursor.execute(sql, (city_id,))
    city = cursor.fetchone()
    print(fmt.format(*city))

cursor.close()
db.close()

In the connection arguments, use_pure is set to False. Since that is the default, it is not needed, but it has been added here to make it explicit that the C Extension is used.

Avoid

Do not hardcode the connection arguments in your programs. It is done here to keep the example simple, but it is both insecure and inflexible to do in real programs.

When the cursor is created in line 12, prepared is set to True making it a prepared statement cursor. To verify that, the cursor is printed in the next line.

You create the statement by adding the string %s as a placeholder where you want to add the parameters to the query. You can then keep executing the query. In the example, the query is executed for three different IDs. (Yes, for this example, all three cities could have been fetched in one query, but imagine this query is used as part of a larger application where the three cities are not required at the same time. This could for example be for three independent user requests.) The parameter is provided as a tuple to the execute() method of the cursor. The output of the program is:

CMySQLCursorPrepared: (Nothing executed yet)

  ID  Name            Country  District         Popluation
----------------------------------------------------------
 130  Sydney            AUS    New South Wales     3276207
 456  London            GBR    England             7285000
3805  San Francisco     USA    California           776733

Notice that the cursor uses the class CMySQLCursorPrepared, which is the prepared statement cursor class for the C Extension.

MySQL Connector/Python Revealed

Book

If you want to learn more about MySQL Connector/Python, then I have written MySQL Connector/Python Revealed published by Apress. The book both covers the traditional Python Database API (PEP 249) and the X DevAPI which is new as of MySQL 8.

The book is available from Apress (print and DRM free ePub+PDF), Amazon (print and Kindle), Barnes & Noble (print), and others.

Have fun coding.

Connector/Python Connection Attributes

MySQL Server has since version 5.6 supported connection attributes for the clients. This has allowed a client to provide information such as which program or connector the client is, the client version, the license, etc. The database administrator can use this information for example to verify whether all clients have been upgraded, which client is executing a particular query, and so forth.

In MySQL 8.0.16 this feature has been included for the X DevAPI in the MySQL connectors as well, including MySQL Connector/Python which I will cover in this blog. First though, let’s take a look at how the attributes are exposed in MySQL Server.

Example of the built-in connection attributes provided by MySQL Connector/Python
The built-in MySQL Connector/Python connection attributes

Connection Attributes in MySQL Server

The connection attributes are made available in MySQL Server through two tables within the Performance Schema: session_account_connect_attrs and session_connect_attrs. The two tables have the same definition – the only difference is for which connections they show the connection attributes.

The session_account_connect_attrs table includes the attributes for connections using the same user account as for the one querying the table. This is useful if you want to grant permission for a user to check the attributes for their own connections but not for other connections.

On the other hand, session_connect_attrs shows the attributes for all connections. This is useful for the administrator to check the attributes for all users.

Information

It is up to the client what attributes they want to expose and the values they provide. In that sense, you can only trust the attributes to the same extend that you trust the clients to submit correct values.

The tables have four columns:

  • PROCESSLIST_ID: This is the same ID as in SHOW PROCESSLIST or the PROCESSLIST_ID column in performance_schema.threads.
  • ATTR_NAME: This is the name of the attribute, for example _client_name.
  • ATTR_VALUE: This is the value for the attribute, for example mysql-connector-python.
  • ORDINAL_POSITION: The attributes have an order. The ordinal position specifies the position of the attribute. The first attribute for the connection has ordinal position 0, the next 1, and so forth.

The PROCESSLIST_ID and ATTR_NAME together form the primary key of the tables.

Now that you know how to query the table, let’s take a look at how it works with MySQL Connector/Python.

Connector/Python Attributes

There are essentially three different ways to use the connection attributes from MySQL Connector/Python. You can have them disabled. This is the default and means that no attributes will be provided. You can enable them and use the default attributes. And finally, there is support for providing custom attributes. Let’s look at each of the two cases where attributes are enabled.

MySQL Connector/Python Revealed

Book

If you want to learn more about MySQL Connector/Python, then I have written MySQL Connector/Python Revealed published by Apress. The book both covers the traditional Python Database API (PEP 249) and the X DevAPI which is new as of MySQL 8.

The book is available from Apress (print and DRM free ePub+PDF), Amazon (print and Kindle), Barnes & Noble (print), and others.

Enabled with Default Attributes

If you want your Python program to connect to MySQL using the default attributes, all you need to do is to set the connection-attributes option to True. You can do that in one of two ways depending on how you set the connection arguments

If you use a dictionary of arguments, you add connection-attributes as a key with the value set to True:

import mysqlx

connect_args = {
    "host": "127.0.0.1",
    "port": 33060,
    "user": "pyuser",
    "password": "Py@pp4Demo",
    "connection-attributes": True,
};

db = mysqlx.get_session(**connect_args)
p_s = db.get_schema("performance_schema")
attr = p_s.get_table("session_account_connect_attrs")
stmt = attr.select()
stmt.where("PROCESSLIST_ID = CONNECTION_ID()")
stmt.order_by("ORDINAL_POSITION")
result = stmt.execute()

print("+------+-----------------+------------------------+---------+")
print("| P_ID | ATTR_NAME       | ATTR_VALUE             | ORDINAL |")
print("+------+-----------------+------------------------+---------+")
fmt = "| {0:4d} | {1:<15s} | {2:<22s} | {3:7d} |"
for row in result.fetch_all():
    print(fmt.format(row[0], row[1], row[2], row[3]))
print("+------+-----------------+------------------------+---------+")

db.close()

The program creates the connection, then queries the performance_schema.session_account_connect_attrs table using the crud methods. Finally, the result is printed (note that the PROCESSLIST_ID and ORDINAL_POSITION columns have had their names shortened in the output to make the output less wide – the process list ID will differ in your output):

+------+-----------------+------------------------+---------+
| P_ID | ATTR_NAME       | ATTR_VALUE             | ORDINAL |
+------+-----------------+------------------------+---------+
|   45 | _pid            | 19944                  |       0 |
|   45 | _platform       | x86_64                 |       1 |
|   45 | _os             | Windows-10.0.17134     |       2 |
|   45 | _source_host    | MY-LAPTOP              |       3 |
|   45 | _client_name    | mysql-connector-python |       4 |
|   45 | _client_version | 8.0.16                 |       5 |
|   45 | _client_license | GPL-2.0                |       6 |
+------+-----------------+------------------------+---------+

Notice that all of the attribute names start with an underscore. That means it is a built-in attribute. Attribute names starting with an underscore are reserved and can only be set by MySQL itself.

You can also connect using a URI, in that case the connection is made like the following example:

import mysqlx
import urllib

uri = "mysqlx://{0}:{1}@{2}:{3}/?connection-attributes=True".format(
    "pyuser",
    urllib.parse.quote('Py@pp4Demo', safe=''),
    "127.0.0.1",
    "33060"
)
db = mysqlx.get_session(uri)

The username, password, host, and port are added through the format() function to make the code less wide. The important thing here is the connection-attributes=True. You can also leave out =True as mentioning the connection-attributes option without any value is the same as enabling it.

What do you do, if you want to add your own customer attributes? That is supported as well.

Custom Attributes

This far the connection-attributes option has just been set to True. However, it also supports taking a list or dictionary as the argument. That can be used to set your own custom attributes.

The easiest way to understand this is to see an example:

import mysqlx

attributes = {
    "application": "my_test_app",
    "foo": "bar",
    "foobar": "",
}
connect_args = {
    "host": "127.0.0.1",
    "port": 33060,
    "user": "pyuser",
    "password": "Py@pp4Demo",
    "connection-attributes": attributes,
};

db = mysqlx.get_session(**connect_args)
p_s = db.get_schema("performance_schema")
attr = p_s.get_table("session_account_connect_attrs")
stmt = attr.select()
stmt.where("PROCESSLIST_ID = CONNECTION_ID()")
stmt.order_by("ORDINAL_POSITION")
result = stmt.execute()

print("+------+-----------------+------------------------+---------+")
print("| P_ID | ATTR_NAME       | ATTR_VALUE             | ORDINAL |")
print("+------+-----------------+------------------------+---------+")
fmt = "| {0:4d} | {1:<15s} | {2:<22s} | {3:7d} |"
for row in result.fetch_all():
    value = row[2] if row[2] is not None else ""
    print(fmt.format(row[0], row[1], value, row[3]))
print("+------+-----------------+------------------------+---------+")

db.close()

Notice in line 29 that it is checked whether the attribute value is None (NULL in SQL). When the attribute value is an empty string or no value like for the foobar attribute, it is returned as None in the result set.

Alternatively, you can specify the same three connection attributes using a list:

attributes = [
    "application=my_test_app",
    "foo=bar",
    "foobar",
]
connect_args = {
    "host": "127.0.0.1",
    "port": 33060,
    "user": "pyuser",
    "password": "Py@pp4Demo",
    "connection-attributes": attributes,
};

You can also use an URI of course. You need to use the list syntax for that:

uri = "mysqlx://{0}:{1}@{2}:{3}/" \
    + "?connection-attributes=[application=my_test_app,foo=bar,foobar]".format(
    "pyuser",
    urllib.parse.quote('Py@pp4Demo', safe=''),
    "127.0.0.1",
    "33060",
)

These all give the same result (the process list ID will differ):

+------+-----------------+------------------------+---------+
| P_ID | ATTR_NAME       | ATTR_VALUE             | ORDINAL |
+------+-----------------+------------------------+---------+
|   74 | _pid            | 20704                  |       0 |
|   74 | _platform       | x86_64                 |       1 |
|   74 | _os             | Windows-10.0.17134     |       2 |
|   74 | _source_host    | MY-LAPTOP              |       3 |
|   74 | _client_name    | mysql-connector-python |       4 |
|   74 | _client_version | 8.0.16                 |       5 |
|   74 | _client_license | GPL-2.0                |       6 |
|   74 | application     | my_test_app            |       7 |
|   74 | foo             | bar                    |       8 |
|   74 | foobar          |                        |       9 |
+------+-----------------+------------------------+---------+

Notice that the built-in attributes are still included and the custom attributes have been added at the end.

That concludes this introduction to connection attributes with the MySQL Connector/Python X DevAPI. I will recommend to enable them by default even if just for the built-in attributes. The attributes can be very handy when debugging issues on the server.

MySQL Shell 8.0.16: User Defined Reports

In my blog yesterday, I wrote about the new reporting framework in MySQL Shell. It is part of the 8.0.16 release. I also noted that it includes the possibility to create your own custom reports and use those with the \show and \watch commands. This blog will explore how you can create a report and register it, so it automatically is available when you start MySQL Shell.

Update

This blog was updated on 30 April to include the use of the values argument for report options. This moved the validation of the optional arguments in to the reporting framework and automatically includes the list of valid options in the generated help text.

The help text for the example sessions report.
The help text for the example sessions report.

Background

You can write the code that generates the report in either JavaScript or Python. The reports can be used from either language mode – even SQL – irrespective of which language you choose, so go with what you are most comfortable with.

Once you have written your code, you save it in the init.d folder (does not exist by default) inside the user configuration folder. By default this is at the following location depending on whether you are on Microsoft Windows or Linux:

  • Microsoft Windows: %AppData%MySQL\mysqlsh
  • Linux: ~/.mysqlsh

You can overwrite this path with the MYSQLSH_USER_CONFIG_HOME environment variable.

You are free to choose any file name, but a good rule is to name the file the same as the report. However, it is required that you use .py as the file name extension if you wrote the report in Python and .js if you used JavaScript.

At that point, you need to register the report, so you can use it through the reporting framework. You do that using the shell.registerReport() method from inside the same file that has the report code. It takes four arguments: the name of the report, the report type, the function generating the report (as a function object), and optional a dictionary with the description. I will not go into the details of these argument here beyond providing an example of using it. The manual has a quite detailed section on registering your report including what the arguments are.

One thing that is worth discussing a bit as it influences how the report content should be formatted is the report type. This can have one of three values:

  • list: The content of the report is returned as a list of lists constituting the rows of a table. The \show and \watch commands can then show the data either using the familiar tabular format or in vertical. The decision of which display format to use can be made when running the report.
  • report: The report content is returned in YAML.
  • print: The report code print the output directly.

The report and print types are the more flexible, but the list type works well with query results.

This can all feel very abstract. The best way to actually understand how it works is to write an example report to go through the steps.

Example Custom Report

The custom report, I will create is based on the one in the reference manual, but modified to allow you to choose what to sort by. The example should help make it clearer how to create your own reports.

The example is quite simple and could be generated using the built-in query report, but it serves as a good starting point to understand the mechanics of custom reports, and even simple reports like this provides a way to have your report logic saved in one place and easily accessible from within MySQL Shell. The example is written in Python, but a report generating the same result written in JavaScript would look similar (although not identical).

Download the Source

You do not need to copy and paste all the code snippets if you want to try this example. You can download the entire sessions.zip file from below and extract the file with the report source code.

The Report Function

The first thing is to define the report itself. This report is called sessions, so the function with the code is also called sessions. This is not required, but it is best practice:

sort_allowed = {
    'thread': 'thd_id',
    'connection': 'conn_id',
    'user': 'user',
    'db': 'db',
    'latency': 'statement_latency',
    'memory': 'current_memory',
}

def sessions(session, args, options):

First a dictionary is defined with the keys specifying the allowed values for the --sort option and the values as what will actually be used for the ordering. Then there is the definition of the reporting function itself. The function takes three arguments:

  • session: A MySQL Shell session object. This gives you access to all of the session properties and methods when you create the report.
  • args: A list of any additional arguments passed to the the report. This is what the query report uses to get the query that will be executed. This report does not use any such arguments, so anything passed this way will be ignored.
  • options: This is a dictionary with named options. This report will support two such named options:
    • --limit or -l which sets the maximum number of rows to retrieve. The option will use the limit key in the dictionary. The default is not to impose any limit.
    • --sort or -s which chooses what to sort by. The option will use the sort key in the dictionary. The report will support ordering by thread, connection, user, db, latency, and memory. The default is to sort by latency.

You can choose different names for the arguments if you prefer.

The next thing is to define the query that will retrieve the result that will be used in the report. You can do this in several ways. If you want to execute an SQL query, you can use session.sql() (where session is the name of the session object in your argument list). However, it is simpler to code the query using the X DevAPI as that makes it trivial to customize the query, for example with the limit option and what to order by.

    sys = session.get_schema('sys')
    session_view = sys.get_table('x$session')
    query = session_view.select(
        'thd_id', 'conn_id', 'user', 'db',
        'sys.format_statement(current_statement) AS statement',
        'sys.format_time(statement_latency) AS latency',
        'format_bytes(current_memory) AS memory')

The statement will query the sys.x$session view. This is the non-formatted version of sys.session. The reason for using this is to allow custom sorting of the result set according to the --sort option. The view is obtained using the session.get_schema() method first to get a schema object for the sys schema, then the get_table() method of the schema object.

The query can then be defined from the table (view in this case) object by using the select() method. The arguments are the columns that should be included in the result. As you can see, it if possible to manipulate the columns and rename them.

Want to Learn More?

If you want to learn more about the MySQL X DevAPI and how to use the Python version of it, then I have written MySQL Connector/Python Revealed published by Apress. The book is available from Apress (print and DRM free ePub+PDF), Amazon (print and Kindle), Barnes & Noble (print), and others.

The X DevAPI makes it trivial to modify the query with the options the report are invoked with. First handle the --sort option:

    # Set what to sort the rows by (--sort)
    try:
        order_by = options['sort']
    except SystemError:
        order_by = 'latency'
        
    if order_by in ('latency', 'memory'):
        direction = 'DESC'
    else:
        direction = 'ASC'
    query.order_by('{0} {1}'.format(
        sort_allowed[order_by], direction))

    # If ordering by latency, ignore those statements with a NULL latency
    # (they are not active)
    if order_by == 'latency':
        query.where('statement_latency IS NOT NULL')

If the --sort option is not provided, then a SystemError exception is raised. The first part of the snippet handles this, and ensures that the report default to ordering by the latency. Then, it is checked if the provided value is one of the supported values.

The next step is to decide whether to sort in descending or ascending order. You can of course add another option for this, but here the logic is contained within the report choosing descending when sorting by latency or memory usage; otherwise ascending.

The final step is to tell MySQL what to order by which is done in lines 29-30 by invoking the order_by() method. This is where the programmatic approach of the X DevAPI makes it easier to gradually put the query together compared to working directly with the SQL statement.

This report adds a little extra logic to the query. If the result is ordered by latency, only queries that are currently executing (the latency IS NOT NULL are included). This is one of the advantages of creating a custom report rather than writing the query ad-hoc as you can include logic like that.

The --limit option is handled in a similar way:

    # Set the maximum number of rows to retrieve is --limit is set.
    try:
        limit = options['limit']
    except SystemError:
        limit = 0
    if limit > 0:
        query.limit(limit)

There is not much to note about this code snippet. In line 43 the limit is applied (if the value is greater than 0) by invoking the limit() method. Finally, the query can be executed and the report generated:

    result = query.execute()
    report = [result.get_column_names()]
    for row in result.fetch_all():
        report.append(list(row))

    return {'report': report}

The execute() method is used to tell MySQL that the query can be executed. This returns a result object. The get_column_names() method of the result object can be used to get the column names. Then, the rows are added by iterating over them. As you can see, there is only one report list: the first element is a list with the column headers, the remaining are the row values.

Tip

The first element in the report list contains the column headers. The remaining elements contain the values.

Finally, the result is returned as a dictionary. That is it for generating the report, but it should also be registered.

Registering the Report

The registration of the report is done in the same file as where the report function was defined. You perform the registration by calling the shell.register_report() method:

shell.register_report(
    'sessions',
    'list',
    sessions,
    {
        'brief': 'Shows which sessions exist.',
        'details': ['You need the SELECT privilege on sys.session view and the '
                    + 'underlying tables and functions used by it.'],
        'options': [
            {
                'name': 'limit',
                'brief': 'The maximum number of rows to return.',
                'shortcut': 'l',
                'type': 'integer'
            },
            {
                'name': 'sort',
                'brief': 'The field to sort by.',
                'shortcut': 's',
                'type': 'string',
                'values': sort_allowed.keys()
            }
        ],
        'argc': '0'
    }
)

The first argument is the name of the report, ‘sessions’, then the report type. The third argument is the function itself. Then comes a dictionary describing the report.

There are two parts to the dictionary: the two first arguments with a description of the report – first a short (brief) description, then more details. Then a list of the options that the report supports. The final argument is the number of additional arguments. Notice that for the --sort option, the element values have been set with the allowed values for the option. This will make MySQL Shell validate the option.

Now, you are ready to test the report.

Testing the Report

First the report must be installed. If you do not already have the init.d directory, create it under %AppData%MySQL\mysqlsh if you are on Microsoft Windows or under ~/.mysqlsh if you are on Linux. Then copy sessions.py into the directory.

Now, start MySQL Shell and the report is ready to be used:

mysql-js> \show
Available reports: query, sessions.

mysql-js> \show sessions --help
sessions - Shows which sessions exist.

You need the SELECT privilege on sys.session view and the underlying tables and
functions used by it.

Usage:
       \show sessions [OPTIONS]
       \watch sessions [OPTIONS]

Options:
  --help                        Display this help and exit.
  --vertical, -E                Display records vertically.
  --limit=integer, -l           The maximum number of rows to return.
  --sort=string, -s             The field to sort by. Allowed values: latency,
                                thread, db, connection, user, memory.

mysql-js> \show sessions
+--------+---------+---------------+------+-------------------------------------------------------------------+----------+------------+
| thd_id | conn_id | user          | db   | statement                                                         | latency  | memory     |
+--------+---------+---------------+------+-------------------------------------------------------------------+----------+------------+
| 65     | 28      | mysqlx/worker | NULL | SELECT `thd_id`,`conn_id`,`use ... ER BY `statement_latency` DESC | 38.09 ms | 965.58 KiB |
+--------+---------+---------------+------+-------------------------------------------------------------------+----------+------------+

mysql-js> \show sessions -E
*************************** 1. row ***************************
   thd_id: 65
  conn_id: 28
     user: mysqlx/worker
       db: NULL
statement: SELECT `thd_id`,`conn_id`,`use ... ER BY `statement_latency` DESC
  latency: 35.49 ms
   memory: 968.88 KiB

Notice how the help text has been generated from the information that was provided when the report was registered, and how the -E option can be used to turn the tabular output format into the vertical format. Note also that the report is invoked from JavaScript mode and still works even though the report is written in Python – MySQL Shell will automatically handle that for you and ensure the report is executed using the correct interpreter.

It is left as an exercise for the reader to add the --sort and --limit options and to use the report with the \watch command.

Note

On Microsoft Windows, it sometimes happens that when an option is not explicitly passed to the report, then the options dictionary is still set with a value. You can avoid that by providing the options explicitly.

One related feature that is worth covering before finishing is the shell.reports object.

The shell.reports Object

This far the \show and \watch commands have been used to invoke the reports, but there is a lower level way to do it – using the shell.reports object. It is also a very useful way to explore which reports are available.

Let’s start with the latter – exploring reports – as that also shows you how the shell.reports object work. As usual in MySQL Shell, it has built-in help:

mysql-py> shell.reports.help()
NAME
      reports - Gives access to built-in and user-defined reports.

SYNTAX
      shell.reports

DESCRIPTION
      The 'reports' object provides access to built-in reports.

      All user-defined reports registered using the shell.register_report()
      method are also available here.

      The reports are provided as methods of this object, with names
      corresponding to the names of the available reports.

      All methods have the same signature: Dict report(Session session, List
      argv, Dict options), where:

      - session - Session object used by the report to obtain the data.
      - argv (optional) - Array of strings representing additional arguments.
      - options (optional) - Dictionary with values for various report-specific
        options.

      Each report returns a dictionary with the following keys:

      - report (required) - List of JSON objects containing the report. The
        number and types of items in this list depend on type of the report.

      For more information on a report use: shell.reports.help('report_name').

FUNCTIONS
      help([member])
            Provides help about this object and it's members

      query(session, argv)
            Executes the SQL statement given as arguments.

      sessions(session[, argv][, options])
            Shows which sessions exist.

This includes a list of the functions available – and notice that the two reports that exist, query and sessions, are among the functions. You can also use the help() function with the report name as a string argument to get the report specific help.

If you invoke one of the report functions, you execute the report. This is much similar to invoking the report using the \show command, but it will be the raw report result that is returned. Let’s try it both for the query and sessions reports:

mysql-py> shell.reports.query(shell.get_session(), ["SELECT NOW()"])
{
    "report": [
        [
            "NOW()"
        ],
        [
            "2019-04-27 15:53:21"
        ]
    ]
}

mysql-py> shell.reports.sessions(shell.get_session(), [], {'limit': 10, 'sort': 'latency'})
{
    "report": [
        [
            "thd_id",
            "conn_id",
            "user",
            "db",
            "statement",
            "latency",
            "memory"
        ],
        [
            66,
            29,
            "mysqlx/worker",
            null,
            "SELECT `thd_id`,`conn_id`,`use ... ment_latency` DESC LIMIT 0, 10",
            "39.76 ms",
            "886.99 KiB"
        ]
    ]
}

It is not often this is needed, but in case you want to manually manipulate the output, it can be useful.

Tip

If you use JavaScript mode, then use shell.getSession() instead of shell.get_session() to get a session object to pass to the report.

That is all. Now over to you to create your own reports.

Using SQLAlchemy with MySQL 8

I few months ago, I wrote about using the Django framework with MySQL 8. There are also other Python frameworks that are worth considering. In this blog, I will look at using SQLAlchemy with MySQL 8.

In order for you to be able to use MySQL 8 with SQLAlchemy, you need three pieces of software: MySQL Server, MySQL Connector/Python, and SQLAlchemy. I will go through the installations, then I will look at a code example.

Information

The examples in this blog uses MySQL Server 8.0.15, MySQL Connector/Python 8.0.15, and SQLAlchemy 1.2.18. It should be possible to reuse the instructions with other MySQL versions as well, except in older MySQL versions you need to explicitly use the utf8mb4 character set.

If you do not want to read through the whole blog, the main thing for using MySQL and MySQL Connector/Python with SQLAlchemy is to create the engine with dialect set to mysql and the driver to mysqlconnector:

engine = sqlalchemy.create_engine(
    'mysql+mysqlconnector://<user>:<password>@<host>:<port>/<default_db>...')

Installing MySQL Server

There are several ways to install MySQL Server and which one is the best depends on your circumstances and preferences. For the sake of this blog, I will show how MySQL Server can be installed on Oracle Linux/RHEL/CentOS 7 using RPMs and on Microsoft Windows using MySQL Installer. For more options, see the installation chapter in the reference manual. Let’s look at the Linux installation first.

RPM Install on Enterprise Linux

MySQL provides repositories for several Linux distributions including the Oracle Linux/RHEL/CentOS family. This makes it easy to install MySQL. The step to install the repository definition is:

shell$ sudo yum install https://dev.mysql.com/get/mysql80-community-release-el7-1.noarch.rpmLoaded plugins: langpacks, ulninfo
Repository ol7_developer_EPEL is listed more than once in the configuration
mysql80-community-release-el7-1.noarch.rpm                                     |  25 kB  00:00:00     
Examining /var/tmp/yum-root-Ts4OzC/mysql80-community-release-el7-1.noarch.rpm: mysql80-community-release-el7-1.noarch
Marking /var/tmp/yum-root-Ts4OzC/mysql80-community-release-el7-1.noarch.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package mysql80-community-release.noarch 0:el7-1 will be installed
--> Finished Dependency Resolution
 
Dependencies Resolved
 
======================================================================================================
 Package                       Arch       Version   Repository                                   Size
======================================================================================================
Installing:
 mysql80-community-release     noarch     el7-1     /mysql80-community-release-el7-1.noarch      31 k
 
Transaction Summary
======================================================================================================
Install  1 Package
 
Total size: 31 k
Installed size: 31 k
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : mysql80-community-release-el7-1.noarch                                             1/1
  Verifying  : mysql80-community-release-el7-1.noarch                                             1/1
 
Installed:
  mysql80-community-release.noarch 0:el7-1
 
Complete!

Now, you can install MySQL Server. There are several RPMs to choose from and which you need depends on which other features you need to use. A common set of RPMs can be installed as:

shell$ sudo yum install mysql-community-server mysql-community-libs \
            mysql-community-libs-compat mysql-community-common mysql-community-client
...

Note

If you have another MySQL installation, it will be upgraded to the latest release (at the time of writing 8.0.15).

On the first start, the data directory will be initialized:

shell$ sudo systemctl start mysqld

To keep a fresh installation secure, a random password has been set for the root user. This can be found from the MySQL error log:

shell$ sudo grep password /var/log/mysqld.log 
2018-11-05T08:05:09.985857Z 5 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: KWNfuA!1r:PF

Use this password to connect to MySQL and update the password (please use a strong password):

shell$ mysql --user=root --password
Enter password:

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 19
Server version: 8.0.15 MySQL Community Server - GPL

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> ALTER USER root@localhost IDENTIFIED BY 'Kij0@jDi~Faf4';
Query OK, 0 rows affected (0.13 sec)

MySQL is now ready for use. Before continuing, I will show an example of installing MySQL on Microsoft Windows.

Microsoft Windows

On Microsoft Windows an easy way to install MySQL is to use the MySQL Installer. The installer can be downloaded from the MySQL download site. The MySQL Installer can be used to install most MySQL products. If you start MySQL Installer for the first time, you will be taken directly to the screen to choose the products to install; if you already have installed products, you will need to choose to add new products.

On the Select Products and Features screen, choose to install MySQL Server 8.0 (MySQL Installer will list the latest release from the list of available products):

Installing MySQL Server 8.0.13 from MySQL Installer
Installing MySQL Server from MySQL Installer – The screen shot is for 8.0.13, but other than the version number, it is the same for 8.0.15.

Optionally, you can filter the list of products. Feel free to choose other products you want. MySQL Notifier can be useful for starting and stopping MySQL, if you do not plan to have MySQL running at all times. You can also install MySQL Connector/Python this way, however for this blog a different method will be used.

Follow the installation wizard. For this blog, the default choices will work, though during the configuration you may want to ensure Open Windows Firewall ports for network access is unchecked unless you need remote access.

Preparing MySQL Server

While MySQL is now ready to work with SQLAlchemy, you will likely want to do a few more preparation steps. Here creating the MySQL user and schema (database) used by your application will be covered.

An example of creating the user pyuser@localhost and give it all privileges to the sqlalchemy schema and to create the sqlalchemy schema is:

mysql> CREATE USER pyuser@localhost IDENTIFIED BY 'Py@pp4Demo';
Query OK, 0 rows affected (0.01 sec)

mysql> GRANT ALL ON sqlalchemy.* TO pyuser@localhost;
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE SCHEMA sqlalchemy;
Query OK, 1 row affected (0.01 sec)

This will allow the pyuser user to connect from the same host as MySQL Server is installed by authenticating with the password Py@app4Demo.

Installing MySQL Connector/Python and SQLAlchemy

Both MySQL Connector/Python and SQLAlchemy can be installed in a platform independent way using the pip command. Since Python 2.7 is soon end of life, I will assume Python 3.6 in this blog. (MySQL Connector/Python 8.0.13 and later also supports Python 3.7.)

If you do not have Python 3.6 installed on Oracle Linux/RHEL/CentOS 7, you can easily install it for example from for EPEL repository. Assuming you have configured the EPEL repository, the following steps install Python 3.6, enable pip, and update pip to the latest version:

shell$ yum install python36
shell$ python3.6 -m ensurepip
shell$ python3.6 -m pip install --upgrade pip

You can now use python3.6 to invoke Python 3.6. In the following, replace python with python3.6 if you have installed Python 3.6 in this way.

To install the latest MySQL Connector/Python release (currently 8.0.15):

PS> python -m pip install mysql-connector-python
Collecting mysql-connector-python
  Using cached https://files.pythonhosted.org/packages/31/45/ef8cf013918108f508a1a1bb5539abaff5f78f3a569f7fa30232967713c9/mysql_connector_python-8.0.15-cp36-cp36m-win_amd64.whl

Collecting protobuf>=3.0.0 (from mysql-connector-python)
  Downloading https://files.pythonhosted.org/packages/5d/5c/476f473c2efc0a8d9fd7185e6c08dcbd21c469698e2a80487fa054b8c5ba/protobuf-3.7.0-cp36-cp36m-win_amd64.whl (1.1MB)
    100% |████████████████████████████████| 1.1MB 6.6MB/s
Requirement already satisfied: six>=1.9 in c:\users\myuser\appdata\local\programs\python\python36\lib\site-packages (from protobuf>=3.0.0->mysql-connector-python)
 (1.11.0)
Requirement already satisfied: setuptools in c:\users\myuser\appdata\local\programs\python\python36\lib\site-packages (from protobuf>=3.0.0->mysql-connector-pytho
n) (28.8.0)
Installing collected packages: protobuf, mysql-connector-python
Successfully installed mysql-connector-python-8.0.15 protobuf-3.7.0

Similar for installing SQLAlchemy:

PS> python -m pip install SQLAlchemy
Collecting SQLAlchemy
  Downloading https://files.pythonhosted.org/packages/21/ed/7eb53305b43ca51774a458d7c292f3bc7664d7a9bbb5bac4149fa34756b9/SQLAlchemy-1.2.18.tar.gz (5.7MB)
    100% |████████████████████████████████| 5.7MB 3.3MB/s
Installing collected packages: SQLAlchemy
  Running setup.py install for SQLAlchemy ... done

That’s it. Now you are ready to use SQLAlchemy with MySQL Connector/Python 8 and MySQL Server 8.

SQLAlchemy Example

With MySQL and SQLAlchemy installed, it is straight forward to use SQLAlchemy in you application. Here, I will just show a very simple example based on the tutorial in the official SQLAlchemy manual.

The first step is to import the necessary parts of SQLAlchemy. In this example, only sqlalchemy itself and declarative_base from sqlalchemy.ext.declarative are required:

import sqlalchemy
from sqlalchemy.ext.declarative import declarative_base

The second step is to define MySQL as the engine and that you want to connect using MySQL Connector/Python:

# Define the MySQL engine using MySQL Connector/Python
engine = sqlalchemy.create_engine(
    'mysql+mysqlconnector://pyuser:Py@pp4Demo@localhost:3306/sqlalchemy',
    echo=True)

The definition of MySQL and MySQL Connector Python happens in line 6. mysql defines that you are using MySQL as the database (the dialect), and mysqlconnector tells that you want to use MySQL Connector/Python as the driver. The rest of the line defines the connection options. In this case you have user:password@host:port/default_db. You can add more options if you need it. So, in this case the following options have been set:

  • User: pyuser
  • Password: Py@app4Demo
  • Host: localhost
  • Port: 3306
  • Default database: sqlalchemy

Avoid

Never hard code the connection parameters into your application. Particularly the password is an absolutely no go. It is done here to keep the example simple, but doing so in an actual application makes deployments hard and pose a severe security issue.

See also the documentation of the engine configuration in the SQLAlchemy manual for more information.

The argument echo=True makes SQLAlchemy print each SQL statement it executes. This can be useful when testing.

The third step is to define and create a table – in this example the users table:

# Define and create the table
Base = declarative_base()

class User(Base):
    __tablename__ = 'users'

    id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True)
    name = sqlalchemy.Column(sqlalchemy.String(length=50))
    fullname = sqlalchemy.Column(sqlalchemy.String(length=50))
    nickname = sqlalchemy.Column(sqlalchemy.String(length=50))

    def __repr__(self):
        return "<User(name='{0}', fullname='{1}', nickname='{2}')>".format(
                            self.name, self.fullname, self.nickname)

Base.metadata.create_all(engine)

MySQL requires that you specify the maximum number of characters for varchar columns, which is the data type used when specifying sqlalchemy.String(). So, the length argument is passed. (The length argument is also the first argument to sqlalchemy.String(), so you do not need to specify explicitly that it is the length.)

The call to Base.metadata.create_all() tells SQLAlchemy to create the underlying database table, if it does not already exist. Otherwise the existing table will be used.

The fourth step is to add a user. This requires a session. Once the session is created, you can add the user to it:

# Create a session
Session = sqlalchemy.orm.sessionmaker()
Session.configure(bind=engine)
session = Session()

# Add a user
jwk_user = User(name='jesper', fullname='Jesper Wisborg Krogh', nickname='&#x1f42c;')
session.add(jwk_user)
session.commit()

There are two things, I will like you to pay attention to here. When defining the jwk_user in line 31, the nickname is set to be the dolphin emoji. This emoji requires four bytes in UTF-8 (0xF09F90AC). In older versions of MySQL (5.7 and earlier), you would need to explicitly change the character set to handle UTF-8. In MySQL 8, however, the four byte variant of UTF-8 (utf8mb4) is the default, so the dolphin emoji will work out of the box.

Tip

If you are interested in more information about the recommendation of which character set to use in MySQL, I wrote a blog about that last year: Which Character Set Should You Use in MySQL?

The other thing is that in order to persist the new user, you need to call session.commit(). The session works as a transaction here, so the changes will not be persisted until it is explicitly committed.

The fifth – and final – step is to query the data just saved:

# Query the user
our_user = session.query(User).filter_by(name='jesper').first()
print('\nOur User:')
print(our_user)
print('Nick name in hex: {0}'.format(our_user.nickname.encode('utf-8')))

The query finds the first user with the name set to “jesper”. Because the dolphin emoji tends not to be displayed correctly in many shells, the byte sequence in hex is also printed, so it can be confirmed it is indeed the dolphin emoji that was retrieved.

The output of the entire program is:

2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE 'sql_mode'
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE 'lower_case_table_names'
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine SELECT DATABASE()
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS CHAR(60)) AS anon_1
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS CHAR(60)) AS anon_1
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,265 INFO sqlalchemy.engine.base.Engine DESCRIBE `users`
2019-03-03 14:02:57,280 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,280 INFO sqlalchemy.engine.base.Engine ROLLBACK
2019-03-03 14:02:57,280 INFO sqlalchemy.engine.base.Engine
CREATE TABLE users (
        id INTEGER NOT NULL AUTO_INCREMENT,
        name VARCHAR(50),
        fullname VARCHAR(50),
        nickname VARCHAR(50),
        PRIMARY KEY (id)
)


2019-03-03 14:02:57,280 INFO sqlalchemy.engine.base.Engine {}
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine COMMIT
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine INSERT INTO users (name, fullname, nickname) VALUES (%(name)s, %(fullname)s, %(nickname)s)
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine {'name': 'jesper', 'fullname': 'Jesper Wisborg Krogh', 'nickname': '&#x1f42c;'}
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine COMMIT
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine SELECT users.id AS users_id, users.name AS users_name, users.fullname AS users_fullname, users.nickname AS users_nickname
FROM users
WHERE users.name = %(name_1)s
 LIMIT %(param_1)s
2019-03-03 14:02:57,312 INFO sqlalchemy.engine.base.Engine {'name_1': 'jesper', 'param_1': 1}

Our User:
<User(name='jesper', fullname='Jesper Wisborg Krogh', nickname='&#x1f42c;')>
Nick name in hex: b'\xf0\x9f\x90\xac'

As it can be seen from the last two lines, the dolphin emoji was saved and retrieved correctly.

Complete Example Code

For completeness, here is the entire example program:

import sqlalchemy
from sqlalchemy.ext.declarative import declarative_base

# Define the MySQL engine using MySQL Connector/Python
engine = sqlalchemy.create_engine(
    'mysql+mysqlconnector://pyuser:Py@pp4Demo@localhost:3306/sqlalchemy',
    echo=True)

# Define and create the table
Base = declarative_base()
class User(Base):
    __tablename__ = 'users'

    id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True)
    name = sqlalchemy.Column(sqlalchemy.String(length=50))
    fullname = sqlalchemy.Column(sqlalchemy.String(length=50))
    nickname = sqlalchemy.Column(sqlalchemy.String(length=50))

    def __repr__(self):
        return "<User(name='{0}', fullname='{1}', nickname='{2}')>".format(
                            self.name, self.fullname, self.nickname)

Base.metadata.create_all(engine)

# Create a session
Session = sqlalchemy.orm.sessionmaker()
Session.configure(bind=engine)
session = Session()

# Add a user
jwk_user = User(name='jesper', fullname='Jesper Wisborg Krogh', nickname='&#x1f42c;')
session.add(jwk_user)
session.commit()

# Query the user
our_user = session.query(User).filter_by(name='jesper').first()
print('\nOur User:')
print(our_user)
print('Nick name in hex: {0}'.format(our_user.nickname.encode('utf-8')))

Enjoy using MySQL, MySQL Connector/Python, and SQLAlchemy.

MySQL Connector/Python 8.0.14+: Changed Expression Handling in mysqlx

The X DevAPI allows you to work with JSON documents and SQL tables at the same time. Furthermore, the CRUD style API is more intuitive than SQL statements for some programmers. Either way, the X DevAPI allows you to mix JSON documents, SQL tables, CRUD methods, and SQL statements to give you the best of all worlds. In MySQL Connector/Python, the X DevAPI is implemented in the mysqlx module.

This blog will look at how MySQL Connector/Python handles expressions, and how you in version 8.0.14 and later need to use the mysqlx.expr() method to explicitly define expressions.

Information

The changed way to work with expressions does not apply when defining fields. In that case, you can still specify the expression inline.

Expression Handling

One original feature of the X DevAPI in MySQL Connector/Python was that expressions were automatically handled when you inlined them into statement definitions. For example, you could increase the population like:

result = countryinfo.modify("Name = :country") \
    .set("demographics.Population",
         "CAST(FLOOR(demographics.Population * 1.10) AS unsigned)") \
    .bind("country", country_name) \
    .execute()

Notice the expression in line 3.

While this was nice, it also caused some problems such as Bug 92416 where regular data could end up being interpreted as an expression by mistake. For example:

schema = db.create_schema("db1")
mycol = schema.create_collection("mycol")
mycol.add({"A": "(@)"}).execute()

In this example (@) would be interpreted as an expression even though it should be taken as a string.

The solution has been to require the mysqlx.expr() method to be used to define all expressions. This then allows MySQL to interpret all strings as literals. While it does remove the shortcut method, it also removes any ambiguities and thus makes the code safer.

Tip

The mysqlx.expr() method is also available in MySQL Connector/Python 8.0.13 and earlier as well as other connectors. If you have not yet upgraded to the latest version, it is recommended to start using mysqlx.expr() now to avoid problems when upgrading.

Let’s look at an example to better understand how the change works.

Example

As an example, consider an application that uses the world_x database and updates the population of a country with 10%. This can be done using the following expression:

CAST(FLOOR(demographics.Population * 1.10) AS unsigned)

At first, the application will use the inline method to specify the expression, then we will look at changing this to work in MySQL Connector/Python 8.0.14 and later.

Inline Expression

The source code for updating the population using an inline expression can be seen in the following sample program:

import mysqlx
import mysql.connector

connect_args = {
    "host": "127.0.0.1",
    "port": 33060,
    "user": "pyuser",
    "password": "Py@pp4Demo",
};

print("Using inline expression.")
print("MySQL Connector/Python {0}".format(mysql.connector.__version__))

db = mysqlx.get_session(**connect_args)
schema = db.get_schema("world_x")
countryinfo = schema.get_collection("countryinfo")
country_name = "Australia"

db.start_transaction()

# Get the population of the country
population = countryinfo.find("Name = :country") \
    .fields("demographics.Population AS Population") \
    .bind("country", country_name)

before = population.execute()
print("Before ...: {0}".format(before.fetch_one()['Population']))

# Update the population
result = countryinfo.modify("Name = :country") \
    .set("demographics.Population",
         "CAST(FLOOR(demographics.Population * 1.10) AS unsigned)") \
    .bind("country", country_name) \
    .execute()

after = population.execute()
print("After ....: {0}".format(after.fetch_one()['Population']))

# Leave the data in the same state as before the changes
db.rollback()
db.close()

The population is updated in the statement defined and executed in lines 30-34. The population is fetched both before and after and printed together with the MySQL Connector/Python version. At the end, the transaction is rolled back, so it is possible to execute the program several times while getting the same output.

Information

The mysql.connector module is only included in order to print the MySQL Connector/Python version.

The output using MySQL Connector/Python 8.0.13 is:

Using inline expression.
MySQL Connector/Python 8.0.13
Before ...: 18886000
After ....: 20774600

This is as expected. However, in MySQL Connector/Python 8.0.14 and later, the result is quite different:

Using inline expression.
MySQL Connector/Python 8.0.15
Before ...: 18886000
After ....: CAST(FLOOR(demographics.Population * 1.10) AS unsigned)

Now the expression is taken as a literal string – oops.

Warning

This also highlights that you must be careful when working with a schemaless data model. The database will not stop you from putting wrong data types into your documents.

Let’s look at how this can be fixed.

Explicit Expressions

The solution to the issue, we just saw, is to use explicit expressions. You can do that using the mysqlx.expr() method. This returns an expression object that you can use in your statements. The same example as before – but using an explicit expression – is:

import mysqlx
import mysql.connector

connect_args = {
    "host": "127.0.0.1",
    "port": 33060,
    "user": "pyuser",
    "password": "Py@pp4Demo",
};

print("Using explicit expression.")
print("MySQL Connector/Python {0}".format(mysql.connector.__version__))

db = mysqlx.get_session(**connect_args)
schema = db.get_schema("world_x")
countryinfo = schema.get_collection("countryinfo")
country_name = "Australia"

db.start_transaction()

# Get the population of the country
population = countryinfo.find("Name = :country") \
    .fields("demographics.Population AS Population") \
    .bind("country", country_name)

before = population.execute()
print("Before ...: {0}".format(before.fetch_one()['Population']))

# Update the population
expr = mysqlx.expr("CAST(FLOOR(demographics.Population * 1.10) AS unsigned)")
result = countryinfo.modify("Name = :country") \
    .set("demographics.Population", expr) \
    .bind("country", country_name) \
    .execute()

after = population.execute()
print("After ....: {0}".format(after.fetch_one()['Population']))

# Leave the data in the same state as before the changes
db.rollback()
db.close()

The only change is the definition of the expression in line 30 and the use of it in line 32.

Tip

The expression object can be re-used if you need the same expression in several statements.

Now, MySQL Connector/Python 8.0.13 and 8.0.15 updates the population to the same value. First 8.0.13:

Using explicit expression.
MySQL Connector/Python 8.0.13
Before ...: 18886000
After ....: 20774600

Then 8.0.15:

Using explicit expression.
MySQL Connector/Python 8.0.15
Before ...: 18886000
After ....: 20774600

Further Reading

If this blog has caught you interest in MySQL Connector/Python whether you are looking at using the traditional Python Database API specification (PEP 249) or the new X DevAPI, then I have written MySQL Connector/Python Revealed published by Apress.
The book is available from Apress (print and DRM free ePub+PDF), Amazon (print and Kindle), Barnes & Noble (print), and others.

Enjoy.

MySQL Connector/Python 8.0.15: allow_local_infile Disabled by Default

MySQL Server has a feature where you can insert data into a table from a file with the data delimited by commas, tabs, or another delimiter. This is particularly useful when you need to bulk import data, for example when restoring a backup or migrating data from one system to another including from another database product than MySQL. The mysqldump backup utility is an example of a program that supports exporting the data to delimited text files.

The statement to load the data is LOAD DATA INFILE. By default the file must be server-side and MySQL Server will load it without involving the connections (other than for submitting the query and returning the result). However, there is also an optional keyword LOCAL that can be used to tell MySQL that a the file is located client-side: LOAD DATA LOCAL INFILE. It is this local variant that is the topic of this blog.

Dolphin

By default, MySQL Server 8 does not allow loading client-side files using the LOAD DATA LOCAL INFILE statement. (Similarly the mysql command-line client has the feature disabled by default.) Why? The reason is that there are security implications enabling the feature. For that reason, it is recommended to disable the feature unless you know you need it, and then only allow it for as short time as possible.

Advice

The server-side variant is safer as long as the server-side file system is secured. Additionally, MySQL will only read server-side files from the directory (including subdirectories) specified with the secure_file_priv option. Make sure the option is not set to an empty value as that will allow MySQL to read any file.

In MySQL Connector/Python, the mysql.connector module includes the option allow_local_infile which determines whether MySQL Connector/Python will allow the server to request a client-side file. In version 8.0.15 the default value for allow_local_infile has been changed from True to False to align with the default value for MySQL Server and the mysql command-line client.

This means that if you attempt to execute LOAD DATA LOCAL INFILE in MySQL Connector/Python 8.0.15 without explicitly enabling it, a ProgrammingError exception is raised:

mysql.connector.errors.ProgrammingError: 1148 (42000): The used command is not allowed with this MySQL version

(Admitted, the error message is not particularly good at telling what you are doing wrong.) What do you do, if you need to load client-side data files? These are the steps:

  1. Enable the local_infile option in MySQL Server. This only exists for the global scope. You can temporary enable it with SET GLOBAL local_infile = ON;, then disable it again when you are done by setting the variable to OFF.
  2. Enable allow_local_infile in your MySQL Connector/Python program. I will show an example of how to do this below. Again, it is recommended to only have it enabled when needed.

An example of creating a connection with allow_local_infile enabled is:

import mysql.connector

# Create connection to MySQL
db = mysql.connector.connect(
  option_files="my.ini",
  allow_local_infile=True
)

This example uses an INI formatted configuration file to specify the connection options, and the allow_local_infile option is added explicitly.

If you are interested in learning more about using LOAD DATA [LOCAL] INFILE with MySQL Connector/Python, then my book MySQL Connector/Python Revealed (Apress) has a section describing how to load data from files including an example. The book is available from Apress (print and DRM free ePub+PDF), Amazon (print and Kindle), Barnes & Noble (print), and others.

Automatic Decryption of MySQL Binary Logs Using Python

One of the new features in MySQL 8.0.14 is support for encrypting the binary logs. While encryption makes the data more secure (provided the key is secret of course), it can make life a bit more difficult in terms of how easy it is to do tasks such as point-in-time recoveries. This blog shows how you can use the binlog_decrypt.py Python script to decrypt the binary logs as long as you have the keyring that was used to encrypt it.

Decrypting a binary log with binlog_decrypt.py

Introduction and Background

João Gramacho wrote a nice blog how you can use standard Linux programs to decrypt the binary logs. This inspired me to consider implementing the same, but using Python which should make the script easier to use. Specifically, my aim was that the Python script should have the following features:

  • It should work cross platform. I have tested the script on Oracle Linux 7 and Microsoft Windows 7.
  • The key used to encrypt binary logs can be rotated, so different binary logs use different keys. The script should automatically determine which key a binary log uses and extract if from the keyring. For simplicity, I only implemented support for the keyring_file plugin.
  • The script should be able to handle multiple binary logs and gracefully handle unencrypted binary logs.

Introducing binlog_decrypt.py

As it turned out once I understood how the keyring file works, the task was pretty straight forward using João’s blog to get the required steps. I have maintained the overall steps from that blog. The result can be downloaded from the following link:

Advice

I would like to say a big thank you to João Gramacho and Ivan Švaljek for tips on how to work with the keyring_file data format.

Some important comments about the script are:

  • The script only works with Python 3 (tested with Python 3.6).
  • All work is done in-memory. While this gives good performance (a 1.1GiB binary log on my laptop decrypts in around three seconds when the encrypted log is in the operating system I/O cache), it does mean that the memory usage is quite high. The 1.1GiB file resulted in a 3.2GiB peak memory usage.
  • Other than performing checks of the binary log content, I have added limited error checking. This is to keep focus on the actual work required to decrypt the binary log.
  • The cryptography module is used for the decryption work. The easiest way to install the module is to use pip (see below).
  • The keyring must be from the keyring_file plugin and using format version 2.0 (the format current as of MySQL 8.0.14). If you use a different keyring plugin, you can use the keyring migration feature to create a copy of the keyring using keyring_file. (But, please note that keyring_file is not a secure keyring format.)

Warning

The script uses the low-level methods of the cryptography module (the hazmat sub module – named so for a good reason). This is OK in this case as it is pure decryption. However, for encryption please do not use the hazmat methods unless you really know what you are doing.

Installing Prerequisites

If you are using Oracle Linux 7, Red Hat Enterprise Linux (RHEL) 7, or CentOS 7, the included Python version is 2.7. This will not work with the binlog_decrypt.py script. You can install Python 3.6 in addition to Python 2.7 from the EPEL repository using the following steps (assuming you have already added the EPEL repository):

shell$ yum install python36
shell$ python3.6 -m ensurepip
shell$ python3.6 -m pip install --upgrade pip

This also installs and upgrades the pip command which can be invoked using python3.6 -m pip.

On all platforms, you can install the cryptography module using pip, for example (from Microsoft Windows):

PS:> python -m pip install cryptography
Collecting cryptography
  Downloading https://files.pythonhosted.org/packages/65/d6/48e8194ab0d0d643acb89042a853d029c7cd2daaaba52cf4ff83ff0060a9/cryptography-2.5-cp36-cp36m-win_amd64.whl (1.5MB)
    100% |████████████████████████████████| 1.5MB 4.7MB/s
Collecting asn1crypto>=0.21.0 (from cryptography)
  Downloading https://files.pythonhosted.org/packages/ea/cd/35485615f45f30a510576f1a56d1e0a7ad7bd8ab5ed7cdc600ef7cd06222/asn1crypto-0.24.0-py2.py3-none-any.whl (101kB)
    100% |████████████████████████████████| 102kB 5.8MB/s
Requirement already satisfied: six>=1.4.1 in c:\users\jesper\appdata\local\programs\python\python36\lib\site-packages (from cryptography) (1.11.0)
Collecting cffi!=1.11.3,>=1.8 (from cryptography)
  Downloading https://files.pythonhosted.org/packages/2f/85/a9184548ad4261916d08a50d9e272bf6f93c54f3735878fbfc9335efd94b/cffi-1.11.5-cp36-cp36m-win_amd64.whl (166kB)
    100% |████████████████████████████████| 174kB 5.5MB/s
Collecting pycparser (from cffi!=1.11.3,>=1.8->cryptography)
  Downloading https://files.pythonhosted.org/packages/68/9e/49196946aee219aead1290e00d1e7fdeab8567783e83e1b9ab5585e6206a/pycparser-2.19.tar.gz (158kB)
    100% |████████████████████████████████| 163kB 5.2MB/s
Installing collected packages: asn1crypto, pycparser, cffi, cryptography
  Running setup.py install for pycparser ... done

If you use Oracle Linux 7, RHEL 7, or CentOS 7, invoke pip using python3.6 -m pip instead.

Using binlog_decrypt.py

You can now test the script. Assuming you have two binary logs of which the first is not encrypted and the second is encrypted:

mysql> SHOW BINARY LOGS;
+---------------+-----------+-----------+
| Log_name      | File_size | Encrypted |
+---------------+-----------+-----------+
| binlog.000001 |    722755 | No        |
| binlog.000002 |    723022 | Yes       |
+---------------+-----------+-----------+
3 rows in set (0.01 sec)

You can now use the script as:

PS:> python binlog_decrypt.py --keyring_file_data="C:\ProgramData\MySQL\MySQL Server 8.0\keyring" "C:\ProgramData\MySQL\MySQL Server 8.0\data\binlog.000001" "C:\ProgramData\My
SQL\MySQL Server 8.0\data\binlog.000002"
binlog.000001: Binary log is not encrypted. Skipping.
binlog.000002: Keyring key ID for is 'MySQLReplicationKey_59e3f95b-e0d6-11e8-94e8-ace2d35785be_1'
binlog.000005: Successfully decrypted as 'C:\tmp\plain-binlog.000005'

Notice how binlog.000001 is skipped because it is detected that the binary log is not encrypted.

This is just an example. Invoke the script with the --help argument to get a description of all of the options.

The Full Source Code

For reference, here is the full source code for the script:

import sys
import os
import struct
import collections
import hashlib
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend

def key_and_iv_from_password(password):
    # Based on
    # https://stackoverflow.com/questions/13907841/implement-openssl-aes-encryption-in-python

    key_length = 32
    iv_length = 16
    required_length = key_length + iv_length
    password = password

    key_iv = hashlib.sha512(password).digest()
    tmp = [key_iv]
    while len(tmp) < required_length:
        tmp.append(hashlib.sha512(tmp[-1] + password).digest())
        key_iv += tmp[-1]

    key = key_iv[:key_length]
    iv = key_iv[key_length:required_length]

    return key, iv


class Key(
    collections.namedtuple(
        'Key', [
            'key_id',
            'key_type',
            'user_id',
            'key_data',
        ]
    )):
    __slots__ = ()


class Keyring(object):
    _keys = []
    _keyring_file_version = None
    _xor_str = '*305=Ljt0*!@$Hnm(*-9-w;:'.encode('utf-8')

    def __init__(self, keyring_filepath):
        self.read_keyring(keyring_filepath)

    def _read_key(self, data):
        overall_length = struct.unpack('<Q', data[0:8])[0]
        key_id_length = struct.unpack('<Q', data[8:16])[0]
        key_type_length = struct.unpack('<Q', data[16:24])[0]
        user_id_length = struct.unpack('<Q', data[24:32])[0]
        key_length = struct.unpack('<Q', data[32:40])[0]

        key_id_start = 40
        key_type_start = key_id_start + key_id_length
        user_id_start = key_type_start + key_type_length
        key_start = user_id_start + user_id_length
        key_end = key_start + key_length

        key_id = data[key_id_start:key_type_start].decode('utf-8')
        key_type = data[key_type_start:user_id_start].decode('utf-8')
        # The User ID may be blank in which case the length is zero
        user_id = data[user_id_start:key_start].decode('utf-8') if user_id_length > 0 else None
        key_raw = data[key_start:key_end]
        xor_str_len = len(self._xor_str)
        key_data = bytes([key_raw[i] ^ self._xor_str[i%xor_str_len]
                          for i in range(len(key_raw))])

        return Key(key_id, key_type, user_id, key_data)

    def read_keyring(self, filepath):
        keyring_data = bytearray()
        with open(filepath, 'rb') as keyring_fs:
            chunk = keyring_fs.read()
            while len(chunk) > 0:
                keyring_data.extend(chunk)
                chunk = keyring_fs.read()

            keyring_fs.close()

        # Verify the start of the file is "Keyring file version:"
        header = keyring_data[0:21]
        if header.decode('utf-8') != 'Keyring file version:':
            raise ValueError('Invalid header in the keyring file: {0}'
                             .format(header.hex()))

        # Get the keyring version - currently only 2.0 is supported
        version = keyring_data[21:24].decode('utf-8')
        if version != '2.0':
            raise ValueError('Unsupported keyring version: {0}'
                             .format(version))

        self._keyring_file_version = version
        keyring_length = len(keyring_data)
        offset = 24
        keys = []
        while offset < keyring_length and keyring_data[offset:offset+3] != b'EOF':
            key_length = struct.unpack('<Q', keyring_data[offset:offset+8])[0]
            key_data = keyring_data[offset:offset+key_length]
            key = self._read_key(key_data)
            keys.append(key)
            offset += key_length

        self._keys = keys

    def get_key(self, key_id, user_id):
        for key in self._keys:
            if key.key_id == key_id and key.user_id == user_id:
                return key

        return None


def decrypt_binlog(binlog, keyring, out_dir, prefix):
    '''Decrypts a binary log and outputs it to out_dir with the prefix
    prepended. The arguments are:

        * binlog - the path to the encrypted binary log
        * keyring - a Keyring object
        * out_dir - the output directory
        * prefix - prefix to add to the binary log basename.
    '''
    magic_encrypted = 'fd62696e'
    magic_decrypted = 'fe62696e'

    binlog_basename = os.path.basename(binlog)
    decrypt_binlog_path = os.path.join(
        out_dir, '{0}{1}'.format(prefix, binlog_basename))
    if os.path.exists(decrypt_binlog_path):
        print("{0}: Decrypted binary log path, '{1}' already exists. Skipping"
              .format(binlog_basename, decrypt_binlog_path), file=sys.stderr)
        return False

    with open(binlog, 'rb') as binlog_fs:
        # Verify the magic bytes are correct
        magic = binlog_fs.read(4)
        if magic.hex() == magic_decrypted:
            print('{0}: Binary log is not encrypted. Skipping.'
                  .format(binlog_basename), file=sys.stderr)
            return False
        elif magic.hex() != magic_encrypted:
            print("{0}: Found invalid magic '0x{1}' for encrypted binlog file."
                  .format(binlog_basename, magic.hex(), file=sys.stderr))
            return False

        # Get the encrypted version (must currently be 1)
        version = struct.unpack('<B', binlog_fs.read(1))[0]
        if version != 1:
            print("{0}: Unsupported binary log encrypted version '{1}'"
                  .format(binlog_basename, version), file=sys.stderr)
            return False

        # First header field is a TLV: the keyring key ID
        field_type = struct.unpack('<B', binlog_fs.read(1))[0]
        if field_type != 1:
            print('{0}: Invalid field type ({1}). Keyring key ID (1) was '
                  + 'expected.'.format(binlog_basename, field_type),
                  file=sys.stderr)
            return False

        keyring_id_len = struct.unpack('<B', binlog_fs.read(1))[0]
        keyring_id = binlog_fs.read(keyring_id_len).decode('utf-8')
        print("{0}: Keyring key ID for is '{1}'"
              .format(binlog_basename, keyring_id), file=sys.stderr)

        # Get the key from the keyring file
        key = keyring.get_key(keyring_id, None)

        # Second header is a TV: the encrypted file password
        field_type = struct.unpack('<B', binlog_fs.read(1))[0]
        if field_type != 2:
            print('{0}: Invalid field type ({1}). Encrypted file password (2) '
                  + 'was expected.'.format(binlog_basename, field_type),
                  file=sys.stderr)
            return False
        encrypted_password = binlog_fs.read(32)

        # Third header field is a TV: the IV to decrypt the file password
        field_type = struct.unpack('<B', binlog_fs.read(1))[0]
        if field_type != 3:
            print('{0}: Invalid field type ({1}). IV to decrypt the file '
                  + 'password (3) was expected.'
                  .format(binlog_basename, field_type), file=sys.stderr)
            return False
        iv = binlog_fs.read(16)

        backend = default_backend()
        cipher = Cipher(algorithms.AES(key.key_data), modes.CBC(iv),
                        backend=backend)
        decryptor = cipher.decryptor()
        password = decryptor.update(encrypted_password) + decryptor.finalize()

        # Generate the file key and IV
        key, iv = key_and_iv_from_password(password)
        nonce = iv[0:8] + bytes(8)
     
        # Decrypt the file data (the binary log content)
        # The encrypted binary log headers are 512, so skip those
        binlog_fs.seek(512, os.SEEK_SET)
        binlog_encrypted_data = binlog_fs.read()
        binlog_fs.close()

    cipher = Cipher(algorithms.AES(key), modes.CTR(nonce), backend=backend)
    decryptor = cipher.decryptor()
    binlog_decrypted_data = decryptor.update(binlog_encrypted_data)
    binlog_decrypted_data += decryptor.finalize()
    binlog_encrypted_data = None

    # Check decrypted binary log magic
    magic = binlog_decrypted_data[0:4]
    if magic.hex() != magic_decrypted:
        print("{0}: Found invalid magic '0x{1}' for decrypted binlog file."
              .format(binlog_basename, magic.hex()), file=sys.stderr)
        return False

    # Write the decrypted binary log to disk
    with open(decrypt_binlog_path, 'wb') as new_fs:
        new_fs.write(binlog_decrypted_data)
        new_fs.close()

    print("{0}: Successfully decrypted as '{1}'"
          .format(binlog_basename, decrypt_binlog_path))
    return True

def decrypt_binlogs(args):
    '''Outer routine for decrypted one or more binary logs. The
    argument args is a named touple (typically from the argparse
    parser) with the following members:

       * args.binlogs - a list or tuple of the binary logs to decrypt
       * args.keyring_file_data - the path to the file with the
            kerying data for the keyring_file plugin.
       * args.dir - the output directory for the decrypted binary logs
       * args.prefix - the prefix to prepend to the basename of the
            encrypted binary log filenames. This allows you to output
            the decrypted to the same directory as the encrypted
            binary logs without overwriting the original files.
    '''
    keyring = Keyring(args.keyring_file_data)
    for binlog in args.binlogs:
        decrypt_binlog(binlog, keyring, args.dir, args.prefix)

def main(argv):
    import argparse

    parser = argparse.ArgumentParser(
        prog='decrypt_binlog.py',
        description='Decrypt one or more binary log files from MySQL Server '
                   +'8.0.14+ created with binlog_encryption = ON. The '
                   +'binary log files have the prefix given with --prefix '
                   +'prepended to their file names.'
                   +'If an output file already exists, the file will be '
                   +'skipped.',
        epilog='All work is performed in-memory. For this reason, the'
               +'expected peak memory usage is around three times the'
               +'size of the largest binary log. As max_binlog_size can'
               +'at most be 1G, for instances exlusively executing small'
               +'transactions, the memory usage can thus be up to around'
               +'3.5G. For instances executing large transactions, the'
               +'binary log files can be much larger than 1G and thus the'
               +'memory usage equally larger.')

    parser.add_argument('-d', '--dir', default=os.getcwd(),
        dest='dir',
        help='The destination directory for the decrypted binary log files. '
             +'The default is to use the current directory.')

    parser.add_argument('-p', '--prefix', default='plain-',
        dest='prefix',
        help='The prefix to prepand to the basename of the binary log file.'
             +'The default is plain-.')

    parser.add_argument('-k', '--keyring_file_data', default=None,
        dest='keyring_file_data',
        help='The path to the keyring file. The same as keyring_file_data in '
             +'the MySQL configuration. This option is mandatory.')

    parser.add_argument('binlogs', nargs=argparse.REMAINDER,
                        help='The binary log files to decrypt.')

    args = parser.parse_args()
    if not args.binlogs:
        print('ERROR: At least one binary log file must be specified.\n',
              file=sys.stderr)
        parser.print_help(file=sys.stderr)
        sys.exit(1)

    if not args.keyring_file_data:
        print('ERROR: The path to the keyring file must be specified.\n',
              file=sys.stderr)
        parser.print_help(file=sys.stderr)
        sys.exit(1)

    decrypt_binlogs(args)


if __name__ == '__main__':
   main(sys.argv[1:])

The start of the script is the handling of the keyring. Then follows the code for decrypting the binary logs which has a total of three functions (from bottom and up):

  • main: The function for handling the command line arguments.
  • decrypt_binlogs: Initializes the keyring and loops over the binary logs.
  • decrypt_binlog: Decrypts a single binary log.

For a closer discussion of the individual steps to decrypt the binary log, I recommend you to read João Gramacho’s blog How to manually decrypt an encrypted binary log file.

MySQL Connector/Python 8.0.14 X DevAPI: Default Schema

The MySQL X DevAPI is the new API that provides a uniform API across the supported programming languages. It has from the beginning supported that you can specify a default schema, when you connect. Originally it was not used for SQL statements. Starting with MySQL 8.0.14 the feature has been extended, so SQL statements take the default schema into consideration as well. This blog will explore how this works using MySQL Connector/Python. If you use a different programming language, the change will work in a similar way.

In order to explore the feature, a sample program is needed. A simple program that prints the MySQL Connector/Python version, queries the city table in the default schema, and either catches the exception or prints the resulting rows will work:

import mysql.connector
import mysqlx
from mysqlx.errors import OperationalError
import pprint

printer = pprint.PrettyPrinter(indent=3)

connect_args = {
    "host": "127.0.0.1",
    "port": 33060,
    "user": "pyuser",
    "password": "Py@pp4Demo",
    "schema": "world",
};

print("MySQL Connector/Python {0}".format(mysql.connector.__version__))

db = mysqlx.get_session(connect_args)
try:
    result = db.sql("SELECT * FROM city WHERE ID = 130").execute()
except OperationalError as err:
    print("Error: {0}".format(err))
    result = None
else:
    columns = [column.column_name for column in result.get_columns()]
    for row in result.fetch_all():
        row_data = dict(zip(columns, row))
        printer.pprint(row_data)

db.close()

Notice how the default schema is set in line 13.

If this program is executed with MySQL Connector/Python 8.0.13 or earlier, the query in line 20 will cause an OperationalError exception:

MySQL Connector/Python version: 8.0.13
Error: No database selected

However, if you execute the code with MySQL Connector/Python 8.0.14, the query executes and queries the city table in the world schema:

MySQL Connector/Python 8.0.14
{  'CountryCode': 'AUS',
    'District': 'New South Wales',
    'ID': 130,
    'Name': 'Sydney',
    'Population': 3276207}

This new behaviour is the same as you would be used to from the traditional API.

Tip

If you want to use the default schema with CRUD statements, you can retrieve the default schema with Session.get_default_schema(). That also works in earlier releases of MySQL Connector/Python.

If you want to read more about using the X DevAPI and MySQL Connector/Python in general, then I have written the book MySQL Connector/Python Revealed (see links in the sidebar to the right).

MySQL 8: Drop Several Stored Events, Procedures, or Functions

Maybe the biggest new feature in MySQL 8 is the new transaction data dictionary that improves the consistency of schema objects among other things. To further protect the data in the data dictionary, the data dictionary tables are hidden and their content only exposed through the Information Schema. (One exception is when you use the debug binary, then it is possible to get direct access to the data dictionary tables. This is not recommended at all on production systems!)

A side effect of the data dictionary tables being hidden is that those that have had a habit of manipulating the tables directly in MySQL 5.7 and earlier (I will not recommend doing that) will no longer be able to do so. Examples of manipulating the tables include dropping several stored events, procedures, and/or functions by deleting the corresponding rows in the mysql.event and mysql.proc tables.

Tip

I will use the word object to cover either an event, procedure, or function.

There are some discussions on Stack Overflow on how to do this task. See for example How to delete multiple events in mysql 8.0 database ? it was working in 5.7 but not in 8.0? and MySQL 8 – Remove All Stored Procedures in a Specific Database.

The obvious thought may be to write a stored procedure that can delete the objects, however that requires using a prepared statement which is not supported for DROP EVENT and similar:

mysql> SET @sql = 'DROP EVENT db1.event1';
Query OK, 0 rows affected (0.00 sec)

mysql> PREPARE stmt_drop FROM @sql;
ERROR 1295 (HY000): This command is not supported in the prepared statement protocol yet

There is a feature request to implement this in Bug 42337.

So what to do instead? My suggestion is to use MySQL Shell. There you can generate the queries and use session.sql() to execute them without using a prepared statement. However, first some objects are needed.

Deleting events, functions, and procedures in MySQL Shell
Deleting events, functions, and procedures in MySQL Shell

Setup

Before we start looking at the options to drop several events, procedures, or functions, lets create some objects to work with. For the examples there are three events, three procedures, and three functions that can be created as follows:

mysql> CREATE SCHEMA db1;
Query OK, 1 row affected (0.01 sec)

mysql> use db1;
Database changed
mysql> CREATE EVENT event1 ON SCHEDULE EVERY 1 DAY STARTS NOW() DO DO SLEEP(1);
Query OK, 0 rows affected (0.04 sec)

mysql> CREATE EVENT event2 ON SCHEDULE EVERY 1 DAY STARTS NOW() DO DO SLEEP(1);
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE EVENT `event``3` ON SCHEDULE EVERY 1 DAY STARTS NOW() DO DO SLEEP(1);
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE PROCEDURE proc1() DO SLEEP(1);
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE PROCEDURE proc2() DO SLEEP(1);
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE PROCEDURE `proc``3`() DO SLEEP(1);
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE FUNCTION func1() RETURNS int DETERMINISTIC RETURN 1;
Query OK, 0 rows affected (0.40 sec)

mysql> CREATE FUNCTION func2() RETURNS int DETERMINISTIC RETURN 1;
Query OK, 0 rows affected (0.08 sec)

mysql> CREATE FUNCTION `func``3`() RETURNS int DETERMINISTIC RETURN 1;
Query OK, 0 rows affected (0.30 sec)

mysql> SELECT EVENT_SCHEMA, EVENT_NAME
         FROM information_schema.EVENTS
        WHERE EVENT_SCHEMA = 'db1';
+--------------+------------+
| EVENT_SCHEMA | EVENT_NAME |
+--------------+------------+
| db1          | event1     |
| db1          | event2     |
| db1          | event`3    |
+--------------+------------+
3 rows in set (0.01 sec)

mysql> SELECT ROUTINE_SCHEMA, ROUTINE_NAME, ROUTINE_TYPE
         FROM information_schema.ROUTINES
        WHERE ROUTINE_SCHEMA = 'db1';
+----------------+--------------+--------------+
| ROUTINE_SCHEMA | ROUTINE_NAME | ROUTINE_TYPE |
+----------------+--------------+--------------+
| db1            | func1        | FUNCTION     |
| db1            | func2        | FUNCTION     |
| db1            | func`3       | FUNCTION     |
| db1            | proc1        | PROCEDURE    |
| db1            | proc2        | PROCEDURE    |
| db1            | proc`3       | PROCEDURE    |
+----------------+--------------+--------------+
6 rows in set (0.01 sec)

This also shows how the events and routines can be found from the Information Schema.

The third event, procedure, and function are named with a backtick before the digit. This is to ensure our delete method quotes the identifier names correctly. If you are deleting events and routines from a production system, you are encouraged to test carefully that you quote the identifiers correctly.

Now on to delete the objects.

MySQL Shell

MySQL Shell is one of the great recent additions to MySQL. It supports performing DBA and devops tasks using SQL, Python, and JavaScript. Having a full featured programing language available allows you to automate tasks that otherwise would have been cumbersome or impossible.

The Python and JavaScript modes support executing queries either using NoSQL CRUD methods or SQL statements. So, now the task of deleting events, procedures, or functions becomes as simple as executing a query against the Information Schema to find the names of the objects and then generate the queries in a loop. The following will use Python, but it can also be written in JavaScript if you prefer.

I will first go through dropping events, then functions and procedures.

Events – Interactive

It is possible to get the names using the select() CRUD method on the information_schema.EVENTS table:

mysql-py> i_s = session.get_schema("information_schema")

mysql-py> events = i_s.EVENTS \
       ->             .select("sys.quote_identifier(EVENT_SCHEMA) AS EventSchema", "sys.quote_identifier(EVENT_NAME) AS EventName") \
       ->             .where("EVENT_SCHEMA = 'db1'").execute().fetch_all();

mysql-py> events
[
    [
        "`db1`",
        "`event1`"
    ], 
    [
        "`db1`",
        "`event2`"
    ], 
    [
        "`db1`",
        "`event``3`"
    ]
]

First an object of the information_schema is stored in the i_s variable. This allows you to access the Information Schema EVENTS table using i_s.EVENTS. From here the select() CRUD method is used to find all events in the db1 schema.

Tip

If you have a different criteria for which events to delete, edit the argument to the where() method.

The sys.quote_identifier() function is used to ensure that the schema and event names are quoted correctly, so for example the backtick in event`3 is escaped correctly (by adding a second backtick).

Important

The sys.quote_identifier() only works guaranteed to work correctly when the ANSI_QUOTES SQL mode is not enabled. When ANSI quotes are used, double quotes (“) are using to quote identifiers though backticks are still supported.

Now events is a list of the three events that were created earlier. You can easily check which events where found by just “executing” events. With that, you can delete the events one by one in a loop. For this a query template is first defined, then used to add the name of the schema and event. Usually it is not safe to do that using the Python format() function, however here we have ensured the names are quotes appropriately by using sys.quote_identifier().

mysql-py> sql_fmt = "DROP EVENT {0}.{1}"
mysql-py> for event in events:
       ->     print(sql_fmt.format(*event))
       ->     session.sql(sql_fmt.format(*event)).execute()
       ->
DROP EVENT `db1`.`event1`
DROP EVENT `db1`.`event2`
DROP EVENT `db1`.`event``3`
Query OK, 0 rows affected (0.0038 sec)

That’s it. Not exactly a one liner, but not hard either.

This is fine if it is a one off that you need to delete some events, but what if this is a more regular task? MySQL Shell can accommodate for that as well.

Events – Function

I have previously discussed how you can use external libraries in MySQL Shell and even shown an example of using it to have auto-repeating reports. It can also be used to implement a function where you give the schema you want to delete events from as an argument.

Let’s look at how the code looks for that:

from mysqlsh import globals

def events_by_schema(schema, verbose=True):
    """Drops all events in the schema passed as an argument.

    Requires a connection to exist to the MySQL instance where the
    events are to be dropped.

    Optionally specify whether to be verbose (enabled by default) by
    printing each DROP EVENT statement executed.

    Returns the number of events dropped and False if an issue is
    encountered so no attempts are made to drop the events.
    """

    # Get the session and ensure it is connected
    if not globals.session.is_open():
        print("Please create a connection first.")
        return False

    # Define the query to get the events
    i_s = globals.session.get_schema("information_schema")
    stmt = i_s.EVENTS.select(
        "sys.quote_identifier(EVENT_SCHEMA) AS EventSchema",
        "sys.quote_identifier(EVENT_NAME) AS EventName")
    stmt = stmt.where("EVENT_SCHEMA = :schema")

    # Execute the query and check for warnings
    result = stmt.bind("schema", schema).execute()
    events = result.fetch_all()
    if (result.get_warnings_count() > 0):
        # Bail out and print the warnings
        print("Warnings occurred - bailing out:")
        print(result.get_warnings())
        return False

    # Drop the events and check for warnings after each event
    sql_fmt = "DROP EVENT {0}.{1}"
    for event in events:
        sql = sql_fmt.format(*event)
        if verbose:
            print(sql)

        drop_result = globals.session.sql(sql).execute()
        if (drop_result.get_warnings_count() > 0):
            print("Warnings occurred:")
            print(result.get_warnings())

    return len(events)

Yes, this code is a lot longer than the previous example, but in reality it is the same with some extra checks added and obtaining the session (the connection to the MySQL instance) from the mysqlsh.globals module. Since the schema is now provided as an argument to the function, a placeholder and bind() are used to ensure it is safe to use the value in a query.

If you save the code in a file in the Python search path, you can import it into MySQL Shell. Recall from my earlier blog that you can include extra directories in your search path by updating mysqlshrc.py in one of the places where MySQL Shell looks for it; for example in ${HOME}/.mysqlsh on Linux or %APPDATA%\MySQL\mysqlsh on Windows. An example of the mysqlshrc.py file is:

import sys
sys.path.append("D:\MySQL\Shell\Python")

In this example, the code has been saved in the file delete_objects.py, so you can delete all events in the db1 schema like the following example:

mysql-py> import delete_objects
mysql-py> num_delete_events = delete_objects.events_by_schema('db1')
DROP EVENT `db1`.`event1`
DROP EVENT `db1`.`event2`
DROP EVENT `db1`.`event``3`

That completes the example for events. Now let’s turn to functions and procedures.

Procedures and Functions – Interactive

The example of deleting functions and/or procedures is very similar to what have just been seen for events, other than using the information_schema.ROUTINES table instead of information_schema.EVENTS to get the routine names and that your need to set a filter on ROUTINE_TYPE to specify whether you want procedures or functions (or both if the filter is absent). So, I will jump straight into the example by first dropping all functions, then all procedures in the db1 schema:

mysql-py> i_s = session.get_schema("information_schema")

mysql-py> functions = i_s.ROUTINES \
       ->                .select("sys.quote_identifier(ROUTINE_SCHEMA) AS RoutineSchema", "sys.quote_identifier(ROUTINE_NAME) AS RoutineName") \
       ->                .where("ROUTINE_SCHEMA = 'db1' AND ROUTINE_TYPE = 'FUNCTION'").execute().fetch_all()

mysql-py> functions
[
    [
        "`db1`",
        "`func1`"
    ], 
    [
        "`db1`",
        "`func2`"
    ], 
    [
        "`db1`",
        "`func``3`"
    ]
]

mysql-py> for function in functions:
       ->     print(sql_fmt.format(*function))
       ->     session.sql(sql_fmt.format(*function)).execute()
       ->
DROP FUNCTION `db1`.`func1`
DROP FUNCTION `db1`.`func2`
DROP FUNCTION `db1`.`func``3`
Query OK, 0 rows affected (0.0684 sec)

Similar for the procedures:

mysql-py> i_s = session.get_schema("information_schema")

mysql-py> procedures = i_s.ROUTINES \
       ->             .select("sys.quote_identifier(ROUTINE_SCHEMA) AS RoutineSchema", "sys.quote_identifier(ROUTINE_NAME) AS RoutineName") \
       ->             .where("ROUTINE_SCHEMA = 'db1' AND ROUTINE_TYPE = 'PROCEDURE'").execute().fetch_all()

mysql-py> procedures
[
    [
        "`db1`",
        "`proc1`"
    ], 
    [
        "`db1`",
        "`proc2`"
    ], 
    [
        "`db1`",
        "`proc``3`"
    ]
]

mysql-py> sql_fmt = "DROP PROCEDURE {0}.{1}"

mysql-py> for procedure in procedures:
       ->     print(sql_fmt.format(*procedure))
       ->     session.sql(sql_fmt.format(*procedure)).execute()
DROP PROCEDURE `db1`.`proc1`
DROP PROCEDURE `db1`.`proc2`
DROP PROCEDURE `db1`.`proc``3`
Query OK, 0 rows affected (0.0976 sec)

Again, it is possible to create a function, so the code can be re-used.

Functions and Procedures – Function

Since the code for deleting functions and procedures is so similar – it is just the filter and DROP FUNCTION versus DROP PROCEDURE that is the difference – it is simple to use the same function to delete either. In fact, it would not take much to combine it with events_by_schema() from above, but that will be left as an exercise for the reader (there is a hint in the next section using MySQL Connector/Python).

The code for the routines_by_schema() function is:

from mysqlsh import globals

def routines_by_schema(schema, routine_type=None, verbose=True):
    """Drops all routines of a given type in the schema passed as an
    argument. If no type is given, both functions and procedures are
    dropped. The default is to drop both.

    Requires a connection to exist to the MySQL instance where the
    routines are to be dropped.

    Optionally specify whether to be verbose (enabled by default) by
    printing each DROP FUNCTION|PROCEDURE statement executed.

    Returns the number of routines dropped and False if an issue is
    encountered so no attempts are made to drop the routines.
    """

    # Get the session and ensure it is connected
    if not globals.session.is_open():
        print("Please create a connection first.")
        return False

    # Define the query to get the routines
    i_s = globals.session.get_schema("information_schema")
    filters = ["ROUTINE_SCHEMA = :schema"]
    if routine_type is not None:
        filters.append("ROUTINE_TYPE = :type")
    stmt = i_s.ROUTINES.select(
        "sys.quote_identifier(ROUTINE_SCHEMA) AS RoutineSchema",
        "sys.quote_identifier(ROUTINE_NAME) AS RoutineName",
        "ROUTINE_TYPE")
    stmt = stmt.where(" AND ".join(filters))

    # Execute the query and check for warnings
    stmt = stmt.bind("schema", schema)
    if routine_type is not None:
        stmt = stmt.bind("type", routine_type)
    result = stmt.execute()
    routines = result.fetch_all()
    routines = result.fetch_all()
    if (result.get_warnings_count() > 0):
        # Bail out and print the warnings
        print("Warnings occurred - bailing out:")
        print(result.get_warnings())
        return False

    # Drop the routines and check for warnings after each routine
    sql_fmt = "DROP {2} {0}.{1}"
    for routine in routines:
        sql = sql_fmt.format(*routine)
        if verbose:
            print(sql)

        drop_result = globals.session.sql(sql).execute()

        if (drop_result.get_warnings_count() > 0):
            print("Warnings occurred:")
            print(result.get_warnings())

    return len(routines)

The function takes an extra argument compared to events_by_schema(), so it is possible to either delete both functions and procedures (the default) or just one type. The use of the function is also similar to what have been seen before:

mysql-py> num_delete_functions = delete_objects.routines_by_schema('db1', 'FUNCTION')
DROP FUNCTION `db1`.`func1`
DROP FUNCTION `db1`.`func2`
DROP FUNCTION `db1`.`func``3`

mysql-py> num_delete_procedure = delete_objects.routines_by_schema('db1', 'PROCEDURE')
DROP PROCEDURE `db1`.`proc1`
DROP PROCEDURE `db1`.`proc2`
DROP PROCEDURE `db1`.`proc``3`

# Restore the functions and procedures
mysql-py> num_delete_routines = delete_objects.routines_by_schema('db1')
DROP FUNCTION `db1`.`func1`
DROP FUNCTION `db1`.`func2`
DROP FUNCTION `db1`.`func``3`
DROP PROCEDURE `db1`.`proc1`
DROP PROCEDURE `db1`.`proc2`
DROP PROCEDURE `db1`.`proc``3`

The last thing is to look at how the same actions can be done with MySQL Connector/Python.

MySQL Connector/Python

Given how similar using Python in MySQL Shell is with MySQL Connector/Python, it is natural to make the functions available as a command-line tool or function for Connector/Python programs. The following shows an example of how that can be done – combining events_by_schema() and routines_by_schema() into one function. The source code is:

import mysqlx

def objects_by_schema(session, schema, object_type=None, verbose=True):
    """Drops all events, functions, and/or procedures in the schema
    passed as an argument.

    Requires a connection to exist to the MySQL instance where the
    events are to be dropped.

    The object_type can be None (drop all events, functions, and
    procedures - the default), "EVENT", "FUNCTION", or "PROCEDURE".

    Optionally specify whether to be verbose (enabled by default) by
    printing each DROP statement executed.

    Returns the number of events, functions, and procedures dropped
    as a dictionary or False if an issue is encountered so no attempts
    are made to drop the events.
    """

    # Get the session and ensure it is connected
    if not session.is_open():
        print("Please create a connection first.")
        return False

    # Get an object for the Information Schema
    # and whether ANSI quotes are used
    i_s = session.get_schema("information_schema")
    sql_mode = session.sql("SELECT @@session.sql_mode AS sql_mode").execute() \
                      .fetch_one()["sql_mode"]
    sql_modes = sql_mode.split(",")
    ansi_quotes = "ANSI_QUOTES" if "ANSI_QUOTES" in sql_modes else ""

    events = []
    # If dropping events, get a list of all events
    if object_type is None or object_type == "EVENT":
        events_table = i_s.get_table("EVENTS")
        stmt = events_table.select(
            "EVENT_SCHEMA AS ObjectSchema", "EVENT_NAME AS ObjectName",
            "'EVENT' AS ObjectType")
        stmt = stmt.where("EVENT_SCHEMA = :schema")

        # Execute the query and check for warnings
        result = stmt.bind("schema", schema).execute()
        events = result.fetch_all()
        if (result.get_warnings_count() > 0):
            # Bail out and print the warnings
            print("Warnings occurred - bailing out:")
            print(result.get_warnings())
            return False

    routines = []
    if object_type is None or object_type in ("FUNCTION", "PROCEDURE"):
        routines_table = i_s.get_table("ROUTINES")
        filters = ["ROUTINE_SCHEMA = :schema"]
        if object_type is not None:
            filters.append("ROUTINE_TYPE = :type")
        stmt = routines_table.select(
            "ROUTINE_SCHEMA AS ObjectSchema", "ROUTINE_NAME AS ObjectName",
            "ROUTINE_TYPE AS ObjectType")
        stmt = stmt.where(" AND ".join(filters))

        # Execute the query and check for warnings
        stmt = stmt.bind("schema", schema)
        if object_type is not None:
            stmt = stmt.bind("type", object_type)
        result = stmt.execute()
        routines = result.fetch_all()
        if (result.get_warnings_count() > 0):
            # Bail out and print the warnings
            print("Warnings occurred - bailing out:")
            print(result.get_warnings())
            return False

    # Drop the routines and check for warnings after each routine
    objects = events + routines
    sql_fmt = "DROP {0} {1}.{2}"
    objects_deleted = {
        "EVENT": 0,
        "FUNCTION": 0,
        "PROCEDURE": 0,
    }
    for obj in objects:
        objects_deleted[obj[2]] += 1
        sql = sql_fmt.format(obj[2],
                  mysqlx.helpers.quote_identifier(obj[0], ansi_quotes),
                  mysqlx.helpers.quote_identifier(obj[1], ansi_quotes))
        if verbose:
            print(sql)

        drop_result = session.sql(sql).execute()

        if (drop_result.get_warnings_count() > 0):
            print("Warnings occurred:")
            print(result.get_warnings())

    return objects_deleted

if __name__ == "__main__":
    import sys
    import argparse
    import getpass

    # Handle the command line arguments
    parser = argparse.ArgumentParser(
        prog="delete_objects.py",
        description="Delete all events, functions, and/or procedures from a "
                   + "single schema.")

    parser.add_argument("--type", default="ALL",
        choices=["ALL", "EVENT", "FUNCTION", "PROCEDURE"],
        help="The object type to drop.")
    parser.add_argument("--schema", default=None,
        help="The schema to drop objects from.")
    parser.add_argument("--verbose", type=int, default=1,
        help="Whether to produce verbose output.")
    parser.add_argument("--user", default=None,
        help="The user to connect to MySQL as.")
    parser.add_argument("--host", default="127.0.0.1",
        help="The host to connect to MySQL on.")
    parser.add_argument("--port", type=int, default=33060,
        help="The mysqlx port to connect to MySQL on.")

    # Parse the arguments and get the password used when connecting to MySQL
    args = parser.parse_args(sys.argv[1:])
    object_type = args.type
    if args.type == "ALL":
        object_type = None
    verbose = True if args.verbose > 0 else False
    password = getpass.getpass("MySQL Password: ")

    # Connect to MySQL
    session = mysqlx.get_session(user=args.user, password=password,
                                 host=args.host, port=args.port)

    # Drop the objects
    objects_deleted = objects_by_schema(session, args.schema, object_type,
                                        verbose)
    if verbose:
        print(objects_deleted)

While the example is pretty extensive, it is not quite complete. There is only support for TCP connections using the user, password, host, and port options, and the password is always provided interactively (the most secure). Support for a configuration file and other options can be added if needed.

Instead of using the sys.quote_identifier() function, the schema and object names are quoted using the mysqlx.helpers.quote_identifier() function. Otherwise the program should be self explanatory given the previous examples. You can use python delete_objects.py --help to get help with the supported arguments.

MySQL X DevAPI Connection Pool with Connector/Python

If you have an application that need to use multiple connections to the MySQL database for short periods of times, it can be a good to use a connection pool to avoid creating a new connection and going through the whole authentication process every time a connection is needed. For the Python Database API (PEP249), MySQL Connector/Python has had support for connection pools for a long time. With the release of MySQL 8.0.13, the new X DevAPI also has support for connection pools.

MySQL Connector/Python X DevAPI connection pool code snippet.

This blog will first cover the background of the X DevAPI connection pool feature in MySQL Connector/Python. Then provide an example.

Background

You create a connection pool using the mysqlx.get_client() function. You may wonder why you are creating a client and not a pool? As will be shown later, there is a little more to this feature than just a connection pool. So, it makes sense to use a more generic term.

The get_client() function takes two arguments: The connection options and the client options. The connection options are the usual arguments defining which MySQL instance to connect to, authentication related options, how to connect, etc. The client options are the interesting ones in the discussion of a connection pool.

The client options is a dictionary or a JSON document written as a string. Currently, the only supported client options are the ones defining the connection pool. These are specified under the pooling field (and example will be provided shortly). This leaves room for the possibility to expand get_client() later with other features than a connection pool.

There are currently four connection pool options:

  • enabled: Whether the connection pool is enabled. The default is True.
  • max_size: The maximum number of connections that can be in the pool. The default is 25.
  • max_idle_time: How long time in milliseconds a connection can be idle before it is closed. The default is 0 which means “infinite” (in practice 2147483000 milliseconds).
  • queue_timeout: The maximum amount of time in milliseconds that an attempt to get a connection from the pool will block. If no connections have become available before the timeout, a mysqlx.errors.PoolError exception is raised. The default is 0 which means “infinite” (in practice 2147483000 milliseconds).

What happens if you disable the connection pool? In that case the client that is returned simply work as a template for connections and you can keep creating connections until MySQL Server runs out of connections. In that case, the session you end up with is a regular connection, so it when you close it, it will disconnect to MySQL.

Back to the case where the connection pool is enabled. Once you have the client object, you can start using the pool. You retrieve a connection from the pool with the get_session() method. No arguments are used. After this you can use the session just as a regular standalone connection. The only difference is that when you close the session, it is returned to the pool rather than disconnected.

Enough background. Let’s see an example.

Example

The following example creates a connection pool with at most two connections. Then two sessions are fetched from the pool and their connection IDs are printed. A third session will be requested before one of the original sessions is returned to the pool. Finally, a session is reused and its connection ID is printed.

import mysqlx
from datetime import datetime

cnxid_sql = "SELECT CONNECTION_ID() AS ConnectionID"
fmt_id = "Connection {0} ID ..........................: {1}"

connect_args = {
    "host": "127.0.0.1",
    "port": 33060,
    "user": "pyuser",
    "password": "Py@pp4Demo",
};

client_options = {
    "pooling": {
        "enabled": True,
        "max_size": 2,
        "max_idle_time": 60000,
        "queue_timeout": 3000,
    }
}

# Create the connection pool
pool = mysqlx.get_client(connect_args, client_options)

# Fetch two connections (exhausting the pool)
# and get the connection ID for each
connection1 = pool.get_session()
id1_row = connection1.sql(cnxid_sql).execute().fetch_one()
print(fmt_id.format(1, id1_row["ConnectionID"]))
connection2 = pool.get_session()
id2_row = connection2.sql(cnxid_sql).execute().fetch_one()
print(fmt_id.format(2, id2_row["ConnectionID"]))

# Attempt to get a third connection
time = datetime.now().strftime('%H:%M:%S')
print("Starting to request connection 3 .........: {0}".format(time))
try:
    connection3 = pool.get_session()
except mysqlx.errors.PoolError as err:
    print("Unable to fetch connection 3 .............: {0}".format(err))
time = datetime.now().strftime('%H:%M:%S')
print("Request for connection 3 completed .......: {0}".format(time))

# Return connection 1 to the pool
connection1.close()

# Try to get connection 3 again
connection3 = pool.get_session()
id3_row = connection3.sql(cnxid_sql).execute().fetch_one()
print(fmt_id.format(3, id3_row["ConnectionID"]))

# Close all connetions
pool.close()

The first thing to notice is the client options defined in lines 14-21. In this case all four options are set, but you only need to set those where you do not want the default value. The settings allow for at most two connections in the pool, when requesting a session it is allowed to take at most 3 seconds, and idle sessions should be disconnected after 60 seconds.

In line 24 the connection pool (client) is created and subsequent two sessions are fetched from the pool. When a third session is requested, it will trigger a PoolError exception as the pool is exhausted. Lines 38-41 shows how to handle the exception.

Finally the first connection is returned to the pool and it is possible to get the third request to complete.

An example of the output is (the connection IDs and timestamps will differ from execution to execution):

Connection 1 ID ..........................: 239
Connection 2 ID ..........................: 240
Starting to request connection 3 .........: 18:23:14
Unable to fetch connection 3 .............: pool max size has been reached
Request for connection 3 completed .......: 18:23:44
Connection 3 ID ..........................: 241

From the output you can see that the first attempt to fetch connection 3 takes three seconds before it times out and raises the exception – just as specified by the queue_timeout setting.

What may surprise you (at least if you have studied Chapter 5 from MySQL Connector/Python Revealed) from this output is that once connection 1 has been returned to the pool and connection 3 fetches the session again, it has a new connection ID. Does that mean the pool is not working? No, the pool is working alright. However, the X Plugin (the plugin in MySQL Server handling connections using the X Protocol) works differently than the connection handling for the traditional MySQL protocol.

The X Plugin distinguishes between the connection to the application and the thread inside MySQL. So, when the session is returned to the pool and the session is reset (to set the session variables back to the defaults and remove user variables) the thread inside MySQL is removed. As MySQL uses threads, it is cheap to create a new thread as it is needed, so this is not a performance problem. However, the connection to the application is maintained. This means you safe the expensive steps of creating the connection and authenticating, while the threads only actually exists inside MySQL while it is out of the pool.

If you are interested in learning more about MySQL Connector/Python 8 including how to use the X DevAPI, then I am the author of MySQL Connector/Python Revealed (Apress). It is available from Apress, Amazon, and other book stores.