The MySQL X DevApi: Working with NULL Values

There was an excellent question from Francis on my MySQL X DevAPI tutorial on how to work with NULL values:

Is it possible to store an attribute with the null value, and how to find it (xxx is null ?

It may sound like a trivial question, but the answer is not all that simple, so instead of just replying directly to the comment, I think it is worth covering the subject of NULL values in the X DevAPI in a separate post.

The Short Story

In short there are two ways to work with “not a value” in the X DevAPI. One way is to not set a value for those fields in which case you can compare the retrieved values with IS NULL as you would for a relational table, for example in MySQL Connector/Python:

# Find documents where "Birthday IS NULL"
stmt_find = my_docs.find('Birthday IS NULL')
stmt_find.fields('First_name', 'Surname', 'Birthday')
result = stmt_find.execute()

If you want the field to be present, you need to set the value to a JSON null value like:

john = {
    'First_name': 'John',
    'Surname': 'Doe',
    'Birthday': None,
}

Or if you prefer to specify the JSON document as a string:

john2 = """{
    "First_name": "John",
    "Surname": "Doe",
    "Birthday": null
}"""

Then you can retrieve the values by comparing to a JSON null value:

# Find documents where Birthday is equal to a JSON document
# just containing a null value
stmt_find = my_docs.find("Birthday = CAST('null' AS JSON)")
stmt_find.fields('First_name', 'Surname', 'Birthday')
result = stmt_find.execute()

The remainder of this post goes through a full example program with tips how to determine the correct way to query the JSON null value.

MySQL Connector/Python Revealed

Book

If you want to learn more about MySQL Connector/Python, then I have written MySQL Connector/Python Revealed published by Apress. The book both covers the traditional Python Database API (PEP 249) and the X DevAPI which is new as of MySQL 8.

The book is available from Apress (print and DRM free ePub+PDF), Amazon (print and Kindle), Barnes & Noble (print), and others.

The Example Program

The code used for this blog is a Python program using the mysqlx module of MySQL Connector/Python. It can be downloaded here:

The program requires that the pyuser@localhost user exists and has access to the my_collections schema:

CREATE USER pyuser@localhost IDENTIFIED BY 'Py@pp4Demo';
GRANT CREATE, INSERT, SELECT, DROP
      ON my_collections.* TO pyuser@localhost;

The my_collections schema is not allowed to exist when you execute the script. In order to allow you to explorer the documents yourself from MySQL Shell, the schema is not dropped at the end of the script, so if you want to execute the script multiple times, you will need to manually drop the schema:

DROP SCHEMA my_collections;

You can now create the connection and create the document collection as:

import mysqlx

connect_args = {
    'host': '127.0.0.1',
    'port': 33060,
    'user': 'pyuser',
    'password': 'Py@pp4Demo',
};

# Create the database connection
db = mysqlx.get_session(**connect_args)

# Create the my_collections schema
schema = db.create_schema('my_collections')

# Create the my_docs collection in the my_collections schema
my_docs = schema.create_collection('my_docs')

Avoid

For simplicity, the connection arguments are hard coded as a dictionary near the top of this script. Never ever hard code the password (and preferably also the other options) in your real scripts.

Inserting Data

The insertion of data is done in a similar way to the previous tutorial, so I will not go into details with that. In this case, there are also documents for three persons, but for two of them the birthday is unknown:

# Define three documents to insert
kate = {
    'First_name': 'Kate',
    'Surname': 'Lee',
    'Birthday': '1982-08-09',
}

john = {
    'First_name': 'John',
    'Surname': 'Doe',
    'Birthday': None,
}


jane = {
    'First_name': 'Jane',
    'Surname': 'Doe'
}

# Insert the document for Adam
db.start_transaction()
result = my_docs.add(kate).add(john).add(jane).execute()
db.commit()

Here John and Jane Joe has no birthday included. For John Doe the Birthday field is set to None (which is stored as a JSON null value). If you specify the JSON document as a string instead of a dictionary, you need to use null (unquoted within the string) instead. The three documents are inserted in one transaction by chaining the methods together.

Warning

The example does not contain any error checking in order to keep the example simple and focus on the code related to working with NULL values. In a real program, please take care and code defensively with error checks.

Now the question is how to retrieve the data again. Let's try some different methods.

Find Documents by Binding None

In the tutorial, there was an example to find people by querying their birthday:

# Find the person born on 9 August 1982
print("")
stmt_find = my_docs.find("Birthday = :birthday")
stmt_find.fields("First_name", "Surname")
stmt_find.bind("birthday","1982-08-09")
result = stmt_find.execute()
person = result.fetch_one()
print("Person born on 9 August 1982: {First_name} {Surname}".format(**person))

So the naïve approach is to use the same code and just bind None to birthday rather than an actual birthday and hope MySQL Connector/Python handles it from there:

# Find documents by binding None
stmt_find = my_docs.find('Birthday = :birthday')
stmt_find.fields('First_name', 'Surname', 'Birthday')
stmt_find.bind('birthday', None)
result = stmt_find.execute()
print('People with Birthday bound to None:')
print('-' * 35)
for person in result.fetch_all():
    print("{First_name} {Surname} ({Birthday})".format(**person))

However that returns nothing. Like for relational tables you cannot find NULL values using the equal sign. You may think you just need to replace the equal sign (=) with the NULL-safe equal operator (<=>), however that is not supported in the X DevAPI and will cause an exception:

ValueError: Unknown token type = 27  when expecting atomic expression at 3

So, instead let's see what happens when you use the IS NULL expression.

Find Documents Using IS NULL

In relational queries, you can use IS NULL to find NULL values, like WHERE mycol IS NULL. Does that also work for documents with the X DevAPI? Let's try:

# Find documents where "Birthday IS NULL"
stmt_find = my_docs.find('Birthday IS NULL')
stmt_find.fields('First_name', 'Surname', 'Birthday')
result = stmt_find.execute()
print('')
print('People with Birthday IS NULL:')
print('-' * 29)
for person in result.fetch_all():
    print("{First_name} {Surname} ({Birthday})".format(**person))

This returns the document for Jane Doe (i.e. without the Birthday field):

People with Birthday IS NULL:
-----------------------------
Jane Doe (None)

So why is the document for John Doe returned? Let's switch to SQL and explore the document in more detail.

The John Doe Document

To investigate the John Doe document in more detail, you can use MySQL Shell to query the document using SQL. To make it easier to understand what is going on, you can tell MySQL Shell to show column information by enabling the showColumnTypeInfo option:

sql> \option showColumnTypeInfo true

Now extract the birthday for John Doe:

sql> SELECT doc->>'$.First_name' AS FirstName,
            doc->>'$.Surname' AS Surname,
            doc->'$.Birthday' AS Birthday
       FROM my_collections.my_docs
      WHERE doc->>'$.First_name' = 'John';
Field 1
Name:      `FirstName`
Org_name:  ``
Catalog:   `def`
Database:  ``
Table:     ``
Org_table: ``
Type:      String
DbType:    VAR_STRING
Collation: utf8mb4_bin (46)
Length:    4294967292
Decimals:  0
Flags:

Field 2
Name:      `Surname`
Org_name:  ``
Catalog:   `def`
Database:  ``
Table:     ``
Org_table: ``
Type:      String
DbType:    VAR_STRING
Collation: utf8mb4_bin (46)
Length:    4294967292
Decimals:  0
Flags:

Field 3
Name:      `Birthday`
Org_name:  ``
Catalog:   `def`
Database:  ``
Table:     ``
Org_table: ``
Type:      Json
DbType:    JSON
Collation: utf8mb4_bin (46)
Length:    4294967295
Decimals:  0
Flags:     BLOB BINARY

+-----------+---------+----------+
| FirstName | Surname | Birthday |
+-----------+---------+----------+
| John      | Doe     | null     |
+-----------+---------+----------+
1 row in set (0.0014 sec)

The result is preceded with details of each column. The interesting here is that the type of the Birthday column is JSON. This means that the null in the result is not a NULL as you are used to in SQL but a JSON document containing a single value being null.

You can easily demonstrate that a JSON null is not the same as an SQL NULL as the JSON null is a JSON document:

sql> SELECT NULL,
            CAST('null' AS JSON),
            CAST('null' AS JSON) IS NULL;
Field 1
Name:      `NULL`
Org_name:  ``
Catalog:   `def`
Database:  ``
Table:     ``
Org_table: ``
Type:      Bytes
DbType:    VAR_STRING
Collation: binary (63)
Length:    0
Decimals:  0
Flags:

Field 2
Name:      `CAST('null' AS JSON)`
Org_name:  ``
Catalog:   `def`
Database:  ``
Table:     ``
Org_table: ``
Type:      Json
DbType:    JSON
Collation: utf8mb4_bin (46)
Length:    4294967295
Decimals:  0
Flags:     BLOB BINARY

Field 3
Name:      `CAST('null' AS JSON) IS NULL`
Org_name:  ``
Catalog:   `def`
Database:  ``
Table:     ``
Org_table: ``
Type:      Integer
DbType:    TINY
Collation: binary (63)
Length:    1
Decimals:  0
Flags:     NOT_NULL NUM

*************************** 1. row ***************************
                        NULL: NULL
        CAST('null' AS JSON): null
CAST('null' AS JSON) IS NULL: 0
1 row in set (0.0004 sec)

The CAST() function is used to create a JSON document containing the null value, and it can be seen from the comparison against IS NULL that it is not considered the same as the SQL NULL value. When you think about it, it makes sense. If they were the same, how would you distinguish from the whole JSON column not having any value?

For comparison, let's retrieve the birthday for Jane Doe:

sql> SELECT doc->>'$.First_name' AS FirstName,
            doc->>'$.Surname' AS Surname,
            doc->'$.Birthday' AS Birthday
       FROM my_collections.my_docs
      WHERE doc->>'$.First_name' = 'Jane';
Field 1
Name:      `FirstName`
Org_name:  ``
Catalog:   `def`
Database:  ``
Table:     ``
Org_table: ``
Type:      String
DbType:    VAR_STRING
Collation: utf8mb4_bin (46)
Length:    4294967292
Decimals:  0
Flags:

Field 2
Name:      `Surname`
Org_name:  ``
Catalog:   `def`
Database:  ``
Table:     ``
Org_table: ``
Type:      String
DbType:    VAR_STRING
Collation: utf8mb4_bin (46)
Length:    4294967292
Decimals:  0
Flags:

Field 3
Name:      `Birthday`
Org_name:  ``
Catalog:   `def`
Database:  ``
Table:     ``
Org_table: ``
Type:      Json
DbType:    JSON
Collation: utf8mb4_bin (46)
Length:    4294967295
Decimals:  0
Flags:     BLOB BINARY

+-----------+---------+----------+
| FirstName | Surname | Birthday |
+-----------+---------+----------+
| Jane      | Doe     | NULL     |
+-----------+---------+----------+
1 row in set (0.0006 sec)

While, it is still a JSON document being returned for the birthday, this time there genuinely is no value for it (the field is absent), so it is an SQL NULL value that is returned.

With that in mind, we are able to write the code in our program that retrieves John Doe.

Find Documents Using a JSON Document with null

The last piece of the puzzle is to create a filter condition for the find() method that compares the birthday with a JSON document only containing the JSON null value. This can be done similarly to the previous test:

# Find documents where Birthday is equal to a JSON document
# just containing a null value
stmt_find = my_docs.find("Birthday = CAST('null' AS JSON)")
stmt_find.fields('First_name', 'Surname', 'Birthday')
result = stmt_find.execute()
print('')
print("People with Birthday = CAST('null' AS JSON):")
print('-' * 44)
for person in result.fetch_all():
    print("{First_name} {Surname} ({Birthday})".format(**person))

Note that in this case, you cannot bind the value as you are using an expression.

Full Program Output

The full program output is:

shell> python x_dev_api_null_2021-01-23.py
People with Birthday bound to None:
-----------------------------------

People with Birthday IS NULL:
-----------------------------
Jane Doe (None)

People with Birthday = CAST('null' AS JSON):
--------------------------------------------
John Doe (None)

Notice that both for Jane and John Doe, the birthday is returned as None – the Python equivalent of NULL.

I have worked with MySQL databases since 2006 both as an SQL developer, a database administrator, and for more than eight years as part of the Oracle MySQL Support team. I have spoken at MySQL Connect and Oracle OpenWorld on several occasions. I have contributed to the sys schema and four Oracle Certified Professional (OCP) exams for MySQL 5.6 to 8.0. I have written four books, all published at Apress.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.