MySQL supports a native JSON data type defined by RFC 7159 that enables efficient access to data in JSON (JavaScript Object Notation) documents. The JSON data type provides these advantages over storing JSON-format strings in a string column: Show
MySQL 8.0 also supports the JSON Merge Patch format defined in RFC 7396, using the JSON_MERGE_PATCH() function. See the description of this function, as well as Normalization, Merging, and Autowrapping of JSON Values, for examples and further information.
Note This discussion uses JSON in monotype to indicate specifically the JSON data type and “JSON” in regular font to indicate JSON data in general. The space required to store a JSON document is roughly the same as for LONGBLOB or LONGTEXT; see Section 11.7, “Data Type Storage Requirements”, for more information. It is important to keep in mind that the size of any JSON document stored in a JSON column is limited to the value of the max_allowed_packet system variable. (When the server is manipulating a JSON value internally in memory, it can be larger than this; the limit applies when the server stores it.) You can obtain the amount of space required to store a JSON document using the JSON_STORAGE_SIZE() function; note that for a JSON column, the storage size—and thus the value returned by this function—is that used by the column prior to any partial updates that may have been performed on it (see the discussion of the JSON partial update optimization later in this section). Prior to MySQL 8.0.13, a JSON column cannot have a non-NULL default value. Along with the JSON data type, a set of SQL functions is available to enable operations on JSON values, such as creation, manipulation, and searching. The following discussion shows examples of these operations. For details about individual functions, see Section 12.18, “JSON Functions”. A set of spatial functions for operating on GeoJSON values is also available. See Section 12.17.11, “Spatial GeoJSON Functions”. JSON columns, like columns of other binary types, are not indexed directly; instead, you can create an index on a generated column that extracts a scalar value from the JSON column. See Indexing a Generated Column to Provide a JSON Column Index, for a detailed example. The MySQL optimizer also looks for compatible indexes on virtual columns that match JSON expressions. In MySQL 8.0.17 and later, the InnoDB storage engine supports multi-valued indexes on JSON arrays. See Multi-Valued Indexes. MySQL NDB Cluster 8.0 supports JSON columns and MySQL JSON functions, including creation of an index on a column generated from a JSON column as a workaround for being unable to index a JSON column. A maximum of 3 JSON columns per NDB table is supported. Partial Updates of JSON ValuesIn MySQL 8.0, the optimizer can perform a partial, in-place update of a JSON column instead of removing the old document and writing the new document in its entirety to the column. This optimization can be performed for an update that meets the following conditions:
Such partial updates can be written to the binary log using a compact format that saves space; this can be enabled by setting the binlog_row_value_options system variable to PARTIAL_JSON. It is important to distinguish the partial update of a JSON column value stored in a table from writing the partial update of a row to the binary log. It is possible for the complete update of a JSON column to be recorded in the binary log as a partial update. This can happen when either (or both) of the last two conditions from the previous list is not met but the other conditions are satisfied. See also the description of binlog_row_value_options. The next few sections provide basic information regarding the creation and manipulation of JSON values.
A JSON array contains a list of values separated by commas and enclosed within [ and ] characters: ["abc", 10, null, true, false]A JSON object contains a set of key-value pairs separated by commas and enclosed within { and } characters: {"k1": "value", "k2": 10}As the examples illustrate, JSON arrays and objects can contain scalar values that are strings or numbers, the JSON null literal, or the JSON boolean true or false literals. Keys in JSON objects must be strings. Temporal (date, time, or datetime) scalar values are also permitted: ["12:18:29.000000", "2015-07-29", "2015-07-29 12:18:29.000000"]Nesting is permitted within JSON array elements and JSON object key values: [99, {"id": "HK500", "cost": 75.99}, ["hot", "cold"]] {"k1": "value", "k2": [10, 20]}You can also obtain JSON values from a number of functions supplied by MySQL for this purpose (see Section 12.18.2, “Functions That Create JSON Values”) as well as by casting values of other types to the JSON type using CAST(value AS JSON) (see Converting between JSON and non-JSON values). The next several paragraphs describe how MySQL handles JSON values provided as input. In MySQL, JSON values are written as strings. MySQL parses any string used in a context that requires a JSON value, and produces an error if it is not valid as JSON. These contexts include inserting a value into a column that has the JSON data type and passing an argument to a function that expects a JSON value (usually shown as json_doc or json_val in the documentation for MySQL JSON functions), as the following examples demonstrate:
MySQL handles strings used in JSON context using the utf8mb4 character set and utf8mb4_bin collation. Strings in other character sets are converted to utf8mb4 as necessary. (For strings in the ascii or utf8 character sets, no conversion is needed because ascii and utf8 are subsets of utf8mb4.) As an alternative to writing JSON values using literal strings, functions exist for composing JSON values from component elements. JSON_ARRAY() takes a (possibly empty) list of values and returns a JSON array containing those values: mysql> SELECT JSON_ARRAY('a', 1, NOW()); +----------------------------------------+ | JSON_ARRAY('a', 1, NOW()) | +----------------------------------------+ | ["a", 1, "2015-07-27 09:43:47.000000"] | +----------------------------------------+JSON_OBJECT() takes a (possibly empty) list of key-value pairs and returns a JSON object containing those pairs: mysql> SELECT JSON_OBJECT('key1', 1, 'key2', 'abc'); +---------------------------------------+ | JSON_OBJECT('key1', 1, 'key2', 'abc') | +---------------------------------------+ | {"key1": 1, "key2": "abc"} | +---------------------------------------+JSON_MERGE_PRESERVE() takes two or more JSON documents and returns the combined result: mysql> SELECT JSON_MERGE_PRESERVE('["a", 1]', '{"key": "value"}'); +-----------------------------------------------------+ | JSON_MERGE_PRESERVE('["a", 1]', '{"key": "value"}') | +-----------------------------------------------------+ | ["a", 1, {"key": "value"}] | +-----------------------------------------------------+ 1 row in set (0.00 sec)For information about the merging rules, see Normalization, Merging, and Autowrapping of JSON Values. (MySQL 8.0.3 and later also support JSON_MERGE_PATCH(), which has somewhat different behavior. See JSON_MERGE_PATCH() compared with JSON_MERGE_PRESERVE(), for information about the differences between these two functions.) JSON values can be assigned to user-defined variables: mysql> SET @j = JSON_OBJECT('key', 'value'); mysql> SELECT @j; +------------------+ | @j | +------------------+ | {"key": "value"} | +------------------+However, user-defined variables cannot be of JSON data type, so although @j in the preceding example looks like a JSON value and has the same character set and collation as a JSON value, it does not have the JSON data type. Instead, the result from JSON_OBJECT() is converted to a string when assigned to the variable. Strings produced by converting JSON values have a character set of utf8mb4 and a collation of utf8mb4_bin: mysql> SELECT CHARSET(@j), COLLATION(@j); +-------------+---------------+ | CHARSET(@j) | COLLATION(@j) | +-------------+---------------+ | utf8mb4 | utf8mb4_bin | +-------------+---------------+Because utf8mb4_bin is a binary collation, comparison of JSON values is case-sensitive. mysql> SELECT JSON_ARRAY('x') = JSON_ARRAY('X'); +-----------------------------------+ | JSON_ARRAY('x') = JSON_ARRAY('X') | +-----------------------------------+ | 0 | +-----------------------------------+Case sensitivity also applies to the JSON null, true, and false literals, which always must be written in lowercase: mysql> SELECT JSON_VALID('null'), JSON_VALID('Null'), JSON_VALID('NULL'); +--------------------+--------------------+--------------------+ | JSON_VALID('null') | JSON_VALID('Null') | JSON_VALID('NULL') | +--------------------+--------------------+--------------------+ | 1 | 0 | 0 | +--------------------+--------------------+--------------------+ mysql> SELECT CAST('null' AS JSON); +----------------------+ | CAST('null' AS JSON) | +----------------------+ | null | +----------------------+ 1 row in set (0.00 sec) mysql> SELECT CAST('NULL' AS JSON); ERROR 3141 (22032): Invalid JSON text in argument 1 to function cast_as_json: "Invalid value." at position 0 in 'NULL'.Case sensitivity of the JSON literals differs from that of the SQL NULL, TRUE, and FALSE literals, which can be written in any lettercase: mysql> SELECT ISNULL(null), ISNULL(Null), ISNULL(NULL); +--------------+--------------+--------------+ | ISNULL(null) | ISNULL(Null) | ISNULL(NULL) | +--------------+--------------+--------------+ | 1 | 1 | 1 | +--------------+--------------+--------------+Sometimes it may be necessary or desirable to insert quote characters (" or ') into a JSON document. Assume for this example that you want to insert some JSON objects containing strings representing sentences that state some facts about MySQL, each paired with an appropriate keyword, into a table created using the SQL statement shown here: mysql> CREATE TABLE facts (sentence JSON);Among these keyword-sentence pairs is this one: mascot: The MySQL mascot is a dolphin named "Sakila".One way to insert this as a JSON object into the facts table is to use the MySQL JSON_OBJECT() function. In this case, you must escape each quote character using a backslash, as shown here: mysql> INSERT INTO facts VALUES > (JSON_OBJECT("mascot", "Our mascot is a dolphin named \"Sakila\"."));This does not work in the same way if you insert the value as a JSON object literal, in which case, you must use the double backslash escape sequence, like this: mysql> INSERT INTO facts VALUES > ('{"mascot": "Our mascot is a dolphin named \\"Sakila\\"."}');Using the double backslash keeps MySQL from performing escape sequence processing, and instead causes it to pass the string literal to the storage engine for processing. After inserting the JSON object in either of the ways just shown, you can see that the backslashes are present in the JSON column value by doing a simple SELECT, like this: mysql> SELECT sentence FROM facts; +---------------------------------------------------------+ | sentence | +---------------------------------------------------------+ | {"mascot": "Our mascot is a dolphin named \"Sakila\"."} | +---------------------------------------------------------+To look up this particular sentence employing mascot as the key, you can use the column-path operator ->, as shown here: mysql> SELECT col->"$.mascot" FROM qtest; +---------------------------------------------+ | col->"$.mascot" | +---------------------------------------------+ | "Our mascot is a dolphin named \"Sakila\"." | +---------------------------------------------+ 1 row in set (0.00 sec)This leaves the backslashes intact, along with the surrounding quote marks. To display the desired value using mascot as the key, but without including the surrounding quote marks or any escapes, use the inline path operator ->>, like this: mysql> SELECT sentence->>"$.mascot" FROM facts; +-----------------------------------------+ | sentence->>"$.mascot" | +-----------------------------------------+ | Our mascot is a dolphin named "Sakila". | +-----------------------------------------+
Note The previous example does not work as shown if the NO_BACKSLASH_ESCAPES server SQL mode is enabled. If this mode is set, a single backslash instead of double backslashes can be used to insert the JSON object literal, and the backslashes are preserved. If you use the JSON_OBJECT() function when performing the insert and this mode is set, you must alternate single and double quotes, like this: mysql> INSERT INTO facts VALUES > (JSON_OBJECT('mascot', 'Our mascot is a dolphin named "Sakila".'));See the description of the JSON_UNQUOTE() function for more information about the effects of this mode on escaped characters in JSON values.
Normalization, Merging, and Autowrapping of JSON ValuesWhen a string is parsed and found to be a valid JSON document, it is also normalized. This means that members with keys that duplicate a key found later in the document, reading from left to right, are discarded. The object value produced by the following JSON_OBJECT() call includes only the second key1 element because that key name occurs earlier in the value, as shown here: mysql> SELECT JSON_OBJECT('key1', 1, 'key2', 'abc', 'key1', 'def'); +------------------------------------------------------+ | JSON_OBJECT('key1', 1, 'key2', 'abc', 'key1', 'def') | +------------------------------------------------------+ | {"key1": "def", "key2": "abc"} | +------------------------------------------------------+Normalization is also performed when values are inserted into JSON columns, as shown here: mysql> CREATE TABLE t1 (c1 JSON); mysql> INSERT INTO t1 VALUES > ('{"x": 17, "x": "red"}'), > ('{"x": 17, "x": "red", "x": [3, 5, 7]}'); mysql> SELECT c1 FROM t1; +------------------+ | c1 | +------------------+ | {"x": "red"} | | {"x": [3, 5, 7]} | +------------------+This “last duplicate key wins” behavior is suggested by RFC 7159 and is implemented by most JavaScript parsers. (Bug #86866, Bug #26369555) In versions of MySQL prior to 8.0.3, members with keys that duplicated a key found earlier in the document were discarded. The object value produced by the following JSON_OBJECT() call does not include the second key1 element because that key name occurs earlier in the value: mysql> SELECT JSON_OBJECT('key1', 1, 'key2', 'abc', 'key1', 'def'); +------------------------------------------------------+ | JSON_OBJECT('key1', 1, 'key2', 'abc', 'key1', 'def') | +------------------------------------------------------+ | {"key1": 1, "key2": "abc"} | +------------------------------------------------------+Prior to MySQL 8.0.3, this “first duplicate key wins” normalization was also performed when inserting values into JSON columns. mysql> CREATE TABLE t1 (c1 JSON); mysql> INSERT INTO t1 VALUES > ('{"x": 17, "x": "red"}'), > ('{"x": 17, "x": "red", "x": [3, 5, 7]}'); mysql> SELECT c1 FROM t1; +-----------+ | c1 | +-----------+ | {"x": 17} | | {"x": 17} | +-----------+MySQL also discards extra whitespace between keys, values, or elements in the original JSON document, and leaves (or inserts, when necessary) a single space following each comma (,) or colon (:) when displaying it. This is done to enhance readibility. MySQL functions that produce JSON values (see Section 12.18.2, “Functions That Create JSON Values”) always return normalized values. To make lookups more efficient, MySQL also sorts the keys of a JSON object. You should be aware that the result of this ordering is subject to change and not guaranteed to be consistent across releases. Merging JSON ValuesTwo merging algorithms are supported in MySQL 8.0.3 (and later), implemented by the functions JSON_MERGE_PRESERVE() and JSON_MERGE_PATCH(). These differ in how they handle duplicate keys: JSON_MERGE_PRESERVE() retains values for duplicate keys, while JSON_MERGE_PATCH() discards all but the last value. The next few paragraphs explain how each of these two functions handles the merging of different combinations of JSON documents (that is, of objects and arrays).
Note JSON_MERGE_PRESERVE() is the same as the JSON_MERGE() function found in previous versions of MySQL (renamed in MySQL 8.0.3). JSON_MERGE() is still supported as an alias for JSON_MERGE_PRESERVE() in MySQL 8.0, but is deprecated and subject to removal in a future release. Merging arrays. In contexts that combine multiple arrays, the arrays are merged into a single array. JSON_MERGE_PRESERVE() does this by concatenating arrays named later to the end of the first array. JSON_MERGE_PATCH() considers each argument as an array consisting of a single element (thus having 0 as its index) and then applies “last duplicate key wins” logic to select only the last argument. You can compare the results shown by this query: mysql> SELECT -> JSON_MERGE_PRESERVE('[1, 2]', '["a", "b", "c"]', '[true, false]') AS Preserve, -> JSON_MERGE_PATCH('[1, 2]', '["a", "b", "c"]', '[true, false]') AS Patch\G *************************** 1. row *************************** Preserve: [1, 2, "a", "b", "c", true, false] Patch: [true, false]Multiple objects when merged produce a single object. JSON_MERGE_PRESERVE() handles multiple objects having the same key by combining all unique values for that key in an array; this array is then used as the value for that key in the result. JSON_MERGE_PATCH() discards values for which duplicate keys are found, working from left to right, so that the result contains only the last value for that key. The following query illustrates the difference in the results for the duplicate key a: mysql> SELECT -> JSON_MERGE_PRESERVE('{"a": 1, "b": 2}', '{"c": 3, "a": 4}', '{"c": 5, "d": 3}') AS Preserve, -> JSON_MERGE_PATCH('{"a": 3, "b": 2}', '{"c": 3, "a": 4}', '{"c": 5, "d": 3}') AS Patch\G *************************** 1. row *************************** Preserve: {"a": [1, 4], "b": 2, "c": [3, 5], "d": 3} Patch: {"a": 4, "b": 2, "c": 5, "d": 3}Nonarray values used in a context that requires an array value are autowrapped: The value is surrounded by [ and ] characters to convert it to an array. In the following statement, each argument is autowrapped as an array ([1], [2]). These are then merged to produce a single result array; as in the previous two cases, JSON_MERGE_PRESERVE() combines values having the same key while JSON_MERGE_PATCH() discards values for all duplicate keys except the last, as shown here: mysql> SELECT -> JSON_MERGE_PRESERVE('1', '2') AS Preserve, -> JSON_MERGE_PATCH('1', '2') AS Patch\G *************************** 1. row *************************** Preserve: [1, 2] Patch: 2Array and object values are merged by autowrapping the object as an array and merging the arrays by combining values or by “last duplicate key wins” according to the choice of merging function (JSON_MERGE_PRESERVE() or JSON_MERGE_PATCH(), respectively), as can be seen in this example: mysql> SELECT -> JSON_MERGE_PRESERVE('[10, 20]', '{"a": "x", "b": "y"}') AS Preserve, -> JSON_MERGE_PATCH('[10, 20]', '{"a": "x", "b": "y"}') AS Patch\G *************************** 1. row *************************** Preserve: [10, 20, {"a": "x", "b": "y"}] Patch: {"a": "x", "b": "y"}
Searching and Modifying JSON ValuesA JSON path expression selects a value within a JSON document. Path expressions are useful with functions that extract parts of or modify a JSON document, to specify where within that document to operate. For example, the following query extracts from a JSON document the value of the member with the name key: mysql> SELECT JSON_EXTRACT('{"id": 14, "name": "Aztalan"}', '$.name'); +---------------------------------------------------------+ | JSON_EXTRACT('{"id": 14, "name": "Aztalan"}', '$.name') | +---------------------------------------------------------+ | "Aztalan" | +---------------------------------------------------------+Path syntax uses a leading $ character to represent the JSON document under consideration, optionally followed by selectors that indicate successively more specific parts of the document:
Let $ refer to this JSON array with three elements: [3, {"a": [5, 6], "b": 10}, [99, 100]]Then:
Because $[1] and $[2] evaluate to nonscalar values, they can be used as the basis for more-specific path expressions that select nested values. Examples:
As mentioned previously, path components that name keys must be quoted if the unquoted key name is not legal in path expressions. Let $ refer to this value: {"a fish": "shark", "a bird": "sparrow"}The keys both contain a space and must be quoted:
Paths that use wildcards evaluate to an array that can contain multiple values: mysql> SELECT JSON_EXTRACT('{"a": 1, "b": 2, "c": [3, 4, 5]}', '$.*'); +---------------------------------------------------------+ | JSON_EXTRACT('{"a": 1, "b": 2, "c": [3, 4, 5]}', '$.*') | +---------------------------------------------------------+ | [1, 2, [3, 4, 5]] | +---------------------------------------------------------+ mysql> SELECT JSON_EXTRACT('{"a": 1, "b": 2, "c": [3, 4, 5]}', '$.c[*]'); +------------------------------------------------------------+ | JSON_EXTRACT('{"a": 1, "b": 2, "c": [3, 4, 5]}', '$.c[*]') | +------------------------------------------------------------+ | [3, 4, 5] | +------------------------------------------------------------+In the following example, the path $**.b evaluates to multiple paths ($.a.b and $.c.b) and produces an array of the matching path values: mysql> SELECT JSON_EXTRACT('{"a": {"b": 1}, "c": {"b": 2}}', '$**.b'); +---------------------------------------------------------+ | JSON_EXTRACT('{"a": {"b": 1}, "c": {"b": 2}}', '$**.b') | +---------------------------------------------------------+ | [1, 2] | +---------------------------------------------------------+Ranges from JSON arrays. You can use ranges with the to keyword to specify subsets of JSON arrays. For example, $[1 to 3] includes the second, third, and fourth elements of an array, as shown here: mysql> SELECT JSON_EXTRACT('[1, 2, 3, 4, 5]', '$[1 to 3]'); +----------------------------------------------+ | JSON_EXTRACT('[1, 2, 3, 4, 5]', '$[1 to 3]') | +----------------------------------------------+ | [2, 3, 4] | +----------------------------------------------+ 1 row in set (0.00 sec)The syntax is M to N, where M and N are, respectively, the first and last indexes of a range of elements from a JSON array. N must be greater than M; M must be greater than or equal to 0. Array elements are indexed beginning with 0. You can use ranges in contexts where wildcards are supported. Rightmost array element. The last keyword is supported as a synonym for the index of the last element in an array. Expressions of the form last - N can be used for relative addressing, and within range definitions, like this: mysql> SELECT JSON_EXTRACT('[1, 2, 3, 4, 5]', '$[last-3 to last-1]'); +--------------------------------------------------------+ | JSON_EXTRACT('[1, 2, 3, 4, 5]', '$[last-3 to last-1]') | +--------------------------------------------------------+ | [2, 3, 4] | +--------------------------------------------------------+ 1 row in set (0.01 sec)If the path is evaluated against a value that is not an array, the result of the evaluation is the same as if the value had been wrapped in a single-element array: mysql> SELECT JSON_REPLACE('"Sakila"', '$[last]', 10); +-----------------------------------------+ | JSON_REPLACE('"Sakila"', '$[last]', 10) | +-----------------------------------------+ | 10 | +-----------------------------------------+ 1 row in set (0.00 sec)You can use column->path with a JSON column identifier and JSON path expression as a synonym for JSON_EXTRACT(column, path). See Section 12.18.3, “Functions That Search JSON Values”, for more information. See also Indexing a Generated Column to Provide a JSON Column Index. Some functions take an existing JSON document, modify it in some way, and return the resulting modified document. Path expressions indicate where in the document to make changes. For example, the JSON_SET(), JSON_INSERT(), and JSON_REPLACE() functions each take a JSON document, plus one or more path-value pairs that describe where to modify the document and the values to use. The functions differ in how they handle existing and nonexisting values within the document. Consider this document: mysql> SET @j = '["a", {"b": [true, false]}, [10, 20]]';JSON_SET() replaces values for paths that exist and adds values for paths that do not exist:. mysql> SELECT JSON_SET(@j, '$[1].b[0]', 1, '$[2][2]', 2); +--------------------------------------------+ | JSON_SET(@j, '$[1].b[0]', 1, '$[2][2]', 2) | +--------------------------------------------+ | ["a", {"b": [1, false]}, [10, 20, 2]] | +--------------------------------------------+In this case, the path $[1].b[0] selects an existing value (true), which is replaced with the value following the path argument (1). The path $[2][2] does not exist, so the corresponding value (2) is added to the value selected by $[2]. JSON_INSERT() adds new values but does not replace existing values: mysql> SELECT JSON_INSERT(@j, '$[1].b[0]', 1, '$[2][2]', 2); +-----------------------------------------------+ | JSON_INSERT(@j, '$[1].b[0]', 1, '$[2][2]', 2) | +-----------------------------------------------+ | ["a", {"b": [true, false]}, [10, 20, 2]] | +-----------------------------------------------+JSON_REPLACE() replaces existing values and ignores new values: mysql> SELECT JSON_REPLACE(@j, '$[1].b[0]', 1, '$[2][2]', 2); +------------------------------------------------+ | JSON_REPLACE(@j, '$[1].b[0]', 1, '$[2][2]', 2) | +------------------------------------------------+ | ["a", {"b": [1, false]}, [10, 20]] | +------------------------------------------------+The path-value pairs are evaluated left to right. The document produced by evaluating one pair becomes the new value against which the next pair is evaluated. JSON_REMOVE() takes a JSON document and one or more paths that specify values to be removed from the document. The return value is the original document minus the values selected by paths that exist within the document: mysql> SELECT JSON_REMOVE(@j, '$[2]', '$[1].b[1]', '$[1].b[1]'); +---------------------------------------------------+ | JSON_REMOVE(@j, '$[2]', '$[1].b[1]', '$[1].b[1]') | +---------------------------------------------------+ | ["a", {"b": [true]}] | +---------------------------------------------------+The paths have these effects:
Many of the JSON functions supported by MySQL and described elsewhere in this Manual (see Section 12.18, “JSON Functions”) require a path expression in order to identify a specific element in a JSON document. A path consists of the path's scope followed by one or more path legs. For paths used in MySQL JSON functions, the scope is always the document being searched or otherwise operated on, represented by a leading $ character. Path legs are separated by period characters (.). Cells in arrays are represented by [N], where N is a non-negative integer. Names of keys must be double-quoted strings or valid ECMAScript identifiers (see Identifier Names and Identifiers, in the ECMAScript Language Specification). Path expressions, like JSON text, should be encoded using the ascii, utf8, or utf8mb4 character set. Other character encodings are implicitly coerced to utf8mb4. The complete syntax is shown here: pathExpression: scope[(pathLeg)*] pathLeg: member | arrayLocation | doubleAsterisk member: period ( keyName | asterisk ) arrayLocation: leftBracket ( nonNegativeInteger | asterisk ) rightBracket keyName: ESIdentifier | doubleQuotedString doubleAsterisk: '**' period: '.' asterisk: '*' leftBracket: '[' rightBracket: ']'As noted previously, in MySQL, the scope of the path is always the document being operated on, represented as $. You can use '$' as a synonym for the document in JSON path expressions.
Note Some implementations support column references for scopes of JSON paths; MySQL 8.0 does not support these. The wildcard * and ** tokens are used as follows:
For path syntax examples, see the descriptions of the various JSON functions that take paths as arguments, such as JSON_CONTAINS_PATH(), JSON_SET(), and JSON_REPLACE(). For examples which include the use of the * and ** wildcards, see the description of the JSON_SEARCH() function. MySQL 8.0 also supports range notation for subsets of JSON arrays using the to keyword (such as $[2 to 10]), as well as the last keyword as a synonym for the rightmost element of an array. See Searching and Modifying JSON Values, for more information and examples.
Comparison and Ordering of JSON ValuesJSON values can be compared using the =, <, <=, >, >=, <>, !=, and <=> operators. The following comparison operators and functions are not yet supported with JSON values:
A workaround for the comparison operators and functions just listed is to cast JSON values to a native MySQL numeric or string data type so they have a consistent non-JSON scalar type. Comparison of JSON values takes place at two levels. The first level of comparison is based on the JSON types of the compared values. If the types differ, the comparison result is determined solely by which type has higher precedence. If the two values have the same JSON type, a second level of comparison occurs using type-specific rules. The following list shows the precedences of JSON types, from highest precedence to the lowest. (The type names are those returned by the JSON_TYPE() function.) Types shown together on a line have the same precedence. Any value having a JSON type listed earlier in the list compares greater than any value having a JSON type listed later in the list. BLOB BIT OPAQUE DATETIME TIME DATE BOOLEAN ARRAY OBJECT STRING INTEGER, DOUBLE NULLFor JSON values of the same precedence, the comparison rules are type specific:
For comparison of any JSON value to SQL NULL, the result is UNKNOWN. For comparison of JSON and non-JSON values, the non-JSON value is converted to JSON according to the rules in the following table, then the values compared as described previously.
Converting between JSON and non-JSON valuesThe following table provides a summary of the rules that MySQL follows when casting between JSON values and values of other types:
Table 11.3 JSON Conversion Rules
ORDER BY and GROUP BY for JSON values works according to these principles:
For sorting, it can be beneficial to cast a JSON scalar to some other native MySQL type. For example, if a column named jdoc contains JSON objects having a member consisting of an id key and a nonnegative value, use this expression to sort by id values: ORDER BY CAST(JSON_EXTRACT(jdoc, '$.id') AS UNSIGNED)If there happens to be a generated column defined to use the same expression as in the ORDER BY, the MySQL optimizer recognizes that and considers using the index for the query execution plan. See Section 8.3.11, “Optimizer Use of Generated Column Indexes”.
Aggregation of JSON ValuesFor aggregation of JSON values, SQL NULL values are ignored as for other data types. Non-NULL values are converted to a numeric type and aggregated, except for MIN(), MAX(), and GROUP_CONCAT(). The conversion to number should produce a meaningful result for JSON values that are numeric scalars, although (depending on the values) truncation and loss of precision may occur. Conversion to number of other JSON values may not produce a meaningful result. Page 2
11.6 Data Type Default ValuesData type specifications can have explicit or implicit default values. A DEFAULT value clause in a data type specification explicitly indicates a default value for a column. Examples: SERIAL DEFAULT VALUE is a special case. In the definition of an integer column, it is an alias for NOT NULL AUTO_INCREMENT UNIQUE. Some aspects of explicit DEFAULT clause handling are version dependent, as described following.
Explicit Default Handling as of MySQL 8.0.13The default value specified in a DEFAULT clause can be a literal constant or an expression. With one exception, enclose expression default values within parentheses to distinguish them from literal constant default values. Examples: CREATE TABLE t1 ( -- literal defaults i INT DEFAULT 0, c VARCHAR(10) DEFAULT '', -- expression defaults f FLOAT DEFAULT (RAND() * RAND()), b BINARY(16) DEFAULT (UUID_TO_BIN(UUID())), d DATE DEFAULT (CURRENT_DATE + INTERVAL 1 YEAR), p POINT DEFAULT (Point(0,0)), j JSON DEFAULT (JSON_ARRAY()) );The exception is that, for TIMESTAMP and DATETIME columns, you can specify the CURRENT_TIMESTAMP function as the default, without enclosing parentheses. See Section 11.2.5, “Automatic Initialization and Updating for TIMESTAMP and DATETIME”. The BLOB, TEXT, GEOMETRY, and JSON data types can be assigned a default value only if the value is written as an expression, even if the expression value is a literal:
Expression default values must adhere to the following rules. An error occurs if an expression contains disallowed constructs.
Note If any component of an expression default value depends on the SQL mode, different results may occur for different uses of the table unless the SQL mode is the same during all uses. For CREATE TABLE ... LIKE and CREATE TABLE ... SELECT, the destination table preserves expression default values from the original table. If an expression default value refers to a nondeterministic function, any statement that causes the expression to be evaluated is unsafe for statement-based replication. This includes statements such as INSERT and UPDATE. In this situation, if binary logging is disabled, the statement is executed as normal. If binary logging is enabled and binlog_format is set to STATEMENT, the statement is logged and executed but a warning message is written to the error log, because replication slaves might diverge. When binlog_format is set to MIXED or ROW, the statement is executed as normal. When inserting a new row, the default value for a column with an expression default can be inserted either by omitting the column name or by specifying the column as DEFAULT (just as for columns with literal defaults): mysql> CREATE TABLE t4 (uid BINARY(16) DEFAULT (UUID_TO_BIN(UUID()))); mysql> INSERT INTO t4 () VALUES(); mysql> INSERT INTO t4 () VALUES(DEFAULT); mysql> SELECT BIN_TO_UUID(uid) AS uid FROM t4; +--------------------------------------+ | uid | +--------------------------------------+ | f1109174-94c9-11e8-971d-3bf1095aa633 | | f110cf9a-94c9-11e8-971d-3bf1095aa633 | +--------------------------------------+However, the use of DEFAULT(col_name) to specify the default value for a named column is permitted only for columns that have a literal default value, not for columns that have an expression default value. Not all storage engines permit expression default values. For those that do not, an ER_UNSUPPORTED_ACTION_ON_DEFAULT_VAL_GENERATED error occurs. If a default value evaluates to a data type that differs from the declared column type, implicit coercion to the declared type occurs according to the usual MySQL type-conversion rules. See Section 12.3, “Type Conversion in Expression Evaluation”.
Explicit Default Handling Prior to MySQL 8.0.13With one exception, the default value specified in a DEFAULT clause must be a literal constant; it cannot be a function or an expression. This means, for example, that you cannot set the default for a date column to be the value of a function such as NOW() or CURRENT_DATE. The exception is that, for TIMESTAMP and DATETIME columns, you can specify CURRENT_TIMESTAMP as the default. See Section 11.2.5, “Automatic Initialization and Updating for TIMESTAMP and DATETIME”. The BLOB, TEXT, GEOMETRY, and JSON data types cannot be assigned a default value. If a default value evaluates to a data type that differs from the declared column type, implicit coercion to the declared type occurs according to the usual MySQL type-conversion rules. See Section 12.3, “Type Conversion in Expression Evaluation”.
Implicit Default HandlingIf a data type specification includes no explicit DEFAULT value, MySQL determines the default value as follows: If the column can take NULL as a value, the column is defined with an explicit DEFAULT NULL clause. If the column cannot take NULL as a value, MySQL defines the column with no explicit DEFAULT clause. For data entry into a NOT NULL column that has no explicit DEFAULT clause, if an INSERT or REPLACE statement includes no value for the column, or an UPDATE statement sets the column to NULL, MySQL handles the column according to the SQL mode in effect at the time:
Suppose that a table t is defined as follows: CREATE TABLE t (i INT NOT NULL);In this case, i has no explicit default, so in strict mode each of the following statements produce an error and no row is inserted. When not using strict mode, only the third statement produces an error; the implicit default is inserted for the first two statements, but the third fails because DEFAULT(i) cannot produce a value: INSERT INTO t VALUES(); INSERT INTO t VALUES(DEFAULT); INSERT INTO t VALUES(DEFAULT(i));See Section 5.1.11, “Server SQL Modes”. For a given table, the SHOW CREATE TABLE statement displays which columns have an explicit DEFAULT clause. Implicit defaults are defined as follows:
Page 3
11.7 Data Type Storage RequirementsThe storage requirements for table data on disk depend on several factors. Different storage engines represent data types and store raw data differently. Table data might be compressed, either for a column or an entire row, complicating the calculation of storage requirements for a table or column. Despite differences in storage layout on disk, the internal MySQL APIs that communicate and exchange information about table rows use a consistent data structure that applies across all storage engines. This section includes guidelines and information for the storage requirements for each data type supported by MySQL, including the internal format and size for storage engines that use a fixed-size representation for data types. Information is listed by category or storage engine. The internal representation of a table has a maximum row size of 65,535 bytes, even if the storage engine is capable of supporting larger rows. This figure excludes BLOB or TEXT columns, which contribute only 9 to 12 bytes toward this size. For BLOB and TEXT data, the information is stored internally in a different area of memory than the row buffer. Different storage engines handle the allocation and storage of this data in different ways, according to the method they use for handling the corresponding types. For more information, see Chapter 16, Alternative Storage Engines, and Section 8.4.7, “Limits on Table Column Count and Row Size”.
NDB Table Storage Requirements
Important NDB tables use 4-byte alignment; all NDB data storage is done in multiples of 4 bytes. Thus, a column value that would typically take 15 bytes requires 16 bytes in an NDB table. For example, in NDB tables, the TINYINT, SMALLINT, MEDIUMINT, and INTEGER (INT) column types each require 4 bytes storage per record due to the alignment factor. Each BIT(M) column takes M bits of storage space. Although an individual BIT column is not 4-byte aligned, NDB reserves 4 bytes (32 bits) per row for the first 1-32 bits needed for BIT columns, then another 4 bytes for bits 33-64, and so on. While a NULL itself does not require any storage space, NDB reserves 4 bytes per row if the table definition contains any columns allowing NULL, up to 32 NULL columns. (If an NDB Cluster table is defined with more than 32 NULL columns up to 64 NULL columns, then 8 bytes per row are reserved.) Every table using the NDB storage engine requires a primary key; if you do not define a primary key, a “hidden” primary key is created by NDB. This hidden primary key consumes 31-35 bytes per table record. You can use the ndb_size.pl Perl script to estimate NDB storage requirements. It connects to a current MySQL (not NDB Cluster) database and creates a report on how much space that database would require if it used the NDB storage engine. See Section 23.5.28, “ndb_size.pl — NDBCLUSTER Size Requirement Estimator” for more information.
Numeric Type Storage RequirementsValues for DECIMAL (and NUMERIC) columns are represented using a binary format that packs nine decimal (base 10) digits into four bytes. Storage for the integer and fractional parts of each value are determined separately. Each multiple of nine digits requires four bytes, and the “leftover” digits require some fraction of four bytes. The storage required for excess digits is given by the following table.
Date and Time Type Storage RequirementsFor TIME, DATETIME, and TIMESTAMP columns, the storage required for tables created before MySQL 5.6.4 differs from tables created from 5.6.4 on. This is due to a change in 5.6.4 that permits these types to have a fractional part, which requires from 0 to 3 bytes. As of MySQL 5.6.4, storage for YEAR and DATE remains unchanged. However, TIME, DATETIME, and TIMESTAMP are represented differently. DATETIME is packed more efficiently, requiring 5 rather than 8 bytes for the nonfractional part, and all three parts have a fractional part that requires from 0 to 3 bytes, depending on the fractional seconds precision of stored values. For example, TIME(0), TIME(2), TIME(4), and TIME(6) use 3, 4, 5, and 6 bytes, respectively. TIME and TIME(0) are equivalent and require the same storage. For details about internal representation of temporal values, see MySQL Internals: Important Algorithms and Structures.
String Type Storage RequirementsIn the following table, M represents the declared column length in characters for nonbinary string types and bytes for binary string types. L represents the actual length in bytes of a given string value. Variable-length string types are stored using a length prefix plus data. The length prefix requires from one to four bytes depending on the data type, and the value of the prefix is L (the byte length of the string). For example, storage for a MEDIUMTEXT value requires L bytes to store the value plus three bytes to store the length of the value. To calculate the number of bytes used to store a particular CHAR, VARCHAR, or TEXT column value, you must take into account the character set used for that column and whether the value contains multibyte characters. In particular, when using a utf8 Unicode character set, you must keep in mind that not all characters use the same number of bytes. utf8mb3 and utf8mb4 character sets can require up to three and four bytes per character, respectively. For a breakdown of the storage used for different categories of utf8mb3 or utf8mb4 characters, see Section 10.9, “Unicode Support”. VARCHAR, VARBINARY, and the BLOB and TEXT types are variable-length types. For each, the storage requirements depend on these factors:
For example, a VARCHAR(255) column can hold a string with a maximum length of 255 characters. Assuming that the column uses the latin1 character set (one byte per character), the actual storage required is the length of the string (L), plus one byte to record the length of the string. For the string 'abcd', L is 4 and the storage requirement is five bytes. If the same column is instead declared to use the ucs2 double-byte character set, the storage requirement is 10 bytes: The length of 'abcd' is eight bytes and the column requires two bytes to store lengths because the maximum length is greater than 255 (up to 510 bytes). The effective maximum number of bytes that can be stored in a VARCHAR or VARBINARY column is subject to the maximum row size of 65,535 bytes, which is shared among all columns. For a VARCHAR column that stores multibyte characters, the effective maximum number of characters is less. For example, utf8mb4 characters can require up to four bytes per character, so a VARCHAR column that uses the utf8mb4 character set can be declared to be a maximum of 16,383 characters. See Section 8.4.7, “Limits on Table Column Count and Row Size”. InnoDB encodes fixed-length fields greater than or equal to 768 bytes in length as variable-length fields, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4. The NDB storage engine supports variable-width columns. This means that a VARCHAR column in an NDB Cluster table requires the same amount of storage as would any other storage engine, with the exception that such values are 4-byte aligned. Thus, the string 'abcd' stored in a VARCHAR(50) column using the latin1 character set requires 8 bytes (rather than 5 bytes for the same column value in a MyISAM table). TEXT, BLOB, and JSON columns are implemented differently in the NDB storage engine, wherein each row in the column is made up of two separate parts. One of these is of fixed size (256 bytes for TEXT and BLOB, 4000 bytes for JSON), and is actually stored in the original table. The other consists of any data in excess of 256 bytes, which is stored in a hidden blob parts table. The size of the rows in this second table are determined by the exact type of the column, as shown in the following table: This means that the size of a TEXT column is 256 if size <= 256 (where size represents the size of the row); otherwise, the size is 256 + size + (2000 × (size − 256) % 2000). No blob parts are stored separately by NDB for TINYBLOB or TINYTEXT column values. You can increase the size of an NDB blob column's blob part to the maximum of 13948 using NDB_COLUMN in a column comment when creating or altering the parent table. In NDB 8.0.30 and later, it is also possible to set the inline size for a TEXT, BLOB, or JSON column, using NDB_TABLE in a column comment. See NDB_COLUMN Options, for more information. The size of an ENUM object is determined by the number of different enumeration values. One byte is used for enumerations with up to 255 possible values. Two bytes are used for enumerations having between 256 and 65,535 possible values. See Section 11.3.5, “The ENUM Type”. The size of a SET object is determined by the number of different set members. If the set size is N, the object occupies (N+7)/8 bytes, rounded up to 1, 2, 3, 4, or 8 bytes. A SET can have a maximum of 64 members. See Section 11.3.6, “The SET Type”.
Spatial Type Storage RequirementsMySQL stores geometry values using 4 bytes to indicate the SRID followed by the WKB representation of the value. The LENGTH() function returns the space in bytes required for value storage. For descriptions of WKB and internal storage formats for spatial values, see Section 11.4.3, “Supported Spatial Data Formats”.
JSON Storage RequirementsIn general, the storage requirement for a JSON column is approximately the same as for a LONGBLOB or LONGTEXT column; that is, the space consumed by a JSON document is roughly the same as it would be for the document's string representation stored in a column of one of these types. However, there is an overhead imposed by the binary encoding, including metadata and dictionaries needed for lookup, of the individual values stored in the JSON document. For example, a string stored in a JSON document requires 4 to 10 bytes additional storage, depending on the length of the string and the size of the object or array in which it is stored. In addition, MySQL imposes a limit on the size of any JSON document stored in a JSON column such that it cannot be any larger than the value of max_allowed_packet. Page 4
11.8 Choosing the Right Type for a ColumnFor optimum storage, you should try to use the most precise type in all cases. For example, if an integer column is used for values in the range from 1 to 99999, MEDIUMINT UNSIGNED is the best type. Of the types that represent all the required values, this type uses the least amount of storage. All basic calculations (+, -, *, and /) with DECIMAL columns are done with precision of 65 decimal (base 10) digits. See Section 11.1.1, “Numeric Data Type Syntax”. If accuracy is not too important or if speed is the highest priority, the DOUBLE type may be good enough. For high precision, you can always convert to a fixed-point type stored in a BIGINT. This enables you to do all calculations with 64-bit integers and then convert results back to floating-point values as necessary. Page 5
11.9 Using Data Types from Other Database EnginesTo facilitate the use of code written for SQL implementations from other vendors, MySQL maps data types as shown in the following table. These mappings make it easier to import table definitions from other database systems into MySQL. Data type mapping occurs at table creation time, after which the original type specifications are discarded. If you create a table with types used by other vendors and then issue a DESCRIBE tbl_name statement, MySQL reports the table structure using the equivalent MySQL types. For example: mysql> CREATE TABLE t (a BOOL, b FLOAT8, c LONG VARCHAR, d NUMERIC); Query OK, 0 rows affected (0.00 sec) mysql> DESCRIBE t; +-------+---------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+---------------+------+-----+---------+-------+ | a | tinyint(1) | YES | | NULL | | | b | double | YES | | NULL | | | c | mediumtext | YES | | NULL | | | d | decimal(10,0) | YES | | NULL | | +-------+---------------+------+-----+---------+-------+ 4 rows in set (0.01 sec)Page 6
Expressions can be used at several points in SQL statements, such as in the ORDER BY or HAVING clauses of SELECT statements, in the WHERE clause of a SELECT, DELETE, or UPDATE statement, or in SET statements. Expressions can be written using values from several sources, such as literal values, column values, NULL, variables, built-in functions and operators, loadable functions, and stored functions (a type of stored object). This chapter describes the built-in functions and operators that are permitted for writing expressions in MySQL. For information about loadable functions and stored functions, see Section 5.7, “MySQL Server Loadable Functions”, and Section 25.2, “Using Stored Routines”. For the rules describing how the server interprets references to different kinds of functions, see Section 9.2.5, “Function Name Parsing and Resolution”. An expression that contains NULL always produces a NULL value unless otherwise indicated in the documentation for a particular function or operator.
Note By default, there must be no whitespace between a function name and the parenthesis following it. This helps the MySQL parser distinguish between function calls and references to tables or columns that happen to have the same name as a function. However, spaces around function arguments are permitted. To tell the MySQL server to accept spaces after function names by starting it with the --sql-mode=IGNORE_SPACE option. (See Section 5.1.11, “Server SQL Modes”.) Individual client programs can request this behavior by using the CLIENT_IGNORE_SPACE option for mysql_real_connect(). In either case, all function names become reserved words. For the sake of brevity, some examples in this chapter display the output from the mysql program in abbreviated form. Rather than showing examples in this format: mysql> SELECT MOD(29,9); +-----------+ | mod(29,9) | +-----------+ | 2 | +-----------+ 1 rows in set (0.00 sec)This format is used instead: mysql> SELECT MOD(29,9); -> 2Page 7
12.1 Built-In Function and Operator ReferenceThe following table lists each built-in (native) function and operator and provides a short description of each one. For a table listing functions that are loadable at runtime, see Section 12.2, “Loadable Function Reference”.
Table 12.1 Built-In Functions and Operators
|