当前位置:网站首页>The pit we stepped on in those years, the null value trap in SQL!

The pit we stepped on in those years, the null value trap in SQL!

2021-11-25 18:02:16 Miss Tony without a haircut

SQL It's a declarative language , We just need to describe the desired outcome (WHAT), It doesn't care how the database is implemented (HOW); although SQL It's easier to learn , But there are still some confusing and fallible concepts .

Let's talk about SQL Null traps and avoidance methods in , The databases involved include MySQL、Oracle、SQL Server、PostgreSQL as well as SQLite. As usual , Conclusion first :

NULL characteristic MySQL Oracle SQL Server PostgreSQL SQLite
Three valued logic
Empty value comparison IS [NOT] NULL、expr <=> NULL IS [NOT] NULL'' IS NULL IS [NOT] NULL IS [NOT] NULLIS [NOT] DISTINCT FROM NULL IS [NOT] NULL
NOT IN (NULL) No results returned No results returned No results returned No results returned No results returned
function / expression NULL Parameters The result is NULL The result is NULL、CONCAT Functions and || exception The result is NULL、CONCAT Function exception The result is NULL、CONCAT Function exception The result is NULL
Aggregate functions Ignore NULL data 、COUNT(*) With the exception of Ignore NULL data 、COUNT(*) With the exception of Ignore NULL data 、COUNT(*) With the exception of Ignore NULL data 、COUNT(*) With the exception of Ignore NULL data 、COUNT(*) With the exception of
DISTINCTGROUP BYPARTITION BYUNION All null values are grouped together All null values are grouped together All null values are grouped together All null values are grouped together All null values are grouped together
ORDER BY The default null value is the smallest The default null value is maximum 、 Support NULLS FIRST | LAST The default null value is the smallest The default null value is maximum 、 Support NULLS FIRST | LAST The default null value is the smallest 、 Support NULLS FIRST | LAST
COALESCE function NULLIF function ️、IFNULL(expr1, expr2)、IF(expr1, expr2, expr3) ️、NVL(expr1, expr2)、NVL2(expr1, expr2, expr3) ️、ISNULL(expr1, expr2)
Unique constraints allow multiple null values
Check constraints allow the insertion of null values

The sample data used in this article can be linked to  《SQL Introductory tutorial 》 Sample database download .

NULL It's empty

In the database , Null value (NULL) It's a special value , Usually used to indicate missing values . such as , When filling in the questionnaire, not willing to disclose some information will lead to the lack of entries , There will always be one person in the organizational structure of the company ( Chairman of the board of directors / The general manager ) There is no superior leader .

First of all a bit , Null values and numbers 0 Is not the same . If I ask you : How much money do you have in your wallet ? If you know there's no money in it , It can be said to be zero ; If you're not sure , So the unknown , But it can't be said that there is no . When we need to create a table to store this information , Should be NULL; Unless we can make sure there's no money in the wallet or how much money there is .

in addition , Null values and empty strings ('') It's not the same , The reason is similar to the above . however Oracle It's an exception , We will discuss in detail below .

In most programming languages , visit null Values usually lead to errors ; however SQL No mistakes , It just affects the result of the operation .

Three valued logic

In most programming languages , There are only two cases of logical operation , It's not true (True) It's fake (False). But for SQL for , Logical operations can also be unknown (Unknown):

 The pits we stepped on in those years ,SQL  Null trap in !_oracle
The main purpose of introducing ternary logic is to support NULL, because NULL Represents unknown data . therefore ,SQL The logic operation in and (AND)、 or (OR) And non (NOT) The results are as follows :

AND really false Unknown
really really false Unknown
false false false false
Unknown Unknown false Unknown
OR really false Unknown
really really really really
false really false Unknown
Unknown really Unknown Unknown
NOT result
really false
false really
Unknown Unknown

about AND Operator , True and unknown sums can be true , It could be fake ; therefore , The end result is unknown .

SQL Medium WHERE、HAVING as well as CASE WHEN Clause returns only data whose logical result is true , Do not return data with false or unknown results .

Empty value comparison

When we use the comparison operator (=、<>、<、> etc. ) And NULL When comparing , The result is neither true nor false , It's the unknown ; because NULL It means unknown , That means it could be any value . The results of the following operations are unknown :

NULL = 0
NULL <> 0
NULL <= 0
NULL = NULL
NULL != NULL

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

NULL Not equal to any value , Even two NULL I don't want to wait ; Because we can't say that two unknown values are the same , It can't be said that they are different .

️ For comparison operations ,NULL and NULL inequality ; But some SQL In Clause NULL Values are treated as the same values , for example GROUP BY. See below for details .

that , How to judge whether a value is NULL Well ? So ,SQL Two predicates are introduced (WHERE Clause ):IS NULL and IS NOT NULL. The following example is used to find manager Empty employees :

--  Use the comparison operator to determine null values 
SELECT employee_id, first_name, last_name, manager_id
  FROM employees
 WHERE manager_id = NULL;
employee_id|first_name|last_name|manager_id|
-----------|----------|---------|----------|

--  Use  IS NULL  Judge null 
SELECT employee_id, first_name, last_name, manager_id
  FROM employees
 WHERE manager_id IS NULL;
employee_id|first_name|last_name|manager_id|
-----------|----------|---------|----------|
        100|Steven    |King     |          |

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.

among , The first query uses the comparison operator to determine null values , No results will be returned ; The second query uses IS NULL Judge null , Returned the correct result .

Except standard IS [NOT] NULL outside , There are also some database extension operators that can be used for null comparison :

-- MySQL
SELECT employee_id, first_name, last_name, manager_id
  FROM employees
 WHERE manager_id <=> NULL;
employee_id|first_name|last_name|manager_id|
-----------|----------|---------|----------|
        100|Steven    |King     |          |

-- PostgreSQL
SELECT employee_id, first_name, last_name, manager_id
  FROM employees
 WHERE manager_id IS NOT DISTINCT FROM NULL;
employee_id|first_name|last_name|manager_id|
-----------|----------|---------|----------|
        100|Steven    |King     |          |

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.

MySQL Medium <=> Can be used for equivalence comparison , Support two NULL value ;PostgreSQL Medium IS [NOT] DISTINCT FROM Can be used for equivalence comparison , Support two NULL value .

The results of the following query will not return any results :

SELECT employee_id, first_name, last_name, manager_id
  FROM employees
 WHERE (1 = NULL) OR (1 != NULL);

     
  • 1.
  • 2.
  • 3.

Because according to the above ternary logic , Two unknown results OR The operation is still unknown .

We said earlier , An empty string is not NULL; however Oracle An empty string in is treated as NULL. for example :

-- Oracle
SELECT 1
  FROM dual
 WHERE '' IS NULL;
VAL|
---|
  1|

--  Other databases 
SELECT 1 AS val
 WHERE '' IS NULL;
val|
---|

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.

Of course , If we use equivalence (=) Operator to judge empty string and NULL, The result is still empty .

NOT IN And null value

about WHERE In the condition of IN and NOT IN Operator , We use equivalence comparison . So if NOT IN I met NULL value , Never return any results . for example :

SELECT employee_id, first_name, last_name, manager_id
  FROM employees
 WHERE 1 NOT IN (NULL, 2);

     
  • 1.
  • 2.
  • 3.

Because the above condition is actually equivalent to :

SELECT employee_id, first_name, last_name, manager_id
  FROM employees
 WHERE 1 != NULL AND 1 != 2;

     
  • 1.
  • 2.
  • 3.

1 It's not equal to NULL The result is unknown ,1 It's not equal to 2 The result is true , Unknown and true AND The result is still unknown .

️ If you use NOT IN, Make sure that the values in parentheses don't appear NULL; Or try to use NOT EXISTS.

Functions and null values

Generally speaking , Function and expression parameters NULL, And the result is NULL. There are, of course, some exceptions , For example, aggregate functions .

The following queries return NULL:

SELECT ABS(NULL), 1 + NULL
  FROM employees
 WHERE employee_id = 100;
ABS(NULL)|1 + NULL|
---------|--------|
   [NULL]|  [NULL]|

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.

The absolute value of an unknown value is still unknown ,1 Add an unknown value and the result is still unknown .

But a common exception is strings and NULL The connection of :

-- Oracle、SQL Server、PostgreSQL
SELECT CONCAT('Hello', NULL)
  FROM employees
 WHERE employee_id = 100;
CONCAT('HELLO',NULL)|
--------------------|
Hello               |

-- MySQL
SELECT CONCAT('Hello', NULL)
  FROM employees
 WHERE employee_id = 100;
CONCAT('Hello', NULL)|
---------------------|
               [NULL]|

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.

Oracle take NULL Think of it as an empty string , So the query result is “Hello”;SQL Server and PostgreSQL Although there is a distinction between NULL And empty string , however CONCAT The two are equivalent in a function ;MySQL in NULL Parameter result CONCAT The result of the function is NULL;SQLite Not provided CONCAT function .

in addition ,Oracle Medium || Will also be NULL Think of it as an empty string ; Other databases || Medium NULL The parameter will produce NULL result ;SQL Server Use in + Connection string ,NULL Parameters will produce NULL result .

Aggregate functions (SUM、COUNT、AVG etc. ) It is usually deleted before the calculation NULL data :

SELECT SUM(salary + commission_pct) sum1,
       SUM(salary) + SUM(commission_pct) sum2,
       COUNT(salary),
       COUNT(commission_pct)
  FROM employees;
SUM1    |SUM2    |COUNT(SALARY)|COUNT(COMMISSION_PCT)|
--------|--------|-------------|---------------------|
311507.8|691423.8|          107|                   35|

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.

first SUM Function returns salary and commission_pct The sum of data that is not empty ; The first one SUM Function returns salary Sum of non empty data plus commission_pct The sum of data that is not empty , So it's bigger than the first one ;COUNT Function results show that salary Yes 107 Records are not empty ,commission_pct Only 35 Records are not empty .

If all the input data is NULL value , except COUNT Aggregate functions other than functions return NULL:

SELECT COUNT(*), COUNT(commission_pct), AVG(commission_pct), SUM(commission_pct)
  FROM employees
 WHERE commission_pct IS NULL;
COUNT(*)|COUNT(COMMISSION_PCT)|AVG(COMMISSION_PCT)|SUM(COMMISSION_PCT)|
--------|---------------------|-------------------|-------------------|
      72|                    0|             [NULL]|             [NULL]|

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.

COUNT(*) Always return the number of rows of data , Not affected by null values ;COUNT(commission_pct) Return zero ;AVG and SUM Back to NULL.

DISTINCT、GROUP BY、UNION And null value

SQL The grouping operation in will all NULL Values are assigned to the same group , Include DISTINCT、GROUP BY And window functions PARTITION BY. for example :

SELECT DISTINCT commission_pct
  FROM employees;
commission_pct|
--------------|
        [NULL]|
          0.40|
          0.30|
          0.20|
          0.25|
          0.15|
          0.35|
          0.10|

SELECT commission_pct
  FROM employees
 GROUP BY commission_pct;
commission_pct|
--------------|
        [NULL]|
          0.40|
          0.30|
          0.20|
          0.25|
          0.15|
          0.35|
          0.10|

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.

As can be seen from the above example ,commission_pct Empty data includes 72 strip , But after grouping, there is only one NULL Group .

besides ,UNION The operator will also have all NULL Think of it as the same value :

SELECT manager_id
  FROM employees
 WHERE manager_id IS NULL
 UNION
SELECT manager_id
  FROM employees
 WHERE manager_id IS NULL;
manager_id|
----------|
    [NULL]|

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.

If you will UNION Switch to UNION ALL, The query results will remain 2 individual NULL value .

ORDER BY And null value

SQL The standard is not defined NULL The sort order of values , But for ORDER BY Defined NULLS FIRST and NULLS LAST Options , Used to explicitly specify that null values precede or follow other data .

Different databases provide different implementations of this :

SELECT employee_id, manager_id
  FROM employees
 WHERE employee_id IN (100, 101, 102)
 ORDER BY manager_id;

-- Oracle、PostgreSQL
EMPLOYEE_ID|MANAGER_ID|
-----------|----------|
        101|       100|
        102|       100|
        100|    [NULL]|

-- MySQL、SQL Server、SQLite
employee_id|manager_id|
-----------|----------|
        100|    [NULL]|
        101|       100|
        102|       100|

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.

among ,Oracle and PostgreSQL By default NULL As the maximum value , Last in ascending order ;MySQL、SQL Server and SQLite By default NULL As a minimum , First in ascending order .

in addition ,Oracle、PostgreSQL and SQLite Provides extended NULLS FIRST and NULLS LAST Options :

-- Oracle、PostgreSQL  and  SQLite
SELECT employee_id, manager_id
  FROM employees
 WHERE employee_id IN (100, 101, 102)
 ORDER BY manager_id NULLS FIRST;
employee_id|manager_id|
-----------|----------|
        100|    [NULL]|
        101|       100|
        102|       100|

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.

We can also use CASE Expressions achieve a similar effect . The following example is related to NULLS LAST The same effect , And all databases can be used :

SELECT employee_id, manager_id
  FROM employees
 WHERE employee_id IN (100, 101, 102)
 ORDER BY CASE WHEN manager_id IS NULL THEN 1
               ELSE 0
          END,
          manager_id;
employee_id|manager_id|
-----------|----------|
        101|       100|
        102|       100|
        100|    [NULL]|

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.

First ,CASE The expression will manager_id Empty data is converted to 1, Non empty data is converted to 0, So null values rank after the rest of the data ; The second sort field manager_id Ensures that non empty data is sorted from small to large .

Null processing function

Because of the particularity of null value , When we analyze data, we often need to convert null values to other values . So ,SQL Two standard null functions are provided :COALESCE and NULLIF.

COALESCE(exp1, exp2, …) The function is used to NULL Convert to other values . When exp1 Return when not empty exp1, Otherwise check exp2; If exp2 Return when not empty exp2, By analogy . for example :

SELECT COALESCE(NULL, NULL, 3)
  FROM employees
 WHERE employee_id = 100;
COALESCE(NULL, NULL, 3)|
-----------------------|
                      3|

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.

Because the first two parameters are NULL,COALESCE Finally returned to 3.

COALESCE Functions can also use CASE Rewrite the expression as follows :

CASE WHEN exp1 IS NOT NULL THEN exp1
     WHEN exp2 IS NOT NULL THEN exp2
     ...
     ELSE expN
END

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

NULLIF(exp1, exp2) Function is used to convert the specified value to NULL. When exp1 be equal to exp2 when , return NULL; otherwise , return exp1 .NULLIF The most common use is to prevent division by zero errors , for example :

SELECT 1 / NULLIF(0, 0) -- 1 / 0
  FROM employees
 WHERE employee_id = 100;

     
  • 1.
  • 2.
  • 3.

In the example NULLIF Convert the first zero to NULL, So the query results return NULL; If used directly 1 / 0, The query will return a divide by zero error .MySQL The division error in is caused by  sql_mode Variable control .

NULLIF Functions can also be used CASE Rewrite the expression as follows :

CASE WHEN exp1 = exp2 THEN NULL
     ELSE exp1
END

     
  • 1.
  • 2.
  • 3.

utilize CASE expression , We can also easily implement multiple values to NULL Transformation :

CASE WHEN expr IN (value1, value2,)
     THEN NULL
     ELSE expr
END

     
  • 1.
  • 2.
  • 3.
  • 4.

COALESCE and NULLIF It's actually CASE Two abbreviations for expressions .

In addition to standard SQL Outside of the function , The database also provides some special functions :

  • Oracle Medium NVL(expr1, expr2) amount to 2 Parameters COALESCE. in addition ,NVL2(expr1, expr2, expr3) If the first parameter is not empty , Returns the value of the second parameter ; otherwise , Returns the value of the third parameter ;
  • MySQL Medium IFNULL(expr1, expr2) amount to 2 Parameters COALESCE. in addition ,IF(expr1, expr2, expr3) If the first parameter is true (expr1 <> 0 also expr1 Not empty ), Returns the value of the second parameter ; otherwise , Returns the value of the third parameter ;
  • SQL Server Medium ISNULL(expr1, expr2) amount to 2 Parameters COALESCE.

Field constraints and null values

If unknown or missing data in the field is not allowed , You can use fields of NOT NULL constraint .

For unique constraints (UNIQUE), Multiple NULL It's seen as a different value ; therefore , Multiple null values can exist in the unique constraint field . however SQL Server The exception is :

CREATE TABLE t_unique(id INT UNIQUE);
INSERT INTO t_unique VALUES(1);
INSERT INTO t_unique VALUES(NULL);
INSERT INTO t_unique VALUES(NULL); -- SQL Server  A unique key conflict error has occurred 

-- SQL Server  With the exception of 
SELECT * FROM t_unique;
    id|
------|
[NULL]|
[NULL]|
     1|

-- SQL Server  With the exception of 
SELECT * FROM t_unique;
    id|
------|
[NULL]|
     1|

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.

about SQL Server for , Only one of the unique constraints is allowed NULL data ; So the first 3 individual INSERT Statement execution error , In the end, there were only two records .

If it's a composite index , The situation is slightly different :

CREATE TABLE t_unique2(c1 INT, c2 INT, UNIQUE(c1,c2));

INSERT INTO t_unique2 VALUES(1, 1);
INSERT INTO t_unique2 VALUES(NULL, NULL);
INSERT INTO t_unique2 VALUES(NULL, NULL); -- SQL Server  A unique key conflict error has occurred 
INSERT INTO t_unique2 VALUES(1, NULL);
INSERT INTO t_unique2 VALUES(1, NULL); -- Oracle  and  SQL Server  A unique key conflict error has occurred 

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.

among ,SQL Server Only one record is allowed to have all index fields empty ; If a field is not empty ,Oracle and SQL Server Only one record is allowed to have other index fields empty .

in addition , Check constraint (CHECK) about NULL And WHERE On the contrary : As long as the check result of the data is not false, it can be inserted successfully . for example :

CREATE TABLE t_check (
  c1 INT CHECK (c1 >= 0),
  c2 INT CHECK (c2 >= 0),
  CHECK (c1 + c2 <= 100)
);

INSERT INTO t_check VALUES (5, 5);
INSERT INTO t_check VALUES (NULL, NULL);
INSERT INTO t_check VALUES (200, NULL);

SELECT * FROM t_check;
c1    |c2    |
------|------|
     5|     5|
[NULL]|[NULL]|
   200|[NULL]|

     
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.

If c1 and c2 If it's worth it , Must be greater than or equal to zero and the sum value less than or equal to 100;c1 and c2 Can be empty ; If one of them is empty , The value of another field can be greater than 100.

Welcome to your attention ️、 give the thumbs-up 、 forward !

版权声明
本文为[Miss Tony without a haircut]所创,转载请带上原文链接,感谢
https://chowdera.com/2021/11/20211109095357319K.html

随机推荐