MySQL 合併查詢union 查詢出的行合併到一個表中

在合併查詢中，尤其是二分類的情況，在查詢結果是相同列名的時候可以考慮合併查詢。先查詢出行的結果，再使用union或者union all合併查詢結果。

另外如果 union 和 order by 一起使用的話要注意使用方法。

一、適用場景和方法

（1）適用場景

考慮查詢過程中是否存在以下情況：

查詢行時用的表不同；
查詢某些行時需要where條件，某些行不需要where條件；
分類查詢；
查詢的結果具有相同的列名。

存在上述情況時，大多數需要合併查詢。先分行查詢，再將查詢出的行合併到一個表中。

（2）方法

MySQL合併查詢，將查詢到的行（具有相同列）合併到一個表中使用union或者union all函數

具體包括：

函數	使用說明
union	出現相同行時，不保留重複行，進行去重處理
union all	出現相同行時，保留重複行，不進行去重

根據查詢需要使用不同合併函數。

二、案例分析

下面用2個具體的案例（由簡到難）來說明行合併的過程：

（1）簡單案例

案例來自：SQL26 計算25歲以上和以下的使用者數量

描述

現在運營想要將使用者劃分為25歲以下和25歲及以上兩個年齡段，分別檢視這兩個年齡段使用者數量

本題注意：age為null 也記為 25歲以下

範例：user_profile

id	device_id	gender	age	university	gpa	active_days_within_30	question_cnt	answer_cnt
1	2138	male	21	北京大學	3.4	7	2	12
2	3214	male		復旦大學	4	15	5	25
3	6543	female	20	北京大學	3.2	12	3	30
4	2315	female	23	浙江大學	3.6	5	1	2
5	5432	male	25	山東大學	3.8	20	15	70
6	2131	male	28	山東大學	3.3	15	7	13
7	4321	male	26	復旦大學	3.6	9	6	52

根據範例，你的查詢應返回以下結果：

age_cut	number
25歲以下	4
25歲及以上	3

【分類】：合併查詢、多表連線

分析思路

難點：

1.單個字元或者值可以作為一列：例如'activity2' as activity

2.用了一半時間就完成高難度試卷。兩個時間相減得到分鐘：timestampdiff(minute, date_expr1, date_expr2) 兩個時間的差

（1）統計25歲以下學生的人數

[條件]：where score >= 85 and year(start_time) = 2021

[使用]：distinct。一定要去重

（2）統計25歲以上學生的人數

[條件]：where difficulty = 'hard' and score > 80 and year(start_time) = 2021 and timestampdiff(minute, start_time, submit_time) < duration / 2

[使用]：多表連線使用 join using( )

（3）合併兩個表

[使用]：union all 和union 都可以，因為列activity不會有重複。

最終結果

(
select 查詢結果 [年齡段; 人數]
from 從哪張表中查詢資料[使用者表]
where 查詢條件 [年齡小於25或者為空]
)
union 
(
select 查詢結果 [年齡段; 人數]
from 從哪張表中查詢資料[使用者表]
where 查詢條件 [年齡大於25]
)

該題的多種解法詳見：SQL26 計算25歲以上和以下的使用者數量

求解程式碼

union

(
#統計25歲以下學生的人數
select
    '25歲以下' as age_cut,
    count(device_id) as number
from user_profile
where age < 25 or age is null
)
union
(
#統計25歲以上學生的人數
select
    '25歲及以上' as age_cut,
    COUNT(device_id) as number
from user_profile
where age >= 25
)

（2）較難案例

案例來自：SQL132 每個題目和每份試卷被作答的人數和次數

描述

現有試卷作答記錄表exam_record（uid使用者ID, exam_id試卷ID, start_time開始作答時間, submit_time交卷時間, score得分）：

id	uid	exam_id	start_time	submit_time	score
1	1001	9001	2021-09-01 09:01:01	2021-09-01 09:41:01	81
2	1002	9002	2021-09-01 12:01:01	2021-09-01 12:31:01	70
3	1002	9001	2021-09-01 19:01:01	2021-09-01 19:40:01	80
4	1002	9002	2021-09-01 12:01:01	2021-09-01 12:31:01	70
5	1004	9001	2021-09-01 19:01:01	2021-09-01 19:40:01	85
6	1002	9002	2021-09-01 12:01:01	(NULL)	(NULL)

題目練習表practice_record（uid使用者ID, question_id題目ID, submit_time提交時間, score得分）：

id	uid	question_id	submit_time	score
1	1001	8001	2021-08-02 11:41:01	60
2	1002	8001	2021-09-02 19:30:01	50
3	1002	8001	2021-09-02 19:20:01	70
4	1002	8002	2021-09-02 19:38:01	70
5	1003	8001	2021-08-02 19:38:01	70
6	1003	8001	2021-08-02 19:48:01	90
7	1003	8002	2021-08-01 19:38:01	80

請統計每個題目和每份試卷被作答的人數和次數，分別按照"試卷"和"題目"的uv & pv降序顯示，範例資料結果輸出如下：

tid	uv	pv
9001	3	3
9002	1	3
8001	3	5
8002	2	2

解釋：「試卷」有3人共練習3次試卷9001，1人作答3次9002；「刷題」有3人刷5次8001，有2人刷2次8002

【分類】：合併查詢

分析思路

難點：

union 和 order by 一起使用需要注意的問題

（1）統計每份試卷被作答的人數和次數

[條件]：where score >= 85 and year(start_time) = 2021

[使用]：distinct。一定要去重

（2）統計每個題目被作答的人數和次數

[條件]：where difficulty = 'hard' and score > 80 and year(start_time) = 2021 and timestampdiff(minute, start_time, submit_time) < duration / 2

[使用]：多表連線使用 join using( )

（3）合併兩個表,分別按照"試卷"和"題目"的uv & pv降序顯示

[使用]：union all 和union 都可以，因為列activity不會有重複。

最終結果

select * from 
(
select 查詢結果 [試卷ID; 作答次數]
from 從哪張表中查詢資料[試卷作答記錄表]
group by 分組條件 [試卷ID]
order by 對查詢結果排序 [按照"試卷"的uv & pv降序]
)
union
select * from 
(
select 查詢結果 [題目ID; 作答次數]
from 從哪張表中查詢資料[題目練習表]
group by 分組條件 [題目ID]
order by 對查詢結果排序 [按照"題目"的uv & pv降序]
)

求解程式碼

方法一：

#正確程式碼
select * from 
(
select 
    exam_id as tid,
    count(distinct uid) as uv,
    count(uid) as pv
from exam_record a
group by exam_id
order by uv desc, pv desc
) a
union
select * from 
(
select 
    question_id as tid,
    count(distinct uid) as uv,
    count(uid) as pv
from practice_record b
group by question_id
order by uv desc, pv desc
) attr

是不是可以union兩個子句之後再使用order by ? 但是這個排序要對2個表分別進行降序，就需要寫成下面這樣：

方法二：

使用函數

left(str,length) 函數： str左邊開始的長度為 length 的子字串，在本例中為‘9’和‘8’

解釋：試卷編號以‘9’開頭、題目編號以‘8’開頭，對編號進行降序就是對"試卷"和"題目"分別進行排序

(
#每份試卷被作答的人數和次數
select
    exam_id as tid,
    count(distinct uid) as uv,
    count(*) as pv
from exam_record
group by exam_id
)
union
(
#每個題目被作答的人數和次數
select
    question_id as tid,
    count(distinct uid) as uv,
    count(*) as pv
from practice_record
group by question_id
)
#分別按照"試卷"和"題目"的uv & pv降序顯示
order by left(tid,1) desc,uv desc,pv desc

推薦使用方法一，更具有普適性。

擴充套件：

前往檢視MySQL union 和 order by 一起使用需要注意的問題