场景介绍
以学生成绩为例:
老师会按照学科录入成绩,每科老师都会单独录入每个学生对应学科的成绩,而每位学生只关注自己各科的成绩。如果把老师录入数据作为原始表,那么学生查看自己的成绩就要用到行转列;如果让学生自己填写各科的成绩并汇总,然后老师去查自己学科所有学生的成绩,那就是列转行。
行转列与列转行的示意图如下:
图1 示意图
- 行转列
将多行数据转换成一行显示,或将一列数据转换成多列显示。
- 列转行
将一行数据转换成多行显示,或将多列数据转换成一列显示。
示例表
创建行存表students_info并插入数据。
CREATE TABLE students_info(name varchar(20),subject varchar(100),score bigint) distribute by hash(name);
INSERT INTO students_info VALUES('lily','math',95);
INSERT INTO students_info VALUES('lily','physics',80);
INSERT INTO students_info VALUES('lily','literature',92);
INSERT INTO students_info VALUES('matu','math',75);
INSERT INTO students_info VALUES('matu','physics',90);
INSERT INTO students_info VALUES('matu','literature',85);
INSERT INTO students_info VALUES('jack','math',90);
INSERT INTO students_info VALUES('jack','physics',95);
INSERT INTO students_info VALUES('jack','literature',95);
查看表students_info信息。
SELECT * FROM students_info;
name | subject | score
------+------------+-------
matu | math | 75
matu | physics | 90
matu | literature | 85
lily | math | 95
lily | physics | 80
lily | literature | 92
jack | math | 90
jack | physics | 95
jack | literature | 95
创建列存表students_info1并插入数据。
CREATE TABLE students_info1(name varchar(20), math bigint, physics bigint, literature bigint) with (orientation = column) distribute by hash(name);
INSERT INTO students_info1 VALUES('lily',95,80,92);
INSERT INTO students_info1 VALUES('matu',75,90,85);
INSERT INTO students_info1 VALUES('jack',90,95,95);
查看表students_info1信息。
SELECT * FROM students_info1;
name | math | physics | literature
------+------+---------+------------
matu | 75 | 90 | 85
lily | 95 | 80 | 92
jack | 90 | 95 | 95
(3 rows)
静态行转列
静态行转列需要手动指定每一列的列名,如果存在则取其对应值,否则将赋其默认值0。
SELECT name,
sum(case when subject='math' then score else 0 end) as math,
sum(case when subject='physics' then score else 0 end) as physics,
sum(case when subject='literature' then score else 0 end) as literature FROM students_info GROUP BY name;
name | math | physics | literature
------+------+---------+------------
matu | 75 | 90 | 85
lily | 95 | 80 | 92
jack | 90 | 95 | 95
(3 rows)
动态行转列
8.1.2及以上集群版本可使用GROUP_CONCAT生成列存语句。
SELECT group_concat(concat('sum(IF(subject = ''', subject, ''', score, 0)) AS "', name, '"'))FROM students_info; group_concat------------------------------------------------------------
sum(IF(subject = 'literature', score, 0)) AS "jack",sum(IF(subject = 'literature', score, 0)) AS "lily",sum(IF(subject = 'literature', score, 0)) AS "matu",sum(IF(subject = 'math', score, 0)) AS "jack",sum(IF(subject = 'math', score, 0)) AS "lily",sum(IF(subject = 'math', score, 0)) AS "matu",sum(IF(subject = 'physics', score, 0)) AS "jack",sum(IF(subject = 'physics', score, 0)) AS "lily",sum(IF(subject = 'physics', score, 0)) AS "matu"
(1 row)
8.1.1及更低版本中可用LISTAGG生成列存语句。
SELECT listagg(concat('sum(case when subject = ''', subject, ''' then score else 0 end) AS "', subject, '"'),',') within GROUP(ORDER BY 1)FROM (select distinct subject from students_info); listagg------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- sum(case when subject = 'literature' then score else 0 end) AS "literature",sum(case when subject = 'physics' then score else 0 end) AS "physics",sum(case when subject = 'math' then score else 0 end) AS "math"
(1 row)
再通过视图动态重建:
CREATE OR REPLACE FUNCTION build_view()
RETURNS VOID
LANGUAGE plpgsql
AS $$ DECLARE
sql text;
rec record;
BEGIN
sql := 'select LISTAGG(
CONCAT( ''sum(case when subject = '''''', subject, '''''' then score else 0 end) AS "'', subject, ''"'' ) ,'','' ) within group(order by 1) from (select distinct subject from students_info);';
EXECUTE sql INTO rec;
sql := 'drop view if exists get_score'; EXECUTE sql; sql := 'create view get_score as select name, ' || rec.LISTAGG || ' from students_info group by name';
EXECUTE sql;
END$$;
执行重建:
CALL build_view();
查询视图:
SELECT * FROM get_score;
name | literature | physics | math
------+------------+---------+------
matu | 85 | 90 | 75
lily | 92 | 80 | 95
jack | 95 | 95 | 90
(3 rows)
列转行
使用union all,将各科目(math、physics和literature)整合为一列,示例如下:
SELECT * FROM
(
SELECT name, 'math' AS subject, math AS score FROM students_info1
union all
SELECT name, 'physics' AS subject, physics AS score FROM students_info1
union all
SELECT name, 'literature' AS subject, literature AS score FROM students_info1)
order by name;
name | subject | score
------+------------+-------
jack | math | 90
jack | physics | 95
jack | literature | 95
lily | math | 95
lily | physics | 80
lily | literature | 92
matu | math | 75
matu | physics | 90
matu | literature | 85
(9 rows)