博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Indexing GROUP BY
阅读量:5324 次
发布时间:2019-06-14

本文共 4142 字,大约阅读时间需要 13 分钟。

 

SQL databases use two entirely different group by algorithms. The first one, the hash algorithm, aggregates the input records in a temporary hash table. Once all input records are processed, the hash table is returned as the result. The second algorithm, the sort/group algorithm, first sorts the input data by the grouping key so that the rows of each group follow each other in immediate succession. Afterwards, the database just needs to aggregate them. In general, both algorithms need to materialize an intermediate state, so they are not executed in a pipelined manner. Nevertheless the sort/group algorithm can use an index to avoid the sort operation, thus enabling a pipelined group by.

 
 
Consider the following query. It delivers yesterday's revenue grouped by 
PRODUCT_ID:
SELECT product_id, sum(eur_value)  FROM sales WHERE sale_date = TRUNC(sysdate) - INTERVAL '1' DAY GROUP BY product_id

Knowing the index on SALE_DATE and PRODUCT_ID from the , the sort/group algorithm is more appropriate because an INDEX RANGE SCAN automatically delivers the rows in the required order. That means the database avoids materialization because it does not need an explicit sort operation—the group by is executed in a pipelined manner.

oracle:---------------------------------------------------------------|Id |Operation                    | Name        | Rows | Cost |---------------------------------------------------------------| 0 |SELECT STATEMENT             |             |   17 |  192 || 1 | SORT GROUP BY NOSORT        |             |   17 |  192 || 2 |  TABLE ACCESS BY INDEX ROWID| SALES       |  321 |  192 ||*3 |   INDEX RANGE SCAN          | SALES_DT_PR |  321 |    3 |---------------------------------------------------------------

 

The Oracle database's execution plan marks a pipelined 
SORT GROUP BY operation with the 
NOSORT addendum. The execution plan of other databases does not mention any sort operation at all. 
 
The pipelined 
group by has the same prerequisites as the pipelined 
order by, except there are no 
ASC and 
DESC modifiers. That means that defining an index with 
ASC/
DESC modifiers should not affect pipelined 
group by execution. The same is true for 
NULLS FIRST/
LAST. Nevertheless there are databases that cannot properly use an 
ASC/
DESC index for a pipelined 
group by.
 
For PostgreSQL, you must add an order by clause to make an index with NULLS LAST sorting usable for a pipelined group by. The Oracle database cannot read an index backwards in order to execute a pipelined group by that is followed by an order by. More details are available in the respective appendices: , .

 

If we extend the query to consider all sales since yesterday, as we did in the example for the pipelined order by, it prevents the pipelined group by for the same reason as before: the INDEX RANGE SCAN does not deliver the rows ordered by the grouping key.

SELECT product_id, sum(eur_value)  FROM sales WHERE sale_date >= TRUNC(sysdate) - INTERVAL '1' DAY GROUP BY product_id

 

Oracle:---------------------------------------------------------------|Id |Operation                    | Name        | Rows | Cost |---------------------------------------------------------------| 0 |SELECT STATEMENT             |             |   24 |  356 || 1 | HASH GROUP BY               |             |   24 |  356 || 2 |  TABLE ACCESS BY INDEX ROWID| SALES       |  596 |  355 ||*3 |   INDEX RANGE SCAN          | SALES_DT_PR |  596 |    4 |---------------------------------------------------------------

 

Instead, the Oracle database uses the hash algorithm. The advantage of the hash algorithm is that it only needs to buffer the aggregated result, whereas the sort/group algorithm materializes the complete input set. In other words: the hash algorithm needs less memory.

As with pipelined order by, a fast execution is not the most important aspect of the pipelined group by execution. It is more important that the database executes it in a pipelined manner and delivers the first result before reading the entire input. 

 

参考:

http://use-the-index-luke.com/sql/sorting-grouping/indexed-group-by

转载于:https://www.cnblogs.com/xiaotengyi/p/7229262.html

你可能感兴趣的文章
SecureCRT的使用方法和技巧(详细使用教程)
查看>>
自建数据源(RSO2)、及数据源增强
查看>>
2018icpc徐州OnlineA Hard to prepare
查看>>
使用命令创建数据库和表
查看>>
【转】redo与undo
查看>>
安卓当中的线程和每秒刷一次
查看>>
wpf样式绑定 行为绑定 事件关联 路由事件实例
查看>>
TCL:表格(xls)中写入数据
查看>>
Oracle事务
查看>>
String类中的equals方法总结(转载)
查看>>
标识符
查看>>
内存地址对齐
查看>>
创新课程管理系统数据库设计心得
查看>>
Could not resolve view with name '***' in servlet with name 'dispatcher'
查看>>
[转载] redis 的两种持久化方式及原理
查看>>
MyBaits学习
查看>>
管道,数据共享,进程池
查看>>
[Cypress] Stub a Post Request for Successful Form Submission with Cypress
查看>>
SDUTOJ3754_黑白棋(纯模拟)
查看>>
php中的isset和empty的用法区别
查看>>