sql-server – 检查是否存在EXISTS优于COUNT! ……不是吗?
副标题[/!--empirenews.page--]
我经常阅读当必须检查行的存在时,应始终使用EXISTS而不是COUNT. 然而,在最近的几个场景中,我测量了使用计数时的性能提升. LEFT JOIN ( SELECT someID,COUNT(*) FROM someTable GROUP BY someID ) AS Alias ON ( Alias.someID = mainTable.ID ) 我不熟悉告诉SQL Server“内部”发生了什么的方法,所以我想知道是否存在一个带有EXISTS的无法解决的缺陷,这对我已经完成的测量非常有意义(可以说是RBAR吗?!). 你对这种现象有一些解释吗? 编辑: 这是您可以运行的完整脚本: SET NOCOUNT ON SET STATISTICS IO OFF DECLARE @tmp1 TABLE ( ID INT UNIQUE ) DECLARE @tmp2 TABLE ( ID INT,X INT IDENTITY,UNIQUE (ID,X) ) ; WITH T(n) AS ( SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM master.dbo.spt_values AS S ),tally(n) AS ( SELECT T2.n * 100 + T1.n FROM T AS T1 CROSS JOIN T AS T2 WHERE T1.n <= 100 AND T2.n <= 100 ) INSERT @tmp1 SELECT n FROM tally AS T1 WHERE n < 10000 ; WITH T(n) AS ( SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM master.dbo.spt_values AS S ),tally(n) AS ( SELECT T2.n * 100 + T1.n FROM T AS T1 CROSS JOIN T AS T2 WHERE T1.n <= 100 AND T2.n <= 100 ) INSERT @tmp2 SELECT T1.n FROM tally AS T1 CROSS JOIN T AS T2 WHERE T1.n < 10000 AND T1.n % 3 <> 0 AND T2.n < 1 + T1.n % 15 PRINT ' COUNT Version: ' WAITFOR DELAY '00:00:01' SET STATISTICS IO ON SET STATISTICS TIME ON SELECT T1.ID,CASE WHEN n > 0 THEN 1 ELSE 0 END AS DoesExist FROM @tmp1 AS T1 LEFT JOIN ( SELECT T2.ID,COUNT(*) AS n FROM @tmp2 AS T2 GROUP BY T2.ID ) AS T2 ON ( T2.ID = T1.ID ) WHERE T1.ID BETWEEN 5000 AND 7000 OPTION (RECOMPILE) -- Required since table are filled within the same scope SET STATISTICS TIME OFF PRINT ' EXISTS Version:' WAITFOR DELAY '00:00:01' SET STATISTICS TIME ON SELECT T1.ID,CASE WHEN EXISTS ( SELECT 1 FROM @tmp2 AS T2 WHERE T2.ID = T1.ID ) THEN 1 ELSE 0 END AS DoesExist FROM @tmp1 AS T1 WHERE T1.ID BETWEEN 5000 AND 7000 OPTION (RECOMPILE) -- Required since table are filled within the same scope SET STATISTICS TIME OFF 在SQL Server 2008R2(七个64位)上我得到了这个结果 COUNT版本:
EXISTS版本:
解决方法
任何事情都是非常罕见的,特别是涉及到数据库时.在SQL中有许多表达相同语义的方法.如果有一个有用的经验法则,可能是使用最自然的语法编写查询(并且,是的,这是主观的),并且只有在您获得的查询计划或性能不可接受时才考虑重写. 对于它的价值,我自己对这个问题的看法是存在查询最自然地使用EXISTS来表达.这也是我的经验,EXISTS tends to optimize better比OUTER JOIN拒绝NULL替代.使用COUNT(*)和过滤= 0是另一种选择,恰好在SQL Server查询优化器中有一些支持,但我个人发现这在更复杂的查询中是不可靠的.无论如何,对于我来说,EXISTS似乎比任何一种替代品更自然.
您的特定示例很有趣,因为它突出了优化程序处理CASE表达式(特别是EXISTS测试)中的子查询的方式. CASE表达式中的子查询 考虑以下(完全合法的)查询: DECLARE @Base AS TABLE (a integer NULL); DECLARE @When AS TABLE (b integer NULL); DECLARE @Then AS TABLE (c integer NULL); DECLARE @Else AS TABLE (d integer NULL); SELECT CASE WHEN (SELECT W.b FROM @When AS W) = 1 THEN (SELECT T.c FROM @Then AS T) ELSE (SELECT E.d FROM @Else AS E) END FROM @Base AS B; semantics of 只有当传递谓词返回false时,才会计算嵌套循环连接的内侧.总体效果是CASE表达式按顺序进行测试,并且仅在没有满足先前表达式的情况下才评估子查询. 带有EXISTS子查询的CASE表达式 在CASE子查询使用EXISTS的情况下,逻辑存在测试实现为半连接,但是在后面的子句需要时,必须保留通常被半连接拒绝的行.流经这种特殊类型的半连接的行获取一个标志,以指示半连接是否找到匹配.此标志称为探测列. 实现的细节是逻辑子查询被相关联接(‘apply’)替换为探测列.该工作由查询优化器中的简化规则执行,该规则称为RemoveSubqInPrj(在投影中删除子查询).我们可以使用跟踪标志8606查看详细信息: SELECT T1.ID,CASE WHEN EXISTS ( SELECT 1 FROM #T2 AS T2 WHERE T2.ID = T1.ID ) THEN 1 ELSE 0 END AS DoesExist FROM #T1 AS T1 WHERE T1.ID BETWEEN 5000 AND 7000 OPTION (QUERYTRACEON 3604,QUERYTRACEON 8606); (编辑:青岛站长网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |