I took a time to look at the ASK Tom's RSS, I found that two articles that should be said to IN / EXISTS: http://asktom.Oracle.com/pls/ask/f? P = 4950: 8 ::::: : F4950_P8_DISPLAYID: 953229842074, http://asktom.oracle.com/pls/ask/f? P = 4950: 8 ::::: F4950_P8_DISPLAYID: 442029737684
The article is very long, and there is no careful reading, but some things are still very interesting, and the handsome experiment. Everyone will take a look, Tom originally answered the problem, I have no way to test, do some tests under 10.1.0.4, CBO and RBO (using hints) are still very different.
SQL> SELECT * from scott.emp;
Empno ename Job Mgr Hiredate Sal Comm Deptno
---------- ------------------------- --- ---------- ---------- ----------
7369 Smith Clerk 7902 17-December -80 800 20
7499 Allen Salesman 7698 20-February-81 1600 300 30
7521 Ward Salesman 7698 22-2 February-81 1250 500 30
7566 Jones Manager 7839 02- April-81 2975 20
7654 Martin Salesman 7698 28- September-81 1250 1400 30
7698 Blake Manager 7839 01-00-81 2850 30
7782 Clark Manager 7839-09-09- June -81 2450 10
7788 Scott Analyst 7566 19- April-87 3000 20
7839 KING PRESIDENT 17-November -81 5000 10
7844 Turner Salesman 7698-08- September-81 1500 0 30
7876 Adams Clerk 7788 23-50-87 1100 20
7900 James Clerk 7698 03-2 month -81 950 30
7902 Ford Analyst 7566 03-December -81 3000 20
7934 Miller Clerk 7782 23- Jan -82 1300 10 has selected 14 lines.
Implementation plan
-------------------------------------------------- ------------
0 Select Statement Optimizer = All_Rows (COST = 3 CARD = 14 BYtes = 1218)
1 0 Table Access (Full) of 'EMP' (COST = 3 Card = 14 BYtes = 1218)
SQL> CREATE TABLE TMP_EMP AS SELECT * from scott.emp where ename Like 's%'
The table has been created.
SQL> SELECT * from Tmp_EMP;
Empno ename Job Mgr Hiredate Sal Comm Deptno
---------- ------------------------- --- ---------- ---------- ----------
7369 Smith Clerk 7902 17-December -80 800 20
7788 Scott Analyst 7566 19- April-87 3000 20
Implementation plan
-------------------------------------------------- ------------
0 Select Statement Optimizer = All_Rows (COST = 3 Card = 2 Bytes = 174)
1 0 Table Access (Full) of 'TMP_EMP' (Table) (COST = 3 Card = 2 Bytes = 174)
Test in:
SQL> SELECT * from Tmp_EMP where ename in (select ename from scott.emp);
Empno ename Job Mgr Hiredate Sal Comm Deptno
---------- ------------------------- --- ---------- ---------- ----------
7369 Smith Clerk 7902 17-December -80 800 20
7788 Scott Analyst 7566 19- April-87 3000 20
Implementation plan
-------------------------------------------------- ------------
0 Select Statement Optimizer = All_Rows (COST = 7 Card = 2 Bytes = 188)
1 0 Hash Join (SEMI) (COST = 7 Card = 2 bytes = 188)
2 1 Table Access (Full) of 'TMP_EMP' (Table) (COST = 3 Card = 2 Bytes = 174) 3 1 Table Access (Full) of 'EMP' (COST = 3 Card = 14 BYtes = 98)
The table did not have analyzed, using hash Join (SEMI).
Tom mentioned when answering questions:
Select * from t1 where x in (Select Y from T2)
IS Typically Processed As:
SELECT *
From T1, (Select Distinct Y from T2) T2
WHERE T1.X = T2.Y;
The Subquery Is Evaluated, DistIncted, Indexed (or Hashed Or Sorted) and then
Joined to the Original Table - Typically.
Obviously the second query is very strange. It is also this that prompted me to open the database test. Is there any Distinct? Haven't seen IN to produce sorting operation.
SQL> ED
File Afiedt.buf
1 Select * from tmp_emp, (Select Distinct Ename from Scott.emp) T
2 * where tmp_emp.ename = t.ename
SQL> /
Empno ename Job Mgr Hiredate Sal Comm Deptno ENAME
---------- ------------------------- --- ---------- ----------------------------
7369 Smith Clerk 7902 17-December -80 800 20 Smith
7788 Scott Analyst 7566 19-April-87 3000 20 Scott
Implementation plan
-------------------------------------------------- ------------
0 Select Statement Optimizer = All_Rows (COST = 8 Card = 2 Bytes = 188)
1 0 View (cost = 8 card = 2 bytes = 188)
2 1 sort (unique) (COST = 8 card = 2 bytes = 202)
3 2 Hash Join (COST = 7 Card = 2 bytes = 202)
4 3 Table Access (Full) of 'TMP_EMP' (Table) (COST = 3 Card = 2 Bytes = 188)
5 3 Table Access (Full) of 'EMP' (COST = 3 Card = 14 BYtes = 98)
Obviously different. So modified Rule mode:
SQL> ED
File Afiedt.buf
1 SELECT / * Rule * / * from Tmp_EMP, (Select Distinct Ename from Scott.emp) T2 * Where TMP_EMP.ENAME = T.Name
SQL> /
Empno ename Job Mgr Hiredate Sal Comm Deptno ENAME
---------- ------------------------- --- ---------- ----------------------------
7788 Scott Analyst 7566 19-April-87 3000 20 Scott
7369 Smith Clerk 7902 17-December -80 800 20 Smith
Implementation plan
-------------------------------------------------- ------------
0 Select Statement Optimizer = HINT: RULE
1 0 merge Join
2 1 view
3 2 sort (unique)
4 3 Table Access (Full) of 'EMP' (TABLE)
5 1 Sort (Join)
6 5 Table Access (Full) of 'TMP_EMP' (TABLE)
Try then add an index:
SQL> ALTER TABLE TMP_EMP Add Constraints PK_TMPEMP Primary Key (Ename);
The table has been changed.
Analyze:
SQL> Analyze Table TMP_EMP Compute Statistics for Table for All Indexed Column;
The table has been analyzed.
SQL> Analyze Table Scott.emp Compute Statistics for Table for All Column;
The table has been analyzed.
SQL> SELECT *
2 from TMP_EMP E, (Select Distinct Ename from Scott.emp) T
3 where e.ename = t.ename;
Empno ename Job Mgr Hiredate Sal Comm Deptno ENAME
---------- ------------------------- --- ---------- ----------------------------
7369 Smith Clerk 7902 17-December -80 800 20 Smith
7788 Scott Analyst 7566 19-April-87 3000 20 Scott
Implementation plan
-------------------------------------------------- ------------
0 Select Statement Optimizer = All_Rows (COST = 6 Card = 2 Bytes = 188) 1 0 View (COST = 6 Card = 2 Bytes = 188)
2 1 sort (unique) (COST = 6 card = 2 bytes = 88)
3 2 nested loops (cost = 5 card = 2 bytes = 88)
4 3 Table Access (Full) of 'EMP' (COST = 3 Card = 14 BYtes = 70)
5 3 Table Access (by index rowid) of 'tmp_emp' (table) (COST = 1 Card = 1 bytes = 39)
6 5 INDEX (Unique Scan) of 'PK_TMPEMP' (INDEX (COST = 0 card = 1)
SQL> ED
File Afiedt.buf
1 SELECT / * RULE * / *
2 from TMP_EMP E, (Select Distinct Ename from Scott.emp) T
3 * where e.ename = t.ename
SQL> /
Empno ename Job Mgr Hiredate Sal Comm Deptno ENAME
---------- ------------------------- --- ---------- ----------------------------
7788 Scott Analyst 7566 19-April-87 3000 20 Scott
7369 Smith Clerk 7902 17-December -80 800 20 Smith
Implementation plan
-------------------------------------------------- ------------
0 Select Statement Optimizer = HINT: RULE
1 0 NESTED LOOPS
2 1 view
3 2 sort (unique)
4 3 Table Access (Full) of 'EMP' (TABLE)
5 1 Table Access (By Index Rowid) of 'TMP_EMP' (TABLE)
6 5 INDEX (Unique Scan) of 'PK_TMPEMP' (INDEX (Unique))
SQL> SELECT * from Tmp_EMP where ename in (select ename from scott.emp);
Empno ename Job Mgr Hiredate Sal Comm Deptno
---------- ------------------------- --- ---------- ---------- ---------- 7788 Scott Analyst 7566 19- April -87 3000 20
7369 Smith Clerk 7902 17-December -80 800 20
Implementation plan
-------------------------------------------------- ------------
0 Select Statement Optimizer = All_Rows (COST = 5 Card = 1 Bytes = 44)
1 0 nested loops (cost = 5 card = 1 Bytes = 44)
2 1 sort (unique) (COST = 3 card = 14 bytes = 70)
3 2 Table Access (Full) of 'EMP' (COST = 3 Card = 14 BYtes = 70)
4 1 Table Access (By Index Rowid) of 'TMP_EMP' (Table) (COST = 1 Card = 1 BYtes = 39)
5 4 Index (Unique Scan) of 'PK_TMPEMP' (INDEX (COST = 0 Card = 1)
SQL> SELECT / * Rule * / * from TMP_EMP Where ENAME IN (SELECT Ename from Scott.em);
Empno ename Job Mgr Hiredate Sal Comm Deptno
---------- ------------------------- --- ---------- ---------- ----------
7788 Scott Analyst 7566 19- April-87 3000 20
7369 Smith Clerk 7902 17-December -80 800 20
Implementation plan
-------------------------------------------------- ------------
0 Select Statement Optimizer = HINT: RULE
1 0 NESTED LOOPS
2 1 View of 'VW_NSO_1' (View)
3 2 sort (unique)
4 3 Table Access (Full) of 'EMP' (TABLE)
5 1 Table Access (By Index Rowid) of 'TMP_EMP' (TABLE)
6 5 INDEX (Unique Scan) of 'PK_TMPEMP' (INDEX (Unique))
This is really close, and it is similar to Tom. However, this pattern of Rbo should be very good when the selected column selectivity is low, and other situations may not be optimized. Let's take a look at the exists: sql> select * from tmp_emp t where exists (Select Null from Scott.emp e where t.ename = e.ename);
Empno ename Job Mgr Hiredate Sal Comm Deptno
---------- ------------------------- --- ---------- ---------- ----------
7788 Scott Analyst 7566 19- April-87 3000 20
7369 Smith Clerk 7902 17-December -80 800 20
Implementation plan
-------------------------------------------------- ------------
0 Select Statement Optimizer = All_Rows (COST = 5 Card = 1 Bytes = 44)
1 0 nested loops (cost = 5 card = 1 Bytes = 44)
2 1 sort (unique) (COST = 3 card = 14 bytes = 70)
3 2 Table Access (Full) of 'EMP' (COST = 3 Card = 14 BYtes = 70)
4 1 Table Access (By Index Rowid) of 'TMP_EMP' (Table) (COST = 1 Card = 1 BYtes = 39)
5 4 Index (Unique Scan) of 'PK_TMPEMP' (INDEX (COST = 0 Card = 1)
Look with in the same.
SQL> SELECT / * Rule * / * from TMP_EMP T where exists (Select Null from Scott.emp e Where t.ename = e.ename);
Empno ename Job Mgr Hiredate Sal Comm Deptno
---------- ------------------------- --- ---------- ---------- ----------
7369 Smith Clerk 7902 17-December -80 800 20
7788 Scott Analyst 7566 19- April-87 3000 20
Implementation plan
-------------------------------------------------- ------------
0 Select Statement Optimizer = HINT: RULE
1 0 Filter
2 1 Table Access (Full) of 'TMP_EMP' (Table) 3 1 Table Access (Full) Of 'EMP' (Table)
Use RBO to be different!
SQL> SELECT T. * From Tmp_EMP T, Scott.emp e where t.ename = E.ename;
Empno ename Job Mgr Hiredate Sal Comm Deptno
---------- ------------------------- --- ---------- ---------- ----------
7369 Smith Clerk 7902 17-December -80 800 20
7788 Scott Analyst 7566 19- April-87 3000 20
Implementation plan
-------------------------------------------------- ------------
0 Select Statement Optimizer = All_Rows (COST = 5 Card = 2 Bytes = 88)
1 0 NESTED LOOPS (COST = 5 Card = 2 Bytes = 88)
2 1 Table Access (Full) of 'EMP' (COST = 3 Card = 14 bytes = 70)
3 1 Table Access (by index rowid) of 'tmp_emp' (table) (COST = 1 card = 1 bytes = 39)
4 3 Index (Unique Scan) of 'PK_TMPEMP' (INDEX (COST = 0 Card = 1)
This seems to be in CBO, inin, in, exists is likely to get the same execution plan (more cases no test), the optimizer will find the same semantics of the three; in the earlier RBO Under Exists uses Filter and IN is equivalent to the relationship between subqueries first Distinct, the internal association is directly related to it.
Tom said in his latest reply:
Use the rbo and see what you see. Way Back When I wrote this, That Was the "more popular" of the two perhaps
Today In 2005, What I Said Years Ago Using The RBO Does Not Apply to the Cbo. The CBO IS SMART ENOUGH To Recognize There Two Things Are Effectively The Same.
IN is equivalent to the child's first Distinct, this is really not easy, why is RBO doing this? There is no general, my personal understanding is when I think in IN is a list of the list of values. Of course, I will first calculate the value first, then NL is OK, if the back is the value in a table later, It may be treated as it; and maybe mostly the use of the son query on the outer query, that is, the outer layer is large, and the return value of the subquery is less. Look at different implementation plans for different situations, Oracle does still spend some minds in CBO, Herce. Conclusion: Under RBO, use IN or exists needs to be determined, as long as you remember
Use IN presence and Distinct this step should not be difficult to judge; CBO's optimizer will choose for you, how to write is just habits.