Identify data based on data in other tables [message #674431] |
Sun, 27 January 2019 09:33 |
|
rrcr
Messages: 18 Registered: January 2019
|
Junior Member |
|
|
We have tables say A, B, C
I need to identify data from table A based on Data in C.
We have been provided relation among the tables
C. Col1=B. Col2
B. Col3=A. Col4
We have written query like
Select a. * from
A, B, C
Where A. Col4=B. Col3
And B. Col2=C. Col1
And c. Col >somedate.
It is giving dups as the join keys are not pk.
We tried to write using exists
Select * from A where exists
(select 1 from B where A. Col4=B. Col3 and exists(
Select 1 from C where B. Col2=C. Col1)
But this is running long time as there 200 Million records.
Please suggest a better approach.
Thanks in advance
|
|
|
|
|
|
|
|
Re: Identify data based on data in other tables [message #674448 is a reply to message #674447] |
Mon, 28 January 2019 01:51 |
John Watson
Messages: 8951 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
rrcr wrote on Mon, 28 January 2019 07:48John Watson wrote on Sun, 27 January 2019 12:35Surely all you need do is
select DISTINCT a.* ......
and, by the way, I wish you would not say "record" when you mean "row"
Thanks For your suggestion.. Distinct generally takes more time isnt it.
How do you know if you haven't tried it? The optimizer should rewrite it to a semijoin, which I think is what you were trying to do with EXISTS.
|
|
|
|
Re: Identify data based on data in other tables [message #674452 is a reply to message #674450] |
Mon, 28 January 2019 02:09 |
John Watson
Messages: 8951 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
So you haven't tested properly, have you. I would guess that you haven't checked out the execution plans either. You do need to do a bit of investigation, you know. For example,orclx>
orclx> set autot trace exp
orclx> select distinct d.* from dept d join emp e on (e.deptno=d.deptno);
Execution Plan
----------------------------------------------------------
Plan hash value: 3764232723
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 69 | 7 (29)| 00:00:01 |
| 1 | HASH UNIQUE | | 3 | 69 | 7 (29)| 00:00:01 |
| 2 | MERGE JOIN SEMI | | 3 | 69 | 6 (17)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 4 | 80 | 2 (0)| 00:00:01 |
| 4 | INDEX FULL SCAN | PK_DEPT | 4 | | 1 (0)| 00:00:01 |
|* 5 | SORT UNIQUE | | 14 | 42 | 4 (25)| 00:00:01 |
| 6 | TABLE ACCESS FULL | EMP | 14 | 42 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("E"."DEPTNO"="D"."DEPTNO")
filter("E"."DEPTNO"="D"."DEPTNO")
orclx>
|
|
|
|
|
|
Re: Identify data based on data in other tables [message #674475 is a reply to message #674471] |
Tue, 29 January 2019 00:39 |
|
rrcr
Messages: 18 Registered: January 2019
|
Junior Member |
|
|
EdStevens wrote on Mon, 28 January 2019 14:50rrcr wrote on Mon, 28 January 2019 02:02John Watson wrote on Mon, 28 January 2019 01:51rrcr wrote on Mon, 28 January 2019 07:48John Watson wrote on Sun, 27 January 2019 12:35Surely all you need do is
select DISTINCT a.* ......
and, by the way, I wish you would not say "record" when you mean "row"
Thanks For your suggestion.. Distinct generally takes more time isnt it.
How do you know if you haven't tried it? The optimizer should rewrite it to a semijoin, which I think is what you were trying to do with EXISTS.
I have tested with sample data .. distinct is running for 0.5 secs where as exist running for 0.27 sec
How much of your time is justified optimizing the query to save 0.23 seconds?
For test data it is 0.2 sec where as we have data around 180 millions.. it will have great impact right
[Updated on: Tue, 29 January 2019 00:39] Report message to a moderator
|
|
|
|
|