OraFAQ Forum: SQL & PL/SQL » Identify data based on data in other tables

Home » SQL & PL/SQL » SQL & PL/SQL » Identify data based on data in other tables

Show: Today's Messages :: Polls :: Message Navigator
E-mail to friend

Identify data based on data in other tables [message #674431]

Sun, 27 January 2019 09:33

rrcr
Messages: 18
Registered: January 2019

Junior Member

We have tables say A, B, C
I need to identify data from table A based on Data in C.
We have been provided relation among the tables
C. Col1=B. Col2
B. Col3=A. Col4
We have written query like
Select a. * from
A, B, C
Where A. Col4=B. Col3
And B. Col2=C. Col1
And c. Col >somedate.
It is giving dups as the join keys are not pk.
We tried to write using exists
Select * from A where exists
(select 1 from B where A. Col4=B. Col3 and exists(
Select 1 from C where B. Col2=C. Col1)
But this is running long time as there 200 Million records.
Please suggest a better approach.
Thanks in advance

Report message to a moderator

Re: Identify data based on data in other tables [message #674434 is a reply to message #674431]

Sun, 27 January 2019 10:16

Michel Cadot
Messages: 68694
Registered: March 2007
Location: Nanterre, France, http://...

Senior Member
Account Moderator

Welcome to the forum.
Please read OraFAQ Forum Guide and How to use [code] tags and make your code easier to read.

For any performances question, please read http://www.orafaq.com/forum/mv/msg/84315/433888/102589/#msg_433888 and post the required information.
Also always post your Oracle version, with 4 decimals, as solution depends on it.

[Updated on: Sun, 27 January 2019 10:16]

Report message to a moderator

Re: Identify data based on data in other tables [message #674439 is a reply to message #674434]

Sun, 27 January 2019 12:16

BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal

Senior Member

Please read and follow the forum guidelines, to enable us to help you:
OraFAQ Forum Guide
How to use {code} tags and make your code easier to read

untested since you declined to provide test tables and test data.

SELECT * 
FROM   a 
WHERE  col4 =  (SELECT col3 
                FROM   b 
                WHERE  col2 =  (SELECT col1 
                                FROM   c));

Report message to a moderator

Re: Identify data based on data in other tables [message #674441 is a reply to message #674431]

Sun, 27 January 2019 12:35

John Watson
Messages: 8951
Registered: January 2010
Location: Global Village

Senior Member

Surely all you need do is

select DISTINCT a.* ......

and, by the way, I wih you would not say "record" when you mean "row"

Report message to a moderator

Re: Identify data based on data in other tables [message #674446 is a reply to message #674439]

Mon, 28 January 2019 01:47

rrcr
Messages: 18
Registered: January 2019

Junior Member

Thanks For your Suggestion..
As there will be multiple rows returned in subquery.. we might use IN condition. I believe EXISTS give better performance than IN when we have large data. Please clarify

Report message to a moderator

Re: Identify data based on data in other tables [message #674447 is a reply to message #674441]

Mon, 28 January 2019 01:48

rrcr
Messages: 18
Registered: January 2019

Junior Member

John Watson wrote on Sun, 27 January 2019 12:35

Surely all you need do is

select DISTINCT a.* ......

and, by the way, I wih you would not say "record" when you mean "row"

Thanks For your suggestion.. Distinct generally takes more time isnt it.

Report message to a moderator

Re: Identify data based on data in other tables [message #674448 is a reply to message #674447]

Mon, 28 January 2019 01:51

John Watson
Messages: 8951
Registered: January 2010
Location: Global Village

Senior Member

rrcr wrote on Mon, 28 January 2019 07:48

John Watson wrote on Sun, 27 January 2019 12:35
Surely all you need do is

select DISTINCT a.* ......

and, by the way, I wish you would not say "record" when you mean "row"
Thanks For your suggestion.. Distinct generally takes more time isnt it.

How do you know if you haven't tried it? The optimizer should rewrite it to a semijoin, which I think is what you were trying to do with EXISTS.

Report message to a moderator

Re: Identify data based on data in other tables [message #674450 is a reply to message #674448]

Mon, 28 January 2019 02:02

rrcr
Messages: 18
Registered: January 2019

Junior Member

John Watson wrote on Mon, 28 January 2019 01:51

rrcr wrote on Mon, 28 January 2019 07:48
John Watson wrote on Sun, 27 January 2019 12:35
Surely all you need do is

select DISTINCT a.* ......

and, by the way, I wish you would not say "record" when you mean "row"
Thanks For your suggestion.. Distinct generally takes more time isnt it.
How do you know if you haven't tried it? The optimizer should rewrite it to a semijoin, which I think is what you were trying to do with EXISTS.

I have tested with sample data .. distinct is running for 0.5 secs where as exist running for 0.27 sec

Report message to a moderator

Re: Identify data based on data in other tables [message #674452 is a reply to message #674450]

Mon, 28 January 2019 02:09

John Watson
Messages: 8951
Registered: January 2010
Location: Global Village

Senior Member

So you haven't tested properly, have you. I would guess that you haven't checked out the execution plans either. You do need to do a bit of investigation, you know. For example,

orclx>
orclx> set autot trace exp
orclx> select distinct d.* from dept d join emp e on (e.deptno=d.deptno);

Execution Plan
----------------------------------------------------------
Plan hash value: 3764232723

-----------------------------------------------------------------------------------------
| Id  | Operation                     | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |         |     3 |    69 |     7  (29)| 00:00:01 |
|   1 |  HASH UNIQUE                  |         |     3 |    69 |     7  (29)| 00:00:01 |
|   2 |   MERGE JOIN SEMI             |         |     3 |    69 |     6  (17)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DEPT    |     4 |    80 |     2   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN           | PK_DEPT |     4 |       |     1   (0)| 00:00:01 |
|*  5 |    SORT UNIQUE                |         |    14 |    42 |     4  (25)| 00:00:01 |
|   6 |     TABLE ACCESS FULL         | EMP     |    14 |    42 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("E"."DEPTNO"="D"."DEPTNO")
       filter("E"."DEPTNO"="D"."DEPTNO")

orclx>

Report message to a moderator

Re: Identify data based on data in other tables [message #674455 is a reply to message #674452]

Mon, 28 January 2019 02:23

rrcr
Messages: 18
Registered: January 2019

Junior Member

John Watson wrote on Mon, 28 January 2019 02:09

So you haven't tested properly, have you. I would guess that you haven't checked out the execution plans either. You do need to do a bit of investigation, you know. For example,

orclx>
orclx> set autot trace exp
orclx> select distinct d.* from dept d join emp e on (e.deptno=d.deptno);

Execution Plan
----------------------------------------------------------
Plan hash value: 3764232723

-----------------------------------------------------------------------------------------
| Id  | Operation                     | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |         |     3 |    69 |     7  (29)| 00:00:01 |
|   1 |  HASH UNIQUE                  |         |     3 |    69 |     7  (29)| 00:00:01 |
|   2 |   MERGE JOIN SEMI             |         |     3 |    69 |     6  (17)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DEPT    |     4 |    80 |     2   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN           | PK_DEPT |     4 |       |     1   (0)| 00:00:01 |
|*  5 |    SORT UNIQUE                |         |    14 |    42 |     4  (25)| 00:00:01 |
|   6 |     TABLE ACCESS FULL         | EMP     |    14 |    42 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("E"."DEPTNO"="D"."DEPTNO")
       filter("E"."DEPTNO"="D"."DEPTNO")

orclx>

Yeah i havent done this .. i just checked run time for those queries. I am new to this analysis. thanks for info

Report message to a moderator

Re: Identify data based on data in other tables [message #674462 is a reply to message #674455]

Mon, 28 January 2019 11:39

BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal

Senior Member

does table A have a primary key?

Report message to a moderator

Re: Identify data based on data in other tables [message #674471 is a reply to message #674450]

Mon, 28 January 2019 14:50

EdStevens
Messages: 1376
Registered: September 2013

Senior Member

rrcr wrote on Mon, 28 January 2019 02:02

John Watson wrote on Mon, 28 January 2019 01:51
rrcr wrote on Mon, 28 January 2019 07:48
John Watson wrote on Sun, 27 January 2019 12:35
Surely all you need do is

select DISTINCT a.* ......

and, by the way, I wish you would not say "record" when you mean "row"
Thanks For your suggestion.. Distinct generally takes more time isnt it.
How do you know if you haven't tried it? The optimizer should rewrite it to a semijoin, which I think is what you were trying to do with EXISTS.
I have tested with sample data .. distinct is running for 0.5 secs where as exist running for 0.27 sec

How much of your time is justified optimizing the query to save 0.23 seconds?

Report message to a moderator

Re: Identify data based on data in other tables [message #674475 is a reply to message #674471]

Tue, 29 January 2019 00:39

rrcr
Messages: 18
Registered: January 2019

Junior Member

EdStevens wrote on Mon, 28 January 2019 14:50

rrcr wrote on Mon, 28 January 2019 02:02
John Watson wrote on Mon, 28 January 2019 01:51
rrcr wrote on Mon, 28 January 2019 07:48
John Watson wrote on Sun, 27 January 2019 12:35
Surely all you need do is

select DISTINCT a.* ......

and, by the way, I wish you would not say "record" when you mean "row"
Thanks For your suggestion.. Distinct generally takes more time isnt it.
How do you know if you haven't tried it? The optimizer should rewrite it to a semijoin, which I think is what you were trying to do with EXISTS.
I have tested with sample data .. distinct is running for 0.5 secs where as exist running for 0.27 sec

How much of your time is justified optimizing the query to save 0.23 seconds?

For test data it is 0.2 sec where as we have data around 180 millions.. it will have great impact right

[Updated on: Tue, 29 January 2019 00:39]

Report message to a moderator

Re: Identify data based on data in other tables [message #674476 is a reply to message #674462]

Tue, 29 January 2019 00:40

rrcr
Messages: 18
Registered: January 2019

Junior Member

BlackSwan wrote on Mon, 28 January 2019 11:39

does table A have a primary key?

Yes we have .. but the joining keys are not primary keys hence getting dups

Report message to a moderator

Re: Identify data based on data in other tables [message #674484 is a reply to message #674476]

Tue, 29 January 2019 07:56

BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal

Senior Member

rrcr wrote on Mon, 28 January 2019 22:40

BlackSwan wrote on Mon, 28 January 2019 11:39
does table A have a primary key?
Yes we have .. but the joining keys are not primary keys hence getting dups

show us how you have duplicates when PK ensure UNIQUE values.

When "duplicate" does exist, what determines which row is eliminated to remove the "duplicate" value?

Please read and follow the forum guidelines, to enable us to help you:
OraFAQ Forum Guide
How to use {code} tags and make your code easier to read

Report message to a moderator

Previous Topic:	DataMapper - JSON -> XML - Add a colon in XML tag
Next Topic:	MULTISET EXCEPT - COLLECTION

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Sat Sep 28 16:11:43 CDT 2024