Google
Information Storage and Retrieval: The Magic of 'Start With Connect By' Clause

Pages

Tuesday, September 11, 2007

The Magic of 'Start With Connect By' Clause

In Oracle , 'Start With..Connect By' Clause is a powerful way to select data that has hierarchical relationship. (like Parent -> Child or Manger->Employee). We will explore it with the help of an example. First, create a table named "Entities".

CREATE TABLE ENTITIES
(
PARENT_ENTITY VARCHAR2(20 BYTE),
CHILD_ENTITY VARCHAR2(20 BYTE)
);

Now insert some sample values in it.

Insert into ENTITIES (CHILD_ENTITY) Values ('a');
Insert into ENTITIES (PARENT_ENTITY, CHILD_ENTITY) Values ('a', 'af');
Insert into ENTITIES (PARENT_ENTITY, CHILD_ENTITY) Values ('a', 'ab');
Insert into ENTITIES (PARENT_ENTITY, CHILD_ENTITY) Values ('a', 'ax');
Insert into ENTITIES (PARENT_ENTITY, CHILD_ENTITY) Values ('ab', 'abc');
Insert into ENTITIES (PARENT_ENTITY, CHILD_ENTITY) Values ('ab', 'abd');
Insert into ENTITIES (PARENT_ENTITY, CHILD_ENTITY) Values ('ab', 'abe');
Insert into ENTITIES (PARENT_ENTITY, CHILD_ENTITY) Values ('abe', 'abes');
Insert into ENTITIES (PARENT_ENTITY, CHILD_ENTITY) Values ('abe', 'abet');
Insert into ENTITIES (CHILD_ENTITY) Values ('b');
Insert into ENTITIES (PARENT_ENTITY, CHILD_ENTITY) Values ('b', 'bg');
Insert into ENTITIES (PARENT_ENTITY, CHILD_ENTITY) Values ('b', 'bh');
Insert into ENTITIES (PARENT_ENTITY, CHILD_ENTITY) Values ('b', 'bi');
Insert into ENTITIES (PARENT_ENTITY, CHILD_ENTITY) Values ('bi', 'biq');
Insert into ENTITIES (PARENT_ENTITY, CHILD_ENTITY) Values ('bi', 'biv');
COMMIT;

This data heirarchy looks like :

(Click on the image to enlarge)

Now we will try to understand the result of the following query:


Main Query:

select level,parent_entity,child_entity
from entities
start with parent_entity is null
connect by prior child_entity=parent_entity



The execution is something like this:
1. First look at your base result i.e
select * from parent_entity. (Keep this in mind)
2. Look at the start by condition...start with parent_entity is null. The startwith condition is used to identify the root of the tree. The 'prior' operator is used to specify the direction in which the query traverses the tree(down from root or up from branches)
The 'data to be scanned' is : (select * from entities where parent_entity is null)
Parent_Entity..............Child_Entity
........NULL............................a
........NULL............................b
So it searches those child_entities where parent_entity is null i.e top of the tree...

The main query result till this stage is:
Level........Parent_Entity............Child Entity
....1....................... NULL.........................a
3. It will now search for allthe childs of 'a'.
The childs of a are : af , ab and ax (this is done as a result of connect by prior child_entity=parent_entity clause )
It will now search for the childs of af. It will show the result in main query. (There are no childs of af)
The main query result till this stage is:
Level........ Parent_Entity....... Child Entity
....1...................... NULL......................a
....2......................... a........................af
It will now search the childs of ab. It will show the result in main query. (There are 3 childs of ab : abc,abd,abe)
Level........ Parent_Entity........ Child Entity
....1........................ NULL.................... a
....2........................... a........................af
....2........................... a........................ab
Now taking 1 child at a time of ab ; it will dig into abc,abd and abe. There are no childs for abc and abd but abe has two childs : abes and abet.

So the result of main query now is:
Level............... Parent_Entity................ Child Entity
....1 .................................NULL............................... a
....2...................................... a...................................af
....2...................................... a...................................ab
....3..................................... ab..................................abc
....3..................................... ab..................................abd
....3..................................... ab..................................abe
....4 .....................................abe................................abes
....4..................................... abe................................abet
It will now serch for childs of abes. There are no childs. It will now search for the childs of abet. There are no childs. So now the it will trace back the loop where it had left i.e at "ax" (3rd child of a) .
4. It will show the row corresponding to ax and serach for its childs (There are no childs for ax).
Result of main query now is:
Level.......... Parent_Entity.......... Child Entity
....1......................... NULL.......................... a
....2............................. a...............................af
....2............................. a...............................ab
....3............................ ab..............................abc
....3............................ ab..............................abd
....3............................ ab..............................abe
....4.............................abe............................abes
....4............................ abe............................abet
....2............................ a................................ax
5. Since there are no childs of ax, it will now trace back to the result set of our "start with" condition and start looking for data corresponding to 'b' ...(read point no. 2).
It will follow the same steps for this level also.

6. The final main query result set will be :
Level........... Parent_Entity.............. Child Entity
....1............................ NULL............................. a
....2................................ a............................af
....2................................ a ...........................ab
....3................................ ab..........................abc
....3................................ ab..........................abd
....3................................ ab..........................abe
....4................................ abe........................abes
....4................................ abe........................abet
....2................................ a............................ax
....1.............................. NULL........................... b
....2................................ b............................bg
....2................................ b............................bh
....2................................ b............................bi
....3................................ bi...........................biq
....3................................ bi...........................biv

Thats it!!!!!
You can alter your tree traversal by changing your "start with" condition.
For example if I write

select level,parent_entity,child_entity
from entities
start with parent_entity = 'a'
connect by prior child_entity=parent_entity


It will cut down the 'b' branch and will show only 'a' hierarchy. :-)) This process is called 'pruning'.

I hope u find it helpful.




1 comment:

max_junoon said...

hii gaurav goel
happy holi
hope you are doing well
i went through your blog and find very valueable i am thanking you from bottom of my heart and hope you will guide me some more in plsql with new idea