Explanation of sql query : Rank and partition oracle

2k views Asked by At

I am working with oracle database and is stuck at this query, I am not comprehending how this is working, i have removed all the unnecssesary information and the query has come down to this:

 select RANK () OVER (PARTITION BY A_pk ORDER BY B_pk DESC) as column from Table_C

Can some one explain how is this working?

3

There are 3 answers

0
Neil McGuigan On
  1. Partition the table into "windows" based on values in the column A_pk:

     A_pk     |  B_pk
    develop   |    11
    develop   |     7
    develop   |     9
    develop   |     8
    develop   |    10
    *** window ***
    personnel |     5
    personnel |     2
    *** window ***
    sales     |     3
    sales     |     1
    sales     |     4
    
  2. Order each "window" by B_pk desc:

     A_pk     |  B_pk
    develop   |    11
    develop   |    10
    develop   |     9
    develop   |     8
    develop   |     7
    *** window ***
    personnel |     5
    personnel |     2
    *** window ***
    sales     |     4
    sales     |     3
    sales     |     1
    
  3. return the rank of each row (the "place" it's in), by window:

    1
    2
    3
    4 
    5
    *** window ***
    1
    2
    *** window ***
    1
    2
    3
    
0
Lalit Kumar B On

Let's understand with this simple query on the sample EMP table in SCOTT schema:

SQL> SELECT empno,
  2    deptno,
  3    sal,
  4    rank() over(partition BY deptno order by sal) rn
  5  FROM emp;

     EMPNO     DEPTNO        SAL         RN
---------- ---------- ---------- ----------
      7934         10       1300          1
      7782         10       2450          2
      7839         10       5000          3
      7369         20        800          1
      7876         20       1100          2
      7566         20       2975          3
      7788         20       3000          4
      7902         20       3000          4
      7900         30        950          1
      7654         30       1250          2
      7521         30       1250          2
      7844         30       1500          4
      7499         30       1600          5
      7698         30       2850          6

14 rows selected.

SQL>

RANK function is a built in analytic function which is used to rank a record within a group of rows. The PARTITION BY clause in the window is what groups the rows, and the ORDER BY clause tells how to rank, i.e. which row in each group will hold the 1st rank and then assign next rank to the next rows in that order.

  • GROUP
  • SORT
  • Assign rank

So, in above example, the rows are grouped by department, and ordered by salary. In each group, the rank is assigned starting from the lowest salary(ascending order). When there is a tie, the rank is not incremented, however, the immediate next row with change in value will not have consecutive sequence. And that's what happened here:

7654         30       1250          2
7521         30       1250          2
7844         30       1500          4

The rank is not consecutive because there is a tie between two rows with salary 1250. To keep the sequence consecutive, you need to use DENSE_RANK.

0
Giorgi Nakeuri On

Suppose you have the following table:

SELECT *, 
       ROW_NUMBER() OVER(PARTITION BY col1 ORDER BY col2) AS [ROW_NUMBER],
       RANK() OVER(PARTITION BY col1 ORDER BY col2) [RANK],
       DENSE_RANK() OVER(PARTITION BY col1 ORDER BY col2) [DENSE_RANK]
FROM SomeTable

col1    col2  |  ROW_NUMBER RANK    DENSE_RANK
1       1     |  1          1       1
1       1     |  2          1       1
1       5     |  3          3       2
1       5     |  4          3       2
1       9     |  5          5       3
1       9     |  6          5       3
2       1     |  1          1       1
2       1     |  2          1       1
2       1     |  3          1       1
2       3     |  4          4       2
2       3     |  5          4       2

ROW_NUMBER: These are window ranking functions. This means that when you partition by some column function works in the window of that partition. Imagine it like:

col1    col2
/*------*\
|1      1|
|1      1|
|1      5|  <--window 1
|1      5|
|1      9|
|1      9|
\*------*/

/*------*\
|2      1|
|2      1|
|2      1|  <--another window 2
|2      3|
|2      3|
\*------*/

When window changes everything resets! So for ROW_NUMBER you are ordering by col2 and assign incremental number by that order. When window changes function resets and begins from 1.

RANK:

This function will calculate count of rows where value of col2 is less then value of col2 in current row plus 1. For example in window 1 for row with col2 = 5 there are 2 rows with col2 = 1, so 2 + 1 = 3. For row with col2 = 9 there are 4 rows, so 4 + 1 = 5.

DENSE_RANK:

It is the same as RANK but it counts preceding DISTINCT values! For example for row with col2 = 9 there are 2 distinct values of col2 = 1 and 5 so 2 + 1 = 3