Percentage with union all request

713 views Asked by At

As I asked in this question : Oracle SQL Group By if I log file usage in my application. There is 3 files sources:

  • Pool
  • MDA
  • Other

If the file is opened twice from MDA and once from Pool, I'll get two entries:

TESTID        SITE       LATEST_READ READ_COUNT FILE_ORIGIN_ID
------------- ---------- ----------- ---------- --------------
File1        |Site1     |02/05/13   |         2|             1 
File1        |Site2     |22/01/14   |         3|             2 

What I want to achieve is to get the ratio of files that are not in the Pool OR the MDA grouped by sites. So I managed to do this resquest:

SELECT Count(TESTID) as OTHER_FILES, SITE, 'OTHERS'
FROM USER_STATS.FILE_USAGE_LOG
WHERE TESTID not in (
  -- Files that are on Pool OR MDA
  SELECT TESTID
  FROM USER_STATS.FILE_USAGE_LOG
  WHERE FILE_ORIGIN_ID < 2
)
AND LATEST_READ between '01/05/2015' and '01/06/2015'
GROUP BY Site

UNION ALL
SELECT Count(TESTID) as OTHER_FILES, site, 'Files that are at least in Pool or MDA'
FROM USER_STATS.FILE_USAGE_LOG
WHERE TESTID in (
  -- Files that are on Pool OR MDA
  SELECT TESTID
  FROM USER_STATS.FILE_USAGE_LOG
  WHERE FILE_ORIGIN_ID < 2
)
AND LATEST_READ between '01/05/2015' and '01/06/2015'
GROUP BY Site

Which gives me this:

18      BR-CTA      Files that are at least in Pool or MDA
324     BR-CTA      OTHERS
26      BR-CTA-VPN  OTHERS
5       CN-TSN-VPN  OTHERS
2040    FR-LYON     Files that are at least in Pool or MDA
248     FR-LYON     OTHERS
1       IN-BLR      Files that are at least in Pool or MDA
1       IN-PUNE     OTHERS
810     JP-SAIT     OTHERS
48      JP-SAIT     Files that are at least in Pool or MDA
...

And I would like to have this:

94%         BR-CTA      Ratio -- 94% in OTHER
100%        BR-CTA-VPN  Ratio -- 100% in OTHER
100%        CN-TSN-VPN  Ratio -- 100% in OTHER
10%         FR-LYON     Ratio -- 10% in OTHER
0%          IN-BLR      Ratio -- 0% in OTHER
100%        IN-PUNE     Ratio -- 100% in OTHER
94%         JP-SAIT     Ratio -- 94% in OTHER
...

But I can't acheive this whatever I try. How can I do this?

I use nbTotal / (nbOther) * 100 as ratio calculation.

4

There are 4 answers

2
Karl Kieninger On BEST ANSWER

There's a few ways to do that and what is possible or best depends in part on you RDBMS. However, here is one way. I am substituting you query above with an IntermediateResults table for simplicity. In practice you could use your query with a CTE, derived table, temp table or table variable.

CREATE TABLE IntermediateResults (OtherFiles INT, Site VARCHAR(20), Message VARCHAR(100));
GO
INSERT INTO IntermediateResults (OtherFiles,Site,Message) VALUES (18,'BR-CTA','Files that are at least in Pool or MDA');
INSERT INTO IntermediateResults (OtherFiles,Site,Message) VALUES (324,'BR-CTA' ,'OTHERS');
INSERT INTO IntermediateResults (OtherFiles,Site,Message) VALUES (26,'BR-CTA-VPN','OTHERS');
INSERT INTO IntermediateResults (OtherFiles,Site,Message) VALUES (1,'IN-BLR','Files that are at least in Pool or MDA');
GO


SELECT COALESCE(o.Site,p.Site) Site
      ,Ratio = CASE WHEN o.OtherFiles IS NULL THEN 0
                    WHEN p.OtherFiles IS NULL THEN 100
                    ELSE 100 * o.OtherFiles/(p.OtherFiles + o.OtherFiles) END
FROM
  (SELECT * FROM IntermediateResults WHERE Message = 'OTHERS') o
FULL JOIN
  (SELECT * FROM IntermediateResults WHERE Message <> 'OTHERS') p
ON o.Site = p.Site

Results:

BR-CTA      94
IN-BLR      0
BR-CTA-VPN  100

EDIT: An example of how to replace the table in my example with your query would be to use a subquery factoring which is what Oracle calls TSQL Common Table Expression or the WITH construct.

WITH IntermediateResults AS (
    /*your query here*/
)
SELECT COALESCE(o.Site,p.Site) Site
      ,Ratio = CASE WHEN o.OtherFiles IS NULL THEN 0
                    WHEN p.OtherFiles IS NULL THEN 100
                    ELSE 100 * o.OtherFiles/(p.OtherFiles + o.OtherFiles) END
FROM
  (SELECT * FROM IntermediateResults WHERE Message = 'OTHERS') o
FULL JOIN
  (SELECT * FROM IntermediateResults WHERE Message <> 'OTHERS') p
ON o.Site = p.Site
2
less On

I've put the data in a temp Table for easier handling:

DECLARE @tmp  TABLE (readcount INT, site VARCHAR(40), origin VARCHAR(40))

INSERT INTO @TMP (readcount, site, origin) VALUES (18,'BR-CTA','Files that are at least in Pool or MDA')
INSERT INTO @TMP (readcount, site, origin) VALUES (324,'BR-CTA','OTHERS')
INSERT INTO @TMP (readcount, site, origin) VALUES (26,'BR-CTA-VPN','OTHERS')
INSERT INTO @TMP (readcount, site, origin) VALUES (5,'CN-TSN-VPN','OTHERS')

I think what you want is:

SELECT 
  percentage = 
   ((SELECT readcount      FROM @tmp T2 WHERE T2.site = T.site AND origin = 'OTHERS') * 100.0 / 
    (SELECT SUM(readcount) FROM @tmp T3 WHERE t3.site = T.site GROUP BY SITE) ),
  site
FROM @tmp t
GROUP BY site

This results in

94.736842105263 BR-CTA
100.000000000000    BR-CTA-VPN
100.000000000000    CN-TSN-VPN
0
jorge polanco On

Starting by the top you don't need a UNION ALL Query, you could retrieve your data with this query.

I omitted the range period for easy read.

SELECT COUNT(TESTID) AS OTHER_FILES,SITE
,CASE WHEN FILE_ORIGIN_ID < 2 THEN 'Files that are at least in Pool or MDA' ELSE 'OTHERS' END AS validCondition
FROM FILE_USAGE_LOG as pivot
GROUP BY pivot.TESTID
,(CASE WHEN FILE_ORIGIN_ID < 2 THEN 'Files that are at least in Pool or MDA' ELSE 'OTHERS' END);

So with that you could make an derived table, with out grouping to obtain all the universe.

SELECT COUNT(TESTID) AS OTHER_FILES,SITE
,CASE WHEN FILE_ORIGIN_ID < 2 THEN 'Files that are at least in Pool or MDA' ELSE 'OTHERS' END AS validCondition,
(COUNT(TESTID)/MAX(total))*100 AS ratio
FROM FILE_USAGE_LOG as pivot
CROSS JOIN(SELECT COUNT(TESTID) AS total FROM FILE_USAGE_LOG) AS u
GROUP BY pivot.TESTID
,(CASE WHEN FILE_ORIGIN_ID < 2 THEN 'Files that are at least in Pool or MDA' ELSE 'OTHERS' END);

I hope this answer could help you Regards.

0
Ponder Stibbons On

I think that this query may be helpful here:

with t as (
    select site,
        count(case when dsc = 'MDA' then testid end) mda,
        count(case when dsc = 'OTH' then testid end) oth
      from (
        select testid, site,
          case when exists (
              select testid from file_usage_log 
                 where file_origin_id<2 and testid = ful.testid) 
            then 'MDA' else 'OTH' end dsc
        from file_usage_log ful
        where latest_read between date '2015-05-01' and date '2015-06-01')
      group by site)
  select site, round(100*oth/(oth+mda)) percent from t

SQLFiddle

Without your data access it's hard to verify corectness of calculations, but for my examples it worked.