SQL Find a minimum number for each value in analytical function

377 views Asked by At

I am using Microsoft SQL Server. I want to write a query with only analytical function ( like not using group by)

I want to write a query that returns rows of:

  • MIN(Customer_number), count(code), private
  • MIN(Customer_number), count(code), business

I have made two tables for example

enter image description here

by this example, it should be only for rows that Customer_number = 1,2 like this:

* 1,intermediate results counting,private
.
.
.
* 2, intermediate results counting, business
.
.
.

I wrote:

SELECT 
    MIN(subscribers.customer_number) OVER (PARTITION BY customers.customer_number, customer_type) AS cusNo,
    COUNT(subscribers.code) OVER (PARTITION BY customers.customer_number, customer_type) AS subscribes,
    customer_type
FROM
    customers 
JOIN
    subscribers ON subscribers.customer_number = customers.customer_number;

I tried for so much time to understand how to return the right output, why it's not working and could not find it

If someone could help me out with how to order it, and explain what was wrong, that'd be great.

1

There are 1 answers

4
Caius Jard On

You seem to want only customer 1 and 2, which I interpret as you wanting only the lowest numbered customer from private and also the lowest numbered customer from business.

You don't want to use group by.

SELECT * FROM
  (
    SELECT
      --number each row, order by customer number, "group" by business type 
      ROW_NUMBER() OVER(PARTITION BY c.customer_type ORDER BY c.customer_number) rown, 
      --it isn't totally clear if you want the count of codes per customer or per type
      count(s.code) OVER (partition by c.customer_number) AS count_codes_by_cust,
      count(s.code) OVER (partition by c.customer_type) AS count_codes_by_type,
      customer_type 
    FROM
      customers c 
      INNER JOIN
      subscribers s
      ON s.customer_number = c.customer_number
  ) c
WHERE rown = 1 --cust #1 and cust#2 both have a rown of 1

Note, I've left the join to subscribers and counting codes out of this, for clarity - the critical concept that your original attempt was missing was using a WHERE to restrict the output to only two rows

Your first attempt could also be modified to produce the requirement:

SELECT * FROM
(
 SELECT 
  min(c.customer_number) OVER (partition by c.customer_type) AS min_cust,
  c.customer_number,
  --it isn't totally clear if you want the count of codes per customer or per type
  count(s.code) OVER (Partition by c.customer_number) AS count_codes_by_cust,
  count(s.code) OVER (Partition by c.customer_type) AS count_codes_by_type,
  customer_type
 FROM 
  customers c 
  INNER JOIN
  subscribers s
  ON s.customer_number = c.customer_number
)d
WHERE
  min_cust = customer_number

But a deficiency of this approach is it gives multiple rows, because joining customers and subscribers causes the customer number to repeat, and you end up with more than one row where the where min_cust = customer number is true: the min(customer_number) over(...) has chosen "1" and "2" as your minimum customer numbers, but thanks to the join to subscribers, 1 appears 3 times and 2 appears twice

The row_number approach is different in this regard = only one row per customer_type can ever have a row number of 1, so you'll only get as many rows as you have distinct types of customer. In more general terms, if you have a table that stores, for example, multiple dated versions of a document, and you want only the latest version of each document, you can:

SELECT * FROM (
  SELECT 
    doc.*, 
    ROW_NUMBER() OVER(PARTITION BY filename ORDER BY filedate DESC) --latest is numbered 1
  FROM
    documents
)doc WHERE rown = 1

To see the action in more detail, highlight and run the inner queries only, to look at the raw data the outer query is operating on.