Find difference between max and second max, grouped by columns in MySql (8.0.13), without using the order by...
up vote
1
down vote
favorite
I have a table with the columns ID, ad_id, amount_time, which shows the ID of users, the ad that they saw, and how much time they seen it. Here is an example of the data:
table name: ads
ID | ad_id | amount_time
1 2 600
1 3 300
3 1 400
1 3 100
1 1 700
We want the result to show the difference between max and 2nd max of amount_time, grouped by ID and ad_id
So the result is
ID |ad_id | diff_amount_time
1 3 200
I can get the max and second_max of the total table by executing:
select
(SELECT MAX(amount_time) FROM ads) maxtime,
(SELECT MAX(amount_time) FROM ads
WHERE amount_time NOT IN (SELECT MAX(amount_time) FROM ads )) as
second_max_time
However, I am having trouble incorporating the group by clause in order to get my result. I know there's a way to incorporate order by
and limit 2
to get the max and and second max, but this is computationally expensive and want to know if there is another solution without ordering the amount_time column.
mysql group-by max
|
show 4 more comments
up vote
1
down vote
favorite
I have a table with the columns ID, ad_id, amount_time, which shows the ID of users, the ad that they saw, and how much time they seen it. Here is an example of the data:
table name: ads
ID | ad_id | amount_time
1 2 600
1 3 300
3 1 400
1 3 100
1 1 700
We want the result to show the difference between max and 2nd max of amount_time, grouped by ID and ad_id
So the result is
ID |ad_id | diff_amount_time
1 3 200
I can get the max and second_max of the total table by executing:
select
(SELECT MAX(amount_time) FROM ads) maxtime,
(SELECT MAX(amount_time) FROM ads
WHERE amount_time NOT IN (SELECT MAX(amount_time) FROM ads )) as
second_max_time
However, I am having trouble incorporating the group by clause in order to get my result. I know there's a way to incorporate order by
and limit 2
to get the max and and second max, but this is computationally expensive and want to know if there is another solution without ordering the amount_time column.
mysql group-by max
1
What is your MySQL server version ?
– Madhur Bhaiya
2 days ago
Just edited it in the title. Thanks for the reminder.
– Kevin Sun
2 days ago
How do you get200
?
– Eric
2 days ago
1
@KevinSunMAX()
function will also do a full table scan if there is no indexing. Same thing is applicable forORDER BY
. MySQL has significantly optimized usage ofORDER BY
. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html
– Madhur Bhaiya
2 days ago
1
Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyzeEXPLAIN
statement results for the same.
– Madhur Bhaiya
2 days ago
|
show 4 more comments
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have a table with the columns ID, ad_id, amount_time, which shows the ID of users, the ad that they saw, and how much time they seen it. Here is an example of the data:
table name: ads
ID | ad_id | amount_time
1 2 600
1 3 300
3 1 400
1 3 100
1 1 700
We want the result to show the difference between max and 2nd max of amount_time, grouped by ID and ad_id
So the result is
ID |ad_id | diff_amount_time
1 3 200
I can get the max and second_max of the total table by executing:
select
(SELECT MAX(amount_time) FROM ads) maxtime,
(SELECT MAX(amount_time) FROM ads
WHERE amount_time NOT IN (SELECT MAX(amount_time) FROM ads )) as
second_max_time
However, I am having trouble incorporating the group by clause in order to get my result. I know there's a way to incorporate order by
and limit 2
to get the max and and second max, but this is computationally expensive and want to know if there is another solution without ordering the amount_time column.
mysql group-by max
I have a table with the columns ID, ad_id, amount_time, which shows the ID of users, the ad that they saw, and how much time they seen it. Here is an example of the data:
table name: ads
ID | ad_id | amount_time
1 2 600
1 3 300
3 1 400
1 3 100
1 1 700
We want the result to show the difference between max and 2nd max of amount_time, grouped by ID and ad_id
So the result is
ID |ad_id | diff_amount_time
1 3 200
I can get the max and second_max of the total table by executing:
select
(SELECT MAX(amount_time) FROM ads) maxtime,
(SELECT MAX(amount_time) FROM ads
WHERE amount_time NOT IN (SELECT MAX(amount_time) FROM ads )) as
second_max_time
However, I am having trouble incorporating the group by clause in order to get my result. I know there's a way to incorporate order by
and limit 2
to get the max and and second max, but this is computationally expensive and want to know if there is another solution without ordering the amount_time column.
mysql group-by max
mysql group-by max
edited 2 days ago
asked 2 days ago
Kevin Sun
11618
11618
1
What is your MySQL server version ?
– Madhur Bhaiya
2 days ago
Just edited it in the title. Thanks for the reminder.
– Kevin Sun
2 days ago
How do you get200
?
– Eric
2 days ago
1
@KevinSunMAX()
function will also do a full table scan if there is no indexing. Same thing is applicable forORDER BY
. MySQL has significantly optimized usage ofORDER BY
. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html
– Madhur Bhaiya
2 days ago
1
Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyzeEXPLAIN
statement results for the same.
– Madhur Bhaiya
2 days ago
|
show 4 more comments
1
What is your MySQL server version ?
– Madhur Bhaiya
2 days ago
Just edited it in the title. Thanks for the reminder.
– Kevin Sun
2 days ago
How do you get200
?
– Eric
2 days ago
1
@KevinSunMAX()
function will also do a full table scan if there is no indexing. Same thing is applicable forORDER BY
. MySQL has significantly optimized usage ofORDER BY
. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html
– Madhur Bhaiya
2 days ago
1
Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyzeEXPLAIN
statement results for the same.
– Madhur Bhaiya
2 days ago
1
1
What is your MySQL server version ?
– Madhur Bhaiya
2 days ago
What is your MySQL server version ?
– Madhur Bhaiya
2 days ago
Just edited it in the title. Thanks for the reminder.
– Kevin Sun
2 days ago
Just edited it in the title. Thanks for the reminder.
– Kevin Sun
2 days ago
How do you get
200
?– Eric
2 days ago
How do you get
200
?– Eric
2 days ago
1
1
@KevinSun
MAX()
function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY
. MySQL has significantly optimized usage of ORDER BY
. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html– Madhur Bhaiya
2 days ago
@KevinSun
MAX()
function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY
. MySQL has significantly optimized usage of ORDER BY
. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html– Madhur Bhaiya
2 days ago
1
1
Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze
EXPLAIN
statement results for the same.– Madhur Bhaiya
2 days ago
Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze
EXPLAIN
statement results for the same.– Madhur Bhaiya
2 days ago
|
show 4 more comments
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.
We will use Row_Number()
function, which will determine the row number values within a combination of ID
and ad_id
. Row number will be based on descending order amount_time
value. So the highest amount_time
value will have row number of 1, and the second highest would have row number of 2.
Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY
) on ID
and ad_id
. We can use conditional CASE .. WHEN
expressions, to determine the difference between the highest and second highest value within every group.
SELECT
dt.ID,
dt.ad_id,
(MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
FROM
(
SELECT
ID,
ad_id,
amount_time,
ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
ORDER BY amount_desc) AS row_no
FROM ads
) AS dt
GROUP BY dt.ID, dt.ad_id
-- to remove cases where there is no second highest
-- when there is no second highest amount, then the difference will be null
-- because 5 - null = null
HAVING diff_amount_time IS NOT NULL
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.
We will use Row_Number()
function, which will determine the row number values within a combination of ID
and ad_id
. Row number will be based on descending order amount_time
value. So the highest amount_time
value will have row number of 1, and the second highest would have row number of 2.
Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY
) on ID
and ad_id
. We can use conditional CASE .. WHEN
expressions, to determine the difference between the highest and second highest value within every group.
SELECT
dt.ID,
dt.ad_id,
(MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
FROM
(
SELECT
ID,
ad_id,
amount_time,
ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
ORDER BY amount_desc) AS row_no
FROM ads
) AS dt
GROUP BY dt.ID, dt.ad_id
-- to remove cases where there is no second highest
-- when there is no second highest amount, then the difference will be null
-- because 5 - null = null
HAVING diff_amount_time IS NOT NULL
add a comment |
up vote
1
down vote
accepted
In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.
We will use Row_Number()
function, which will determine the row number values within a combination of ID
and ad_id
. Row number will be based on descending order amount_time
value. So the highest amount_time
value will have row number of 1, and the second highest would have row number of 2.
Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY
) on ID
and ad_id
. We can use conditional CASE .. WHEN
expressions, to determine the difference between the highest and second highest value within every group.
SELECT
dt.ID,
dt.ad_id,
(MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
FROM
(
SELECT
ID,
ad_id,
amount_time,
ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
ORDER BY amount_desc) AS row_no
FROM ads
) AS dt
GROUP BY dt.ID, dt.ad_id
-- to remove cases where there is no second highest
-- when there is no second highest amount, then the difference will be null
-- because 5 - null = null
HAVING diff_amount_time IS NOT NULL
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.
We will use Row_Number()
function, which will determine the row number values within a combination of ID
and ad_id
. Row number will be based on descending order amount_time
value. So the highest amount_time
value will have row number of 1, and the second highest would have row number of 2.
Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY
) on ID
and ad_id
. We can use conditional CASE .. WHEN
expressions, to determine the difference between the highest and second highest value within every group.
SELECT
dt.ID,
dt.ad_id,
(MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
FROM
(
SELECT
ID,
ad_id,
amount_time,
ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
ORDER BY amount_desc) AS row_no
FROM ads
) AS dt
GROUP BY dt.ID, dt.ad_id
-- to remove cases where there is no second highest
-- when there is no second highest amount, then the difference will be null
-- because 5 - null = null
HAVING diff_amount_time IS NOT NULL
In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.
We will use Row_Number()
function, which will determine the row number values within a combination of ID
and ad_id
. Row number will be based on descending order amount_time
value. So the highest amount_time
value will have row number of 1, and the second highest would have row number of 2.
Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY
) on ID
and ad_id
. We can use conditional CASE .. WHEN
expressions, to determine the difference between the highest and second highest value within every group.
SELECT
dt.ID,
dt.ad_id,
(MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
FROM
(
SELECT
ID,
ad_id,
amount_time,
ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
ORDER BY amount_desc) AS row_no
FROM ads
) AS dt
GROUP BY dt.ID, dt.ad_id
-- to remove cases where there is no second highest
-- when there is no second highest amount, then the difference will be null
-- because 5 - null = null
HAVING diff_amount_time IS NOT NULL
edited 2 days ago
answered 2 days ago
Madhur Bhaiya
16k52136
16k52136
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53343625%2ffind-difference-between-max-and-second-max-grouped-by-columns-in-mysql-8-0-13%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
What is your MySQL server version ?
– Madhur Bhaiya
2 days ago
Just edited it in the title. Thanks for the reminder.
– Kevin Sun
2 days ago
How do you get
200
?– Eric
2 days ago
1
@KevinSun
MAX()
function will also do a full table scan if there is no indexing. Same thing is applicable forORDER BY
. MySQL has significantly optimized usage ofORDER BY
. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html– Madhur Bhaiya
2 days ago
1
Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze
EXPLAIN
statement results for the same.– Madhur Bhaiya
2 days ago