Find difference between max and second max, grouped by columns in MySql (8.0.13), without using the order by...











up vote
1
down vote

favorite












I have a table with the columns ID, ad_id, amount_time, which shows the ID of users, the ad that they saw, and how much time they seen it. Here is an example of the data:



table name: ads

ID | ad_id | amount_time
1 2 600
1 3 300
3 1 400
1 3 100
1 1 700


We want the result to show the difference between max and 2nd max of amount_time, grouped by ID and ad_id



So the result is



ID |ad_id | diff_amount_time
1 3 200


I can get the max and second_max of the total table by executing:



select
(SELECT MAX(amount_time) FROM ads) maxtime,
(SELECT MAX(amount_time) FROM ads
WHERE amount_time NOT IN (SELECT MAX(amount_time) FROM ads )) as
second_max_time


However, I am having trouble incorporating the group by clause in order to get my result. I know there's a way to incorporate order by and limit 2 to get the max and and second max, but this is computationally expensive and want to know if there is another solution without ordering the amount_time column.










share|improve this question




















  • 1




    What is your MySQL server version ?
    – Madhur Bhaiya
    2 days ago










  • Just edited it in the title. Thanks for the reminder.
    – Kevin Sun
    2 days ago










  • How do you get 200?
    – Eric
    2 days ago






  • 1




    @KevinSun MAX() function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY. MySQL has significantly optimized usage of ORDER BY. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html
    – Madhur Bhaiya
    2 days ago






  • 1




    Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze EXPLAIN statement results for the same.
    – Madhur Bhaiya
    2 days ago















up vote
1
down vote

favorite












I have a table with the columns ID, ad_id, amount_time, which shows the ID of users, the ad that they saw, and how much time they seen it. Here is an example of the data:



table name: ads

ID | ad_id | amount_time
1 2 600
1 3 300
3 1 400
1 3 100
1 1 700


We want the result to show the difference between max and 2nd max of amount_time, grouped by ID and ad_id



So the result is



ID |ad_id | diff_amount_time
1 3 200


I can get the max and second_max of the total table by executing:



select
(SELECT MAX(amount_time) FROM ads) maxtime,
(SELECT MAX(amount_time) FROM ads
WHERE amount_time NOT IN (SELECT MAX(amount_time) FROM ads )) as
second_max_time


However, I am having trouble incorporating the group by clause in order to get my result. I know there's a way to incorporate order by and limit 2 to get the max and and second max, but this is computationally expensive and want to know if there is another solution without ordering the amount_time column.










share|improve this question




















  • 1




    What is your MySQL server version ?
    – Madhur Bhaiya
    2 days ago










  • Just edited it in the title. Thanks for the reminder.
    – Kevin Sun
    2 days ago










  • How do you get 200?
    – Eric
    2 days ago






  • 1




    @KevinSun MAX() function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY. MySQL has significantly optimized usage of ORDER BY. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html
    – Madhur Bhaiya
    2 days ago






  • 1




    Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze EXPLAIN statement results for the same.
    – Madhur Bhaiya
    2 days ago













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I have a table with the columns ID, ad_id, amount_time, which shows the ID of users, the ad that they saw, and how much time they seen it. Here is an example of the data:



table name: ads

ID | ad_id | amount_time
1 2 600
1 3 300
3 1 400
1 3 100
1 1 700


We want the result to show the difference between max and 2nd max of amount_time, grouped by ID and ad_id



So the result is



ID |ad_id | diff_amount_time
1 3 200


I can get the max and second_max of the total table by executing:



select
(SELECT MAX(amount_time) FROM ads) maxtime,
(SELECT MAX(amount_time) FROM ads
WHERE amount_time NOT IN (SELECT MAX(amount_time) FROM ads )) as
second_max_time


However, I am having trouble incorporating the group by clause in order to get my result. I know there's a way to incorporate order by and limit 2 to get the max and and second max, but this is computationally expensive and want to know if there is another solution without ordering the amount_time column.










share|improve this question















I have a table with the columns ID, ad_id, amount_time, which shows the ID of users, the ad that they saw, and how much time they seen it. Here is an example of the data:



table name: ads

ID | ad_id | amount_time
1 2 600
1 3 300
3 1 400
1 3 100
1 1 700


We want the result to show the difference between max and 2nd max of amount_time, grouped by ID and ad_id



So the result is



ID |ad_id | diff_amount_time
1 3 200


I can get the max and second_max of the total table by executing:



select
(SELECT MAX(amount_time) FROM ads) maxtime,
(SELECT MAX(amount_time) FROM ads
WHERE amount_time NOT IN (SELECT MAX(amount_time) FROM ads )) as
second_max_time


However, I am having trouble incorporating the group by clause in order to get my result. I know there's a way to incorporate order by and limit 2 to get the max and and second max, but this is computationally expensive and want to know if there is another solution without ordering the amount_time column.







mysql group-by max






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 2 days ago

























asked 2 days ago









Kevin Sun

11618




11618








  • 1




    What is your MySQL server version ?
    – Madhur Bhaiya
    2 days ago










  • Just edited it in the title. Thanks for the reminder.
    – Kevin Sun
    2 days ago










  • How do you get 200?
    – Eric
    2 days ago






  • 1




    @KevinSun MAX() function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY. MySQL has significantly optimized usage of ORDER BY. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html
    – Madhur Bhaiya
    2 days ago






  • 1




    Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze EXPLAIN statement results for the same.
    – Madhur Bhaiya
    2 days ago














  • 1




    What is your MySQL server version ?
    – Madhur Bhaiya
    2 days ago










  • Just edited it in the title. Thanks for the reminder.
    – Kevin Sun
    2 days ago










  • How do you get 200?
    – Eric
    2 days ago






  • 1




    @KevinSun MAX() function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY. MySQL has significantly optimized usage of ORDER BY. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html
    – Madhur Bhaiya
    2 days ago






  • 1




    Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze EXPLAIN statement results for the same.
    – Madhur Bhaiya
    2 days ago








1




1




What is your MySQL server version ?
– Madhur Bhaiya
2 days ago




What is your MySQL server version ?
– Madhur Bhaiya
2 days ago












Just edited it in the title. Thanks for the reminder.
– Kevin Sun
2 days ago




Just edited it in the title. Thanks for the reminder.
– Kevin Sun
2 days ago












How do you get 200?
– Eric
2 days ago




How do you get 200?
– Eric
2 days ago




1




1




@KevinSun MAX() function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY. MySQL has significantly optimized usage of ORDER BY. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html
– Madhur Bhaiya
2 days ago




@KevinSun MAX() function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY. MySQL has significantly optimized usage of ORDER BY. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html
– Madhur Bhaiya
2 days ago




1




1




Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze EXPLAIN statement results for the same.
– Madhur Bhaiya
2 days ago




Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze EXPLAIN statement results for the same.
– Madhur Bhaiya
2 days ago












1 Answer
1






active

oldest

votes

















up vote
1
down vote



accepted










In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.



We will use Row_Number() function, which will determine the row number values within a combination of ID and ad_id. Row number will be based on descending order amount_time value. So the highest amount_time value will have row number of 1, and the second highest would have row number of 2.



Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY) on ID and ad_id. We can use conditional CASE .. WHEN expressions, to determine the difference between the highest and second highest value within every group.



SELECT
dt.ID,
dt.ad_id,
(MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
FROM
(
SELECT
ID,
ad_id,
amount_time,
ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
ORDER BY amount_desc) AS row_no
FROM ads
) AS dt
GROUP BY dt.ID, dt.ad_id
-- to remove cases where there is no second highest
-- when there is no second highest amount, then the difference will be null
-- because 5 - null = null
HAVING diff_amount_time IS NOT NULL





share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53343625%2ffind-difference-between-max-and-second-max-grouped-by-columns-in-mysql-8-0-13%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote



    accepted










    In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.



    We will use Row_Number() function, which will determine the row number values within a combination of ID and ad_id. Row number will be based on descending order amount_time value. So the highest amount_time value will have row number of 1, and the second highest would have row number of 2.



    Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY) on ID and ad_id. We can use conditional CASE .. WHEN expressions, to determine the difference between the highest and second highest value within every group.



    SELECT
    dt.ID,
    dt.ad_id,
    (MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
    MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
    FROM
    (
    SELECT
    ID,
    ad_id,
    amount_time,
    ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
    ORDER BY amount_desc) AS row_no
    FROM ads
    ) AS dt
    GROUP BY dt.ID, dt.ad_id
    -- to remove cases where there is no second highest
    -- when there is no second highest amount, then the difference will be null
    -- because 5 - null = null
    HAVING diff_amount_time IS NOT NULL





    share|improve this answer



























      up vote
      1
      down vote



      accepted










      In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.



      We will use Row_Number() function, which will determine the row number values within a combination of ID and ad_id. Row number will be based on descending order amount_time value. So the highest amount_time value will have row number of 1, and the second highest would have row number of 2.



      Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY) on ID and ad_id. We can use conditional CASE .. WHEN expressions, to determine the difference between the highest and second highest value within every group.



      SELECT
      dt.ID,
      dt.ad_id,
      (MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
      MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
      FROM
      (
      SELECT
      ID,
      ad_id,
      amount_time,
      ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
      ORDER BY amount_desc) AS row_no
      FROM ads
      ) AS dt
      GROUP BY dt.ID, dt.ad_id
      -- to remove cases where there is no second highest
      -- when there is no second highest amount, then the difference will be null
      -- because 5 - null = null
      HAVING diff_amount_time IS NOT NULL





      share|improve this answer

























        up vote
        1
        down vote



        accepted







        up vote
        1
        down vote



        accepted






        In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.



        We will use Row_Number() function, which will determine the row number values within a combination of ID and ad_id. Row number will be based on descending order amount_time value. So the highest amount_time value will have row number of 1, and the second highest would have row number of 2.



        Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY) on ID and ad_id. We can use conditional CASE .. WHEN expressions, to determine the difference between the highest and second highest value within every group.



        SELECT
        dt.ID,
        dt.ad_id,
        (MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
        MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
        FROM
        (
        SELECT
        ID,
        ad_id,
        amount_time,
        ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
        ORDER BY amount_desc) AS row_no
        FROM ads
        ) AS dt
        GROUP BY dt.ID, dt.ad_id
        -- to remove cases where there is no second highest
        -- when there is no second highest amount, then the difference will be null
        -- because 5 - null = null
        HAVING diff_amount_time IS NOT NULL





        share|improve this answer














        In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.



        We will use Row_Number() function, which will determine the row number values within a combination of ID and ad_id. Row number will be based on descending order amount_time value. So the highest amount_time value will have row number of 1, and the second highest would have row number of 2.



        Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY) on ID and ad_id. We can use conditional CASE .. WHEN expressions, to determine the difference between the highest and second highest value within every group.



        SELECT
        dt.ID,
        dt.ad_id,
        (MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
        MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
        FROM
        (
        SELECT
        ID,
        ad_id,
        amount_time,
        ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
        ORDER BY amount_desc) AS row_no
        FROM ads
        ) AS dt
        GROUP BY dt.ID, dt.ad_id
        -- to remove cases where there is no second highest
        -- when there is no second highest amount, then the difference will be null
        -- because 5 - null = null
        HAVING diff_amount_time IS NOT NULL






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 2 days ago

























        answered 2 days ago









        Madhur Bhaiya

        16k52136




        16k52136






























             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53343625%2ffind-difference-between-max-and-second-max-grouped-by-columns-in-mysql-8-0-13%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            "Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

            Alcedinidae

            Origin of the phrase “under your belt”?