Sunday, May 1, 2011

MySQL : selecting the X smallest values

Hi,

Let be a table like this :

CREATE TABLE `amoreAgentTST01` (
  `moname` char(64) NOT NULL DEFAULT '',
  `updatetime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `data` longblob,
  PRIMARY KEY (`moname`,`updatetime`)

I have a query to find the oldest records for each distinct 'moname', but only if there are multiple records for this 'moname' :

SELECT moname, updatetime FROM amoreAgentTST01 a 
WHERE (SELECT count(*) FROM amoreAgentTST01 x WHERE x.moname = a.moname) > 1 
  AND a.updatetime = (SELECT min(updatetime) FROM amoreAgentTST01 y WHERE y.moname = a.moname) ;

My question is : how to do the same but selecting the X oldest values ? I now simply run this, delete the oldest values and rerun it... which is not so nice.

Seconds question is : what do you think of the above query ? can it be improved ? is there any obvious bad practice ?

Thank you in advance for your advices and help.

Barth

From stackoverflow
  • Would something like this work (untested):

    SELECT moname, MIN(updatetime) FROM amoreAgentTST01 
    GROUP BY moname HAVING COUNT(moname)>1
    

    Edit - the above is meant only as a replacement for your existing code, so it doesn't directly answer your question.

    I think something like this should work for your main question:

    SELECT moname, updatetime FROM amoreAgentTST01 
    GROUP BY moname, updatetime 
    HAVING COUNT(moname)>1 
    ORDER BY updatetime LIMIT 0, 10
    

    Edit - sorry, the above won't work because it's returning only 10 records for all the monames - rather than the 10 oldest for each. Let me have a think.

    One more go at this (admittedly, this one looks a bit convoluted):

    SELECT a.moname, a.updatetime FROM amoreAgentTST01 a
    WHERE EXISTS 
    (SELECT * FROM amoreAgentTST01 b 
    WHERE a.moname = b.moname AND a.updatetime = b.updatetime  
    ORDER BY b.updatetime LIMIT 0, 10)
    AND (SELECT COUNT(*) FROM amoreAgentTST01 x WHERE x.moname = a.moname) > 1
    

    I should add that if there is an ID column - generally the primary key- then that should be used for the sub-query joins for improved performance.

    Barth : Indeed this is a lot better than the query I had previously. Thank you !
    BrynJ : I've just updated my answer with a possible solution.

0 comments:

Post a Comment

Note: Only a member of this blog may post a comment.