|ID (VARCHAR)||NAME (VARCHAR)||AMOUNT (INT)||LAST_UPDATED (TIMESTAMP)|
|2||Lee Sin Ti||100.00||2006-03-14-00.10.31.999999|
|2||Lee Sin Ti||200.00||2006-03-14-00.09.31.999999|
Based on this, your DB2 query is to remove all the outdated records while maintaining only the latest entries.
An idiotic first attempt:
DELETE FROM TABLE1
ID || NAME || CAST (AMOUNT AS VARCHAR(32)) || CAST(LAST_UPDATED AS VARCHAR(64))
SELECT (ID || NAME || CAST(AMOUNT AS VARCHAR(32)) || CAST(MAX(LAST_UPDATED) AS VARCHAR(64)) ) AS KEY FROM TABLE1 GROUP BY ID, NAME
This attempt is definitely a NO-NO. Not only the string concatenations takes a huge amount of processing cycles, it is also UGLY in my point of view. Using a generated random data of 100k records, it takes an unacceptable amount of time to complete the delete query.
Then, my second attempt got to deal with DB2 support of row_number() function, which I greatly appreciated from IBM.
SELECT ROW_NUMBER() OVER (PARTITION BY ID, NAME ORDER BY LAST_UPDATED DESC)
) AS X (ROWNUM) WHERE ROWNUM > 1;
WOW, an optimized yet elegant query to achieve my goal. With the same random data set, it only took less than 1 minute to complete the query.