This article is a mirror article of machine translation, please click here to jump to the original article.

View: 15984|Reply: 0

[Source] MySQL [remove redundancy and leave one] A sql statement completes the idea summary

[Copy link]
Posted on 3/13/2019 1:37:42 PM | | | |
A few days ago, when I was doing a requirement, I needed to clean up the duplicate records in mysql, and the idea at that time was to write it through code traversal, and then I thought it was too complicated, thinking that I should be able to solve the problem with a sql statement. After checking the information and consulting the boss, I came up with a very convenient sql statement, and here I will share this sql statement and ideas.

Needs analysis
If there are duplicate records in the database, delete and keep one of them (whether the criteria for determining duplicate fields are multiple fields)


solution

When you encounter this need, you probably have an idea in your heart. The fastest thing I thought of was that I could solve it with a sql statement, but I was too shallow in complex sql statements, so I wanted to ask the boss for help.

Find someone to help

因为这个需求有点着急,所以最开始想到的是,可以找这方面的同行来解决,然后分享这个问题给@赵七七同学,结果这货随便百度了一下,就甩给我一个从未用过的sql语句,让我自己尝试,心里万匹那啥啥啥奔腾而过...

Own Baidu

Found a sql statement:


This SQL idea is obvious, there are the following 3 steps:

  • SELECT peopleId, seq FROM vitae GROUP BY peopleId, seq HAVING count(*) > 1 Query duplicate records in the table as a condition
  • SELECT min(rowid) FROM vitae GROUP BY peopleId, seq HAVING count(*) > 1 The second condition is the smallest value of the ID in the duplicate records in the query table
  • Finally, according to the above two conditions, delete the remaining duplicate records except for the smallest ID in the duplicate record


But unfortunately, there is an error in running this statement, which roughly means that the table cannot be updated at the same time as the query.


Code solved

Based on the above SQL statement, I think that you may be able to achieve the same goal in two steps through code:

  • Duplicate datasets are taken out first
  • Based on the queried dataset, the remaining duplicates are removed in a loop


I had an idea, and I wrote it quickly, but I was shocked when I ran it, and I actually needed it116sleft and right, and then I want to find a SQL statement that I can use, paste the code and the running result:




Perfect [deduplication and leave one] SQL

Finally, I got a perfect answer in a technical group, look at this sql statement:


The above sql statement, if you look closely, it is not difficult to figure out the idea, and it is probably divided into 3 steps to understand:

  • (SELECT min(id) id, user_id, monetary, consume_time FROM consum_record GROUP BY user_id, monetary, consume_time HAVING count(*) > 1 ) t2 Query the duplicate records to form a collection (temporary table t2), which contains the minimum ID of each duplicate record
  • consum_record.user_id = t2.user_id and consum_record.monetary = t2.monetary and consum_record.consume_time = t2.consume_time Associate fields for the duplicate criteria
  • Delete records with IDs greater than IDs in T2 according to the conditions


When I saw this sentence, I thought to myself that this was too powerful. Such a simple sql statement can solve such a complex problem, and the posture is rising~
It is also super fast to run, the original code loop execution takes about 116s, but here 0.3s is enough, amazing~







Previous:javaEE October 2018 video material
Next:SQL randomly modifies the updated data
Disclaimer:
All software, programming materials or articles published by Code Farmer Network are only for learning and research purposes; The above content shall not be used for commercial or illegal purposes, otherwise, users shall bear all consequences. The information on this site comes from the Internet, and copyright disputes have nothing to do with this site. You must completely delete the above content from your computer within 24 hours of downloading. If you like the program, please support genuine software, purchase registration, and get better genuine services. If there is any infringement, please contact us by email.

Mail To:help@itsvse.com