This article is a mirror article of machine translation, please click here to jump to the original article.

View: 14609|Reply: 0

[Tips] SQL How to query a table to remove duplicates

[Copy link]
Posted on 10/9/2014 11:03:04 AM | | |

SQL single/multi-table queries remove duplicate records

Single table distinct

  1. select distinct 字段 from 表
Copy code

Many votes are group by

group by must be placed before order by and limit, otherwise an error will be reported

************************************************************************************

1. Find the redundant duplicate records in the table, and the duplicate records are judged based on a single field (peopleId).

select * from people
where peopleId in (select  peopleId  from  people  group  by  peopleId  having  count(peopleId) > 1)

2. Delete the redundant duplicate records in the table, and the duplicate records are judged based on a single field (peopleId), leaving only the records with the smallest rowid
delete from people
where peopleId  in (select  peopleId  from people  group  by  peopleId   having  count(peopleId) > 1)
and rowid not in (select min(rowid) from  people  group by peopleId  having count(peopleId )>1)

3. Find redundant duplicate records (multiple fields) in the table
select * from vitae a
where (a.peopleId,a.seq) in  (select peopleId,seq from vitae group by peopleId,seq  having count(*) > 1)

4. Delete the redundant duplicate records (multiple fields) in the table, leaving only the records with the smallest rowid
delete from vitae a
where (a.peopleId,a.seq) in  (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)
and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1)


5. Find redundant duplicate records (multiple fields) in the table, and do not contain records with the least rowid
select * from vitae a
where (a.peopleId,a.seq) in  (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)
and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1)

(two)
For example
There is a field "name" in table A,
And the "name" value may be the same between different records,
Now you need to query for items with duplicate "name" values between records in the table;
Select Name,Count(*) From A Group By Name Having Count(*) > 1

If the gender is also the same, it is as follows:
Select Name,sex,Count(*) From A Group By Name,sex Having Count(*) > 1

(three)
Method 1

declare @max integer,@id integer

declare cur_rows cursor local for select main field, count(*) from table name group by main field having count(*) >; 1

open cur_rows

fetch cur_rows into @id,@max

while @@fetch_status=0

begin

select @max = @max -1

set rowcount @max

delete from table name where primary field = @id

fetch cur_rows into @id,@max
end

close cur_rows

set rowcount 0

Method 2

"Duplicate records" have two meanings of duplicate records, one is a completely duplicate record, that is, a record with all fields duplicated, and the other is a record with duplicate parts of key fields, such as the Name field is duplicated, while other fields are not necessarily duplicated or can be ignored.

1. For the first type of repetition, it is relatively easy to solve and use

select distinct * from tableName

You can get the result set with no duplicate records.

If you need to delete duplicate records (keep 1 duplicate record), you can delete them as follows

select distinct * into #Tmp from tableName

drop table tableName

select * into tableName from #Tmp
drop table #Tmp

This duplication occurs due to poor table design and can be solved by adding unique index columns.

2. This type of duplicate problem usually requires the first record in the duplicate record to be kept, and the operation method is as follows

Suppose there are duplicate fields of Name and Address, and you need to get a unique result set for these two fields

select identity(int,1,1) as autoID, * into #Tmp from tableName

select min(autoID) as autoID into #Tmp2 from #Tmp group by Name,autoID

select * from #Tmp where autoID in(select autoID from #tmp2)

The last select gets the result set of Name and Address that are not repeated (but there is an additional autoID field, which can be written in the select clause when actually writing)

(4)
Queries are duplicated

select * from tablename where id in (select id from tablename

group by id

having count(id) > 1

)

3. Find redundant duplicate records (multiple fields) in the table
select * from vitae a
where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)

Running will cause problems, and writing and sending like where(a.peopleId, a.seq) will not work!!






Previous:sql creates stored procedures with parameters
Next:EPUBReader reader, a must-have for opening EPUB files
Disclaimer:
All software, programming materials or articles published by Code Farmer Network are only for learning and research purposes; The above content shall not be used for commercial or illegal purposes, otherwise, users shall bear all consequences. The information on this site comes from the Internet, and copyright disputes have nothing to do with this site. You must completely delete the above content from your computer within 24 hours of downloading. If you like the program, please support genuine software, purchase registration, and get better genuine services. If there is any infringement, please contact us by email.

Mail To:help@itsvse.com