SQL How to query a table to remove duplicates

admin · Posted on 10/9/2014 11:03:04 AM

SQL single/multi-table queries remove duplicate records

Single table distinct

select distinct 字段 from 表

Copy code

Many votes are group by

group by must be placed before order by and limit, otherwise an error will be reported

************************************************************************************

1. Find the redundant duplicate records in the table, and the duplicate records are judged based on a single field (peopleId).

select * from people
where peopleId in (select peopleId from people group by peopleId having count(peopleId) > 1)

2. Delete the redundant duplicate records in the table, and the duplicate records are judged based on a single field (peopleId), leaving only the records with the smallest rowid
delete from people
where peopleId in (select peopleId from people group by peopleId having count(peopleId) > 1)
and rowid not in (select min(rowid) from people group by peopleId having count(peopleId )>1)

3. Find redundant duplicate records (multiple fields) in the table
select * from vitae a
where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)

4. Delete the redundant duplicate records (multiple fields) in the table, leaving only the records with the smallest rowid
delete from vitae a
where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)
and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1)

5. Find redundant duplicate records (multiple fields) in the table, and do not contain records with the least rowid
select * from vitae a
where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)
and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1)

(two)
For example
There is a field "name" in table A,
And the "name" value may be the same between different records,
Now you need to query for items with duplicate "name" values between records in the table;
Select Name,Count(*) From A Group By Name Having Count(*) > 1

If the gender is also the same, it is as follows:
Select Name,sex,Count(*) From A Group By Name,sex Having Count(*) > 1

(three)
Method 1

declare @max integer,@id integer

declare cur_rows cursor local for select main field, count(*) from table name group by main field having count(*) >; 1

open cur_rows

fetch cur_rows into @id,@max

while @@fetch_status=0

begin

select @max = @max -1

set rowcount @max

delete from table name where primary field = @id

fetch cur_rows into @id,@max
end

close cur_rows

set rowcount 0

Method 2

"Duplicate records" have two meanings of duplicate records, one is a completely duplicate record, that is, a record with all fields duplicated, and the other is a record with duplicate parts of key fields, such as the Name field is duplicated, while other fields are not necessarily duplicated or can be ignored.

1. For the first type of repetition, it is relatively easy to solve and use

select distinct * from tableName

You can get the result set with no duplicate records.

If you need to delete duplicate records (keep 1 duplicate record), you can delete them as follows

select distinct * into #Tmp from tableName

drop table tableName

select * into tableName from #Tmp
drop table #Tmp

This duplication occurs due to poor table design and can be solved by adding unique index columns.

2. This type of duplicate problem usually requires the first record in the duplicate record to be kept, and the operation method is as follows

Suppose there are duplicate fields of Name and Address, and you need to get a unique result set for these two fields

select identity(int,1,1) as autoID, * into #Tmp from tableName

select min(autoID) as autoID into #Tmp2 from #Tmp group by Name,autoID

select * from #Tmp where autoID in(select autoID from #tmp2)

The last select gets the result set of Name and Address that are not repeated (but there is an additional autoID field, which can be written in the select clause when actually writing)

(4)
Queries are duplicated

select * from tablename where id in (select id from tablename

group by id

having count(id) > 1

)

3. Find redundant duplicate records (multiple fields) in the table
select * from vitae a
where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)

Running will cause problems, and writing and sending like where(a.peopleId, a.seq) will not work!!

[Tips] SQL How to query a table to remove duplicates

Related Posts

Sections viewed