Deleting Duplicate Records Leaving Just the First
If you have a table that contains duplicates
of records each with it's own unique key and you want to remove all but one of
them then this is how to do it.
NOTE OF WARNING: YOU SHOULD FIRST PERFORM
THIS TASK IN A COPY OF THE DATABASE BEFORE EVEN THINKING OF RUNNING IT IN YOUR
ORIGINAL AS IT IS NOT REVERSIBLE AND IF YOU HAVE GOT IT WRONG YOU COULD LOSE ALL
YOUR DATA.
Create a new query and bring in the table in
question. Drag in both the field or fields that contain the duplicate data and
the unique key for that table. Set on the totals for the query by clicking the
summation symbol on the toolbar or by choosing view¦totals from the menus. In
the group by row for the unique key choose first. This will return the first
record for each group of records. Save this query as qry_List_Of_Firsts
Create a second query. Bring in the
qry_List_Of_Firsts and the original table. Then link them by unique key from the
unique table to the qry_List_Of_First. Double click the join and choose the
second option which shows all records from original table and only those records
from qry_List_Of_Firsts that match. Drag in the unique keys from both tables.
Click off the show button for the qry_List_Of_Firsts and set it's criteria to IS
NULL. This will list all the records in the original table that are not in the
qry_List_Of_Firsts. Save this new query as qry_List_To_Delete. This new query
now contains every record that is a duplicate that is not the first record for
that set of duplicates.
Create a third query and bring in just the
Original table. Set the query type to DELETE, and drag in the star field * (all
fields) and the unique id. Click off the show box for the unique id and for it's
criteria use one of the following two
For text type fields use
| DLookUp("UniqueFieldName","qry_List_To_Delete","UniqueFieldName ='" &
[UniqueFieldName] & "'") |
For number fields use
| DLookUp("UniqueFieldName","qry_List_To_Delete","UniqueFieldName
=" & [UniqueFieldName]) |
Then save the query.
BACKUP your data before running the query the first time (IT IS
NOT REVERSIBLE). Then run the query and check it has worked properly. You can
keep reusing the query at any time because it will only ever delete duplicates
as you specify in the first query
|