Quick Search for:  in language:    
SQL,Hassle,free,locating,reporting,deletion,d
   Code/Articles » |  Newest/Best » |  Community » |  Jobs » |  Other » |  Goto » | 
CategoriesSearch Newest CodeCoding ContestCode of the DayAsk A ProJobsUpload
SQL Stats

 Code: 45,316. lines
 Jobs: 126. postings

 How to support the site

 
Sponsored by:

 
You are in:
 
Login





Latest Code Ticker for SQL.
sp_db_paging_pr imary_key
By Marcus LeBlanc on 1/11


sp_db_paging
By Marcus LeBlanc on 1/11


Beginners pack
By hardik shah on 1/11


Inline Views
By Thivya Prabakaran on 1/6


Click here to put this ticker on your site!


Add this ticker to your desktop!


Daily Code Email
To join the 'Code of the Day' Mailing List click here!

Affiliate Sites



 
 
   

SQL Detect and or Delete Dups

Print
Email
 
VB icon
Submitted on: 10/25/2002 5:15:49 PM
By: Sean P. O. MacCath-Moran  
Level: Advanced
User Rating: By 2 Users
Compatibility:SQL Server 2000

Users have accessed this code 3831 times.
 
(About the author)
 
     /* Hassle free locating / reporting or deletion of duplicate rows in a SQL table based on the uniqueness of a column or columns that the user provides with a few nifty hands off features. Keywords: delete unique column record records */
 
Can't Copy and Paste this?
Click here for a copy-and-paste friendly version of this code!

--**************************************
--     
-- for :SQL Detect and or Delete Dups
--**************************************
--     
/*None*/
code:
Can't Copy and Paste this?
Click here for a copy-and-paste friendly version of this code!
 
Terms of Agreement:   
By using this code, you agree to the following terms...   
1) You may use this code in your own programs (and may compile it into a program and distribute it in compiled format for languages that allow it) freely and with no charge.   
2) You MAY NOT redistribute this code (for example to a web site) without written permission from the original author. Failure to do so is a violation of copyright laws.   
3) You may link to this code from another website, but ONLY if it is not wrapped in a frame. 
4) You will abide by any additional copyright restrictions which the author may have placed in the code or code's description.

--**************************************
--     
-- Name: SQL Detect and or Delete Dups
-- Description:/*
Hassle free locating / reporting or deletion OF duplicate rows in a SQL TABLE based on the uniqueness of a COLUMN or columns that the USER provides WITH a few nifty hands OFF features.
Keywords: DELETE UNIQUE COLUMN record records
*/
-- By: Sean P. O. MacCath-Moran
--
-- Inputs:/*
sp_RemoveDups ''TableName'', ''DupQualifierFieldNameList'', [''DeleteDups''], [''UniqueColName''], [''CreateIdentColIfNeeded''], <''StoredProcedureResult''>'
TableName: Required - String - Name OF the table TO detect duplicate records in.'
DupQualifierFieldNameList: Required - String - Comma seperated list OF columns that make up the UNIQUE record within TableName.'
DeleteDups: Optional - Bit, SET TO 0 to delete duplicate records. A value OF NULL or 1 will RETURN the duplicate records to be deleted.'
UniqueColName: Optional - Bit - A TABLE must have a unique COLUMN value IN it TO perform the deletion logic. IF no UniqueColName IS provided THEN an attemp will be made to locate the RowGUID column. If that fails then an attempt will be made to locate the IDENTITY column. If that fails then ALL OF the columns of the table will be examined AND the first one WITH all unique VALUES will be selected.'
CreateIdentColIfNeeded: Optional - Bit - BY DEFAULT this SP will create an IDENTITY COLUMN if no UNIQUE column can be located. Pass IN a 1 here TO run this feature off.'
StoredProcedureResult: Optional - OUTPUT - Int - Returns a 3 IF an error occured, otherwise returns a 0.'
*/
--
-- Returns:/*
IF a the first two parameters are either missing or invalid AND the USER is making the stored PROCEDURE call FROM SQL Query Analyzer THEN help text is provided.
IF the DeleteDups argument IS NOT passed IN or it is SET TO 1 THEN a recordset OF the duplicate records to be deleted will be returned.
IF the DeleteDups argument IS SET TO 0 THEN the DEFAULT SQL success or failure behavior should occur.
*/
--
-- Assumes:/*
Ok - I got tired OF repeatedly modifying my favorite duplicate deletion algorithm FOR whatever TABLE I happen TO be working WITH at the time. This code is the result of that frustration.
The actual deletion method employed IS a correlated subquery, so I must locate or CREATE a COLUMN OF UNIQUE values. TO execute, this code ONLY requires that you specify the TABLE name AND which column or columns make up a unique row IN that table. It will examine the table to discover if it has a Row-GUID or IDENTITY column. If not, THEN each column is examined to determine if it CONTAINS ALL unique values. Failing THAT it will create a TEMPORARY Identity column. Alternatively, the unique column can be specified.
NOTE: BY default, this code does not DELETE record - it just reports them. TO delete records you must pass IN the appropriate parameters.
*/
--
--This code is copyrighted and has-- limited warranties.Please see http://
--     www.Planet-Source-Code.com/vb/scripts/Sh
--     owCode.asp?txtCodeId=576&lngWId;=5--for details.--**************************************
--     

/*	Description: UTILITY - Locate IN MASTER 
	 Syntax: EXEC sp_RemoveDups TableName, DupQualifierFieldNameList, DeleteDups, UniqueColName, CreateIdentColIfNeeded, StoredProcedureResult
			ONLY the first two arguments are required
			FOR HELP, enter the stored procedures name without ANY arguments or see the PRINT statements below
			NO DELETION WILL OCCUR BY DEFAULT - ONLY duplicate recordset returned. TO DELETE records, pass IN a 0 FOR the DeleteDups argument.
	 Example: EXEC sp_RemoveDups 'MyTable','LastName,FirstName,HomePhone'
 	 Purpose: Allow removal OF duplicate rows WHERE
			1. We define what fields qualify the duplicate
			2. We SELECT the UNIQUE rowid OR it IS detected automatically ELSE no action takes place
	Method:		DELETE BY RowID ALL duplicate rows EXCEPT the highest RowID (in alpha-sort sequence) 
			OF each GROUP of duplicates. 
 	DATE		BY				CHANGE
	09-23-2002	Frank				Original v1.0
	09-23-2002	Frank				Changed the name FROM sp_RemoveDupsByRowID TO sp_RemoveDups
	10-8-2002	Sean P. O. MacCath-Moran	Made @UniqueColName optional
							Added logic TO auto-detect RowGUID AND IDENTITY columns
							Added logic TO CHECK FOR UNIQUE value COLUMN IF no RowGUID or IDENTITY column EXISTS
							Added logic TO CREATE a TEMPORARY ID field AS a last resort IF no UNIQUE COLUMN could be located
							Added HELP
*/
CREATE PROCEDURE sp_RemoveDups
@TableName AS varchar(50) = NULL,
@DupQualifierFieldNameList AS varchar(200) = NULL,
@DeleteDups AS bit = NULL,
@UniqueColName AS varchar(50) = NULL,
@CreateIdentColIfNeeded AS bit = NULL,
@StoredProcedureResult int = NULL OUTPUT
AS
SET NOCOUNT ON
DECLARE @SQL nvarchar(2000)
DECLARE @SQL_DetermineUniqueTemplate nvarchar(2000)
DECLARE @TempIdentColName varchar(20)
DECLARE @HostName varchar(50)
DECLARE @ActionText varchar(10)
DECLARE @SUM int
DECLARE @COUNT int
DECLARE @NextColumn varchar(50)
/*==================================================================================*/
/*========================VARIABLE INITIALIZATION AND SETUP========================*/
/*=================================================================================*/
/*If no UNIQUE COLUMN IS located THEN a TEMPORARY IDENTITY column will be created useing the name specified in this TempIdentColName string*/
SET @TempIdentColName = 'TempIdentityColXY123'
SET @SQL_DetermineUniqueTemplate = 'SELECT @COUNT = COUNT(Count), @SUM = sum(Count) FROM '
SET @SQL_DetermineUniqueTemplate = @SQL_DetermineUniqueTemplate + CHAR(13) + '(SELECT TOP 100 PERCENT <COLUMN_NAME>, COUNT(*) AS Count FROM ' + @TableName
SET @SQL_DetermineUniqueTemplate = @SQL_DetermineUniqueTemplate + CHAR(13) + ' GROUP BY <COLUMN_NAME> ORDER BY <COLUMN_NAME>) a'
/*Retrieve the Host Name. This will be used later IN this SP AS a test TO determine IF the USER is making this call FROM within SQL Query Analyzer*/
SELECT @HostName = hostname FROM master..sysprocesses WHERE spid = @@SPID AND program_name = 'SQL Query Analyzer'
/*Set ActionText TO be used IN message output*/
IF (@DeleteDups IS NULL) OR (@DeleteDups = 1)
	SET @ActionText = 'Selection' 
ELSE
	SET @ActionText = 'Deletion' 
/*If a value IS specified FOR USE BY UniqueColName it cannot exist IN the columns FROM the DupQualifierFieldNameList*/
IF CHARINDEX(@UniqueColName, @DupQualifierFieldNameList) > 0
	BEGIN
		/*The value IN UniqueColName was detected in DupQualifierFieldNameList.*/
		IF NOT (@HostName IS NULL) PRINT 'The UniqueColName provided (' + @UniqueColName + ') must not exist in DupQualifierFieldNameList (' + @DupQualifierFieldNameList + '). Other solutions will be sought automatically.'
		SET @UniqueColName = NULL
	END
/*If UniqueColName IS provided THEN perform CHECK TO ensure that ALL the VALUES IN that COLUMN are, in fact, unique.*/
IF NOT (@UniqueColName IS NULL)
	BEGIN
		SET @SQL = REPLACE(@SQL_DetermineUniqueTemplate,'<COLUMN_NAME>', @UniqueColName)
		/*Perform a CHECK OF this COLUMN TO determine IF ALL of it's VALUES are unique*/
		EXEC sp_executesql @SQL, N'@SUM AS int OUTPUT,@COUNT as int OUTPUT',@SUM OUTPUT,@COUNT OUTPUT
		/*Test TO determine IF this COLUMN CONTAINS UNIQUE values*/
		IF @SUM <> @COUNT 
			BEGIN
				/*The COLUMN specified BY UniqueColName does NOT contain unique values.*/
				IF NOT (@HostName IS NULL) PRINT 'The UniqueColName provided (' + @UniqueColName + ') does not contain unique values. Other solutions will be sought automatically.'
				SET @UniqueColName = NULL
			END
	END
/*==============================================================*/
/*======================HELP OUTPUT TEXT======================*/
/*==============================================================*/
IF (@TableName IS NULL) OR (@TableName = '/?') OR (@TableName = '?') OR (@DupQualifierFieldNameList IS NULL) OR (@DupQualifierFieldNameList = '/?') OR (@DupQualifierFieldNameList = '?')
	BEGIN
		IF NOT (@HostName IS NULL)
			BEGIN
				PRINT 'sp_RemoveDups ''TableName'', ''DupQualifierFieldNameList'', [''DeleteDups''], [''UniqueColName''], [''CreateIdentColIfNeeded''], <''StoredProcedureResult''>'
				PRINT '====================================================================================================================================================================='
				PRINT 'TableName: Required - String - Name OF the table TO detect duplicate records in.'
				PRINT 'DupQualifierFieldNameList: Required - String - Comma seperated list OF columns that make up the UNIQUE record within TableName.'
				PRINT 'DeleteDups: Optional - Bit, SET TO 0 to delete duplicate records. A value OF NULL or 1 will RETURN the duplicate records to be deleted.'
				PRINT 'UniqueColName: Optional - Bit - A TABLE must have a unique COLUMN value in it TO perform the deletion logic. IF no UniqueColName IS provided THEN an attemp will be made to locate the RowGUID column. If that fails then an attempt will be made to locate the IDENTITY column. If that fails then ALL OF the columns of the table will be examined AND the first one WITH all unique VALUES will be selected.'
				PRINT 'CreateIdentColIfNeeded: Optional - Bit - BY DEFAULT this SP will create an IDENTITY COLUMN if no UNIQUE column can be located. Pass in a 1 here TO run this feature off.'
				PRINT 'StoredProcedureResult: Optional - OUTPUT - Int - Returns a 3 IF an error occured, otherwise returns a 0.'
			END
		SET @StoredProcedureResult = 3
		RETURN
	END
/*========================================================================*/
/*======================DETECT USABLE UniqueColName======================*/
/*========================================================================*/
IF @UniqueColName IS NULL
	BEGIN
		/*Check FOR a RowGUID or IDENTITY COLUMN IN this table. IF one exists, THEN utilze it AS the UNIQUE value for the purposes OF this deletion*/
		IF EXISTS(SELECT * FROM SysColumns WHERE ID = Object_ID(@TableName) AND ColumnProperty(ID,Name,'IsRowGUIDCol') = 1) SET @UniqueColName = 'RowGUIDCol'
		IF EXISTS(SELECT * FROM SysColumns WHERE ID = Object_ID(@TableName) AND ColumnProperty(ID,Name,'IsIdentity') = 1) SET @UniqueColName = 'IdentityCol'
		IF @UniqueColName IS NULL 
		/*If no RowGUID OR IDENTITY COLUMN was found THEN CHECK ALL OF the columns IN this TABLE TO see if one of them can be utilized as a UNIQUE value column*/
			BEGIN
				/*Select ALL OF the columns FROM the TABLE IN question...*/
				DECLARE MyCursor CURSOR LOCAL SCROLL STATIC FOR SELECT name FROM syscolumns WHERE OBJECT_ID(@TableName)=ID
				OPEN MyCursor
				FETCH NEXT FROM MyCursor INTO @NextColumn
				WHILE @@fetch_status = 0
					BEGIN
						/*Create SQL string WITH correct COLUMN name in place.*/
						SET @SQL = REPLACE(@SQL_DetermineUniqueTemplate,'<COLUMN_NAME>', @NextColumn)
						/*Perform a CHECK OF this COLUMN TO determine IF ALL of it's VALUES are unique*/
						EXEC sp_executesql @SQL, N'@SUM AS int OUTPUT,@COUNT as int OUTPUT',@SUM OUTPUT,@COUNT OUTPUT
						/*Test TO determine IF this COLUMN CONTAINS UNIQUE values*/
						IF @SUM = @COUNT 
							BEGIN
								/*A UNIQUE VALUES COLUMN IS detected. USE it AND BREAK out OF the loop UNLESS column is specified IN DupQualifierFieldNameList*/
								IF CHARINDEX(@NextColumn, @DupQualifierFieldNameList) = 0
									BEGIN
										/*NextColumn was NOT detected IN DupQualifierFieldNameList, so this is the column we will use.*/
										SET @UniqueColName = @NextColumn
										BREAK
									END
							END
						ELSE
						FETCH NEXT FROM MyCursor INTO @NextColumn
					END
				CLOSE MyCursor
				DEALLOCATE MyCursor
			END
	END
/*If no UniqueColName has been found THEN CREATE one UNLESS @CreateIdentColIfNeeded = 1*/
IF (@UniqueColName IS NULL) AND ( (@CreateIdentColIfNeeded IS NULL) OR (@CreateIdentColIfNeeded = 0) )
	BEGIN
		/*Add a sequence COLUMN TO the table...*/
		IF NOT (@HostName IS NULL) PRINT 'Creating TEMPORARY IDENTITY COLUMN in the ' + @TableName + ' table named ' + @TempIdentColName + ' FOR USE in this ' + LOWER(@ActionText) + ' process...'
		EXEC('ALTER TABLE ' + @TableName + ' ADD ' + @TempIdentColName + ' [int] IDENTITY (1, 1)')
		SET @UniqueColName = @TempIdentColName
	END
/*============================================================================*/
/*======================EXECUTE DELETION OR SELECTION======================*/
/*===========================================================================*/
IF @UniqueColName IS NULL
	BEGIN
		/*No UniqueColName was provided BY the USER AND none were detected by the script. This deletion algorythm cannot run.*/
		IF NOT (@HostName IS NULL) PRINT 'Could not perform ' + LOWER(@ActionText) + ' process. No UNIQUE columns were located AND the UniqueColName flag is SET TO 1 (False).'
		SET @StoredProcedureResult = 3
		RETURN
	END
ELSE
	BEGIN
		IF NOT (@HostName IS NULL) PRINT 'Performing ' + LOWER(@ActionText) + ' utilizing the UNIQUE VALUES in the ' + @UniqueColName + ' COLUMN AS a reference...'
		/*
		CREATE AND EXECUTE an SQL statement IN the form of:
		SELECT * (or DELETE)
			FROM TableName WHERE UniqueColName IN 
			(
			SELECT UniqueColName FROM TableName WHERE UniqueColName NOT IN 
				(
				SELECT MAX(Cast(UniqueColName AS varchar(36))) FROM TableName GROUP BY DupQualifierFieldNameList, DupQualifierFieldNameList, etc
				)
			)
		*/
		/*Delete ALL duplicate records useing @UniqueColName AS a unique ID COLUMN */
		IF (@DeleteDups IS NULL) OR (@DeleteDups = 1)
			SET @SQL = 'SELECT * ' 
		ELSE
			SET @SQL = 'DELETE ' 
		SET @SQL = @SQL + 'FROM ' + @TableName + ' WHERE ' + @UniqueColName + ' IN '
		SET @SQL = @SQL + CHAR(13) + CHAR(9) + '(' + CHAR(13) + CHAR(9) 
		SET @SQL = @SQL + 'SELECT ' + @UniqueColName + ' FROM ' + @TableName + ' WHERE ' + @UniqueColName + ' NOT IN '
		SET @SQL = @SQL + CHAR(13) + CHAR(9) + CHAR(9) + '(' + CHAR(13) + CHAR(9)+CHAR(9)
		SET @SQL = @SQL + 'SELECT MAX(Cast(' + @UniqueColName + ' AS varchar(36))) FROM '
		SET @SQL = @SQL + @TableName + ' GROUP BY ' + @DupQualifierFieldNameList
		SET @SQL = @SQL + CHAR(13) + CHAR(9) + CHAR(9) + ')' + CHAR(13) + CHAR(9) + ')'
		EXEC (@SQL)
		IF @@ERROR <> 0
			BEGIN
				IF NOT (@HostName IS NULL) PRINT @ActionText + ' process failed.'
				SET @StoredProcedureResult = 3
			END
		ELSE
			BEGIN
				IF NOT (@HostName IS NULL) PRINT @ActionText + ' completed successfully WITH this SQL: ' + CHAR(13) + @SQL
				SET @StoredProcedureResult = 0
			END
	END
IF (@UniqueColName = @TempIdentColName) AND ( (@CreateIdentColIfNeeded IS NULL) OR (@CreateIdentColIfNeeded = 0) )
	BEGIN
		/*Remove the sequence COLUMN FROM the table...*/
		IF NOT (@HostName IS NULL) PRINT 'Removing TEMPORARY IDENTITY COLUMN named ' + @TempIdentColName + ' FROM the ' + @TableName + ' table...'
		EXEC('ALTER TABLE ' + @TableName + ' DROP COLUMN ' + @TempIdentColName)
	END
GO


Other 3 submission(s) by this author

 

 
Report Bad Submission
Use this form to notify us if this entry should be deleted (i.e contains no code, is a virus, etc.).
Reason:
 
Your Vote!

What do you think of this code(in the Advanced category)?
(The code with your highest vote will win this month's coding contest!)
Excellent  Good  Average  Below Average  Poor See Voting Log
 
Other User Comments

 There are no comments on this submission.
 
Add Your Feedback!
Note:Not only will your feedback be posted, but an email will be sent to the code's author in your name.

NOTICE: The author of this code has been kind enough to share it with you.  If you have a criticism, please state it politely or it will be deleted.

For feedback not related to this particular code, please click here.
 
Name:
Comment:

 

Categories | Articles and Tutorials | Advanced Search | Recommended Reading | Upload | Newest Code | Code of the Month | Code of the Day | All Time Hall of Fame | Coding Contest | Search for a job | Post a Job | Ask a Pro Discussion Forum | Live Chat | Feedback | Customize | SQL Home | Site Home | Other Sites | About the Site | Feedback | Link to the Site | Awards | Advertising | Privacy

Copyright© 1997 by Exhedra Solutions, Inc. All Rights Reserved.  By using this site you agree to its Terms and Conditions.  Planet Source Code (tm) and the phrase "Dream It. Code It" (tm) are trademarks of Exhedra Solutions, Inc.