How to find duplicate values in a table column?

There was an error with one of our application, and I needed to find table rows with duplicate values. The table column was not defined to have a unique index, but we weren't anticipating duplicate values. I needed an easier way to identify duplicate values in a table with simple SQL statement.

Consider the following "employee" table:

id Name Alias Age
1 John Doe John 30
2 John Smith John 40
3 Joe Schmo Joe 38
4 Charlie Bohne Charlie 55

Assuming that we have an "employee" table with above values, and looking to find records with duplicate "Alias". How do we retrieve them? With the following SQL statement with HAVING clause, we can easily accomplish that.

SELECT alias, count(alias) as count
FROM employee
GROUP BY alias
HAVING (count(alias) > 1)
ORDER BY alias

The above SQL statement will retrieve:

Alias Count
John 2

*NOTE: The HAVING clause allows SQL to use with aggregate functions with a condition whereas WHERE clause does not offer that functionality. For example, the HAVING clause can be used to retrieve SUM(x) > 100 or COUNT(y) > 1.

Tags: 

Comments

"GROUP BY" keywords allow SQL to aggregate multiple column values into a single row much like "DISTINCT" keyword, but it also allows calculated values in the result set. You may use SQL statement like the below to sum up "age" of retrieved rows:

SELECT alias, sum(age) as ages
FROM employee
GROUP BY alias

Here is the result set from the above query:

Alias Ages
John 70
Joe 38
Charlie 55
By aladar

Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.