The way to remove unique database entries without deleting them

Feb 26, 2013   #mysql 

Sometimes you need to remove the record because the name can come into conflict with future records with the same name.

Say you have a list of entries like this:

+----+--------------------+
| id | name               |
+----+--------------------+
| 1  | Doris M. Thornburg |
| 2  | Richard D. Moore   |
| 3  | Tonya M. Allen     |
+----+--------------------+

To be able to distinguish the names from each other, it is a good idea to make these unique. It’s a good idea, since names really are identifiers we use in a social content. If two people have the same name in a company, we usually name one of them something else, like: Frank from Sales and Frank from Support. I strongly recommend enforcing some kind of human readable unique identifier, since it is almost impossible to share a common perspective if we have to refer to entries in a list as “the third Frank from the top.”. Now, perhaps human names aren’t the best example for this, but you get the picture.

The problem with unique indexes

The problem we most often encounter when just setting records is removed, is that no new records can be made in the same name.

+----+--------------------+------------+
| id | name               | is_removed |
+----+--------------------+------------+
|  1 | Doris M. Thornburg |          0 |
|  2 | Richard D. Moore   |          1 |
|  3 | Tonya M. Allen     |          0 |
+----+--------------------+------------+

So while we easily can exclude is_removed = 1 from our SELECT query, we are not able to create a new record with the name Richard D. Moore because of the index.

We could instead extend the unique index to also contain the is_removed column, but that will only delay the conflict to one removed record. To be able to maintain a continuous insertion of unique names, while ensuring that removed records don’t clutter up the index, we could create introduce a new column that gets filled with a random number when the record is removed. For this approach we could also use the is_removed column. This way only is_removed = 0 will be included and the unique index will consist of name and is_removed. While this seems to be a viable solution, it is however a mess, since our unique index will include of a column, that seems to have nothing to do with the name.

The name is the unique index

Instead of making an index of from a compound, we could keep it simple and instead use the name column for storing something random along with the original value of the name. This means that Richard D. Moore becomes XXXX Richard D. Moore where we know that the first five chars are random and should be removed before restoring the record.

This should statistically ensure that we never run into a naming conflict, unless some user hits the jackpot and picks the name 3eHH Richard D. Moore as the name of the record. Something that most likely will not happen.

But why stop there. Why not remove the random part and use something better than random; sequential. I’m talking about the ID of the record. Guaranteed no two values alike, we can use the id column as the extra-text.

Special characters

When concatenating the id onto the name column it is very important that the ID is the first thing in the name. Something like this would be perfect: __1__Richard D. Moore. The reason for this is special characters.

Say you do it like this: Richard D. Moore__1__. Some string functions (like the ones in MySQL) aren’t UTF-friendly and have a hard time determining the position of __{{id}}__ if the name has special characters. This is much simpler than to do it the other way around, because your index ID most likely don’t contain special characters.

Conclusion

Don’t clutter up the columns marked as unique indexes containing data from the users. Make sure to introduce something in the records marked as removed. Otherwise your users will suffer, when they are not able to reuse a name used in the unique index.

Use this format: __{{id}}__{{value}}.

Like so:

+----+-------------------------+------------+
| id | name                    | is_removed |
+----+-------------------------+------------+
|  1 | Doris M. Thornburg      |          0 |
|  2 | __2__Richard D. Moore   |          1 |
|  3 | Tonya M. Allen          |          0 |
+----+-------------------------+------------+

Happy creating :-)