Ordering of Strings according to numbers contained in them


This is an error that is more common than you would think.

Take an instance – There is a column in the Database table such as a Profile ID, and the developer thinks of sorting the records according to the Profile ID in either an ascending, or a descending order. eg. Sorting records by descending Profile ID to show the latest added records first.

What happens is that the Profile Id is alphanumeric, and default sorting of such records is not exactly as one would expect.

Below is the description of a recent bug I raised. You will understand what I’ve been trying to say by reading it.

Bug Description – “The ‘Top Deals in your City’ section of Home page is required to show the latest entered bike listings.
However, Latest Listings were not being displayed as expected. Issue is as  explained below:

Listings were being Ordered by ProfileID, and not by EntryDate.
Hence, a listing with ProfileID ‘S27’ that was entered on ‘2012-08-27 20:24:26.067’ was considered newer than a listing with ProfileID ‘S237’ that was entered on ‘2013-02-05 11:13:27.637’

Strings are ordered as ascending/descending in a way different from the ordering of integers.
Hence, while ordering, ‘S27’ was considered to be greater than ‘S237’ ”

Have I succeeded in explaining this to you yet? Let me try again..

You have the following values:

A1

A2

A10

A108

A20

A3

You want to order them in the ascending order. What would you expect the result to be? I’m sure everyone would expect it to be as below;

A1

A2

A3

A10

A20

A108

But, the default sort functions in almost every programming language would return the below result. The records are sorted in ASCII order, and not in Alphabetical order.

A1

A10

A108

A2

A20

A3

What can we do to resolve this problem? Let’s see…

1. The first and most simple solution would be to not use the alpha-numeric column for sorting the records.

Instead, one could use a Date column to sort records (as mentioned in my bug description above), or one could use an ID column that contains only numbers and is meant for sorting the records.

Proper planning of Database architecture could help prevent this problem from arising.

2. If the only way to sort records is to use the alpha-numeric column, then one can use “Natural Sorting”. Following are some good links that would help you understand this subject more.

The Alphanum Algorithm

Sorting for Humans : Natural Sort Order

Natural Sorting in C#

Compact Python Human Sort

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s