In computer science
and mathematics, a
sorting algorithm is an algorithm that puts
elements of a list in a certain
order. The most-used
orders are numerical order and lexicographical
order. Efficient sorting is important to
optimizing the use of other algorithms (such as search and
merge algorithms)
that require sorted lists to work correctly; it is also often
useful for canonicalizing
data and for producing human-readable output. More formally, the
output must satisfy two conditions:
- The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order);
- The output is a permutation, or reordering, of the input.
Since the dawn of computing, the sorting problem has attracted a
great deal of research, perhaps due to the complexity of solving
it efficiently despite its simple, familiar statement. For
example, bubble sort was
analyzed as early as 1956.[1]
Although many consider it a solved problem, useful new sorting
algorithms are still being invented (for example, library sort was
first published in 2004). Sorting algorithms are prevalent in
introductory computer science classes, where the abundance of
algorithms for the problem provides a gentle introduction to a
variety of core algorithm concepts, such as big O notation,
divide
and conquer algorithms, data structures,
randomized
algorithms, best,
worst and average case analysis, time-space
tradeoffs, and lower bounds.
Comparison of
algorithms:
In this table, n is the number of records to be sorted. The columns "Average" and "Worst" give the time complexity in each case, under the assumption that the length of each key is constant, and that therefore all comparisons, swaps, and other needed operations can proceed in constant time. "Memory" denotes the amount of auxiliary storage needed beyond that used by the list itself, under the same assumption. These are all comparison sorts.
Name
|
Average
|
Worst
|
Memory
|
Stable
|
Method
|
Other notes
|
|---|---|---|---|---|---|---|
| Bubble sort |
02
|
03
|
01
|
Yes | Exchanging | |
| Cocktail sort | 03 - |
03
|
01
|
Yes | Exchanging | |
| Comb sort | 03 - | 04 - |
01
|
No | Exchanging | Small code size |
| Gnome sort | 03 - |
03
|
01
|
Yes | Exchanging | Tiny code size |
| Selection sort |
02
|
03
|
01
|
Yes | Selection | |
| Insertion sort |
02
|
03
|
01
|
Yes | Insertion |
Average case is also
, where d is the number of
inversions
|
| Shell sort | 03 - |
02
|
01
|
No | Insertion | |
| Binary tree sort |
01
|
01
|
03
|
Yes | Insertion | When using a self-balancing binary search tree |
| Library sort |
01
|
03
|
03
|
Yes | Insertion | |
| Merge sort |
01
|
01
|
03
|
Yes | Merging | |
| In-place merge sort |
01
|
01
|
01
|
No | Merging | Example implementation here: [1]; can be implemented as a stable sort based on stable in-place merging: [2] |
| Heapsort |
01
|
01
|
01
|
No | Selection | |
| Smoothsort | 03 - |
01
|
01
|
No | Selection |
An adaptive sort
-
comparisons when the data is already sorted, and
swaps.
|
| Quicksort |
01
|
03
|
02
|
No | Partitioning |
Naïve
variants use
space
|
| Introsort |
01
|
01
|
02
|
No | Hybrid | Used in SGI STL implementations |
| Patience sorting | 03 - |
01
|
03
|
No | Insertion & Selection | Finds all the longest increasing subsequences within O(n log n) |
| Strand sort |
01
|
03
|
03
|
Yes | Selection | |
| Tournament sort |
01
|
01
|
Selection |
The following table describes sorting algorithms that are not
comparison sorts.
As such, they are not limited by a
lower bound. Complexities below are in terms of n, the
number of items to be sorted, k, the size of each key,
and s, the chunk size used by the implementation. Many
of them are based on the assumption that the key size is large
enough that all entries have unique key values, and hence that
n << 2k, where << means
"much less than."
Name
|
Average
|
Worst
|
Memory
|
Stable
|
n << 2k
|
Notes
|
|---|---|---|---|---|---|---|
| Pigeonhole sort |
|
|
|
Yes | Yes | |
| Bucket sort |
|
|
|
Yes | No | Assumes uniform distribution of elements from the domain in the array. |
| Counting sort |
|
|
|
Yes | Yes | |
| LSD Radix sort |
|
|
|
Yes | No | |
| MSD Radix sort |
|
|
|
No | No | |
| Spreadsort |
|
|
|
No | No | Asymptotics are based on the assumption that n << 2k, but the algorithm does not require this. |
The following table describes some sorting algorithms that are impractical for real-life use due to extremely poor performance or a requirement for specialized hardware.
Name
|
Average
|
Worst
|
Memory
|
Stable
|
Comparison
|
Other notes
|
|---|---|---|---|---|---|---|
| Bogosort |
|
Unbounded |
|
No | Yes | Average time using Fisher-Yates shuffle |
| Bozo sort |
|
Unbounded |
|
No | Yes | Average time is asymptotically half that of bogosort |
| Stooge sort |
|
|
|
No | Yes | |
| Bead sort | N/A | N/A | - | N/A | No | Requires specialized hardware |
| Simple pancake sort |
|
|
|
No | Yes | Count is number of flips. |
| Sorting networks |
|
|
|
Yes | No |
Requires a custom circuit of size
|
| Tacosort |
|
Unbounded |
|
No | No | Running time also depends upon bit size of elements |
Additionally, theoretical computer scientists have detailed other
sorting algorithms that provide better than
time complexity with additional constraints, including:
- Han's algorithm, a deterministic algorithm for sorting keys
from a
domain of finite size, taking
time and
space.[2]
- Thorup's algorithm, a randomized algorithm for sorting keys
from a domain of finite size, taking
time and
space.[3]
- An integer sorting algorithm
taking
time and
space.[4]
While theoretically interesting, to date these algorithms have
seen little use in practice.
Bubble
sort:
Bubble sort is a straightforward and simplistic method of
sorting data that is used in computer science education. The
algorithm starts at the beginning of the data set. It compares
the first two elements, and if the first is greater than the
second, it swaps them. It continues doing this for each pair of
adjacent elements to the end of the data set. It then starts
again with the first two elements, repeating until no swaps have
occurred on the last pass. While simple, this algorithm is highly
inefficient and is rarely used except in education. For example,
if we have 100 elements then the total number of comparisons will
be 10000. A slightly better variant, cocktail sort,
works by inverting the ordering criteria and the pass direction
on alternating passes. Bubble sort average case and worst case
are both O(n²).
Insertion
sort:
Insertion sort is a simple sorting algorithm that is
relatively efficient for small lists and mostly-sorted lists, and
often is used as part of more sophisticated algorithms. It works
by taking elements from the list one by one and inserting them in
their correct position into a new sorted list. In arrays, the new
list and the remaining elements can share the array's space, but
insertion is expensive, requiring shifting all following elements
over by one. Shell sort (see below)
is a variant of insertion sort that is more efficient for larger
lists.
Shell
sort:
Shell sort was invented by Donald Shell in 1959.
It improves upon bubble sort and insertion sort by moving out of
order elements more than one position at a time. One
implementation can be described as arranging the data sequence in
a two-dimensional array and then sorting the columns of the array
using insertion sort. Although this method is inefficient for
large data sets, it is one of the fastest algorithms for sorting
small numbers of elements.
Merge
sort:
Merge sort takes advantage of the ease of merging
already sorted lists into a new sorted list. It starts by
comparing every two elements (i.e., 1 with 2, then 3 with 4...)
and swapping them if the first should come after the second. It
then merges each of the resulting lists of two into lists of
four, then merges those lists of four, and so on; until at last
two lists are merged into the final sorted list. Of the
algorithms described here, this is the first that scales well to
very large lists, because its worst-case running time is
O(n log n).
Heapsort:
Heapsort is a much more efficient version of selection sort. It
also works by determining the largest (or smallest) element of
the list, placing that at the end (or beginning) of the list,
then continuing with the rest of the list, but accomplishes this
task efficiently by using a data structure called a heap, a
special type of binary tree. Once the
data list has been made into a heap, the root node is guaranteed
to be the largest(or smallest) element. When it is removed and
placed at the end of the list, the heap is rearranged so the
largest element remaining moves to the root. Using the heap,
finding the next largest element takes O(log n) time,
instead of O(n) for a linear scan as in simple selection
sort. This allows Heapsort to run in O(n log n)
time.
Quicksort:
Quicksort is a divide
and conquer algorithm which relies
on a partition operation: to partition an array, we
choose an element, called a pivot, move all smaller
elements before the pivot, and move all greater elements after
it. This can be done efficiently in linear time and in-place. We
then recursively sort the lesser and greater sublists. Efficient
implementations of quicksort (with in-place partitioning) are
typically unstable sorts and somewhat complex, but are among the
fastest sorting algorithms in practice. Together with its modest
O(log n) space usage, this makes quicksort one of the
most popular sorting algorithms, available in many standard
libraries. The most complex issue in quicksort is choosing a good
pivot element; consistently poor choices of pivots can result in
drastically slower O(n²) performance, but if at each
step we choose the median as the pivot then it works in
O(n log n).
Radix
sort:
Radix sort is an algorithm that sorts a list of
fixed-size numbers of length k in O(n ·
k) time by treating them as bit strings. We first sort
the list by the least significant bit while preserving their
relative order using a stable sort. Then we sort them by the next
bit, and so on from right to left, and the list will end up
sorted. Most often, the counting sort
algorithm is used to accomplish the bitwise sorting, since the
number of values a bit can have is minimal - only '1' or '0'.
Bucket
sort:
Bucket sort is a sorting algorithm that works by partitioning an
array into a finite number of buckets. Each bucket is then sorted
individually, either using a different sorting algorithm, or by
recursively applying the bucket sorting algorithm. Thus this is
most effective on data whose values are limited (e.g. a sort of a
million integers ranging from 1 to 1000). A variation of this
method called the single buffered count sort is faster than
quicksort and takes about the same time to run on any set of
data.


, where d is the number of 

swaps.

