Quick Sort
Quicksort is a fast sorting algorithm that takes a divide-and-conquer approach to sorting lists. While sorting is a simple concept, it is a basic principle used in complex programs such as file search, data compression, and pathfinding. Running time is an important thing to consider when selecting a sorting algorithm since efficiency is often thought of in terms of speed. Quicksort has a very slow worst-case running time, but a fast average and best-case running time.
Contents
Sorting
An algorithm that maps the following input/output pair is called a sorting algorithm:
- Input: An array, \(A\), that contains \(n\) orderable elements (integers, strings, floating point numbers, etc.): \(A[0,1,...,n-1]\).
- Output: A sorted permutation of \(A\), called \(B\), such that \(B[0] \leq B[1] \leq \cdots \leq B[n-1].\)
Here is what it means for an array to be sorted: an array \(<a_n>\) is sorted if and only if for all \(i<j\), \(a_i \leq a_j.\)
In other words, a sorted array is an array that is in a particular order. For example, \([a,b,c,d]\) is sorted alphabetically, \([1,2,3,4,5]\) is a list of integers sorted in increasing order, and \([5,4,3,2,1]\) is a list of integers sorted in decreasing order.
By convention, empty arrays and singleton arrays (arrays consisting of only one element) are always sorted. This is a key point for the base case of many sorting algorithms.
Quicksort
Quicksort uses divide and conquer to sort an array. Divide and conquer is a technique used for breaking algorithms down into subproblems, solving the subproblems, and then combining the results back together to solve the original problem. It can be helpful to think of this method as divide, conquer, and combine.
Here are the divide, conquer, and combine steps that quicksort uses:
Divide:
- Pick a pivot element, \(A[q]\).
- Partition, or rearrange, the array into two subarrays: \(A[p, \ldots, q-1]\) such that all elements are less than \(A[q],\) and \(A[q+1, \ldots, r]\) such that all elements are greater than or equal to \(A[q]\).
Conquer: Sort the subarrays \(A[p, \ldots, q-1]\) and \(A[q+1, \ldots, r]\) recursively with quicksort.
Combine: No work is needed to combine the arrays because they are already sorted. [1]
Here is a recursive algorithm for quicksort:
- If the list is empty, return the list and terminate. (Base case)
- Choose a pivot element in the list.
- Take all of the elements that are less than or equal to the pivot and use quicksort on them.
- Take all of the elements that are greater than the pivot and use quicksort on them.
- Return the concatenation of the quicksorted list of elements that are less than or equal to the pivot, the pivot, and the quicksorted list of elements that are greater than the pivot.
Here is an animation that illustrates this procedure. Note: the black box indicates the pivot.
Choosing a Pivot
Picking a good pivot is the key for a fast implementation of quicksort; however, it is difficult to determine what a good pivot might be. The partitioning step takes time proportional to the number of elements being partitioned, so reducing the number of elements in each partition would give a faster runtime. The best-case pivot would divide the array into two equal parts, which would halve the problem size. However, this means that the pivot is the median of the elements, and in order to find the median, we would need an already sorted array. Since the goal of quicksort is to sort an array, we can’t rely on having a pivot equal to the median of the elements.
Here are some ways of choosing a pivot:
- Select a random pivot.
- Select the leftmost or rightmost element as the pivot.
- Take the first, middle, and last value of the array, and choose the median of those three numbers as the pivot (median-of-three method).[2]
- Use a median-finding algorithm such as the median-of-medians algorithm.
Quicksort Implementation
Here is one way to implement quicksort in Python. There are many ways to implement quicksort, but the same ideas are behind all of the implementations. Quicksort can sort any orderable list (integers, strings, floating point numbers, etc.).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Complexity of Quicksort
Division: Dividing the list into parts less than and greater than the pivot takes \(O(n)\) time because the algorithm needs to scan through the list, which has \(O(n)\) elements. During this step, for each element, the algorithm performs a constant number of comparisons. In particular, it determines if the element is greater than or less than the pivot.
Subproblems:
Worst Case
In the worst case, all elements are either less than or greater than the pivot. In other words, if the pivot is the smallest or largest element of the array. \(\)
In these cases, it takes \(T(n-1)\) time to solve the subproblems because there will be \(n - 1\) recursive calls to the algorithm that creates subarrays of size 0 and \(n-1\) and then at the next step creates subarrays of size 0 and \(n-2\), and so on.Best Case
The best case would be when both arrays are of the same length, in which case it would take \(2T\big(\frac{n-1}{2}\big)\) to solve both of the subproblems.
Combining: The subarrays are sorted in other steps of the algorithm, so there is no explicit, separate combining step.
Now, let's consider the following three analyses:
Best-case analysis
The best case recurrence is \[ T(n) = 2\,T \left ( \frac{n-1}{2} \right ) + O(n) .\] The master theorem tells us that the solution to this recurrence is \[T(n) = O(n \log n).\] Quicksort will have a best-case running time when the pivot at each recursive call is equal to the median element of the subarray. This means that, at each step, the problem size is being halved, and the array can be sorted with \(\log n\) nested calls. Each call takes \(O(n)\) time (from the division step), so the total run time of the best-case quicksort is \(O(n \log n)\).Worst-case analysis
The worst-case recurrence is \[ T(n) = T(n-1) + O(n) .\] The master theorem tells us that the solution to this recurrence is \[T(n) = O\big(n^2\big).\]Average-case analysis
This needs an explanation
The expected run time of the algorithm is also \[ T(n) = O(n \log n). \]
Recall that big O notation masks constant factors. While the average and best-case run time of quicksort is equal to that of other algorithms such as mergesort, a well-implemented quicksort will have much lower constant factors than other sorting algorithms. If two algorithms have the same asymptotic running time, the one with smaller constant factors will be faster. In practice, quicksort is often faster than mergesort.
Quicksort is usually implemented as an unstable sort with a best-case space complexity of \(O( \log n)\) and an average-case space complexity of \(O(n)\).
References
- Cormen, T., Leiserson, C., Rivest, R., & Stein, C. (2001). Introduction to Algorithms (2nd edition) (pp. 171). The MIT Press.
- Interactive Python , R. The Quick Sort. Retrieved March, 22, 2016, from http://interactivepython.org/runestone/static/pythonds/SortSearch/TheQuickSort.html