Most SQL dialects don't come with a median function. But one can be cobbled together using the procedural programming language that accompanies most databases.
They work like this:
1. Sort the list of values.
2. Count the number of items in the list.
3. Select the element at position n/2.
A function like this can't be expressed purely in SQL because SQL is not procedural.
I've come up with a median function that uses pure SQL:
select age
from person
where (select count(age) from person p1 where p1.age >= person.age) -
(select count(age) from person p2 where p2.age <= person.age) between 0 and 1;
This technique uses two correlated subqueries. The outer query works through the list of values (in this case, age). As it considers each age in turn, it counts the number of ages that are greater than the target age and the number of ages that are less. The difference between the two should be 0 or 1 (depending on whether there are an odd or even number of ages to consider).
When the difference is 0 or 1, you've found the median value.
This solution is O(n2). Expect it to take a long time on large data sets.
My Blog Has Moved!
8 years ago