T-SQL Tutorial

PARTITION BY clause


The SQL Server PARTITION BY clause is a powerful feature that enables developers to split data into partitions and perform calculations or aggregations on each partition. This clause is commonly used with functions such as ROW_NUMBER(), RANK(), and DENSE_RANK().


Syntax

The syntax of the PARTITION BY clause is as follows:

SELECT column1, column2, ..., columnN, aggregate_function(columnX)
OVER (
PARTITION BY partition_column1, partition_column2, ..., partition_columnN
ORDER BY ordering_column1, ordering_column2, ..., ordering_columnN
)
FROM table_name;

In this syntax, partition_column1, partition_column2, ..., partition_columnN specifies the columns by which the data will be partitioned. The aggregate_function(columnX) can be any SQL aggregate function such as COUNT(), SUM(), AVG(), etc., that is applied on the partitioned data.

The ORDER BY clause is optional and can be used to specify the order of the data within each partition.


Example

Let's consider an example to understand the usage of the PARTITION BY clause better. Suppose we have a table employees with the following data:

employee_iddepartmentsalary
1HR50000
2HR60000
3IT70000
4IT80000
5IT90000
6Sales40000
7Sales55000
8Sales65000

Now, if we want to find the rank of each employee within their respective department based on their salary, we can use the following query:

SELECT employee_id, department, salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) as department_rank
FROM employees;

This query will partition the data by the department column and then rank the employees within each department based on their salary in descending order.

The result of this query would be:


employee_iddepartmentsalarydepartment_rank
2HR600001
1HR500002
5IT900001
4IT800002
3IT700003
8Sales650001
7Sales550002
6Sales400003

As we can see, the employees have been ranked based on their salary within their respective departments, which is exactly what we wanted to achieve.

In conclusion, the PARTITION BY clause is a powerful feature that can help developers partition data and perform complex calculations or aggregations on each partition. Its flexibility and ease of use make it an essential tool for any developer working with SQL Server.