Monday, 29 July 2013

Lesson 67 : LISTAGG

LISTAGG
This article describes the new LISTAGG function in Oracle 11g Release 2. LISTAGG is a built-in function that enables us to perform string aggregation natively. String aggregation is a popular technique, and there are several methods available on the web, so we will compare their performance to the new LISTAGG function later in this article.
sample data
For reference, we will use the following sample data for our examples.
    DEPTNO ENAME      HIREDATE
---------- ---------- ----------
        10 CLARK      09/06/1981
        10 KING       17/11/1981
        10 MILLER     23/01/1982
        20 ADAMS      12/01/1983
        20 FORD       03/12/1981
        20 JONES      02/04/1981
        20 SCOTT      09/12/1982
        20 SMITH      17/12/1980
        30 ALLEN      20/02/1981
        30 BLAKE      01/05/1981
        30 JAMES      03/12/1981
        30 MARTIN     28/09/1981
        30 TURNER     08/09/1981
        30 WARD       22/02/1981
 

what is string aggregation?

String aggregation is simply the grouping and concatenation of multiple rows of data into a single row per group. For example, consider the following resultset:
   DEPTNO ENAME
--------- ----------
       10 CLARK
       10 KING
       10 MILLER
       20 ADAMS
       20 FORD
       20 JONES
With string aggregation, this resultset would be grouped (by DEPTNO) as follows:
   DEPTNO AGGREGATED_ENAMES
--------- -------------------------
       10 CLARK,KING,MILLER
       20 ADAMS,FORD,JONES
 
We can see that the employee names have simply been grouped and concatenated into a single column (values are delimited by comma) per group. As stated, there are several techniques available to perform this aggregation (references are provided at the end of this article), but the new LISTAGG function makes it much easier, as we will see below.

listagg syntax overview

The LISTAGG function has the following syntax structure:
LISTAGG( [,]) WITHIN GROUP (ORDER BY ) [OVER (PARTITION BY )]
LISTAGG is an aggregate function that can optionally be used as an analytic (i.e. the optional OVER() clause). The following elements are mandatory:
  • the column or expression to be aggregated;
  • the WITHIN GROUP keywords;
  • the ORDER BY clause within the grouping.
We will now see some examples of the function below.

listagg as an aggregate function

We will begin with a simple example that aggregates the employee names for each department in the EMP table, using a comma as delimiter.
SQL> SELECT deptno
  2  ,      LISTAGG(ename, ',') WITHIN GROUP (ORDER BY ename) AS employees
  3  FROM   emp
  4  GROUP  BY
  5         deptno;
 
    DEPTNO EMPLOYEES
---------- ------------------------------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
 
3 rows selected.
Note that we chose to order the employees within each aggregation by the employee name. It should be noted that ordering the elements of a string aggregation is not a trivial task in some of the alternative techniques to LISTAGG.
In the following example, we will aggregate the employee names but order them by their respective hire dates.
SQL> SELECT deptno
  2  ,      LISTAGG(ename, ',') WITHIN GROUP (ORDER BY hiredate) AS employees
  3  FROM   emp
  4  GROUP  BY
  5         deptno;
 
    DEPTNO EMPLOYEES
---------- ------------------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,FORD,SCOTT,ADAMS
        30 ALLEN,WARD,BLAKE,TURNER,MARTIN,JAMES
 
3 rows selected.
We can see that the order of employee names within each group differs from the first example.