The Data Studio

Ugly Cross-Tab Report - Otherwise Known As A Pivot Table

I frequently get asked how to do "cross-tab" or "pivot table" reports in SQL.

Here's the challenge. We have a table that was created like this:

        create table transaction
        (
            id                   integer,
            claimant_id          integer,
            claim_head_id        integer,
            actual_or_reserve    varchar(7),
            payment_or_recovery  varchar(8),
            amount               numeric(8,2),
            created_at           timestamp,
            updated_at           timestamp
        );
    

and we want a report like this:

Actual or Reserve Payment or Recovery 2016Q4 2017Q1 2017Q2 2017Q3
Actual Payment 25,593,120.54 25,724,698.40 26,028,240.14 9,202,088.05
Actual Recovery 9,340,889.44 8,555,264.57 8,680,410.63 3,584,824.21
Reserve Payment 25,806,542.44 25,699,383.72 25,411,909.79 7,814,403.72
Reserve Recovery 9,034,181.32 8,695,679.11 8,810,539.61 2,591,892.73

There's nothing ugly about the report, and there's nothing ugly about the solution I suggest, but producing such a report with SQL alone is ugly.

What makes this tricky is that some of the headings (the quarters) are derived from data values, not from column names. We want each quarter to show as a separate column, so we want to pivot the result set to make rows into columns.

The quarters are derived from the created_at date, so to select the data for our table we write a query like this:

        select
            extract(year from created_at) || 
            'Q' || 
            extract(quarter from created_at) as quarter,
            actual_or_reserve,
            payment_or_recovery,
            sum(amount) as total_amount
        from
            transaction
        where
            created_at >= current_date - interval '1 year'
        group by
            quarter,
            actual_or_reserve,
            payment_or_recovery;
    

and the results come out like this:

         quarter | actual_or_reserve | payment_or_recovery | total_amount 
        ---------+-------------------+---------------------+--------------
         2016Q3  | Actual            | Payment             |  19233782.46
         2016Q3  | Actual            | Recovery            |   5817459.79
         2016Q3  | Reserve           | Payment             |  18497733.82
         2016Q3  | Reserve           | Recovery            |   6415232.78
         2016Q4  | Actual            | Payment             |  25593120.54
         2016Q4  | Actual            | Recovery            |   9340889.44
         2016Q4  | Reserve           | Payment             |  25806542.44
         2016Q4  | Reserve           | Recovery            |   9034181.32
         2017Q1  | Actual            | Payment             |  25724698.40
         2017Q1  | Actual            | Recovery            |   8555264.57
         2017Q1  | Reserve           | Payment             |  25699383.72
         2017Q1  | Reserve           | Recovery            |   8695679.11
         2017Q2  | Actual            | Payment             |  26028240.14
         2017Q2  | Actual            | Recovery            |   8680410.63
         2017Q2  | Reserve           | Payment             |  25411909.79
         2017Q2  | Reserve           | Recovery            |   8810539.61
         2017Q3  | Actual            | Payment             |   9202088.05
         2017Q3  | Actual            | Recovery            |   3584824.21
         2017Q3  | Reserve           | Payment             |   7814403.72
         2017Q3  | Reserve           | Recovery            |   2591892.73
        (20 rows)
    

I suggest that you run this nice simple query and then use a spreadsheet tool or a visualisation tool (such as Tableau or QlikView) to present the results as a cross-tab.

If you really want to use SQL to format it, this is possible. I'll show you how, and you can decide whether you want to do it this way or the easy way.

The Ugly Bit

Here's the first step we need to do:

        
        select
            quarter,
            case
                when quarter = 
                    extract(year from current_date) ||
                    'Q' ||
                    extract(quarter from current_date)
                then total_amount
                else 0
            end as latest_quarter_amount,
            case
                when quarter = 
                    extract(year from (current_date - interval '3 months')) ||
                    'Q' ||
                    extract(quarter from (current_date - interval '3 months'))
                then total_amount
                else 0
            end as quarter_minus_1_amount,
            case
                when quarter = 
                    extract(year from (current_date - interval '6 months')) ||
                    'Q' ||
                    extract(quarter from (current_date - interval '6 months'))
                then total_amount
                else 0
            end as quarter_minus_2_amount,
            case
                when quarter = 
                    extract(year from (current_date - interval '9 months')) ||
                    'Q' ||
                    extract(quarter from (current_date - interval '9 months'))
                then total_amount
                else 0
            end as quarter_minus_3_amount,
            actual_or_reserve,
            payment_or_recovery
        from
        (
            select
                extract(year from created_at) || 
                'Q' || 
                extract(quarter from created_at) as quarter,
                actual_or_reserve,
                payment_or_recovery,
                sum(amount) as total_amount
            from
                transaction
            where
                created_at >= current_date - interval '1 year'
            group by
                quarter,
                actual_or_reserve,
                payment_or_recovery
          ) x;
    

This gives us the following results:

 quarter | latest_quarter_amount | quarter_minus_1_amount | quarter_minus_2_amount | quarter_minus_3_amount | actual_or_reserve | payment_or_recovery 
---------+-----------------------+------------------------+------------------------+------------------------+-------------------+---------------------
 2016Q3  |                     0 |                      0 |                      0 |                      0 | Actual            | Payment
 2016Q3  |                     0 |                      0 |                      0 |                      0 | Actual            | Recovery
 2016Q3  |                     0 |                      0 |                      0 |                      0 | Reserve           | Payment
 2016Q3  |                     0 |                      0 |                      0 |                      0 | Reserve           | Recovery
 2016Q4  |                     0 |                      0 |                      0 |            25593120.54 | Actual            | Payment
 2016Q4  |                     0 |                      0 |                      0 |             9340889.44 | Actual            | Recovery
 2016Q4  |                     0 |                      0 |                      0 |            25806542.44 | Reserve           | Payment
 2016Q4  |                     0 |                      0 |                      0 |             9034181.32 | Reserve           | Recovery
 2017Q1  |                     0 |                      0 |            25724698.40 |                      0 | Actual            | Payment
 2017Q1  |                     0 |                      0 |             8555264.57 |                      0 | Actual            | Recovery
 2017Q1  |                     0 |                      0 |            25699383.72 |                      0 | Reserve           | Payment
 2017Q1  |                     0 |                      0 |             8695679.11 |                      0 | Reserve           | Recovery
 2017Q2  |                     0 |            26028240.14 |                      0 |                      0 | Actual            | Payment
 2017Q2  |                     0 |             8680410.63 |                      0 |                      0 | Actual            | Recovery
 2017Q2  |                     0 |            25411909.79 |                      0 |                      0 | Reserve           | Payment
 2017Q2  |                     0 |             8810539.61 |                      0 |                      0 | Reserve           | Recovery
 2017Q3  |            9202088.05 |                      0 |                      0 |                      0 | Actual            | Payment
 2017Q3  |            3584824.21 |                      0 |                      0 |                      0 | Actual            | Recovery
 2017Q3  |            7814403.72 |                      0 |                      0 |                      0 | Reserve           | Payment
 2017Q3  |            2591892.73 |                      0 |                      0 |                      0 | Reserve           | Recovery
(20 rows)
    

Now we pivot the data to turn the quarters into columns instead of rows. We are also adding the quarter headings as the first row of data. We could hard-code the headings, which would be prettier, but what if we want to run the same report next quarter and the quarter after that? We don't want to have to change the SQL code every time we run the report. Hmmm. Let's run this version and then think about the column headings again.


        select
            ' ' as actual_or_reserve,
            ' ' as payment_or_recovery,
            extract(year from (current_date - interval '9 months')) ||
            'Q' ||
            extract(quarter from (current_date - interval '9 months')) as period_1,
            extract(year from (current_date - interval '6 months')) ||
            'Q' ||
            extract(quarter from (current_date - interval '6 months')) as period_2,
            extract(year from (current_date - interval '3 months')) ||
            'Q' ||
            extract(quarter from (current_date - interval '3 months')) as period_3,
            extract(year from current_date) ||
            'Q' ||
            extract(quarter from current_date) as period_4
        union
        (
            select
                actual_or_reserve,
                payment_or_recovery,
                to_char(sum(quarter_minus_3_amount), '99G999G999D99') as period_1,
                to_char(sum(quarter_minus_2_amount), '99G999G999D99') as period_2,
                to_char(sum(quarter_minus_1_amount), '99G999G999D99') as period_3,
                to_char(sum(latest_quarter_amount), '99G999G999D99') as period_4
            from
            (
                select
                    quarter,
                    case
                        when quarter = 
                            extract(year from current_date) ||
                            'Q' ||
                            extract(quarter from current_date)
                        then total_amount
                        else 0
                    end as latest_quarter_amount,
                    case
                        when quarter = 
                            extract(year from (current_date - interval '3 months')) ||
                            'Q' ||
                            extract(quarter from (current_date - interval '3 months'))
                        then total_amount
                        else 0
                    end as quarter_minus_1_amount,
                    case
                        when quarter = 
                            extract(year from (current_date - interval '6 months')) ||
                            'Q' ||
                            extract(quarter from (current_date - interval '6 months'))
                        then total_amount
                        else 0
                    end as quarter_minus_2_amount,
                    case
                        when quarter = 
                            extract(year from (current_date - interval '9 months')) ||
                            'Q' ||
                            extract(quarter from (current_date - interval '9 months'))
                        then total_amount
                        else 0
                    end as quarter_minus_3_amount,
                    actual_or_reserve,
                    payment_or_recovery
                from
                (
                    select
                        extract(year from created_at) || 
                        'Q' || 
                        extract(quarter from created_at) as quarter,
                        actual_or_reserve,
                        payment_or_recovery,
                        sum(amount) as total_amount
                    from
                        transaction
                    where
                        created_at > current_date - interval '12 months'
                    group by
                        quarter,
                        actual_or_reserve,
                        payment_or_recovery
                ) x
            ) y
            group by
                actual_or_reserve,
                payment_or_recovery
        )
        order by
            actual_or_reserve,
            payment_or_recovery;    

and the results are:

 actual_or_reserve | payment_or_recovery |    period_1    |    period_2    |    period_3    |    period_4    
-------------------+---------------------+----------------+----------------+----------------+----------------
                   |                     | 2016Q4         | 2017Q1         | 2017Q2         | 2017Q3
 Actual            | Payment             |  25,593,120.54 |  25,724,698.40 |  26,028,240.14 |   9,202,088.05
 Actual            | Recovery            |   9,340,889.44 |   8,555,264.57 |   8,680,410.63 |   3,584,824.21
 Reserve           | Payment             |  25,806,542.44 |  25,699,383.72 |  25,411,909.79 |   7,814,403.72
 Reserve           | Recovery            |   9,034,181.32 |   8,695,679.11 |   8,810,539.61 |   2,591,892.73
(5 rows)
    

So we've got the results, with the correct quarter headings and we could copy and paste this into our presentation slides and, with very little work, make it pretty.

There is a way to make the column headings come out right, rather than generating them as a data row in the table. This uses the technique that is to write SQL to generate a SQL statement and then execute the generated statement to get the results we want. Here is the PostgreSQL version:

      \set script_file `date +%Y%m%d%H%M%S` .sql
      \t
      \o :script_file
      select
        'select ' ||
            'actual_or_reserve, ' ||
            'payment_or_recovery, ' ||
            'to_char(sum(quarter_minus_3_amount), ''99G999G999D99'') as "' || minus_3_quarters_label || '", ' ||
            'to_char(sum(quarter_minus_2_amount), ''99G999G999D99'') as "' || minus_2_quarters_label || '", ' ||
            'to_char(sum(quarter_minus_1_amount), ''99G999G999D99'') as "' || minus_1_quarter_label  || '", ' ||
            'to_char(sum(latest_quarter_amount),  ''99G999G999D99'') as "' || current_label          || '" ' ||
        'from ' ||
        '(' ||
            'select ' ||
                'quarter, ' ||
                'case ' ||
                    'when quarter = ''' || year_minus_3_quarters || 'Q' || quarter_minus_3_quarters || ''' ' ||
                    'then total_amount ' ||
                    'else 0 ' ||
                'end as quarter_minus_3_amount, ' ||
                'case ' ||
                    'when quarter = ''' || year_minus_2_quarters || 'Q' || quarter_minus_2_quarters || ''' ' ||
                    'then total_amount ' ||
                    'else 0 ' ||
                'end as quarter_minus_2_amount, ' ||
                'case ' ||
                    'when quarter = ''' || year_minus_1_quarter || 'Q' || quarter_minus_1_quarter || ''' ' ||
                    'then total_amount ' ||
                    'else 0 ' ||
                'end as quarter_minus_1_amount, ' ||
                'case ' ||
                    'when quarter = ''' || current_year || 'Q' || current_quarter || ''' ' ||
                    'then total_amount ' ||
                    'else 0 ' ||
                'end as latest_quarter_amount, ' ||
                'actual_or_reserve, ' ||
                'payment_or_recovery ' ||
            'from ' ||
            '(' ||
                'select ' ||
                    'extract(year from created_at) || ''Q'' || extract(quarter from created_at) as quarter, ' ||
                    'actual_or_reserve, ' ||
                    'payment_or_recovery, ' ||
                    'sum(amount) as total_amount ' ||
                'from ' ||
                    'transaction ' ||
                'group by ' ||
                    'quarter, ' ||
                    'actual_or_reserve, ' ||
                    'payment_or_recovery ' ||
            ') x ' ||
        ') y ' ||
        'group by ' ||
            'actual_or_reserve, ' ||
            'payment_or_recovery ' ||
        'order by ' ||
            'actual_or_reserve, ' ||
            'payment_or_recovery ' || 
        ';' as query
      from
        (
          select
            year_minus_3_quarters,
            quarter_minus_3_quarters,
            year_minus_3_quarters || 'Q' || quarter_minus_3_quarters  as minus_3_quarters_label,
            year_minus_2_quarters,
            quarter_minus_2_quarters,
            year_minus_2_quarters || 'Q' || quarter_minus_2_quarters  as minus_2_quarters_label,
            year_minus_1_quarter,
            quarter_minus_1_quarter,
            year_minus_1_quarter || 'Q' || quarter_minus_1_quarter    as minus_1_quarter_label,
            current_year,
            current_quarter,
            current_year || 'Q' || current_quarter                    as current_label
          from
            (
              select
                extract(year from (current_date - interval '9 months'))     as year_minus_3_quarters,
                extract(quarter from (current_date - interval '9 months'))  as quarter_minus_3_quarters,
                extract(year from (current_date - interval '6 months'))     as year_minus_2_quarters,
                extract(quarter from (current_date - interval '6 months'))  as quarter_minus_2_quarters,
                extract(year from (current_date - interval '3 months'))     as year_minus_1_quarter,
                extract(quarter from (current_date - interval '3 months'))  as quarter_minus_1_quarter,
                extract(year from current_date)                             as current_year,
                extract(quarter from current_date)                          as current_quarter
            ) x
        ) y;    
      \o
      \t
      \i :script_file
    

Here are the results:

 actual_or_reserve | payment_or_recovery |     2016Q4     |     2017Q1     |     2017Q2     |     2017Q3     
-------------------+---------------------+----------------+----------------+----------------+----------------
 Actual            | Payment             |  25,593,120.54 |  25,724,698.40 |  26,028,240.14 |   9,202,088.05
 Actual            | Recovery            |   9,340,889.44 |   8,555,264.57 |   8,680,410.63 |   3,584,824.21
 Reserve           | Payment             |  25,806,542.44 |  25,699,383.72 |  25,411,909.79 |   7,814,403.72
 Reserve           | Recovery            |   9,034,181.32 |   8,695,679.11 |   8,810,539.61 |   2,591,892.73
(4 rows)
    

I have certainly seen this techinique (using SQL to generate SQL and then executing the result) used, and I do use it myself in special cases, but for a pivot table I think it would not be the best solution. This approach can be used with other databases, but the command-line statements (that start with "\" in PostgreSQL) will be different in different database products. If you do this, remember to delete the script file when you have finished with it. As with the previous solution (using the fake aggregates) this works but I do not recommend it. I only show it to illustrate how messy things get when you try to use SQL for fancy presentation; it isn't designed for that so use a tool that is.

What Some Of The Vendors Provide

PostgreSQL

In PostgreSQL, since version 9.6, there is a command to format the results of a query as a crosstab or pivot table. This allows one column for the vertical values, one column for the horizontal values and one column to be the values in the grid. It is possible to work around this limitation by concatenating columns. So, if we take our original aggregate query and concatentate actual_or_reserve and payment_or_recovery then we can run our query like this:

      select
           extract(year from created_at) || 'Q' || extract(quarter from created_at)  as quarter,
           actual_or_reserve || ' ' || payment_or_recovery                           as accounting_type,
           sum(amount)                                                               as total_amount
       from
           transaction
       where
           created_at >= current_date - interval '1 year'
       group by
           quarter,
           accounting_type;
    

This gives the results in the normal way:

       quarter | accounting_type  | total_amount 
      ---------+------------------+--------------
       2016Q4  | Actual Payment   |   5141985.32
       2016Q4  | Actual Recovery  |   1550122.02
       2016Q4  | Reserve Payment  |   4839087.92
       2016Q4  | Reserve Recovery |   1447052.35
       2017Q1  | Actual Payment   |  26350790.34
       2017Q1  | Actual Recovery  |   8568141.87
       2017Q1  | Reserve Payment  |  25186564.26
       2017Q1  | Reserve Recovery |   8796869.92
       2017Q2  | Actual Payment   |  25676224.13
       2017Q2  | Actual Recovery  |   8603254.09
       2017Q2  | Reserve Payment  |  26453971.14
       2017Q2  | Reserve Recovery |   8869103.17
       2017Q3  | Actual Payment   |  26201379.03
       2017Q3  | Actual Recovery  |   8252500.40
       2017Q3  | Reserve Payment  |  24840062.94
       2017Q3  | Reserve Recovery |   9132468.50
       2017Q4  | Actual Payment   |  16193206.91
       2017Q4  | Actual Recovery  |   6641101.45
       2017Q4  | Reserve Payment  |  15220428.39
       2017Q4  | Reserve Recovery |   4724207.15
      (20 rows)
    

Then we enter \crosstabview <vertical column> <horizontal column> <data column> and the result set is repeated, now formatted as a crosstab:

       \crosstabview accounting_type quarter total_amount
    
       accounting_type  |   2016Q4   |   2017Q1    |   2017Q2    |   2017Q3    |   2017Q4    
      ------------------+------------+-------------+-------------+-------------+-------------
       Actual Payment   | 7103424.73 | 26350790.34 | 25676224.13 | 26201379.03 | 16193206.91
       Actual Recovery  | 2403579.12 |  8568141.87 |  8603254.09 |  8252500.40 |  6641101.45
       Reserve Payment  | 7465664.91 | 25186564.26 | 26453971.14 | 24840062.94 | 15220428.39
       Reserve Recovery | 2570623.91 |  8796869.92 |  8869103.17 |  9132468.50 |  4724207.15
      (4 rows)
    

For the simple case this is a nice solution. It only works in the psql command line tool.

It is also possible to get a plug-in to help with cross-tab or pivot-table reports, but I would not recommend this. As my good friend Paul Beckford says, "SQL is a Domain-Specific Language." SQL does one job very well. Trying to get SQL to do tasks beyond its domain never gives the best outcome.

We saw a very simple query that returned the data we wanted, and we made it into a big ugly query to get the formatting closer to what we want. I think that it is better to use SQL for what it is good at - grouping and aggregating large volumes of data to get an interesting set of values - and to use a presentation tool for the presentation.