Jim Dehner

Tableau How to’s,  Use Cases, and Forums Questions

Hey #DataFam

Let’s Talk

Open for

Want to total across the row or column, rank a dimension, or pick out a single value? Then you need a Table Calculation!  But how?

#### Table Calculation Basics

Table calculations should feel familiar.  They operate on the table you see in the view much like those from spreadsheet calculators.  The actual "data table" is a subset of the total data uploaded into Tableau and is created as you place dimensions on the rows or column shelf and apply filters.   While the data table, in reality, is tall and narrow you can think of the table calculation operating across the columns or down the rows of the table you see in the viz.

With a spreadsheet calculator, you write a formula in a cell and then copy it across the rows or down the column to the remaining cells,  Table calculations create a dimension or a measure that does that for you but you need to tell Tableau in which direction the formula is applied.

Table calculations are applied either ACROSS the columns or DOWN the rows:

Addressing is significant,  across increments positively to the right, negative to the left.  Down is positive down and negative up.

#### Order of Operation effect

Table calculations are last in the order of operation.  All other types of calculations are complete, and context or dimension filters have been applied.  Consequently, the results of table calculations can not be used in LOD expressions, and some Grand Totals or Sub Totals can be affected.  On the other hand, the results from LOD's can be used in table calculations and they can be nested within themselves.

#### Syntax

The syntax for table calculations is:

#### Window_sum(Sum(measure),Start,Finish)

A keyword (Window_sum here) tells Tableau which aggregation function to use to calculate the overall total.  The first argument in the parentheses is the measure to use and it is always aggregated. In this example sum() tells Tableau to sum the individual records from the detail data table and use those totals in the overall aggregation.  Start and finish are where to begin and end the overall aggregation – more on that later

#### 1 – Simple window calculations

We'll start with a couple of simple window calculations based on this data set:

The most frequently used types of calculations are the "Window", "Running" and "Rank" calculations found under "Table Calculations" in the wizard.

Starting with Windows the formula for getting the total across the table would be

#### Window_sum(sum([Sales]),FIRST(),LAST())

Which in words says – take the sum of [Sales] for each year/segment combination and total them from the first cell through the last – the calculation order is Across – (note if first and last are omitted they are assumed to be from the first cell through the last).  The value in each cell is the total for the "Row"

To illustrate the differences between the inner aggregation and the window aggregation level, we could also have calculated the sum of the average values as

#### window_sum(avg([Sales]),FIRST(),LAST())

and the average of the sum as

#### window_avg(sum([Sales]),FIRST(),LAST())

and they return this:

One more option, before we move on. What happens when we add a start and end value? This formula will return the 2 year total – in words, it says start totaling 1 cell back (-1 or to the left) and ends with the current cell (0)  and sums the values together

#### window_sum(sum([Sales]),-1,0)

It returns this:

Five table calculations are used to navigate around the table – similar to "Offset" in excel

Using the function will take you to the location with respect to the current cell and return the value found in that cell.  First(), or Last() refer to the first row or column in the table or the last row or column.  Index just increments (counts) across the row or column – +1 is one to the right or down, +2 is 2 cells right or down, etc.

Lookup() goes to a specific cell – +5 goes to the 5 cells in a positive direction and returns the value found there and

Previous_value() looks to the immediate preceding cell.

It is important to note that navigation is based on the location in the table.  New users often get confused believing the value is considered – e.g. in a date field lookup(attr(date),-5) is looking back 5 columns in the date field NOT 5 days.

## 2 – Lookup functions

Lookup() will return the value in the cell distance from the current cell location. In words, this formula says move to the left 1 column and go the rows marked sum(Sales) and return the value found there.

#### Lookup(sum([Sales]),-1)

You are retrieving values from the table that is visible in the worksheet and those values are aggregated – so table calculations are also always an aggregation.

In this case, the formula returns this:

## 3 Previous Value

Unlike Lookup, Previous Value operates in the same row (or column) as the current cell.  This formula:

#### PREVIOUS_VALUE(sum([Sales]))

Using the same table, Previous value returns:

Previous value can also be used to calculate the running total or the running product.   The formula

#### sum(value)+previous_value(0)

will return the running total and

#### sum(value)*previous_value(1)

returns the running product

#### 4 Table Calculation Editor

The direction and scope of the table calculation establish if the calculation is determined across the columns or down the rows and how much of the table is be included.

For this example, I have added Category to the table to create an additional partition

There is a table calculation to determine the percent of the total. Initially, the calculation is set to Across – so each row totals 100%.

To open the Editor – select Edit Table Calculation

When the Editor opens the Direction and Scope are highlighted

Using the Editor the scope and direction can be changed – the highlighter adjust to show how the table calculation will be executed

#### 5 Ranking – Top N

Ranking formulas in any of its many forms are table calculations –

#### RANK_UNIQUE(sum([Sales]),'desc')

Rank will find the relative rank of a dimension (note you don't explicitly include the dimension in the formula)  based on the Sum[Sales]  – and the Rank is descending.

With a direction and scope of Table/Down Phones have the greatest table (Rank=1) and Phones the least (Rank = 17)

Changing the scope to category/subcategory and restarting every category creates a rank within the category :

To show only the top 2 values in each category, and add a separate column with the total.  We will need a filter:

#### [5 top 2 in category]<=[5 top N ]

and the running total

#### window_sum((if [5 rank filter] then sum([Sales]) end), FIRST(),0)

to only show the Top 2 in each category from the finning total add

#### if RANK_UNIQUE(sum([Sales]),'desc')=[5 top N ] then [5 running totals of rank] end

The filter will just limit the viz to a parameter value ( here 2)  and the running total calculation provides the results we want – but we only want to see the value where Top N =2

The final formula is used to place a Null in all other rows.

## 6 – Count consecutive records

Occasionally, a user will want to identify events that happen in consecutive years.  That can be done with a table calculation:

The raw data looks like this and we want the maximum number of consecutive years that each donor has contributed

this places an indicator (1) in each year the donor made a contribution

#### if lookup(sum([Gift Amount]),0)>0 then 1 end

this will count the consecutive years using Previous Value

#### if LOOKUP([Id gift giving years],0)>0 then ([Id gift giving years]) + PREVIOUS_VALUE(0)else 0 end

and the final step is to just find the max number of consecutive years by donor and show the total

and

#### if last()=0 then "show" else "Hide" end

The detail table looks like this:

using a last filter and hiding unneeded columns ends up with this summary

#### 7 Window Statistics

Some users need to do some basic statistics,  e.g. means, standard deviation,  and get confused between then results based on the detailed data in the dataset or the visual data in their viz.

The "window statistic" functions are based on the aggregated data visible in the view

These are based on the detailed data (or calculation) in the data table

For example – this will return the standard deviation of the detail records

where this

#### WINDOW_STDEV(sum([Sales]))

returns the standard deviation of the aggregated records in the table

I hope this helps you understand table calculations and when they are used.

The workbook containing these examples can be found at Link to Workbook

Enjoy

Jim