Jim Dehner 

 Tableau Visionary and Ambassador

Tableau How to’s,  Use Cases, and Forums Questions

Hey #DataFam

Let’s Talk

Open for

Post Links

Want to total across the row or column, rank a dimension, or pick out a single value? Then you need a Table Calculation!  But how?

 

 

 

Table Calculation Basics

Table calculations should feel familiar.  They operate on the table you see in the view much like those from spreadsheet calculators.  The actual "data table" is a subset of the total data uploaded into Tableau and is created as you place dimensions on the rows or column shelf and apply filters.   While the data table, in reality, is tall and narrow you can think of the table calculation operating across the columns or down the rows of the table you see in the viz.   

 

 

With a spreadsheet calculator, you write a formula in a cell and then copy it across the rows or down the column to the remaining cells,  Table calculations create a dimension or a measure that does that for you but you need to tell Tableau in which direction the formula is applied.  

Table calculations are applied either ACROSS the columns or DOWN the rows:

 

 

Addressing is significant,  across increments positively to the right, negative to the left.  Down is positive down and negative up.  

 

Order of Operation effect

Table calculations are last in the order of operation.  All other types of calculations are complete, and context or dimension filters have been applied.  Consequently, the results of table calculations can not be used in LOD expressions, and some Grand Totals or Sub Totals can be affected.  On the other hand, the results from LOD's can be used in table calculations and they can be nested within themselves.

 

 

Syntax

The syntax for table calculations is:

Window_sum(Sum(measure),Start,Finish) 

 

A keyword (Window_sum here) tells Tableau which aggregation function to use to calculate the overall total.  The first argument in the parentheses is the measure to use and it is always aggregated. In this example sum() tells Tableau to sum the individual records from the detail data table and use those totals in the overall aggregation.  Start and finish are where to begin and end the overall aggregation – more on that later

 

 

How do Table Calculations work?

1 – Simple window calculations

We'll start with a couple of simple window calculations based on this data set:

 

 

The most frequently used types of calculations are the "Window", "Running" and "Rank" calculations found under "Table Calculations" in the wizard. 

 

 

Starting with Windows the formula for getting the total across the table would be

 

Window_sum(sum([Sales]),FIRST(),LAST())

 

Which in words says – take the sum of [Sales] for each year/segment combination and total them from the first cell through the last – the calculation order is Across – (note if first and last are omitted they are assumed to be from the first cell through the last).  The value in each cell is the total for the "Row"

 

 

To illustrate the differences between the inner aggregation and the window aggregation level, we could also have calculated the sum of the average values as 

 

window_sum(avg([Sales]),FIRST(),LAST())

and the average of the sum as 

window_avg(sum([Sales]),FIRST(),LAST())

 

and they return this:

 

 

One more option, before we move on. What happens when we add a start and end value? This formula will return the 2 year total – in words, it says start totaling 1 cell back (-1 or to the left) and ends with the current cell (0)  and sums the values together

window_sum(sum([Sales]),-1,0) 

It returns this:

 

 

Five table calculations are used to navigate around the table – similar to "Offset" in excel 

 

 

Using the function will take you to the location with respect to the current cell and return the value found in that cell.  First(), or Last() refer to the first row or column in the table or the last row or column.  Index just increments (counts) across the row or column – +1 is one to the right or down, +2 is 2 cells right or down, etc.

 

Lookup() goes to a specific cell – +5 goes to the 5 cells in a positive direction and returns the value found there and 

Previous_value() looks to the immediate preceding cell.

 

It is important to note that navigation is based on the location in the table.  New users often get confused believing the value is considered – e.g. in a date field lookup(attr(date),-5) is looking back 5 columns in the date field NOT 5 days.

 

2 – Lookup functions

Lookup() will return the value in the cell distance from the current cell location. In words, this formula says move to the left 1 column and go the rows marked sum(Sales) and return the value found there. 

 

Lookup(sum([Sales]),-1)

You are retrieving values from the table that is visible in the worksheet and those values are aggregated – so table calculations are also always an aggregation. 

 

In this case, the formula returns this:

 

 

 3 Previous Value

Unlike Lookup, Previous Value operates in the same row (or column) as the current cell.  This formula:

 

PREVIOUS_VALUE(sum([Sales]))

 

Using the same table, Previous value returns:

 

 

Previous value can also be used to calculate the running total or the running product.   The formula

 

sum(value)+previous_value(0)

will return the running total and 

sum(value)*previous_value(1)

returns the running product

 

4 Table Calculation Editor

The direction and scope of the table calculation establish if the calculation is determined across the columns or down the rows and how much of the table is be included.

 

For this example, I have added Category to the table to create an additional partition  

There is a table calculation to determine the percent of the total. Initially, the calculation is set to Across – so each row totals 100%.

 

To open the Editor – select Edit Table Calculation 

 

 

When the Editor opens the Direction and Scope are highlighted

 

 

Using the Editor the scope and direction can be changed – the highlighter adjust to show how the table calculation will be executed

 


5 Ranking – Top N

Ranking formulas in any of its many forms are table calculations – 

 

RANK_UNIQUE(sum([Sales]),'desc')

 

Rank will find the relative rank of a dimension (note you don't explicitly include the dimension in the formula)  based on the Sum[Sales]  – and the Rank is descending.  

 

 

With a direction and scope of Table/Down Phones have the greatest table (Rank=1) and Phones the least (Rank = 17) 

 

 

Changing the scope to category/subcategory and restarting every category creates a rank within the category :

 

 

To show only the top 2 values in each category, and add a separate column with the total.  We will need a filter:

 

[5 top 2 in category]<=[5 top N ]

and the running total 

window_sum((if [5 rank filter] then sum([Sales]) end), FIRST(),0)

 

to only show the Top 2 in each category from the finning total add

 

if RANK_UNIQUE(sum([Sales]),'desc')=[5 top N ] then [5 running totals of rank] end

 

The filter will just limit the viz to a parameter value ( here 2)  and the running total calculation provides the results we want – but we only want to see the value where Top N =2

 

The final formula is used to place a Null in all other rows.  

 

 

6 – Count consecutive records

Occasionally, a user will want to identify events that happen in consecutive years.  That can be done with a table calculation: 

The raw data looks like this and we want the maximum number of consecutive years that each donor has contributed 

 

 

this places an indicator (1) in each year the donor made a contribution

 

if lookup(sum([Gift Amount]),0)>0 then 1 end

 

this will count the consecutive years using Previous Value

 

if LOOKUP([Id gift giving years],0)>0 then ([Id gift giving years]) + PREVIOUS_VALUE(0)
else 0 end

 

and the final step is to just find the max number of consecutive years by donor and show the total

window_max([count consecutive years ])

and    

if last()=0 then "show" else "Hide" end

The detail table looks like this:

 

 

using a last filter and hiding unneeded columns ends up with this summary

 

 

7 Window Statistics

 

Some users need to do some basic statistics,  e.g. means, standard deviation,  and get confused between then results based on the detailed data in the dataset or the visual data in their viz.  

 

The "window statistic" functions are based on the aggregated data visible in the view

 

 

These are based on the detailed data (or calculation) in the data table

 

 

For example – this will return the standard deviation of the detail records

 

STDEV([Sales]) 

where this 

WINDOW_STDEV(sum([Sales]))

 

returns the standard deviation of the aggregated records in the table

 

 

I hope this helps you understand table calculations and when they are used.

 

The workbook containing these examples can be found at Link to Workbook     

 

 

 

Enjoy

Jim

3 Responses

  1. Hi, this was wonderful and very helpful! I am wracking my brain over something and was hoping you could help:

    I am using DateDiff with Lookup to find the variance in hours between consecutive days, which is easy enough. I'm using a heatmap viz with the dates going across the columns at the top, and the names of the clients on each row. The detail for the heat map is the variance between start times from the previous day. The table calc is correctly computing Across Rows. Here's my calc: // Difference from previous value. abs (DATEDIFF('hour', LOOKUP(MAX([start time)]),-1),LOOKUP(MAX([start time)]),0)))

    The issue I'm having however, is when there there are nulls in between days, i.e. the job date might run every week on Sunday, versus every day, it returns zero because obviously it's just looking for the previous column. How do I get it to skip the null days and compute the difference from the last non-null date? Is there an "Is IfNull" condition I can add perhaps? If so how would that look?

    1. Lookup is a table calculation that returns a value from a cell in the data table based on the position of the cell in the table – not the value associated with the cell – so when you lookup a cell that is in location -1 (looking 1 cell back) it is not like looking back 1 hour or 1 day – if there are nulls in the data set (or voids) that disrupt the sequence of the data tableau will return the value in in cell without regard to a normal sequence
      As to the specific question – Please post it on the Tableau Forums and include your Tableau workbook

Leave a Reply

Your email address will not be published. Required fields are marked *