Want to total across the row or column, rank a dimension, or pick out a single value? Then you need a Table Calculation! But how?
Table Calculation Basics
Table calculations should feel familiar. They operate on the table you see in the view much like those from spreadsheet calculators. The actual "data table" is a subset of the total data uploaded into Tableau and is created as you place dimensions on the rows or column shelf and apply filters. While the data table, in reality, is tall and narrow you can think of the table calculation operating across the columns or down the rows of the table you see in the viz.
With a spreadsheet calculator, you write a formula in a cell and then copy it across the rows or down the column to the remaining cells, Table calculations create a dimension or a measure that does that for you but you need to tell Tableau in which direction the formula is applied.
Table calculations are applied either ACROSS the columns or DOWN the rows:
Addressing is significant, across increments positively to the right, negative to the left. Down is positive down and negative up.
Order of Operation effect
Table calculations are last in the order of operation. All other types of calculations are complete, and context or dimension filters have been applied. Consequently, the results of table calculations can not be used in LOD expressions, and some Grand Totals or Sub Totals can be affected. On the other hand, the results from LOD's can be used in table calculations and they can be nested within themselves.
Syntax
The syntax for table calculations is:
Window_sum(Sum(measure),Start,Finish)
A keyword (Window_sum here) tells Tableau which aggregation function to use to calculate the overall total. The first argument in the parentheses is the measure to use and it is always aggregated. In this example sum() tells Tableau to sum the individual records from the detail data table and use those totals in the overall aggregation. Start and finish are where to begin and end the overall aggregation – more on that later
How do Table Calculations work?
1 – Simple window calculations
We'll start with a couple of simple window calculations based on this data set:
The most frequently used types of calculations are the "Window", "Running" and "Rank" calculations found under "Table Calculations" in the wizard.
Starting with Windows the formula for getting the total across the table would be
Window_sum(sum([Sales]),FIRST(),LAST())
Which in words says – take the sum of [Sales] for each year/segment combination and total them from the first cell through the last – the calculation order is Across – (note if first and last are omitted they are assumed to be from the first cell through the last). The value in each cell is the total for the "Row"
To illustrate the differences between the inner aggregation and the window aggregation level, we could also have calculated the sum of the average values as
window_sum(avg([Sales]),FIRST(),LAST())
and the average of the sum as
window_avg(sum([Sales]),FIRST(),LAST())
and they return this:
One more option, before we move on. What happens when we add a start and end value? This formula will return the 2 year total – in words, it says start totaling 1 cell back (-1 or to the left) and ends with the current cell (0) and sums the values together
window_sum(sum([Sales]),-1,0)
It returns this:
Five table calculations are used to navigate around the table – similar to "Offset" in excel
Using the function will take you to the location with respect to the current cell and return the value found in that cell. First(), or Last() refer to the first row or column in the table or the last row or column. Index just increments (counts) across the row or column – +1 is one to the right or down, +2 is 2 cells right or down, etc.
Lookup() goes to a specific cell – +5 goes to the 5 cells in a positive direction and returns the value found there and
Previous_value() looks to the immediate preceding cell.
It is important to note that navigation is based on the location in the table. New users often get confused believing the value is considered – e.g. in a date field lookup(attr(date),-5) is looking back 5 columns in the date field NOT 5 days.
2 – Lookup functions
Lookup() will return the value in the cell distance from the current cell location. In words, this formula says move to the left 1 column and go the rows marked sum(Sales) and return the value found there.
Lookup(sum([Sales]),-1)
You are retrieving values from the table that is visible in the worksheet and those values are aggregated – so table calculations are also always an aggregation.
In this case, the formula returns this:
3 Previous Value
Unlike Lookup, Previous Value operates in the same row (or column) as the current cell. This formula:
PREVIOUS_VALUE(sum([Sales]))
Using the same table, Previous value returns:
Previous value can also be used to calculate the running total or the running product. The formula
sum(value)+previous_value(0)
will return the running total and
sum(value)*previous_value(1)
returns the running product
4 Table Calculation Editor
The direction and scope of the table calculation establish if the calculation is determined across the columns or down the rows and how much of the table is be included.
For this example, I have added Category to the table to create an additional partition
There is a table calculation to determine the percent of the total. Initially, the calculation is set to Across – so each row totals 100%.
To open the Editor – select Edit Table Calculation
When the Editor opens the Direction and Scope are highlighted
Using the Editor the scope and direction can be changed – the highlighter adjust to show how the table calculation will be executed
5 Ranking – Top N
Ranking formulas in any of its many forms are table calculations –
RANK_UNIQUE(sum([Sales]),'desc')
Rank will find the relative rank of a dimension (note you don't explicitly include the dimension in the formula) based on the Sum[Sales] – and the Rank is descending.
With a direction and scope of Table/Down Phones have the greatest table (Rank=1) and Phones the least (Rank = 17)
Changing the scope to category/subcategory and restarting every category creates a rank within the category :
To show only the top 2 values in each category, and add a separate column with the total. We will need a filter:
[5 top 2 in category]<=[5 top N ]
and the running total
window_sum((if [5 rank filter] then sum([Sales]) end), FIRST(),0)
to only show the Top 2 in each category from the finning total add
if RANK_UNIQUE(sum([Sales]),'desc')=[5 top N ] then [5 running totals of rank] end
The filter will just limit the viz to a parameter value ( here 2) and the running total calculation provides the results we want – but we only want to see the value where Top N =2
The final formula is used to place a Null in all other rows.
6 – Count consecutive records
Occasionally, a user will want to identify events that happen in consecutive years. That can be done with a table calculation:
The raw data looks like this and we want the maximum number of consecutive years that each donor has contributed
this places an indicator (1) in each year the donor made a contribution
if lookup(sum([Gift Amount]),0)>0 then 1 end
this will count the consecutive years using Previous Value
if LOOKUP([Id gift giving years],0)>0 then ([Id gift giving years]) + PREVIOUS_VALUE(0)
else 0 end
and the final step is to just find the max number of consecutive years by donor and show the total
window_max([count consecutive years ])
and
if last()=0 then "show" else "Hide" end
The detail table looks like this:
using a last filter and hiding unneeded columns ends up with this summary
7 Window Statistics
Some users need to do some basic statistics, e.g. means, standard deviation, and get confused between then results based on the detailed data in the dataset or the visual data in their viz.
The "window statistic" functions are based on the aggregated data visible in the view
These are based on the detailed data (or calculation) in the data table
For example – this will return the standard deviation of the detail records
STDEV([Sales])
where this
WINDOW_STDEV(sum([Sales]))
returns the standard deviation of the aggregated records in the table
I hope this helps you understand table calculations and when they are used.
The workbook containing these examples can be found at Link to Workbook
Enjoy
Jim
3 Responses
Thanks for the wonderful post. It was very helpful in understanding the Table calculations.
Hi, this was wonderful and very helpful! I am wracking my brain over something and was hoping you could help:
I am using DateDiff with Lookup to find the variance in hours between consecutive days, which is easy enough. I'm using a heatmap viz with the dates going across the columns at the top, and the names of the clients on each row. The detail for the heat map is the variance between start times from the previous day. The table calc is correctly computing Across Rows. Here's my calc: // Difference from previous value. abs (DATEDIFF('hour', LOOKUP(MAX([start time)]),-1),LOOKUP(MAX([start time)]),0)))
The issue I'm having however, is when there there are nulls in between days, i.e. the job date might run every week on Sunday, versus every day, it returns zero because obviously it's just looking for the previous column. How do I get it to skip the null days and compute the difference from the last non-null date? Is there an "Is IfNull" condition I can add perhaps? If so how would that look?
Lookup is a table calculation that returns a value from a cell in the data table based on the position of the cell in the table – not the value associated with the cell – so when you lookup a cell that is in location -1 (looking 1 cell back) it is not like looking back 1 hour or 1 day – if there are nulls in the data set (or voids) that disrupt the sequence of the data tableau will return the value in in cell without regard to a normal sequence
As to the specific question – Please post it on the Tableau Forums and include your Tableau workbook