This chart has many names
Butterfly | Pyramid | Bikini
Take your pick!
This #WOW2020 challenge from Lorna Brown is so great! It’s an easy, quick introduction to a complex-looking chart but is fairly straight forward. It’s known as butterfly, pyramid, bikini and diverging bar chart. This chart allows the user to see trends and variations for a binary dimension along a higher cardinality dimension. It’s most common to see this chart type showing gender proportions by age bins. Just like in this example.
Continue reading for the step-by-step tutorial! OR if you prefer something a little more interactive – check out the video I made during my weekly live stream on 07.14.2022
Shaping the data
The first thing we have to do is get the data in the right format – you can see when we bring it in, it’s got 4 rows of junk. We can fix this by checking the “Data Interpreter” box in the data source window. When we do that. Tableau will examine the datasource and find a tidy table for you. It’s a pretty slick feature and is really useful when you download open public data excel files from around the internet.
Let's build the chart
//MALE POP
sum(if [Gender] = "Males" then [Population] else 0 END)
//FEMALE POP
sum(if [Gender] != "Males" then [Population] else 0 END)
Now that we’ve got those two calculations we can start to build our pyramid. Take the [Age] pill and add it to rows, then add both of these new fields to the columns shelf as separate pills but DO NOT make them a dual axis. We should get something like this
Well this doesn’t look quite right. But we can fix that but editing axis of the the [Male Pop] field. and simply reverse it. While we’re here, let’s go ahead and add color to our bars as well. Now we should haves something like this
This is looking pretty good. Now we just need to create a new calculation for the total population. Lorna mentioned using an LoD expression. There are a couple ways we could do it. For this solution, I’m going to use an exlude expression to disregard gender anywhere in the view for this calculation.
//Total Population
{ EXCLUDE [Gender]:SUM([Population])}
From here we can add this as the primary axis for both sides of our pyramid, synchronize and format the colors to get this
But then i started to think about it and the LOD isn’t even needed because the we aren’t doing any filtering on this sheet and we want to show the total population at the age group level. That is just the datasource column of Population…
NO LOD NECESSARY!
And that brings me to my last point because all that’s left here is to format and clean up tooltips – don’t over-engineer your solutions. Be skeptical, try things out. Don’t immediately jump to using a solution that could result in a lack of performance.
Now with that…
GO FORTH AND VIZ