Parallel sets are a new way to analyze categorical data such as gender, age or product segment, and are particularly well suited for answering the question of how many members of column A are also in column B?
Let’s take the example of this table that is the canonical example used for parallel sets: Statistics about Titanic survivors.
And turn it into a parallel sets:
Start your eye on the top of the visualization and you can see that about two-thirds of the people on the Titanic perished; a very large percentage of lives lost being male. Moving down the chart’s far left side you see that of those females that died almost all of them were adults, primarily due to the fact that there were proportionately very few children on the Titanic. And of the female adults that perished, an eyeball-estimated 85-90% of them had cabins in 3rd class; a limited few from 2nd class; and an almost undetectable number in 1st class and crewmembers. Hmmm. Again, ladies first as long as you’re rich or work here!
Perhaps this wasn’t as biased as the above statistics suggest? Perhaps rescue from the 3rd class section of the boat was just logistically more difficult? That doesn’t seem to be the case in that the number of men that perished were quite evenly split across the three cabin classes. Lots of good stuff in this visualization. Top-down, bottom-up, just choose a path; most result in fascinating conclusions.
So, the parallel set is a new arrow in your quiver for categorical data. As with most visualizations it has advantages and drawbacks. With high-cardinality columns it becomes a bit messy and it takes time to get used to how to read it quickly. On the other hand, it’s very interesting to use with low-cardinality columns. Extra bonus: it looks intriguing and may draw increased engagement on your dashboards.