Four Unclear Graphs About Healthcare and Employment and How You Can Fix Them

To see the interactive versions of these charts, head over to our blog. If you’re inclined to share, here is the link: http://blog.plot.ly/post/103126614582/four-unclear-graphs-about-healthcare-and-unemployment

In this post, we’ll look at four confusing graphs and implement improvements. We’ll call out principles inspired by data visualization expert Edward Tufte. Check out our tutorials to learn more. Data visualization is hard, and we’re certainly not perfect. Please let us know your own improvements and suggestions on these ideas.

Two principles guide our changes for the first two graphs. First, perspective is important. The range of your chart — how much you zoom in or out on data — can distort or clarify data. Second, use ink to represent your data. Where you can, eliminate unnecessary and distracting lines, ticks, and labels.

Image for post
Image for post

The downturn above seems dramatic. Starting the y-axis scale at zero contextualizes the change. To save ink for data, we’ll put the title all in one line, eliminate extra grid lines, and use a light gray line for those we keep. The ideal is a proper, clarifying balance between noisy graphs and spaghetti graphs — those graphs in which lost lines wander a blank canvas in search of context. Here we use a marker where each quarter begins to show changes and demonstrate how many samples we have.

Image for post
Image for post
See the interactive plot

Here is a box plot with jitter using the same data. The graph below shows one point for each quarter from our graph above and calculates the median, quartiles, and whiskers. We show all the samples beside the box, and add some jitter so we can better see overlapping points. The outlier — 13.4 — is beyond the whiskers, outside the interquartile range (IQR). We’re using a bit of opacity to create transparency where markers overlap.

Image for post
Image for post
See the interactive plot

To combine your plots, “insert data” into a pre-existing plot. Data, graphs, and code to reproduce graphs are automatically stored at the shared plot URL. Graphs can be public or private and collaboratively edited with your team.

Image for post
Image for post
Image for post
Image for post

The discrepancy between healthcare enrollment and the goal feels substantial in our next plot. Once again, the y-axis does not start at zero, creating a truncated graph.

Image for post
Image for post

Plotly’s defaults start the y-axis at zero, add ticks on the million, abbreviate labels, and use light gray grid lines. Using opacity means the axis lines render just a bit behind the bars. We’re using a slight gap between the bars so they won’t feel crowded.

Image for post
Image for post
See the interactive plot

Plotly’s web-based nature adds interactive opportunities. In our first plot, we added a link to our source over typing out a full web URL. Plotly automatically shows data when you hover your mouse on a point, so we don’t need sequential annotations. To highlight or explain an event, or when downloading a static graph, you may still want to add them. Plotly also lets you embed interactive graphs in your blogs and dashboards with a short HTML snippet:

<iframe width="675" height="675" frameborder="0" seamless="seamless" scrolling="no" src="https://plot.ly/~MattSundquist/2326.embed?width=675&height=675"></iframe>

Our next example comes from the Huffington Post.

Image for post
Image for post
Image for post
Image for post

We can use K as an abbreviation for thousands on the y-axis to save ink and enhance readability. Flipping the chart to a horizontal layout and adding a wide left margin creates space to combine labels and data. We forgo the lines around the chart and eliminate annotations. Placing data in descending order as opposed to alphabetical order allows a visual comparison. Again, taking advantage of Plotly’s online options, we’re showing both the numbers and percentage in the hover text.

Image for post
Image for post
See the interactive plot

From a casual glance at this 3D pie chart, the 19.5% market share Apple holds seems larger than the 21.2% in the background.

Image for post
Image for post

A bar chart with slight opacity, a light border around the pastel bars, and thin, light grid lines allows side-by-side comparison.

Image for post
Image for post
See the interactive plot

For sequential comparisons of segmented events, consider a stacked bar chart. Be careful: plotting more than a few traces can feel crowded. The example above is not well suited. For a few traces, it can be useful:

Image for post
Image for post
See the interactive plot

If needed, Plotly offers varied line types, marker options, and opacity settings.

Image for post
Image for post
See the interactive plot

Head to the interactive version and hover over the marker points to see their names.

Image for post
Image for post
See the interactive plot
Image for post
Image for post
See the interactive plot

Your feedback and thoughts are welcome. If you liked what you read, check us out @plotlygraphs or consider sharing this piece. Happy plotting.

Written by

The leading front-end for ML & data science models in Python, R, and Julia.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store