So much of the data science community focuses on “data storytelling.” We tell ourselves we’re weaving threads of data into beautiful tapestries, then spinning these yarns to audiences on the edge of their seats, hungry for just another helping of data-driven insight.
But the truth is most practical data science “storytelling” is more like reporting. You’re collecting information, synthesizing it, and sending it back out edited and focused for your audience. In some cases these presentations can be investigative exposés, but in many cases they’re more akin to sportswriting: this is what happened (or will likely happen) and some interesting stats. There’s no drama, no characters, no flowery details. In most cases you’re presenting to a business team who wants your insights, quickly, and nothing more.
As a former editor building a career in data science, and a professional who has already sat through a few too many eye-glazing presentations, I want to share some tips on communicating your ideas quickly and efficiently.
1. An Eighth-Grade Data Level
Yes, this is the classic “know your audience” bit. Journalists in the U.S. are generally taught to write at an eighth-grade reading level. For data we must do the same, scaling our technical jargon down to a reasonable base level.
Just as journalists assume the average person knows a couple of ten-dollar words, a data scientist can generally expect that their audience has a decent grounding in the simplest data terms. They will likely have heard of machine learning and linear regression, but perhaps not be familiar with ARIMA or Markov chain Monte Carlo algorithms.
This is a general rule, however, and will vary by audience. If you are presenting to or expecting to be read by other data professionals, you can of course load up your work with the juicy details and fancy packages you used. Oppositely, if your audience’s data literacy is on the lower end, you can knock your explanations down another level or two to match. If in doubt, go with the lower level; you can always explain in greater detail later on.
2. The Inverted Pyramid
News stories are structured to get information across quickly. You don’t need to read the entire article to know what the stock market dropped 200 points, but you can read on to learn why and how. Your work should be structured the same way.
When relaying information we often mirror our own thought processes: I thought this, then this, then that, and now I’ve arrived at this. This is ineffective and often to follow. In journalism we call it “burying the lede.” By sharing the most important details first, your audience is better able to contextualize the supporting information that comes after. Who, what, where, when, why, and how?
In data science this will typically just be the “what” and “why.” What have you found or what are you recommending, what is the impact of that conclusion, and a brief description of how you arrived at it. After that you can delve deeper into the “how” as your audience’s technical level and your time allow.
Details should flow from most important to least, in a sort of inverted pyramid of importance, with the very least important information you want to include at the end.
3. Trim the Fat
“Be sincere, be brief, be seated.”
– Franklin D. Roosevelt
“Brevity is the soul of wit.”
The world is full of colorful expressions that all mean the same thing: keep it short and simple. Your inverted pyramid does not have to be the Great Pyramid.
In journalism this means excluding anything that doesn’t add to a reader’s understanding of the story. The reader doesn’t need to know that it was a cold day in the capitol unless someone slipped on the ice. In data science the same holds true, not just for written or spoken details but for visuals as well.
Exclude anything – any tools, any techniques, any data points – that is not strictly necessary to making your point. For visualizations, this means trimming your timeframe to only the relevant dates, and cleaning out unnecessary labels, dimensions, or annotations. It means making your charts as simple as they can possibly be while still telling your story accurately. If you can, limit the bulk of your argument to one or two perfect, highly impactful visualizations.
I’ll take my own advice and close out here. Next time you’re relaying what you’ve found through your data science wizardry, remember the rules above. Your audience will thank you.