graphics

Critical importance of data visualization

 

 

10214063904_6c6cfc7dfc_z

photo by David Blackwell.

Not sure if you can imagine or have ever experienced a meeting where you bring in your statspack or AWR report, all 30 pages of it, point out some glaring issues that anybody could see and proposed some precise solutions, only to have the management team’s eyes glaze over. Then after you finish your pitch they all start arguing as to what the problem might be despite your clear presentation of the problem and solution.
Have you ever had that same meeting with a printout of top activity from Oracle Enterprise Manager, with it’s load graph of average active sessions and it’s break down as to where the load comes from in terms of CPU and waits and what the top SQL and Session are, and then you explain the problem and solution and they all nod their heads?
Clear presentation of data using graphics are critical to how fast people can understand the information and how comfortable they are in interpreting the information.

 

Screen Shot 2013-12-30 at 12.04.52 PM

 

Edward Tufte wrote a seminal analysis of the decision to launch the space shuttle on January 28, 1986. Some have been critical of the analysis but for reasons that are orthogonal to what I find important . What I find important is the shockingly huge impact the presentation format of data can have on the the viewers interpretation.

On the night before the shuttle launch, the engineers  who designed the solid rocket boosters were concerned that it would be too cold to launch. Cold was an issue because the joints in the solid rocket booster were a type of rubber which becomes stiffer the colder it is. As the rubber became stiffer, it’s capability to seal the joints declined and it increased the danger of solid rocket fuel burning through.

Screen Shot 2013-12-30 at 12.05.04 PM

 

The engineers stayed up late putting together information and faxing it out to the launch control in Florida. The engineers were concerned and trying to prevent the launch the next day. The engineers had information about the damage to the solid rocket boosters from previous flights. On previous flights the rocket boosters  were collected and analyzed for damage after they fell back to the ocean after each launch.  The engineers used this data to show how in past launches that  damage had been related to temperature  on the solid rocket boosters.

Here is a fax showing  the “History of O-Ring damage on SRM field joints:

Screen Shot 2013-12-30 at 12.05.13 PMThe first problem as Tufte points out is that this fax uses three different naming conventions for the data from previous launches which is confusing. Circled in red are the 3 different naming conventions, date, flight# and SRM #

Screen Shot 2013-12-30 at 12.05.22 PM

The fax gives overwhelming detailed information on the damage but no information on the temperatures and the goal was to show a correlation between temperatures and damage.

The next fax shows temperatures but missing many of the damaged flights and includes damage from test fires in the desert that were test fired horizontally not vertically nor with the same stresses  as actual flight.

Screen Shot 2013-12-30 at 12.06.07 PM

 

Finally the inclusion of  tangential data and the exclusion of other data led the comment that there was damage at the hottest flight and the coldest flight.

Screen Shot 2013-12-30 at 12.10.14 PM

 

But the conclusions in the faxes were clear. Estimated temperature at launch was to be 29-38 degrees and the shuttle should not be launched below 53 degrees

Screen Shot 2013-12-30 at 12.10.35 PM

If we take the data that was faxed and plot the number of damage incidents at the temperature which they occur we get a graph like

Screen Shot 2013-12-30 at 12.11.43 PM

 

Based on this information do you think there is a correlation between temperature and damage? Would you have launched the shuttle the next day? Remember that there was tremendous pressure to launch the next day.

Well they did launch and the rest is history. As seen in the picture below there is a white flame coming from one of the o-rings in the solid rocket booster. This flame burned into the liquid fuel and the space shuttle exploded.

Screen Shot 2013-12-30 at 12.11.10 PM

 

It was a national tragedy which led to a congressional investigation. As part of the congressional investigation, the information was drawn up in to graphics.  The  graphics were actually worse than the original faxes because they introduced so much chart junk.

Screen Shot 2013-12-30 at 12.11.18 PM

Screen Shot 2013-12-30 at 12.11.36 PM

 

OK, lets look back at the original data

Screen Shot 2013-12-30 at 12.11.43 PM

Now let’s take that data and change the y-axis to represent not a simple count of damage but a scale of how bad the damage was, and we get

Screen Shot 2013-12-30 at 12.11.51 PM

 

Now include the flights that had no damage, a major piece of information, which makes a huge difference already

Screen Shot 2013-12-30 at 12.11.57 PM

 

Now mark damages of a different type of a different color which is only the one that occurred at 75 degreeScreen Shot 2013-12-30 at 12.12.04 PM

Now at 70 degrees there were both successes and failures, so normalize (average) the damage there

Screen Shot 2013-12-30 at 12.12.11 PM

 

Now we are starting to see some important information

 

Screen Shot 2013-12-30 at 12.13.16 PM

 

We are also starting to see a stronger indicator of correlation

Screen Shot 2013-12-30 at 12.13.24 PM

 

But probably the most important piece of information is still missing – the temperature at which the launch the next day would take place:

Screen Shot 2013-12-30 at 12.13.30 PM

 

X marks the spot of the predicted launch temperature for the next day, January 28, 1986. The launch the next day was well outside the known world. It was so far it out,  that it was almost as big a leap away from the known world as the size of the known world of data was.

In summary

  • NASA engineers, they guys that blew us away putting a man on the moon, can still fail at communicating data clear.
  • Congressional investigators, some of the top lawyers in the country, can still fail at communicating data clearly.
  • Data visualization seems obvious, but it is  difficult.

but lack of clarity can be devastating

Further reading

 

stevej

photo by Steve Jurvetson

4 thoughts on “Critical importance of data visualization

Comments are closed.