1. Background
In recent
times data visualization has come up as a popular choice for analysis and
planning. There are numerous new visualizations which present the data in
different ways for easy fact finding.
But these
benefits come with a side effect - Visualizations are getting more and more
complex and generally need further information (in textual, audio or visual format)
to explain the facts so that user can deduce proper insights.
Dynamic visualizations, where
transitions are used to show the changes in data over period of time, elevate
the problem even further. As the underlying visualization continuously changes,
either we need to ‘pause’ the transition or again provide audio or separate text
inputs to support the data presented. Few example of complex visualization are
given below: (Ref: http://www.visualcomplexity.com/vc/)
Use Case
Consider the chart shown below; on high level it’s showing the per-capita
income in various countries at different point of time. As you will see, over
the period of time the data points are moving towards the right top corner.
Though the intellectual users will get some insights using the axis
labels (the static information) in the chart but more insights on a particular
instance of dynamic visualization will be difficult or many times totally lost.
For example
Obvious
insight:
- During 1970s, 80s, except few
countries all having low per capita income
- Over the last few decades more countries having better
per capita income
Further
insights:
- Even though the already developed countries are
growing further, but their growth rate is much slower than the developing
nations
- The difference between the highest earning and lowest
earning nations is still almost same
- Nation X has made the most progress,
etc …
With a bit more complexity
in data or visualization, it can be almost impossible to convey the underlying
message by just using the visual cues.
Current Solutions and Associated problems
The current
solutions partially solve the problem by either:
- Providing a separate text outside the visualization:
- Not user
friendly
- while
sharing the visualization, we need to share both visualization and accompanying
information.
- Not real
time. If the data changes we need to change the text accordingly.
- Providing a text/annotation on the visualization
itself, in other words we can add a text on top of the snapshot of particular
visualization. For example on any chart I can add a text box and put an insight
for it.
- For static visualizations:
though it solves the problem of keeping insight with visualization, but the
insight is still static. In case the data changes we need to go back put new
insights. In case we want to use the similar insight at same event for some
other data it can’t be done.
- For dynamic
visualization, currently there is no feasible way to add annotation at a
particular instance of visualization.
- Using a third party tools to embed text/audio commentary on the
visualization: For data based visualizations, this means recoding the screen
and adding the labels on top.
- Because we
have recorded the screen (or taken a snapshot), the whole visualization is no
longer attached to the data itself. Hence the insight can’t be used to in some
other chart or data as such. For any new data we need to again create the
visualization and reuse the tool to add the insight.
- Extra time,
efforts are required to do the recording and embedding the text/audio.
- You need
third party tools to achieve this.
2.
Proposed Solution to the Problem
In recent
times, many grammar based visualization engines have been introduced in the
data visualization field. (Vega (http://trifacta.github.io/vega/editor/), IBM Rave
etc.) These engines allow the use to define the visualization using a
predefined template.
The
solution proposes to attach the insights in the visualization by attaching the
trigger points and related insights with the template used to create the
visualization.
Based on
the trigger, we can show more information about the underlying data in the form
of text, audio, graphics or other means. The triggers can be:
‘event’ based,
“when the current
visualization is showing the data for 2001” or
“when the data for X axis
goes beyond the value 1000” show “…..” .
Time based
When we are into 3rd
minute of transition
Due to this
dynamic nature of attached information we can
- change the underlying data with same insight and same
visualization(text or audio)
- change the insight for same or different underlying
data.
3.
Benefits
Proposed
solution will allow business users to:
- gain more insights from a
visualizations.
- embed more insights for end
users.
- as the insights are embedded
we can increase the complexity of the visualizations without the fear of
overwhelming the end user.
- as the insights are embedded
in the visualization and can be attached to the data this adds the capability
to:
- show different insights
based on same data
- show similar insights for
different sets of data
4.
Sample Implementation and Flow
This type of system can be easily
built using the popular Grammar of graphics or a similar system. A sample flow:
Sample
Visualization Template:
Execution Example:
Condition: Difference between max and min value > 200% of average
:- Insight: There is huge difference between best
performing and worse performing nations.
Note: The
mentioned insight will be shown if the condition is met. So in case of any
other year’s data if the difference is not as huge the text will not be
shown.
5.
Going one step ahead
Once we have the system in place, it will be easy to have an implementation where there are several insight triggers attached to the templates based on a rule and over time these triggers will build up. It’s quite possible that on producing visualization for some a set of data, user will be overwhelmed with multitude of such insights (obviously enough due to several triggers getting fired). To help the end user, the system can provide a facility for providing a ‘filter’ input which will filter the insights shown to the user. So in above mentioned example, in case the user is interested in the progress of only third world countries the respective inputs can be provided. On which the system will filter the insights using a simple text based comparison.