1. Background
In recent
times data visualization has come up as a popular choice for analysis and
planning. There are numerous new visualizations which present the data in
different ways for easy fact finding.
But these
benefits come with a side effect - Visualizations are getting more and more
complex and generally need further information (in textual, audio or visual format)
to explain the facts so that user can deduce proper insights.
Dynamic visualizations, where
transitions are used to show the changes in data over period of time, elevate
the problem even further. As the underlying visualization continuously changes,
either we need to ‘pause’ the transition or again provide audio or separate text
inputs to support the data presented. Few example of complex visualization are
given below: (Ref: http://www.visualcomplexity.com/vc/)
Use Case
Consider the chart shown below; on high level it’s showing the per-capita
income in various countries at different point of time. As you will see, over
the period of time the data points are moving towards the right top corner.
|
|
Though the intellectual users will get some insights using the axis
labels (the static information) in the chart but more insights on a particular
instance of dynamic visualization will be difficult or many times totally lost.
- During 1970s, 80s, except few countries all having low per capita income
- Over the last few decades more countries having better per capita income
- Even though the already developed countries are growing further, but their growth rate is much slower than the developing nations
- The difference between the highest earning and lowest earning nations is still almost same
- Nation X has made the most progress, etc …
With a bit more complexity
in data or visualization, it can be almost impossible to convey the underlying
message by just using the visual cues.
Current Solutions and Associated problems
The current
solutions partially solve the problem by either:
- Providing a separate text outside the visualization:
- Not user friendly
- while sharing the visualization, we need to share both visualization and accompanying information.
- Not real time. If the data changes we need to change the text accordingly.
- Providing a text/annotation on the visualization itself, in other words we can add a text on top of the snapshot of particular visualization. For example on any chart I can add a text box and put an insight for it.
- For static visualizations: though it solves the problem of keeping insight with visualization, but the insight is still static. In case the data changes we need to go back put new insights. In case we want to use the similar insight at same event for some other data it can’t be done.
- For dynamic visualization, currently there is no feasible way to add annotation at a particular instance of visualization.
- Using a third party tools to embed text/audio commentary on the visualization: For data based visualizations, this means recoding the screen and adding the labels on top.
- Because we have recorded the screen (or taken a snapshot), the whole visualization is no longer attached to the data itself. Hence the insight can’t be used to in some other chart or data as such. For any new data we need to again create the visualization and reuse the tool to add the insight.
- Extra time, efforts are required to do the recording and embedding the text/audio.
- You need third party tools to achieve this.
2. Proposed Solution to the Problem
In recent
times, many grammar based visualization engines have been introduced in the
data visualization field. (Vega (http://trifacta.github.io/vega/editor/), IBM Rave
etc.) These engines allow the use to define the visualization using a
predefined template.
The
solution proposes to attach the insights in the visualization by attaching the
trigger points and related insights with the template used to create the
visualization.
Based on
the trigger, we can show more information about the underlying data in the form
of text, audio, graphics or other means. The triggers can be:
‘event’ based,
“when the current
visualization is showing the data for 2001” or
“when the data for X axis
goes beyond the value 1000” show “…..” .
Time based
When we are into 3rd
minute of transition
Due to this
dynamic nature of attached information we can
- change the underlying data with same insight and same visualization(text or audio)
- change the insight for same or different underlying data.
3. Benefits
Proposed
solution will allow business users to:
- gain more insights from a visualizations.
- embed more insights for end users.
- as the insights are embedded we can increase the complexity of the visualizations without the fear of overwhelming the end user.
- as the insights are embedded in the visualization and can be attached to the data this adds the capability to:
- show different insights based on same data
- show similar insights for different sets of data
4. Sample Implementation and Flow
This type of system can be easily
built using the popular Grammar of graphics or a similar system. A sample flow:
Sample
Visualization Template:
Execution Example:
Condition: Difference between max and min value > 200% of average
:- Insight: There is huge difference between best
performing and worse performing nations.
Note: The
mentioned insight will be shown if the condition is met. So in case of any
other year’s data if the difference is not as huge the text will not be
shown.
5. Going one step ahead
Once we have the system in place, it will be easy to have an implementation where there are several insight triggers attached to the templates based on a rule and over time these triggers will build up. It’s quite possible that on producing visualization for some a set of data, user will be overwhelmed with multitude of such insights (obviously enough due to several triggers getting fired). To help the end user, the system can provide a facility for providing a ‘filter’ input which will filter the insights shown to the user. So in above mentioned example, in case the user is interested in the progress of only third world countries the respective inputs can be provided. On which the system will filter the insights using a simple text based comparison.