Data Visualization Market
By: Neha Khera
Data visualization is not a new concept. It has been used for centuries to distill and communicate information. Think about all the maps, graphs and charts in existence, and the popularity of this form of data analysis will quickly become clear. However, with advancements in technology, data visualizations are taking on more complex forms than ever before. They are being used to unravel the meaning behind big data sets that would otherwise be too difficult to understand. Highlighted in this piece are eight Ontario-based startups whose innovative applications are setting the future for data visualization.
Data, data and more data. What’s all the hype about?
To understand the importance of data visualization, let’s take a step back and look at the impact of data in today’s modern economy. It has been said that we are living through the Industrial Revolution of data: an era where so much data is being produced on a daily basis by people and machines that we no longer have the capacity to store it all. From the billions of mobile phones to the trillions of RFID sensors, we live in a world where our every action and reaction is being captured and stored. And while it may seem eerily intrusive, the capturing of data has the potential to drastically improve the world in which we live. This is the rise of what’s known as “big data.”
The term “big data” was coined to describe data sets with a size and complexity beyond the ability of typical database software tools to capture, store, manage and analyze them.1 This definition is intentionally subjective and is not meant to limit “big” data sets to a certain number of terabytes.1
Just how big a phenomenon big data actually is was eloquently captured in a remark by Google’s Eric Schmidt. He pointed out that we are creating as much information every two days as we did from the dawn of civilization up until 2003. On a daily basis, this translates into around 2.5 exabytes of data.2
FIGURE 1: Amount of data created daily
With each coming year, the vastness of data generated will only intensify. For example, the Square Kilometer Array (SKA) Telescope―the world’s largest telescope―is projected to generate in excess of one exabyte of data per day when it goes live in 2024.3 This is roughly twice the amount of data that’s generated everyday on the World Wide Web.3 IBM is working feverishly to develop a supercomputer powerful enough to handle this amount of information.
Big data can and will impact every nation, industry, company and individual around the globe, whether it’s in terms of understanding our galaxy, optimizing healthcare, selecting an ideal retail location or finding the perfect date. A study by McKinsey Global Institute estimates that big data can add $300 billion worth of value to the US healthcare system and can increase retailers’ operating margins by as much as 60%.1 There is no doubt that those who collect, analyze and act on their data successfully will gain a competitive advantage in their market.
What is enabling the big data hype
The rise of big data springs from two main factors:
1. The increased generation of information.
2. The ability to store this information.
Both of these factors are tied to advancements in technology. Social media applications have generated huge amounts of sentiment online, where the beliefs, activities and interests of billions of people are being captured in a way like never before. Mobile devices are used by over six billion people today, of which nearly five billion are in developing countries.4 These devices are capturing data in regions where information was previously difficult to extract. And through the rise of networked sensor technologies such as RFID (radio-frequency identification) tags, more than 30 million articles are being tracked across the transportation, industrial and retail sectors.1
And as Moore’s law continues to prevail, we now also have the ability to store all this data that’s being generated. And storage of vast amounts of data is financially accessible to many. Today, the entire world’s music can be stored on a device that costs less than $600.1 Up until the turn of this century, storing an average music playlist of 7,000 songs would have cost $500 alone.
Extracting value from big data
The creation and capture of data by itself does not, obviously, benefit anyone―only when analysis is added to the mix is the value of big data unlocked. Unfortunately, this is also an area where significant challenges exist. Big data analysis remains a market in its infancy. As Google’s Chief Economist Hal Varian put it, “Data are widely available; what is scarce is the ability to extract from them.” 5
FIGURE 2: The Digital Intelligence Architecture6
Big data analysis is often hindered by the sheer cost involved in purchasing tools that can process large volumes of information. Another impediment is not being able to process information quickly enough to extract insights in real-time. Waiting two days or two weeks for reports is becoming unacceptable given the fast pace of digital interactions. What is likely the biggest obstacle is the lack of talent and expertise in the data science field. The McKinsey Global Institute gauges that by 2018, more than half of all big data jobs, nearly 200,000 of them, will go unfilled because skilled candidates will be in short supply.7
However, as we turn our attention to the field of data visualization―one form of data analysis―we start to see many of these roadblocks disappear. The power of data visualizations lies in their ability to transform the most complex of data sets into a rendering that even novice users can interpret. And through technology innovation, data visualization tools have become increasingly easier to adopt, with intuitive user-interfaces and cloud-based access.
The rise of data visualization
At its core, data visualization is the use of abstract, non-representational pictures to show numbers.8 It can include points, lines, symbols, words, shading and colour.8 Data visualizations make it easier to spot trends and patterns amid large amounts of information. They also make it possible for data to tell a story. Just as experts in the field of communication propose the use of stories to better convey information verbally, the same holds true when conveying information through data. And one of the best ways to tell a data story is to use a compelling visual.
As industry-renowned data visualization expert Edward Tufte once said about the traditional rows and columns of data tables, “The world is complex, dynamic, multidimensional; the paper is static, flat. How are we to represent the rich visual world of experience and measurement on mere flatland?” 9
Data illustration techniques have been in use since as early as 6200 BC, when the oldest known map was drawn. However, it was not until the eighteenth century when data visualizations went beyond mapping and more abstract measures were introduced, including the ever-popular pie and bar charts.
The nineteenth century saw the creation of what many have argued to be the world’s best data visualization: Charles Joseph Minard’s 1869 visualization titled Napoleon’s March, which depicts the movement and losses of Napoleon’s army as it invaded Russia in 1812.
FIGURE 3: History of data visualization
After 1975, we witnessed the most rapid advancements in data visualization, which stemmed from the development of software and computer systems. Data visualizations moved beyond pie and bar charts, and more complex formats began to appear and aid us in processing information. For example, through the use of mind maps, our thought patterns can now be visually organized. Apps like Flipboard and Newsmap have completely reinvented the display of news, while tag clouds have provided another way to discover and search for information. And through network graphs, we can now uncover the connectivity between any number of entities, be they our own social circles, groups of companies or globally dispersed cities.
Moreover, visualizations no longer adhere to a static format: they can be interactive in nature. This allows a user to drill down on certain data points, or manipulate and change views of the information to reach deeper insights.
Infographics are another popular visualization form. Their growth since 2009 came with the rise of content marketing, which involves the creation and sharing of content in order to engage with customers.10 Brands and advertisers frequently use infographics as a form of content, as they provide both interesting insights and visual appeal, and are easy for users to share on the web.
FIGURE 4: Search and news reference volume for the word “infographic” on Google.
Data visualization tools
Until about 2007, Microsoft Excel was the de facto standard for developing visualizations, whether they were pivot tables or simple graphs. When analyzing larger data sets or looking for more complex visualizations, knowledge workers would often have to tap into their company’s own business intelligence (BI) units to access highly skilled data scientists and analysts.
Since 2007, however, a new breed of visualization tools has emerged which is characterized by simplicity and ease of use. These tools enable non-technical workers to bypass their BI units and model data themselves. This is the rise of what Gartner touts as “data visualization applications,” an industry Gartner predicts will reach $1 billion as early as 2013.11
Tableau Software is one of the fastest-growing data visualization applications on the market today and is in use by over 9,000 organizations around the globe. Tableau’s success is a testament to the rise of the data visualization market, which research firms Gartner and IDC predict won’t slow down any time soon.
The challenge with data visualizations
With the advent of these innovative tools, the ability to create a visualization of a data set is no longer difficult. What remains difficult, however, is the creation of a good visualization.
If we break down the field of data illustration, we see that it is essentially the coming together of two contrasting fields of study: art and science. It requires the harmonious work of both the left and right brain, where the most complex of data sets can be gathered and refined and then organized in a simple yet compelling way. Finding this type of expertise is not an easy feat―unless, of course, you’re a Google. Google’s “Big Picture” data visualization group is led by Martin Wattenberg, and a quick look at his resume makes you realize he is among a special breed of people. How many people do you know with both a doctorate in mathematics and an exhibition at New York’s MoMA?
Due to the difficulty in finding the right talent and expertise, data visualizations often end up being too complex to interpret, or they distort the information by focusing on the visual and not the meaning of the data itself. As Tufte explains, “excellent visualizations are those that give the viewer the greatest number of ideas in the shortest amount of time, with the least ink and in the smallest space.”8 In essence, data illustration is about simplifying the complex as much as possible.
Investment in the data visualization space
2011 was a banner year for companies in the field of big data, with an estimated $2.47 billion invested by venture capital firms globally.12 This was a 38% increase from the amount invested in 2010.12
The following chart depicts some of the top data visualization companies and their respective funding to date.
Excluded from this chart is the analytics application Spotfire, which was acquired for $195 million by TIBCO software.13 Prior to its acquisition in 2007, Spotfire raised nearly $40 million over the course of ten years.13
Qlik Technologies is another notable software product with powerful visualization techniques. The company went public in July 2010 at a valuation of nearly $900 million.13 Prior to its IPO, the company raised over $80 million over a ten-year period.13
Noteworthy applications of data visualization
Understanding census data
For over a century, visualizations have been used by governments to better understand census data and decide, for instance, how representation should be apportioned and federal dollars distributed.14 A recent example (below) shows Statistics Canada maps depicting population changes in the Greater Toronto Area from 1996 to 2011. They reflect how population growth is slowing in Toronto and Mississauga and rising in areas north of these cities.
FIGURE 5: GTA Population Change by Municipality 1996-200115
FIGURE 6: GTA Population Change by Municipality 2006-201116
OpenFile, a Toronto-based startup, has used 2011 Canada census data to build their CensusFile application. Through the use of data maps, this application allows anyone to mine the census data and gain insights about their neighbourhood.
One of the most cited examples of a data visualization success story was John Snow’s cholera map. During an 1854 outbreak of cholera in London, England, Snow used a spot map to illustrate how outbreaks of cholera were centered around the city’s water pumps. This depiction helped prove that cholera was being spread through water and not by air, as was thought at the time.17
FIGURE 7: John Snow’s cholera map17
In 2006, the city of Sault Ste. Marie in Ontario was able to eliminate what could have been a potentially serious threat related to the West Nile Virus. The Sault Ste. Marie Innovation Centre had done a systematic job of enabling the sharing of data sets between various municipalities within the city. The data sets were then being merged using data maps to uncover new insights. Through this activity, the Centre happened to learn about an unusually large collection of mosquitoes within the city’s underground transformer vaults. Due to an absence of draining structures, the vaults had unknowingly become the perfect breeding ground for mosquitoes. Were it not for the use of data visualization, this threat of West Nile Virus would not have been discovered and mitigated.
FIGURE 8: Map reflecting Sault Ste. Marie mosquito trapping efforts18
The Canadian Institute for Health Information (CIHI) has developed the Canadian Hospital Reporting Project (CHRP), which is focused on improving the quality of healthcare across the nation. Visualizations are being used to increase understanding of mortality rates, readmission rates, costs of hospital stays and other health indicators. The project’s goal is to provide data insights to key decision- and policy-makers, so improvements can be made and hospitals can collaborate to achieve efficiencies.
FIGURE 9: Hospital 30-day overall readmission rates by Ontario region, 2009-201019
The 1986 destruction of the Space Shuttle Challenger, which was due to a damaged O-ring seal, has been attributed in part to a failure of data analysis. Decision-makers at the US space agency, NASA, were uncertain about whether to launch the space shuttle in below-freezing temperatures, and relied on poorly presented data and short bullet points in making their decision. As data visualization expert Edward Tufte later pointed out, this disaster could have been avoided had the data been more clearly conveyed through the use of a graphic. The sample graphic Tufte later developed makes obvious the risk of O-ring damage in extreme cold temperatures.
FIGURE 10: Edward Tuft’s figure on the 1986 Challenger Space Shuttle launch decision20
Today, NASA is heavily involved in the development of visualizations that explain NASA missions and scientific results.
General Electric (GE) is one of the many companies developing extraordinary visualizations, based on the petabytes of data collected through their various technologies. GE is hoping the visualizations will help not only simplify the complex nature of their work, but also drive insights and discoveries that might otherwise be difficult to achieve. For example, GE has developed an interactive visualization to help US residents understand how much of their energy is being supplied by renewable sources.
FIGURE 11: Energy being supplied by renewable sources for US residents21
IBM is also experimenting with data visualization and has developed an application called Many Eyes that invites anyone to upload a data set or to visualize an existing one.
Groups supporting data visualization
Discovery Exhibition is a US-based organization that profiles “visualization impact stories.” Highlights in 2011 included visualizations that helped reveal the mortality rate of African infants, understand traffic patterns in Beijing and optimize car engine injection systems. Information is Beautiful is another US organization focused on celebrating beautiful designs in data visualization. Among the nominated designs for 2012 is one on the Vancouver Canucks’ franchise history.
Here in Ontario, York University and OCAD University have teamed up to develop the Centre for Innovation in Information Visualization and Data-Driven Design (CIV-DDD), which is essentially a data visualization research hub. Leveraging computer scientists from York and designers from OCAD, the group is working to develop data visuals that help solve specific problems across the areas of healthcare, arts, social sciences and engineering. Sample projects underway include understanding the impact of social media content and mapping the origins of Africans liberated from transatlantic slavery.
MaRS’ very own Data Catalyst team is working with data to provide insights on the innovation economy in Ontario. Their outputs will include visualizations and dashboards representing the impact of innovation support in the province, as well as visualizations that highlight opportunities for market and economic growth in key sectors.
There is no question about the potential for growth and innovation in the data visualization space. Otherwise hard-to-understand rows and columns of numbers are brought to life through visualization techniques. Data illustrations not only help to tell a story, but they reveal the true meaning behind a data set.
However, data visualization is only one of a series of analytics techniques. As we continue to collect more and more data every day, an increasing number of techniques will be required to distill the most complex of data sets down to an easily accessible message. This is an existing gap in the big data market, and an area where entrepreneurs should think about focusing their efforts.
Content Lead and Market Analyst:
Neha Khera, MaRS Market Intelligence
Partner & Advisor:
We thank the following individuals and organizations for their participation in this report:
Dr. Kamran Khan, CEO and Founder, Bio.Diaspora
Nick Edouard, EVP Business Development & Marketing, BuzzData
Nadia Amoroso, CEO and Co-Founder, DataAppeal
Haim Sechter, COO, DataAppeal
Niall Wallace, CEO and Founder, Infonaut
Lisa Zhang, Co-Founder, Polychart
Faizal Karmali, Director and Co-Founder, Quinzee
Sam Molyneux, CEO and Co-Founder, Sciencescape
Eugene Woo, Founder, Venngage
1. McKinsey Global Institute: Big data: The next frontier for innovation, competition, and productivity
2. TechCrunch: Eric Schmidt: Every 2 Days We Create As Much Information As We Did Up To 2003
4. The World Bank: Mobile Phone Access Reaches Three Quarters of Planet’s Population
5. The Economist: Data, data everywhere
6. Forrester report: Welcome to the Era of Digital Intelligence
7. Fast Company: Time To Build Your Big-Data Muscles
8. Edward Tufte: The Visual Display of Quantitative Information
9. Forrester report: Advanced Data Visualization (ADV) Platforms, Q3 2012
10. Content Marketing Institute: What is Content Marketing?
11. Gartner report: Emerging Technology Analysis: Visualization-Based Data Discovery Tools
12. SGMarketwatch: Venture Capital Sees Big Returns in Big Data
13. Dow Jones VentureSource
14. Fast Company: Infographic of the Day: What the Census Said About Us…in 1870
15. Toronto Urban Development Services: Population Growth and Aging
17. Visual.ly: John Snow Cholera Map
18. ESRI Canada: Case Study: Sault Ste. Marie Innovation Centre
19. Canadian Institute for Health Information: CHRP Key Findings
20. Edward Tufte: Visual Explanations: Images and Quantities, Evidence and Narrative, p.44
21. GE: Renewable Energy Sources