Polychart is a web-based application for visually analyzing data and creating charts. Through drag-and-drop functionality, it enables managers, marketers, analysts and other users to understand data visually without having to code or perform statistical analysis.
MaRS Market Intelligence spoke with Lisa Zhang, Co-founder of Polychart.
What led to the creation of this technology?
I did a couple of internships at Facebook’s data science team, so I’ve seen some of the trends and opportunities available in the data-analysis market. I’ve also witnessed the rapid growth of the data-analysis software, Tableau, which is a great tool. What we felt was missing from it was the ability to bring that type of analysis to the web. The advantage of being web-based is that we don’t have to make assumptions about which operating systems people are working under, or how willing people are to download software and plugins. It’s just very accessible.
Why Polychart rather than a more traditional tool like MS Excel?
The best thing about Polychart is the speed at which you can create a chart. I think iterability is extremely important when you’re analyzing data, since you tend to think of ideas as you’re working. If there’s a lot of friction between when you thought of an idea and when it shows up on the screen, then that idea just gets lost. In data analysis, this can mean the difference between having a key business insight and not.
Based on your experience at Facebook, how well do companies exploit their data? Particularly, how well do web companies leverage their large amounts of user data?
I think there are a lot of ways in which companies could be using their data but are not due to a lack of talent in the data space―this is particularly true when you’re dealing with big data. There is a Fast Company article I came across which talks about there being 340,000 big data positions in 2012, of which more than half will go unfilled. I think a lot of this is a talent issue and if we can increase the accessibility of data analysis, then companies can go a much longer way.
Why is there such a lack of talent?
Well, in order to be a good data scientist, you need to understand statistics and you also need to have programming skills in order to manipulate data. In order to visually present data in an impactful way, you need to understand human perception and how to communicate well. Those are a lot of different skill sets at play that are difficult to find in one person.
Visualizations can often lead to different interpretations, simply by the way in which the data is displayed. Does Polychart address this challenge?
This is one thing we take very seriously. There is ample research into the field of perception that tells us what our visual system pays attention to. For example, people are very good at comparing areas, and so it’s helpful to start the y-axis of a bar chart at zero. It’s also why 3D effects on bar charts and pie charts can distort the data being displayed. 3D effects do a great job at grabbing someone’s attention, but when doing data analysis, accuracy is much more important.
The fact that people are good at comparing areas is also why when representing values using sizes of objects (say, a circle), the area should grow proportional to the value represented (as opposed to the radius). Say you are representing the numbers 1, 2 and 3, and you use circles that have a radius of 1, 2 and 3, then the third circle will actually look nine times bigger than the first because people perceive areas more readily than the diameters.
Colour is something else that is tricky to use. While colours are great for representing categorical values, they’re not very good for representing quantities. We’re very bad at seeing if a shade is one-and-a-half times, two times or three times darker than another.
In terms of choosing the type of charts to use, there is an interesting flowchart that suggests which visualization to use based on the data that you have and the purpose of the visualization.
Any examples of poorly made visualizations?
The chart titled “Percentage of Comments by Identity” is an example of a visualization that ignores best practices. The 3D effect and the different heights shown give a disproportional area to “Pseudonyms,” and make the area representing “Real Identity” a lot more than 10 times smaller than “Anonymous.”
Similarly, this graphic by Gizmodo about the change in iPad battery size has tied the increase of battery size to the height of the image rather than to the area, misrepresenting the increase.
Fox News is a large source of misleading visualizations! This chart on Bush tax cuts does not start the y-axis of the chart at zero, which magnifies the change in tax rates.
Another chart created by Fox News about unemployment rates is borderline dishonest. The last data point of 8.6% is shown as being a non-change on the graph.
Putting aside all these bad visualizations, what are some of your favourites?
Napoleon’s March is a classic visualization and a great example of an effective way to present statistics.
More recently, the interactive database We Feel Fine is one of the biggest data visualization projects in the past decade