The latest Anaconda State of Data Science Survey 2018 showcases that Matplotlib is the most-preferred data visualisation tool. It continues to enjoy its first-mover advantage in visualisation with 75 percent votes as compared to other popular tools such as Plotly, Tableau, Microsoft Power BI and Tibco Spotfire, the official press release said. The survey has once again rekindled the old debate and pitted Python’s most used visualisation library against the crowd favourite Tableau. Data visualisation tools have always been polarising subject with the community divided over the favourite tools – D3, Tableau, R or Python.
In the highly crowded data visualisation space with a spate of strong commercial competitors, the recent survey shows there is a section of developers, proficient in Python who prefer Matplotlib, Python’s core plotting library functionality over Tableau’s simple drag-and-drop features.
The survey findings do come as a bit of a surprise since Tableau (built from the ground-up for data vizzies) is regarded as the most popular data visualisation tool. Tableau also has wide adoption and has also become the most in-demand skill, with many organisations demanding Tableau skills on the resumés for data scientist and analyst positions.
Now even in the programming community, Matplotlib is regarded as unwieldy for new users as it requires a great deal of time to understand and build graphs. But there is a section of users who upvote for Python’s core plotting library as opposed to Tableau. In fact, earlier this year, Matplotlib 2.2 was announced with a slew of new features.
Some Of The New Updates In Matplotlib 2.2 Are:
- Automatic spacing for subplots with a fixed-size padding in inches around subplots and all their decorators, and space between as a fraction of subplot size between subplots
- Spacing for subtitles, and colour bars that are attached to more than one axes
- Nested GridSpec layouts using GridSpecFromSubplotSpec
Python For Visualisation Matplotlib
No Black Box Technique: Matplotlib user David N Berol wrote in a post that even though learning Matplotlib may sound going off the beaten path, it still gives the developer a full set of tools along with open source documentation, unlike Tableau — a GUI where one wrestles with the black box nature of the software.
Python Purists Abound: Another area where Matplotlib scores is due to Python’s numerous libraries, a large user base that easily translates into tutorials, documentation, free resources and a dedicated base of developers. Basically, there are a lot of Python purists who swear by Python’s data viz package for high-res visualisation.
Matplotlib is excellent for drumming up charts and graphics but it may not be the best fit for time series data, where ggplot is better for defining parameters. Ggplot API is also considered the best for visualising exploratory data analysis. However, Matplotlib also comes with certain disadvantages — it wasn’t designed for exploratory data analysis and visualisation, said a practitioner. Also, the library is not the best for dealing with time series data since one needs to import all these helper classes for the year, month, week, day formatters, shared a user on a forum. It is also very unwieldy when dealing with multiple datasets while it is easy to convert dataset into a long format and plot it. Another big disadvantage is that the library is very low-level, which means that one needs to write more code to get the visualisation as opposed to Tableau where you can achieve that in a few clicks.
Python Data Visualisation Tools