Python is now being hailed as the programming language of the decade because of its versatile nature. With simple english words being used in coding, it is easy to understand and decode errors. Python are R are the most popular programming languages used in the analytics industry to understand and analyse data. In this article we shall analyse the data from the popular eSport Dota 2. This dataset consists of attributes of different heroes in the game.
Dota 2 is a game of 10 players, where there are two teams of five each — radiant and dire. They battle against each other, where the team who takes the fountain wins the game. In this dataset there are three types of heros which will help us analyse how diversified the game is.
Let us import the necessary libraries required
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
You can find the dataset here, where you can download the .csv file.
Let us read the data now, with the help of pandas
Dota = pd.read_csv(hero_stats.csv')
The shape of the data set is <code>(115, 46)</code> which tells us that there are 115 heroes and 46 attributes or features to every hero available in this dataset.
Let us start with plotting the armor of different heroes in the game. This is just a one-dimensional graph which is a swarmplot.
armor = sns.swarmplot(data=Dota, y='Armor')
We will be plotting the data with respect to the primary statistics:
And their attack type:
Let us look at the join plot, from seaborn. In this graph we can infer that the attacks per second and damage per second are normally distributed. With the argument reg one can also plot the regression line which splits the two graph into two part. This feature is not available in Matplotlib.
sns.jointplot(Dota['Attacks Per Second'],Dota['Damage Per Second'], kind='reg')
Let us look at lmplot — this graphical representation is helpful for representing categorical data. In this example we have considered Attack Damage, Primary Stat as X and Y axes and Hue (color differentiation) as our Attack Type. In the legend one can notice how different colors help us understand the data better.
facett = sns.lmplot(data=Dota, x='Attack Damage', y='Damage Per Second', hue='Primary Stat',
fit_reg=False, legend=True, legend_out=True)
This attribute of seaborn is what makes it unique and gives a classy look when compared to Matplotlib. With this kind of data, where categorical values are easily plottable, seaborn is currently being used when analysing data with Python.
Let us look at another example where, two regression lines are drawn to divide the data. This feature is quite helpful when dealing with classification problems. Here is the graph:
fig = sns.lmplot( data=Dota, x='Increased Attack Speed', y='Attacks Per Second', hue='Attack Type',
fit_reg=True, legend=True, legend_out=True, aspect=16/10)
From this graph we can also infer that the melee heroes have a higher attack range when compared with ranged heroes during laning stage.
Now let us look at the violin plot , which is quite nice because it also shows the distribution in the graph itself. This is one graph which is mostly used by analyst.
ax = sns.violinplot(x="Movement Speed", y="Primary Stat", hue="Attack Type",
data=Dota, palette="Set2", split=True,
From this graph we can infer that the strength heroes from Dota 2 are mostly melee, and that they mostly deal physical damage. On the other hand, the intelligence heros are squishy and soft, they mostly spell casters and usually stand behind the Agility and Strength heroes and support them during fights. The agility heroes are distributed equally.
Now let us look at the boxplot. This type of graph divided the distribution of the data into four equal parts of 25% each. Here, the second and the third quartile is represented with a box, in which, the line represents the median value. The first and the fourth are at the extreme ends respectively. This is a statistical model of representing the data for better understanding.
ax = sns.boxplot(x="%Physical Damage Reduction", y="Primary Stat", data=Dota)
Now let us check out the stripplot. Here we are comparing the primary stat with respect to the mana of the heroes. We all know that intelligence heros have more mana compared to other hero types. Let us represent them graphically to understand them better.
ax = sns.stripplot(x="Primary Stat", y="Mana", hue="Attack Type",
This graphical representing works well with categorical data. With more number of distributed classes one can visualise the data better.
(Sidebar: We have tried the boxenplot but it doesn’t seem to work in seaborn anymore. Do let us know in the comments if the newer updates have been removed or changed.)
Lastly we have the palette representation of data, which is cool and trippy at the same time. Here is the graph:
ax = sns.stripplot("Primary Stat", "Base Damage Max", "Attack Type", data=Dota,
palette="Set1", size=25, marker="D",
With this article, one can practice how to deal with all kinds of categorical and continuous data for visual representation. There are other graphical representational libraries in Python like Plotly and others, which have other unique type of features. You can also check them out.