We start this article on the premise that, a) Analytics (also called Data Science) is a fairly debated subject matter in the current realm of business affairs in many parts of the world and, b) it appears to be an exciting field for professionals to look for opportunities to build their career in India. Hence, it is presumably a topical issue of importance to many students of management.
However, “Data Science in Developing Economies” appears to be a contradiction of sorts. The fact is that the basic motivation behind the practice of Data science is building efficiencies in a business system in an environment that is well saturated and holds little opportunities for identifying new turfs of opportunity. On the contrary, going by the general definition of a ‘Developing Economy”, the new opportunity turf in such economies exceeds, by leaps and bounds, the market scape that is currently under the influence of business manipulations. In this scenario, trying to use the virtues of data science and predictive modelling using traditional sources of data is like isolating opportunities from thin files of information and extrapolating the same into a vast ocean of unknown and untrodden space. This is by itself can be an anecdote for ineffective utilization of limited organizational resources.
The traditional scope of data science as we know it is best for extracting valuable information from existing data resources collated from past transactions and is usually worthwhile to decide on allocation of resources appropriately to present business context. The complication in a developing economy is that often times the actual opportunities for successful deployment of organizational resources are nascent and certainly not identifiable from past responses since the subset of responses is far less in comparison to the probable scale of the universe of opportunities.
What does a Data Scientist do?
We interpret IBM’s definition of its data scientists as someone who has a combination of skills in the realm of math/statistics, computer science and data warehousing and is able to combine data of different formats and sources together to provide meaningful insights and inferences that are useful for business decision making. We gathered that the emphasis is more on the extraction of information in meaningful ways utilizing technological skills from associated disciplines to help managers understand complex issues in a more simplistic manner. They are the translators of alien languages stored in varied formats in computing infrastructure into business language. However, the driver of such pursuits is the accessibility to data, something that appears to be scarce in the emerging markets.
The latest trend in data surfeit economies seemingly is about the “internet of data”2 which requires skills sets to mine multiple sets of interconnected data organizations simultaneously and link their insights to form a clearer vision of the future business imperatives for business organizations.
This is perhaps not the suitable model of analytics deployment in a developing economy at the moment. To substantiate this claim, at least partly, it may be worthwhile to look at the uniqueness of the context of India, a representative developing economy, and which appears to have dimensions that may require special attention.
Complications in data sourcing in emerging economies like India:
Our limited fact finding initiative among India-based business organizations provided some insights about the challenges faced by them in adopting Analytics for business decision making. We provide a synopsis of the same below.
a) Non availability of Comprehensive Business Data: A prerequisite for effective data science application is the availability of data. It may be structured or semi-structured (or unstructured), but the nevertheless it is important that the coverage of the available data source should be close to complete and the variety of information available is broad enough to provide a wholesome view of the business phenomenon that is studied. None of these conditions satisfy in the average developing economy, especially India where information collection architecture is very basic to be able to generate anywhere close to a holistic view of the business environment. In fact, such is the scale of incompleteness that many organizations struggle to initiate systematic planning of their existing information resources.
External information regarding markets and environment are the most difficult to acquire simply because there are few private or government agencies involved in the collection processes. Besides, the high cost of collection of data from relatively inaccessible parts of the country (rural markets, for instance) discourages investments in such initiatives. Even the government of India depends upon the local policing infrastructure to monitor and validate the authenticity of its citizenry while issuing passports. It is hoped that the Aadhaar card initiative will tide over some of these perennial bottleneck s of tracking residents of the country.
A consequent problem due to this no availability of data is that the impact of data and its subsequent processing and insight on decision making remains largely muted and incomplete.
b) Internal data in multiple and incompatible formats: A second dimension of the complication for some organizations that have quality business data and that get generated as a part of the business operations, such as transaction data in banks and retail stores, is their availability in different formats which causes immense problems of consolidation. Take for instance the banking and financial services institutions in India. In the past decade and a half, there has been rapid development in computerization and automation of operations in
most large public sector institutions. A consequence of this has been that while recent data is available in standardized electronic formats, their integration (or lack of it) with data available in legacy physical systems (read: paper formats) makes it difficult to apply any data science procedures reliably to glean insights for decision making.
c) Dependency on Heuristics for making decisions: Given the above constraints, many business organizations in developing countries including India remain steadfast on their dependency on heuristic business rules developed over long periods of experience and a firm connect “with the ground”. People –driven decisions override attempts at standardization and the common refrain heard is that information is not available or incomplete to substitute the “gut feel” with the rigors of scientific models based decision support systems”. A notable example of such a focus is the role of branch operations in managing business operations in the field. It is very apparent in rural markets, where the role of the local branch is important for taking both operational and at times strategic decisions. Here the lack of information is substituted by the “look and feel “of the environment, which is only possible through a decentralized branch based operations. A centralized process of decision making using data is therefore dispensed with and substituted by a people led decentralized organization structure.
d) Market Growth hides the virtues of Analytics-driven precision in Decisions: The futility of the analytics practice is driven by the notion of the “growing market syndrome”. Data scientists are supposed to extract business insights that act as a welcome succour in a highly mature and penetrated market. They are supposed to provide directions, refine decisions to hone in on the “close to the perfect” set of decisions for any environment. However, when the markets are in the expanding phase, such extraction of precise insights from past transactions is not quite relevant. In such a situation the importance of factual evidences, based on past occurrences can easily be discounted since the growth in the market overrides the leakages of a sub-optimal decision. Precision in decision making or the lack of it has little consequence on the year end performance of the organization since the overall market growth covers up for all such inefficiencies.
It is apparent that the role of precision driven model –based decisioning system such as the ones projected by the experts in the realm of data science is far-fetched when compared to the requirements of most industries (except the off shore domain) in a developing economy context.
But then is there a role for traditional analytical processes in decision making in such economies and what are the potential priority areas for investment? In the next section, we shall attempt to answer this question.
Try deep learning using MATLAB