Mobile Operators in Karnataka

This blog is about statistical data analysis to determine, market share of mobile operator in Karnataka.

Input data

I have access to mobile numbers of 1300+ members of an NGO, since about June 2013. 


As we know the first four digit of mobile number indicates the mobile operator in India. I extracted the first 4 digit. Then I referred wikipedia to find out the mobile operator. I wrote a simple python code using panda library. 

Here is the code

import pandas as pd
import numpy as np

df = pd.read_csv("C:\RawData.csv", header=None)
df[2] = df[0] / 1000000
df[2] = np.floor(df[2])

series = pd.Series(df.groupby(2).count()[0])

result = pd.read_csv("C:\ssummary.csv", header=None)
plot_data = pd.DataFrame(result.groupby(0).sum())
plot_data = plot_data.sort_values(by=1)
plot_data.plot(kind='pie', autopct='%.2f', subplots=True, figsize=(10,10))

Here is the result


1. The raw input data is not current mobile number of member of the NGO. The members may have change their numbers. 
2. The members can also opt for Mobile Number Portability. The impact on result is ignored. 
3. As part of data cleaning following entries are ignored

- the number with 9 or 11 digits
- the number that belongs to other state. 
- the number whose first four digits are not present at wikipedia. May be wrong number. 


Due to above assumptions, the sample does not truly represent the population. The purpose of this blog is to show how quickly we can infer, using python panda library. This blog is not written for positive or negative publicity of any mobile operator.