Mobile Operators in Karnataka
Posted by
Manish Panchmatia
on Sunday, February 21, 2016
Labels:
Bangalore,
Management,
Python,
software,
Telecom Wireless
This blog is about statistical data analysis to determine, market share of mobile operator in Karnataka.
Input data:
I have access to mobile numbers of 1300+ members of an NGO, since about June 2013.
Method:
As we know the first four digit of mobile number indicates the mobile operator in India. I extracted the first 4 digit. Then I referred wikipedia https://en.wikipedia.org/wiki/Mobile_telephone_numbering_in_India to find out the mobile operator. I wrote a simple python code using panda library.
Here is the code:
import pandas as pd
import numpy as np
df = pd.read_csv("C:\RawData.csv", header=None)
df[2] = df[0] / 1000000
df[2] = np.floor(df[2])
df.groupby(2).count()[0]
series = pd.Series(df.groupby(2).count()[0])
series.to_csv("C:\summary.csv")
result = pd.read_csv("C:\ssummary.csv", header=None)
plot_data = pd.DataFrame(result.groupby(0).sum())
plot_data = plot_data.sort_values(by=1)
plot_data.plot(kind='pie', autopct='%.2f', subplots=True, figsize=(10,10))
Here is the result:
Assumption
1. The raw input data is not current mobile number of member of the NGO. The members may have change their numbers.
2. The members can also opt for Mobile Number Portability. The impact on result is ignored.
3. As part of data cleaning following entries are ignored
- the number with 9 or 11 digits
- the number that belongs to other state.
- the number whose first four digits are not present at wikipedia. May be wrong number.
Disclaimer
Due to above assumptions, the sample does not truly represent the population. The purpose of this blog is to show how quickly we can infer, using python panda library. This blog is not written for positive or negative publicity of any mobile operator.
Input data:
I have access to mobile numbers of 1300+ members of an NGO, since about June 2013.
Method:
As we know the first four digit of mobile number indicates the mobile operator in India. I extracted the first 4 digit. Then I referred wikipedia https://en.wikipedia.org/wiki/Mobile_telephone_numbering_in_India to find out the mobile operator. I wrote a simple python code using panda library.
Here is the code:
import pandas as pd
import numpy as np
df = pd.read_csv("C:\RawData.csv", header=None)
df[2] = df[0] / 1000000
df[2] = np.floor(df[2])
df.groupby(2).count()[0]
series = pd.Series(df.groupby(2).count()[0])
series.to_csv("C:\summary.csv")
result = pd.read_csv("C:\ssummary.csv", header=None)
plot_data = pd.DataFrame(result.groupby(0).sum())
plot_data = plot_data.sort_values(by=1)
plot_data.plot(kind='pie', autopct='%.2f', subplots=True, figsize=(10,10))
Here is the result:
Assumption
1. The raw input data is not current mobile number of member of the NGO. The members may have change their numbers.
2. The members can also opt for Mobile Number Portability. The impact on result is ignored.
3. As part of data cleaning following entries are ignored
- the number with 9 or 11 digits
- the number that belongs to other state.
- the number whose first four digits are not present at wikipedia. May be wrong number.
Disclaimer
Due to above assumptions, the sample does not truly represent the population. The purpose of this blog is to show how quickly we can infer, using python panda library. This blog is not written for positive or negative publicity of any mobile operator.
0 comments:
Post a Comment