Google Play Store Rating Prediction
Share:

Google Play Store Rating Prediction

Image
Description

The market for mobile applications is increasingly competitive and understanding which variables can be favored when designing such applications ca...

Prizes
This competition was made to earn points, and reputation in our platform. It will allow you to measure your knowledge in comparison to other data s...
Competitors
  • Alicia Sparks
  • Mario Sastoque-en
  • felipeamurciapo
  • Esteban Buitrago S
  • Linda Ortiz
  • juano121
  • juan fer
152 Competitors Published at: 08/25/2020
Points
15,000pts
graphical divider

Public Leaderboard


Ranking
Data Scientist
Country
# Submissions
Last submission
Best Score
1
Sidereus Sidereus Featured
Colombia
151
6 months ago
0.698709403908066
2
Pablo Lucero Pablo Lucero Featured
Ecuador
300
5 months ago
0.686667865555434
3
Fernando Chica Fernando Chica Featured
Ecuador
81
6 months ago
0.667694034607079
4
Nicolás Dominutti Nicolás Dominutti Featured
Argentina
106
5 months ago
0.667574096698846
5
Juan Fernando Cifuentes Garcia Juan Fernando Cifuentes Garcia Featured
Colombia
49
6 months ago
0.663858826283701
6
David Augusto Villabón Borja David Augusto Villabón Borja
Colombia
397
5 months ago
0.660429949973203
7
Adam Michaels Adam Michaels
United States
50
about 1 month ago
0.655121856160549
8
Diego Albarracin Mahecha Diego Albarracin Mahecha
Colombia
82
7 months ago
0.651433298488367
9
Diego Alexander Rueda Plata Diego Alexander Rueda Plata
Colombia
22
5 months ago
0.646085976555351
10
James Jeremy Valencia Becerra James Jeremy Valencia Becerra
Peru
6
6 months ago
0.63929545716907
11
Frank Diego Frank Diego
Peru
18
5 months ago
0.638895596239485
12
Michael Guzmán Michael Guzmán
Colombia
57
6 months ago
0.635130303583782
13
Cesar Gustavo Seminario Calle Cesar Gustavo Seminario Calle
Peru
22
5 months ago
0.6257545437472
14
Sebastian Alibaud Sebastian Alibaud Featured
Chile
2
9 months ago
0.625544425087108
15
Javier J Desario Javier J Desario
Argentina
22
8 months ago
0.622011304521809
16
Julian David Tellez Julian David Tellez
Colombia
79
5 months ago
0.607913434997534
17
César Arcos Gonzalez César Arcos Gonzalez
Mexico
10
7 months ago
0.59881630295099
18
Alejandro Anachuri Alejandro Anachuri
Argentina
1
9 months ago
0.598314009076341
19
Christian Farnast Contardo-en Christian Farnast Contardo-en
Chile
15
7 months ago
0.592456026193206
20
Leandro Ruiz Leandro Ruiz
Argentina
15
7 months ago
0.590645572308346
21
oscero90-gmail-com oscero90-gmail-com
Argentina
3
8 months ago
0.56280193236715
22
Cristian Camilo Hidalgo Garcia Cristian Camilo Hidalgo Garcia Featured
Colombia
13
5 months ago
0.539419790327753
23
claudio irrazabal tarazona claudio irrazabal tarazona
Peru
1
8 months ago
0.514572406104051
24
Matías Poullain Matías Poullain
Argentina
12
9 months ago
0.51160398819031
25
Mario Rugeles Mario Rugeles
Colombia
1
5 months ago
0.49953466099764
26
Luis Enrique Luis Enrique
Colombia
1
10 months ago
0.489612274517935
27
Juan Guillermo Gómez Ramírez Juan Guillermo Gómez Ramírez
Colombia
18
10 months ago
0.47393923065776
28
Emiliano Olivares Emiliano Olivares
Argentina
3
9 months ago
0.457449227402525
29
Felipe Perez Felipe Perez
Colombia
1
3 months ago
0.429472025216706
30
Víctor Manuel Víctor Manuel
Colombia
2
10 months ago
0.429472025216706



Timeline

Begin
2020/08/25
Finish
2021/01/25
Complete
2021/02/08

Competition start: 2020/08/25 00:00:00
Competition closes on: 2021/01/25 00:00:00
Final Submission Limit: 2021/02/08 00:00:00

This competition has a total duration of 5 months, within which you will be able to make your submissions and obtain results automatically. Once the first part of the competition is over, you will have one week to choose your best model and submit it to be graded and considered for cash or points prizes. 

Once the whole process is completed, you will still be able to submit models as "Late Submission" as a learning experience, since, since the competition is officially over, those models will not be eligible to win prizes.


Description

The market for mobile applications is increasingly competitive and understanding which variables can be favored when designing such applications can be of great help in making business decisions.  

The objective of this competition will be to analyze and classify the rating of mobile applications in the Google Play Store Android market. 


Evaluation

The evaluation of the model will be measured using the F1 score, this is because the amount of data in both classes is not symmetric. Since we are working with an unbalanced dataset, our goal will be to optimize the model so that it properly classifies both classes and maximizes the accuracy of the classification, especially of the class with a minority of data.  ## Important

F1 = 2 * (precision * recall) / (precision + recall)
The final F1 score will be the average F1 score of each class as follows:

F1_macro = (F1_class0 + F1_class1) / 2  
Note:
In the sklearn library the line of code to calculate the F1 score described above would be
f1_score(y_true, y_pred, average='macro')
In these links you can find more information about the F1 score:
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html


Rules

Rules:

  • The code should not be shared in private. Any code that is shared, must be available to all participants of the competition through the platform
  • The solution should use only public available open source libraries
  • If two solutions get identical scores in the ranking table, the tie-breaker will be the date and time of the submission (the first solution submitted will win).
  • We reserve the right to request any user's code at any time during a challenge. You will have 24 hours to submit your code following the code review rules.
  • We reserve the right to update these rules at any time.
  • Your solution must not infringe the rights of any third party and you must be legally authorized to assign ownership of all intellectual property in and to the winning solution code to DataSource.ai.
  • Competitors may register and submit solutions as individuals (not as teams, at least for now).
  • As this is a learning competition, apart from the rules in the DataSource.ai Terms of Use, no other particular rules apply.
  • Maximum 10 solutions submitted per day.

At the end of the competition and if you are in the top 20, you must submit the complete model in .ipynb (Jupyter Notebook) format - no other formats will be accepted. You will have 48 hours after the end of the competition to send it to [email protected] - This model will serve us to get the real final evaluations, so the Leaderboard could change when the final private evaluation is shown. 


This competition was made to earn points, and reputation in our platform. It will allow you to measure your knowledge in comparison to other data scientists, know how accurate your models are and improve them. This process will help you improve your real skills!

Note: we are working to have companies sponsoring competitions. If you know someone, don't hesitate to tell them about us, or let us know who might be interested (writing to [email protected]) and we will contact them!

Here you can see the global ranking table of the competitions we have done so far in data science.

These will be the awards in points once the competition is over:

1st Place: 15.000 pts 
2nd Place: 14.000 pts 
3rd Place: 13.000 pts 
4th Place: 12.000 pts 
5th Place: 11.000 pts 
6th Place: 10.000 pts
7th Place: 9.000 pts 
8th Place: 8.000 pts 
9th Place: 7.000 pts 
10th Place: 6.000 pts

Points: 15000pts


The data set contains main features of the applications in the Google Play Store market. 

Variables definition:

ID = Unique Application Identifier 
App = Name of the application
Category = Application category 
Reviews = Number of reviews of the application 
Size = Size of the application 
Installs = Number of downloads/installations on the computer
Type = Free or Paid
Price = Price of the application in dollars
Content rating = Content rating
Genres = Gender
Last Updated = Last day to update 
Current Ver = Current version of the application 
Android View = Required Android Version
Rating = Application Rating 

The scale of the Rating, was transformed into 2 classes:
  • 0 if the Rating is less than or equal to 4 (Rating <= 4.0) 
  • 1 if the Rating is greater than 4 (Rating > 4.0) 

In this way, we classify the applications as positive/successful if the rating is 1 or as negative if the rating is 0.

The dataset was divided as follows:
  • Train.csv With this set you will train the machine learning model. 
  • Test.csv With this dataset you will predict and classify the Rating. 
  • SampleSubmission.csv is an example of how you should send the results. 

id           rating                
3000          1
3001          0 
3002 	      1
3003          1 
3004          0  
3005          1
etc.     

For this competition stage, you need to send your submission file with this details:

# of columns: 2
Column names: id,rating
# of rows: 1449


0 Comments

Do you have any comments or questions about the competition?
Log In to Comment


Share this competition:

Other Competitions

Ready to start?

It's free! Just enter your name and email to join our global data science community, enter competitions, learn, have fun, and win cash prizes

You will be notified shortly about your successful registration.
deco-ring-1 decoration
deco-dots-3 decoration
Icon

Join our private community in Slack

Keep up to date by participating in our global community of data scientists and AI enthusiasts. We discuss the latest developments in data science competitions, new techniques for solving complex challenges, AI and machine learning models, and much more!

 
We'll send you an invitational link to your email immediatly.
arrow-up icon