Google Play Store Rating Prediction
Share:
plus interface icon 15,000pts

Google Play Store Rating Prediction

Image
Description

The market for mobile applications is increasingly competitive and understanding which variables can be favored when designing such applications ca...

Prizes
This competition was made to earn points, and reputation in our platform. It will allow you to measure your knowledge in comparison to other data s...
Competitors
  • RickDev31-en
  • oy0rzabal.l0pez@gmail.com-en
  • dattran2346
  • phlemus-en
  • Znasim
  • Katherine
  • cam22
173 Competitors Published at: 08/25/2020
Points
15,000pts
graphical divider

Public Leaderboard


Ranking
Data Scientist
Country
# Submissions
Last submission
Best Score
1
Sidereus Sidereus Featured
Colombia
151
about 2 years ago
0.698709403908066
2
Pablo Lucero Pablo Lucero Featured
Ecuador
300
about 2 years ago
0.686667865555434
3
Fernando Chica Fernando Chica Featured
Ecuador
81
about 2 years ago
0.667694034607079
4
Nicolás Dominutti Nicolás Dominutti Featured
Argentina
106
about 2 years ago
0.667574096698846
5
Juan Fernando Cifuentes Garcia Juan Fernando Cifuentes Garcia Featured
Colombia
49
over 2 years ago
0.663858826283701
6
David Augusto Villabón Borja David Augusto Villabón Borja
Colombia
397
about 2 years ago
0.660429949973203
7
Adam Michaels Adam Michaels Featured
United States
50
almost 2 years ago
0.655121856160549
8
Diego Albarracin Mahecha Diego Albarracin Mahecha
Colombia
82
over 2 years ago
0.651433298488367
9
JuanSebas7ian JuanSebas7ian
Colombia
55
over 1 year ago
0.650643382352941
10
juan fer juan fer
Colombia
50
over 1 year ago
0.646149730419393
11
Diego Alexander Rueda Plata Diego Alexander Rueda Plata
Colombia
22
about 2 years ago
0.646085976555351
12
Sebastian Salazar Betancur Sebastian Salazar Betancur
Colombia
32
over 1 year ago
0.642257908588001
13
Juan Esteban Orozco Botero Juan Esteban Orozco Botero
Colombia
58
over 1 year ago
0.640544173737402
14
James Jeremy Valencia Becerra James Jeremy Valencia Becerra
Peru
6
over 2 years ago
0.63929545716907
15
Frank Diego-en Frank Diego-en
Peru
18
about 2 years ago
0.638895596239485
16
Esteban Buitrago S Esteban Buitrago S
Colombia
3
over 1 year ago
0.636838430457216
17
Michael Guzmán Michael Guzmán
Colombia
57
over 2 years ago
0.635130303583782
18
Carango Carango
Colombia
33
over 1 year ago
0.627683701021391
19
Cesar Gustavo Seminario Calle-en Cesar Gustavo Seminario Calle-en
Peru
22
about 2 years ago
0.6257545437472
20
Sebastian Alibaud Sebastian Alibaud Featured
Chile
2
over 2 years ago
0.625544425087108
21
Javier J Desario Javier J Desario
Argentina
22
over 2 years ago
0.622011304521809
22
Nildo Nildo
Peru
5
over 1 year ago
0.62165308209933
23
Julian David Tellez Julian David Tellez
Colombia
79
about 2 years ago
0.607913434997534
24
César Arcos Gonzalez César Arcos Gonzalez
Mexico
10
over 2 years ago
0.59881630295099
25
Alejandro Anachuri Alejandro Anachuri
Argentina
1
over 2 years ago
0.598314009076341
26
jayantsogikar jayantsogikar Featured
India
10
over 1 year ago
0.594474708874254
27
Christian Farnast Contardo-en Christian Farnast Contardo-en
Chile
15
over 2 years ago
0.592456026193206
28
Leandro Ruiz Leandro Ruiz
Argentina
15
over 2 years ago
0.590645572308346
29
Daniel Morales Daniel Morales
Colombia
13
about 1 year ago
0.579456127131995
30
claudiatesi claudiatesi
Colombia
4
about 1 year ago
0.565226686788656
31
oscero90-gmail-com oscero90-gmail-com
Argentina
3
over 2 years ago
0.56280193236715
32
Cristian Camilo Hidalgo Garcia Cristian Camilo Hidalgo Garcia Featured
Colombia
13
about 2 years ago
0.539419790327753
33
Katherine Katherine
Colombia
9
about 1 year ago
0.538947790874845
34
claudio irrazabal tarazona claudio irrazabal tarazona
Peru
1
over 2 years ago
0.514572406104051
35
Matías Poullain Matías Poullain
Argentina
12
over 2 years ago
0.51160398819031
36
Mario Rugeles Mario Rugeles
Colombia
1
about 2 years ago
0.49953466099764
37
Luis Enrique Luis Enrique
Colombia
1
over 2 years ago
0.489612274517935
38
anyk17 anyk17
Colombia
6
about 1 year ago
0.476239761914142
39
Juan Guillermo Gómez Ramírez Juan Guillermo Gómez Ramírez
Colombia
18
over 2 years ago
0.47393923065776
40
Emiliano Olivares Emiliano Olivares
Argentina
3
over 2 years ago
0.457449227402525
41
sergiodma sergiodma
Colombia
4
about 1 year ago
0.443465734686936
42
Víctor Manuel Víctor Manuel
Colombia
2
over 2 years ago
0.429472025216706
43
kasati kasati Featured
United States
1
over 1 year ago
0.429472025216706
44
Felipe Perez Felipe Perez
Colombia
1
almost 2 years ago
0.429472025216706



Timeline

Begin
2020/08/25
Finish
2021/01/25
Complete
2021/02/08

Competition start: 2020/08/25 00:00:00
Competition closes on: 2021/01/25 00:00:00
Final Submission Limit: 2021/02/08 00:00:00

This competition has a total duration of 5 months, within which you will be able to make your submissions and obtain results automatically. Once the first part of the competition is over, you will have one week to choose your best model and submit it to be graded and considered for cash or points prizes. 

Once the whole process is completed, you will still be able to submit models as "Late Submission" as a learning experience, since, since the competition is officially over, those models will not be eligible to win prizes.


Description

The market for mobile applications is increasingly competitive and understanding which variables can be favored when designing such applications can be of great help in making business decisions.  

The objective of this competition will be to analyze and classify the rating of mobile applications in the Google Play Store Android market. 


Evaluation

The evaluation of the model will be measured using the F1 score, this is because the amount of data in both classes is not symmetric. Since we are working with an unbalanced dataset, our goal will be to optimize the model so that it properly classifies both classes and maximizes the accuracy of the classification, especially of the class with a minority of data.  ## Important

F1 = 2 * (precision * recall) / (precision + recall)
The final F1 score will be the average F1 score of each class as follows:

F1_macro = (F1_class0 + F1_class1) / 2  
Note:
In the sklearn library the line of code to calculate the F1 score described above would be
f1_score(y_true, y_pred, average='macro')
In these links you can find more information about the F1 score:
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html


Rules

Rules:

  • The code should not be shared in private. Any code that is shared, must be available to all participants of the competition through the platform
  • The solution should use only public available open source libraries
  • If two solutions get identical scores in the ranking table, the tie-breaker will be the date and time of the submission (the first solution submitted will win).
  • We reserve the right to request any user's code at any time during a challenge. You will have 24 hours to submit your code following the code review rules.
  • We reserve the right to update these rules at any time.
  • Your solution must not infringe the rights of any third party and you must be legally authorized to assign ownership of all intellectual property in and to the winning solution code to DataSource.ai.
  • Competitors may register and submit solutions as individuals (not as teams, at least for now).
  • As this is a learning competition, apart from the rules in the DataSource.ai Terms of Use, no other particular rules apply.
  • Maximum 10 solutions submitted per day.

At the end of the competition and if you are in the top 20, you must submit the complete model in .ipynb (Jupyter Notebook) format - no other formats will be accepted. You will have 48 hours after the end of the competition to send it to [email protected] - This model will serve us to get the real final evaluations, so the Leaderboard could change when the final private evaluation is shown. 


This competition was made to earn points, and reputation in our platform. It will allow you to measure your knowledge in comparison to other data scientists, know how accurate your models are and improve them. This process will help you improve your real skills!

Note: we are working to have companies sponsoring competitions. If you know someone, don't hesitate to tell them about us, or let us know who might be interested (writing to [email protected]) and we will contact them!

Here you can see the global ranking table of the competitions we have done so far in data science.

These will be the awards in points once the competition is over:

1st Place: 15.000 pts 
2nd Place: 14.000 pts 
3rd Place: 13.000 pts 
4th Place: 12.000 pts 
5th Place: 11.000 pts 
6th Place: 10.000 pts
7th Place: 9.000 pts 
8th Place: 8.000 pts 
9th Place: 7.000 pts 
10th Place: 6.000 pts

Points: 15000pts


The data set contains main features of the applications in the Google Play Store market. 

Variables definition:

ID = Unique Application Identifier 
App = Name of the application
Category = Application category 
Reviews = Number of reviews of the application 
Size = Size of the application 
Installs = Number of downloads/installations on the computer
Type = Free or Paid
Price = Price of the application in dollars
Content rating = Content rating
Genres = Gender
Last Updated = Last day to update 
Current Ver = Current version of the application 
Android View = Required Android Version
Rating = Application Rating 

The scale of the Rating, was transformed into 2 classes:
  • 0 if the Rating is less than or equal to 4 (Rating <= 4.0) 
  • 1 if the Rating is greater than 4 (Rating > 4.0) 

In this way, we classify the applications as positive/successful if the rating is 1 or as negative if the rating is 0.

The dataset was divided as follows:
  • Train.csv With this set you will train the machine learning model. 
  • Test.csv With this dataset you will predict and classify the Rating. 
  • SampleSubmission.csv is an example of how you should send the results. 

id           rating                
3000          1
3001          0 
3002 	      1
3003          1 
3004          0  
3005          1
etc.     

For this competition stage, you need to send your submission file with this details:

# of columns: 2
Column names: id,rating
# of rows: 1449


1 Comments
  1. dattran2346
    dattran2346
    10 months ago
    Edgar company search feature at AlphaResearch allows investors to search the entire SEC database instantly. In addition, AlphaResearch also provides access to Canadian and European filings, earnings call transcripts, etc. Try AlphaResearch using the following SEC filings, 10 k, 10q, form 8k, 13f filings, sec form 4. The platform also provides advanced NLP features like sentiment analysis, synonyms search, and advanced charting.

Do you have any comments or questions about the competition?
Log In to Comment


Share this competition:

Other Competitions

Ready to start?

It's free! Just enter your name and email to join our global data science community, enter competitions, learn, have fun, and win cash prizes

You will be notified shortly about your successful registration.
deco-ring-1 decoration
deco-dots-3 decoration

Win USD $5 in cash prizes with our data science competition!

🎉 Model submissions for the "1" competition will close in

arrow-up icon