Prediction of Online Shoppers Purchasing Intention
Share:
plus interface icon 10,000pts

Prediction of Online Shoppers Purchasing Intention

Image
Description

In this competition, we will analyze the activity of users who vist a service/product offered online through a website. The objective is to predict...

Prizes
For this competition we want to give, in addition to the 10,000 points, a very special gift for the first place!We will send this gift to any count...
Competitors
  • wassidi
  • alejandrodebus-en
  • Mihai7
  • Georg
  • emxme.n
  • gentreex
100 Competitors Published at: 01/17/2021
Points
10,000pts
graphical divider

Public Leaderboard


Ranking
Data Scientist
Country
# Submissions
Last submission
Best Score
1
Cristian Camilo Hidalgo Garcia Cristian Camilo Hidalgo Garcia Featured
Colombia
1631
6 months ago
0.817631034576143
2
Oscar Bartolome Pato Oscar Bartolome Pato Featured
España
343
6 months ago
0.816491504853038
3
Santiago Serna Santiago Serna Featured
Colombia
97
6 months ago
0.811802131632503
4
Juan Luis Quiroz Castillo Juan Luis Quiroz Castillo Featured
Chile
96
7 months ago
0.811678832116788
5
SDG SDG Featured
Peru
87
6 months ago
0.810988554051394
6
Nicolás Dominutti Nicolás Dominutti Featured
Argentina
139
6 months ago
0.810703016286773
7
Carlos Eduardo Vázquez Chong Carlos Eduardo Vázquez Chong
Mexico
78
6 months ago
0.810131278850246
8
Jonathan Loscalzo Jonathan Loscalzo
Argentina
29
7 months ago
0.809511287502948
9
Julian Ismael Centeno Julian Ismael Centeno
Peru
18
8 months ago
0.807714100654053
10
Lautaro Pacella-en Lautaro Pacella-en
Argentina
163
7 months ago
0.807073735677069
11
Nicolas Santilli Nicolas Santilli
Argentina
22
6 months ago
0.807001034624541
12
Willians Carlos Enciso Melgarejo Willians Carlos Enciso Melgarejo
Peru
11
6 months ago
0.805839416058394
13
Diego Alexander Rueda Plata Diego Alexander Rueda Plata
Colombia
90
7 months ago
0.805777101620981
14
Alan F Dopfel Alan F Dopfel
United States
15
6 months ago
0.805621962308745
15
Sidereus Sidereus Featured
Colombia
7
6 months ago
0.803892944038929
16
Christian Farnast Contardo-en Christian Farnast Contardo-en
Chile
10
7 months ago
0.802152437294866
17
Diego Albarracin Mahecha Diego Albarracin Mahecha
Colombia
15
7 months ago
0.802126908386738
18
Felipe Perez Felipe Perez
Colombia
23
6 months ago
0.796920367701698
19
Nachos Nachos
España
15
7 months ago
0.796482014986438
20
ANGEL JORGE SALAZAR ANGEL JORGE SALAZAR
Peru
3
6 months ago
0.79619071563834
21
Julian Armando Abril Luna Julian Armando Abril Luna
Colombia
180
6 months ago
0.794999642448513
22
bprasad26 bprasad26
India
15
3 months ago
0.79463138787232
23
Víctor Manuel Cárdenas Víctor Manuel Cárdenas Featured
Colombia
7
6 months ago
0.793995639423971
24
johan159097 johan159097
Peru
16
6 months ago
0.793988364042935
25
David Augusto Villabón Borja David Augusto Villabón Borja
Colombia
9
6 months ago
0.793844138522364
26
GIANCARLOS NOA FLORES GIANCARLOS NOA FLORES
Peru
18
8 months ago
0.7918177693992
27
Javier J Desario Javier J Desario
Argentina
8
6 months ago
0.790670709094499
28
Denis Tsitko Denis Tsitko
Russian Federation
2
6 months ago
0.784334315080844
29
Alonso Burgos Alonso Burgos
Chile
1
8 months ago
0.78306858303018
30
diego_corona-en diego_corona-en
Mexico
10
5 months ago
0.782270935694076
31
Raja Hamza Azhar Raja Hamza Azhar
Pakistan
26
8 months ago
0.775355385533193
32
Purity Nyagweth Purity Nyagweth Featured
Kenya
4
6 months ago
0.716542631680246
33
Stalyn Quishpe-en Stalyn Quishpe-en
Ecuador
5
7 months ago
0.701500805975297
34
Shiv Kumar Shiv Kumar
India
1
8 months ago
0.69970189385083
35
atuq atuq
Bolivia, Plurinational State of
2
6 months ago
0.695728226085127
36
Frank Smith Frank Smith
United States
2
8 months ago
0.457863110068885
37
Manav Mehra Manav Mehra
Canada
1
7 months ago
0.438637465558223



Timeline

Begin
2021/01/17
Finish
2021/03/31
Complete
2021/04/07

Competition start: 2021/01/17 00:00:00
Competition closes on: 2021/03/31 00:00:00
Final Submission Limit: 2021/04/07 00:00:00

This competition has a total duration of 3 months, within which you will be able to make your submissions and obtain results automatically. Once the first part of the competition is over, you will have one week to choose your best model and submit it to be graded and considered for cash or points prizes. 

Once the whole process is completed, you will still be able to submit models as "Late Submission" as a learning experience, since, since the competition is officially over, those models will not be eligible to win prizes.


Description

In this competition, we will analyze the activity of users who vist a service/product offered online through a website. The objective is to predict which visitors will decide to buy according to the characteristics and interactions they exhibit on the site.

In this special case, we are working with a classification/clustering problem. Of the 12.330 sessions on the website, 84.58% did not decide to make a purchase, which equals 10.422 and the rest ended up buying (1908)


Evaluation

The Evaluation of the competition will be done by performing the F1 Score ("Macro") metric.

First we define the F1 score

F1 = 2 * (precision * recall) / (precision + recall)

The Macro F1 score will be the average of the F1 score of each class as well:





defined as the average of the F1 scores of the classes/labels. This is equal to  

F1_macro = F1_clase0 + F1_clase1 / 2  


In these links you can find more information about the F1 score:


Note: we automatically perform these evaluations against a validation dataset of ours, but you should take these metrics as a reference for your modeling tests.


Rules

Competition Rules

  • The code should not be shared privately. Any code that is shared, must be available to all participants of the competition through the platform
  • The solution should use only publicly available open source libraries
  • If two solutions get identical scores in the ranking table, the tie-breaker will be the date and time of the submission (the first solution submitted will win).
  • We reserve the right to request any user's code at any time during a challenge. You will have 48 hours to submit your code following the code review rules.
  • We reserve the right to update these rules at any time.
  • Your solution must not infringe the rights of any third party and you must be legally authorized to assign ownership of all copyrights in and to the winning solution code to DataSource.ai.
  • Competitors may register and submit solutions as individuals (not as teams, at least for now).
  • As this is a learning competition, apart from the rules in the DataSource.ai Terms of Use, no other particular rules apply.
  • Maximum 10 solutions submitted per day.

At the end of the competition and if you are in the top 20, you must submit the complete model in .ipynb (Jupyter Notebook) format - no other formats will be accepted. You will have 48 hours after the end of the competition to send it to [email protected] - This model will serve us to get the real final evaluations, so the Leaderboard could change when the final private evaluation is shown. 


For this competition we want to give, in addition to the 10,000 points, a very special gift for the first place!

We will send this gift to any country and city in Latin America! (made by https://www.devwear.co/)





*The hoddie is for men or women (Unisex)

Score Scale

These will be the awards once the competition is over:

  • 1st Place: 10.000 pts + Python Hoodie (Delivery to any city in Latin America)
  • 2nd Place: 9.000 pts 
  • 3rd Place: 8.000 pts 
  • 4th Place: 7.000 pts 
  • 5th Place: 6.000 pts 
  • 6th Place: 5.000 pts 
  • 7th Place: 4.000 pts 
  • 8th Place: 3.000 pts
  • 9th Place: 2.000 pts 
  • 10th Place: 1.000 pts

Points: 10000pts


The data set corresponds to 12,330 unique sessions per user, which are divided into

  • 8,631 for the training set (Train.csv)
  • 3,699 for the test.csv set (Test.csv)

This data was obtained over 12 months to avoid special day trends or specific campaigns. 

In the file SampleSubmission.csv you can find the way in which you should send the data, and whose characteristics are:

  • You must send your submission file with only 2 columns
  • Column 0 should be called: 'id
  • Column 1 should be called: 'revenue
  • The file must contain a total number of 3700 rows, where:
    • First row is == header
    • The other 3.699 rows == your predictions
  • If you do not meet these rules within your submission file, the system will automatically reject it

Note: we recommend you to check the file SampleSubmission.csv, which will be like this:


id           revenue
                  
1            0
2            0 
3 	     1
4            0 
5            1  
6            1
etc.           

Variables definition:

  • id: unique ID of the website visitor
  • administrative: Number of times the user visited the administrative section
  • administrative_duration: Total time the user spent in the administrative section
  • informational: Number of times the user visited the informational section
  • informational_duration: Total time the user spent in the informational section
  • productrelated: Number of times the user visited the related products section
  • productrelated_duration: Total time the user spent in the related products section
  • bouncerates: This is the percentage of visitors who enter the page and immediately "bounce" without interacting with it. This metric is only taken into account if it is the first page they visit within the website.
  • exitrates: From the total number of visits to the pages of the website, the percentage of visitors who logged out through this specific page is obtained, that is, it indicates the percentage of users whose last visit to the website was this specific page.
  • pagevalues: This is the average value of the website, it indicates the contribution that this website made to the visitor arriving at the final purchase page or section. 
  • specialday: Is the value that indicates the proximity to a special date such as Valentine's Day.  The range of this variable is 0 to 1, with 1 being the exact day of the special date and 0 if there is no range near that date.
  • month: Month of the visit to the website.
  • operatingsystems: Type of operating system
  • browser: Name of the web browser
  • region: Visitor's geographic region
  • traffictype: Type of web traffic
  • visitortype: Whether you are a new visitor or a returning visitor
  • Weekend: 0 indicates that it is not a weekend day and 1 indicates that it is a weekend day.

Target variable:

  • revenue: Variable to be classified, 1 indicates that the visitor has bought and 0 indicates that the visitor has not bought.

For this competition stage, you need to send your submission file with this details:

# of columns: 2
Column names: id,revenue
# of rows: 3700


10 Comments
  1. Daniel Morales
    Daniel Morales
    5 months ago
    Ya esta funcionando de nuevo, pueden enviar sus modelos!
  2. Daniel Morales
    Daniel Morales
    5 months ago
    Hola Diego, tenemos un problema en el server. Estamos trabajando en solucionarlo
  3. diego_corona-en
    diego_corona-en
    5 months ago
    Hola, estoy tratando de envíar un envíó tardío pero me marca error '500'.
  4. Felipe Perez
    Felipe Perez
    6 months ago
    Muy bueno tener ahora la opción de "Late Submission", felicitaciones!
  5. ANGEL JORGE SALAZAR
    ANGEL JORGE SALAZAR
    6 months ago
    Muy buena competición y los certificados también genial, así que a esmerarse
  6. Daniel Morales
    Daniel Morales
    6 months ago
    Excelente, esa es la idea!
  7. Pablo Neira Vergara
    Pablo Neira Vergara
    6 months ago
    Excelente! se verán bien en mi linkedin y de paso tienen publicidad, todos ganan.
  8. Daniel Morales
    Daniel Morales
    6 months ago
    Asi es Pablo, hemos recibido diferentes solicitudes al respecto, asi que decidimos agregarlo. Tambien acabamos de lanzar los certificados de las competiciones, un feature solicitado por usted, y el cual nos ha parecido muy interesante!. Lo puedes ya encontrar bajo la pestaña "Mi Perfil" en tu dashboard. La sub-pestaña se llama "Certificates". Se pueden descargar en PDF o compartir el Linkedin de forma automatica. Son otorgados a los primeros 10 puestos de cada competición. Seria muy bueno saber que piensas al respecto? y obviamente cualquier idea adicional o retroalimentación siempre será bienvenida!
  9. Pablo Neira Vergara
    Pablo Neira Vergara
    6 months ago
    Que bueno que agregaran una sección de discusión.
  10. Felipe Perez
    Felipe Perez
    6 months ago
    Muy buena competición! Aunque no he podido mejorar mucho mas mi score :(

Do you have any comments or questions about the competition?
Log In to Comment


Share this competition:

Other Competitions

Ready to start?

It's free! Just enter your name and email to join our global data science community, enter competitions, learn, have fun, and win cash prizes

You will be notified shortly about your successful registration.
deco-ring-1 decoration
deco-dots-3 decoration
Icon

Join our private community in Slack

Keep up to date by participating in our global community of data scientists and AI enthusiasts. We discuss the latest developments in data science competitions, new techniques for solving complex challenges, AI and machine learning models, and much more!

 
We'll send you an invitational link to your email immediatly.
arrow-up icon