Plant Diseases Classification
Share:
plus interface icon 20,000pts

Plant Diseases Classification

Image
Description

Apple crops are one of the most important crops in the world. Foliar (leaf) diseases pose a major threat to the productivity and overall quality of...

Prizes
For this competition we want to give, in addition to the 20,000 points, a very special gift for the first place!We will ship this prize to any coun...
Competitors
  • drdsdaniel-en
  • diegorsdiaz-en
  • gentstats
  • karel
  • jazielinho-en
  • Gilson
  • navinp0304
45 Competitors Published at: 07/01/2021
Points
20,000pts
graphical divider



Timeline

Begin
2021/09/10
Finish
2021/12/10
Complete
2021/12/17

Competition start: 2021/09/10 00:01:00
Competition closes on: 2021/12/10 23:59:00
Final Submission Limit: 2021/12/17 23:59:00

This competition has a total duration of 3 months, within which you will be able to make your submissions and obtain results automatically. Once the first part of the competition is over, you will have one week to choose your best model and submit it to be graded and considered for cash or points prizes. 

Once the whole process is completed, you will still be able to submit models as "Late Submission" as a learning experience, since, since the competition is officially over, those models will not be eligible to win prizes.


Description

Apple crops are one of the most important crops in the world. Foliar (leaf) diseases pose a major threat to the productivity and overall quality of apple crops. The current process of diagnosing diseases in apple crops by farmers is based on manual scanning by humans, which is time-consuming and costly.

Although computer vision-based models have shown promise for pattern identification, there are some limitations that need to be addressed. Large variations in visual symptoms of the same disease among different apple cultivars, or new varieties originating in the crop, are the main challenges for computer vision-based disease identification. 

These variations arise from differences in natural and image capture environments, e.g., leaf color and morphology, age of infected tissues, non-uniform image background and different illumination during image acquisition, etc.


Evaluation

The evaluation metric for this competition is Macro Mean F1-Score: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html


Rules

Competition Rules

  • The code should not be shared privately. Any code that is shared, must be available to all participants of the competition through the platform
  • The solution should use only publicly available open source libraries
  • If two solutions get identical scores in the ranking table, the tie-breaker will be the date and time of the submission (the first solution submitted will win).
  • We reserve the right to request any user's code at any time during a challenge. You will have 48 hours to submit your code following the code review rules.
  • We reserve the right to update these rules at any time.
  • Your solution must not infringe the rights of any third party and you must be legally authorized to assign ownership of all copyrights in and to the winning solution code to DataSource.ai.
  • Competitors may register and submit solutions as individuals (not as teams, at least for now).
  • As this is a learning competition, apart from the rules in the DataSource.ai Terms of Use, no other particular rules apply.
  • Maximum 50 solutions submitted per day.

At the end of the competition you must submit the complete model in .ipynb (Jupyter Notebook) format - no other formats will be accepted. Normally, you'll have 1 week after the end of the competition to send it through our "Submit Final Model" button - This model will help us to get the real final evaluations, so the Private Leaderboard could change when the final private evaluation is shown. 


For this competition we want to give, in addition to the 20,000 points, a very special gift for the first place!

We will ship this prize to any country or city in the world! (made by https://www.devwear.co/)




*The hoddie is for man or woman (Unisex)

Score Scale

These will be the awards once the competition is over:

  • 1st Place: 20.000 pts + TensorFlow Hoodie (Delivery to any city around the world)
  • 2nd Place: 19.000 pts 
  • 3rd Place: 18.000 pts 
  • 4th Place: 17.000 pts 
  • 5th Place: 16.000 pts 
  • 6th Place: 15.000 pts 
  • 7th Place: 14.000 pts 
  • 8th Place: 13.000 pts
  • 9th Place: 12.000 pts 
  • 10th Place: 11.000 pts

Points: 20000pts


This is a problem that can be solved using computer vision-based models or other deep learning techniques, since such models have proven effective in identifying patterns in images. 

It is also a multi-label classification problem, where the target column can have one or more possible labels. It is also important to clarify that this is an imbalanced dataset for certain labels.

Datasets:
  • The dataset named Train.csv has 3 columns: "id", "image" and "labels". As its name indicates, this will be the dataset used to train the model. The total number of samples in this dataset is 931. See links to images.
  • The dataset named Test.csv has 2 columns: "id", "image". As its name indicates, this will be the dataset used to test the model, making predictions on the "labels" column and sending the predictions to the platform. The total number of samples in this dataset is 466. See links to images.

The images are in separate links, and it is not necessary to re-send them to the platform. Here are the links to the images:

Links to images:
  • /train_images: is the folder where the reference images for training the model are located and match the names in the "image" column of the Train.csv dataset. Link here: https://drive.google.com/drive/u/1/folders/1FnZZ-0BKrQsqf175Z4iL4lVUUuavyWJ5
  • /test_images: is the folder where the reference images for testing the model and making the predictions to send to the platform are located. The images found there match the names in the "image" column of the Test.csv dataset. Link here: https://drive.google.com/drive/u/1/folders/149dPpgpFA-_Gjh-EJm61Typhh5yiGXqf
  • In the final stage of the competition, the final set of images will be revealed, which will serve as your best model, make a single prediction, and serve as the final score for the private leaderboard. Please keep an eye on this paragraph, as once we enter the final stage of the competition we will reveal that link here. 


Submission File

For each "id" in the Test.csv set, you must predict the label(s) for the target variable "labels". The file must contain a header and have the following format:

id,labels
1,healthy
2,frog_eye_leaf_spot
3,scab
4,rust
5,rust
6,scab
7,scab
...
463,healthy
464,frog_eye_leaf_spot
465,scab
466,rust

For this competition stage, you need to send your submission file with this details:

# of columns: 2
Column names: id,labels
# of rows: 467


5 Comments
  1. Nicolás Aldecoa Rodrigo-en
    Nicolás Aldecoa Rodrigo-en
    about 1 month ago
    Thank you for answering Daniel, I've just confirmed that the order matters when the labels are checked, different permutations of my previous submission had different scores.
    Could you please provide the correct ordering for the labels? 
    This is the first competition that I've joined, si it might just be that my first model is really bad, but I'm getting a huge discrepancy between my cross-validation scores and the evaluation score on the test set as reported by the website; and looking at the images, it doesn't seem like there is a huge domain shift between the sets.
  2. Daniel Morales
    Daniel Morales
    about 1 month ago
    Hi Nicolás, thanks for writing us. Technically it shouldn't affect, for that reason you can test both scenarios and they should give you the same score. However I recommend you to send the results in the same order, and with that be sure to have everything right. Best!
  3. Nicolás Aldecoa Rodrigo-en
    Nicolás Aldecoa Rodrigo-en
    about 1 month ago
    Hi, I'd like to confirm if the labels in the csv are expected to be submitted in the same format as the column in Train.csv, for example:
    ..
    16,powdery_mildew complex
    17,scab
    18,complex
    19,complex frog_eye_leaf_spot scab
    20,healthy
    ...
    are the labels (separated by spaces) permutation invariant for the metric? 
    "complex frog_eye_leaf_spot scab" would be the same encoded label as "scab frog_eye_leaf_spot complex" right? or do I have to respect a predefined ordering?

    Thank you.
  4. Daniel Morales
    Daniel Morales
    2 months ago
    Hi Kudasov. Thanks for let us know about the issue. We've already solve it, so you can send your model submission now. If you have any other question, please let me know. Best!
  5. kudasov.dm
    kudasov.dm
    2 months ago
    Hi guys!
    I have this error when I try to submit my model
    Any ideas how I can fix that?

    """ Error: You have the following error in your submission file
    * Scientific notation: The system does not allow scientific notation values similar or equal to this syntax: '5.54538E+11'
    Please make sure your file is correct and run the submission again. """

Do you have any comments or questions about the competition?
Log In to Comment


Share this competition:

Other Competitions

Ready to start?

It's free! Just enter your name and email to join our global data science community, enter competitions, learn, have fun, and win cash prizes

You will be notified shortly about your successful registration.
deco-ring-1 decoration
deco-dots-3 decoration
Icon

Join our private community in Slack

Keep up to date by participating in our global community of data scientists and AI enthusiasts. We discuss the latest developments in data science competitions, new techniques for solving complex challenges, AI and machine learning models, and much more!

 
We'll send you an invitational link to your email immediatly.
arrow-up icon