This use cases showcases the potential of multimodal data analytics methods and machine-learning algorithms for in-field plant phenotyping.

Partners

Forschungszentrum Jülich Logo

Forschungszentrum Jülich (FZJ)

Universität Bonn Logo

University of Bonn

This team of our partners is working on the success of this use case (link to German page).

Background

Sustainable crop production requires knowledge on the structural and functional status of the crops in the field to plan and execute precise field interventions. For instance, rather than frequent pesticide or fertilizer application, a plant-specific demand should be determined, and plant-level interventions should be planned and executed. An important agricultural research question is how such an increase in precision can be achieved. Drones and field robots in combination with modern sensor systems and cameras are promising tools to provide on demand real-time data on crop traits; thus, plant phenotyping has been adopted in plant breeding and modern agriculture. To quantify complex plant traits such as stress resilience or disease resistance, combinations of multiple noninvasive phenotyping with sensors and advanced data analytics methods and machine-learning algorithms are nowadays state of the art. While significant progress has been made in the past years on sensor technology, machine learning, and autonomous robotics, we are still in the early stages of realizing the full potential of these data. One possible analysis extension is the fusion and combined analysis of repeatedly acquired sensor data with environmental data by combining information across years, including information on previous field interventions. Another relevant factor is the integration of genetic and biochemical characteristics of different crop types and breeding lines. Furthermore, it is still an open research question whether and how such autonomous multisensor robots can be developed and employed in the field and how data streams from the combination of sensors and algorithms can be merged to identify relevant plant traits or detect stress situations at an early stage.

Objectives

The main objective of this UC is to investigate and showcase the potential of a FAIR data infrastructure for the development of multimodal data analytics methods and machine-learning algorithms for in-field plant phenotyping and agricultural robotics. In particular, this UC aims to define requirements for data curation services. This includes automated tools for quality assurance and data harmonization that increase the usefulness of data for combined analyses. This UC will also explore ways to visualize field environment data stored in the data infrastructure to researchers to assist them in operating multisensor systems or controlling agricultural robots (e.g., deciding which field operations a robot shall perform). By conducting a pilot study, this UC will act as an early adopter of the services developed in FAIRagro. Furthermore, we are working closely with the data stewards  to incorporate the specific phenotype data requirements for use in machine learning from this use case into FAIRagro standards and guidelines and to inform the data stewards in how to curate additional datasets accordingly. It remains a scientific challenge to align the requirements of machine learning experts with the potential of field phenotyping. While field phenotyping provides a huge amount of quantitative crop data, essential information may still be missing to make these data usable for machine learning approaches.

Actions

  1. Development of RDM requirements for field phenotypic data
  2. Generation and release of a benchmark dataset of complete field phenotyping data
  3. Pilot study with data curation and visualization services

Progress & next steps

Making an Impact: FAIRagro UC5 Benchmark Dataset Gains great Attention

Who has taken notice of your dataset/publication?
The interest in the data publication is overwhelming, among others colleagues from NFDI4Bioimage 

Why?
The unmanned aerial vehicle (UAV) used for data collection is equipped with both standard red-green-blue (RGB) and multispectral imaging sensors. This setup enables direct side-by-side comparisons of visual and spectral data, making the dataset particularly valuable for diverse imaging communities and methodological comparisons.

Is the dataset relevant for reuse? If so, for whom and for which types of research questions?
Yes, the dataset is relevant for researchers working on:

  • Solar-induced fluorescence
  • Computer vision
  • Image processing workflows

What has your publication/dataset triggered that might not have happened otherwise (e.g. research, collaboration, etc.)? Which challenges has it helped to reduce?
A workflow documentation is currently in development. It will offer step-by-step guidance for data processing and help improve data quality, thereby supporting reproducibility and reuse.

What else is cool about it?
The dataset uses the MIAPPE metadata standard.

Direct link to the benchmark dataset.


Read the detailed Update and Progress Report by Irek Kleppert and Anne Sennhenn from April 2024.

Any questions about this use case?

Please contact Anne Sennhenn for further information.