BFH-HAFL, ETH Zurich, Utrecht University

Milk MIR Spectra and Machine Learning to Verify Herbage Proportion in Dairy Herd Diets

How to verify grassland-based feeding? An approach based on machine learning and milk mid-infrared (MIR) spectra acquired from routine milk quality testing may be the answer.

Background

Milk from grassland-based production systems is considered sustainable in countries with a high proportion of grasslands, such as Switzerland. Consumers are often willing to pay a premium for the products of such production systems, as they are commonly associated with high-welfare animal husbandry and human health benefits. However, there is currently no standard definition for products of grassland-based production systems. The “grass-fed” claim can only be verified indirectly by measuring the proportion of grassland-based feeds in the ration, the time spent on pasture or the grazing area available to the herd. These methods are time consuming, complex and error prone. Therefore, more direct measurement techniques are required. Milk MIR spectral data, already used to determine fat, protein, lactose and urea contents, combined with machine learning, may be the answer.

New methods based on machine learning

To date, partial least squares discriminant analysis (PLS-DA) has been the only machine-learning algorithm employed to attempt to classify the proportion of grassland-based forages in herd diet using milk MIR spectra. In the recently published study, three new machine-learning algorithms (Least Absolute Shrinkage and Selection Operator [LASSO], Random Forest [RF], Support Vector Machines [SVM]) were used to predict grassland-based feeding. Both the MIR spectral wavenumbers themselves as well as milk quality characteristics (fat and protein) and a newly developed indicator that takes seasonality into account were used as data input.

Future prospects

Of the 1,132 milk samples analysed, the method with the strongest significance (LASSO) was able to classify total grassland-based feeds ≥ 50% with an accuracy of 79%, precision of 85%, sensitivity of 91%, specificity of 14% and an F1 score of 88%. Fresh grass ≥ 70% was classified with an accuracy of 86%, a precision of 62%, a sensitivity of 39%, a specificity of 95% and an F1 score of 48%. Based on the results presented in the publication, a method could be developed in the future that allows automated and simplified verification of grassland-based feeding, as desired by public and private organisations.

Conclusion

  • The combined approach of using dietary information, MIR spectral data, milk quality traits and seasonality in machine learning models seems promising.
  • The LASSO and PLS-DA algorithms seem to be superior to RF and SVM algorithms for the classification of grassland-based forages in the diet.
  • The models analysed have proven successful in classifying milk from grassland-based production systems on a limited number of farms. The models developed must now be validated using larger datasets and, if necessary, supplemented with additional characteristics.
To the archive