Cheese varieties from Switzerland are characterised according to various criteria. Agroscope analysed the free volatile carboxylic acids in ten cheese varieties and demonstrated that the latter are suitable for characterisation and differentiation.
In recent decades and mainly on behalf of their cheese consortia, Agroscope has characterised various varieties of Swiss cheese via chemical, biochemical, physical and sensory analyses. The primary aim in doing so was descriptive characterisation, but the concept of differentiation was always a driving force behind these projects. Differentiation is only possible when a comparison takes place, accomplished in this project via supervised Machine Learning (ML) algorithms1. The free volatile carboxylic acids (FVCAs) of 241 samples of ten different cheese varieties were included in the test.
90% of the cheese samples can be correctly classified
Several algorithms were tested in parallel using the PyCaret program library. The best results were achieved with tree-based algorithms – Extra Trees and Random Forest. After the ten training runs with 70% of the data, over 90% of the test data (corresponding to the remaining 30% of the cheese-sample data) was correctly classified – a highly promising result.
Formic acid is the most important carboxylic acid
For the correct classification of the cheese samples, formic acid was of greatest importance. This acid is either produced by the addition of a facultatively heterofermentative lactobacillus cuture, as in Appenzeller® or Emmentaler AOP, or through the same bacterial group but originating from raw milk, as e.g. in Raclette du Valais AOP. By contrast, there is less formic acid present in extra-hard cheeses, which is in turn characteristic of them. Acetic acid and butyric acid were less important for the classification of the carboxylic acids.
Key carboxylic acids for each cheese variety
SHAP values (SHapley Additive exPlanations2) were used to weight the carboxylic acids according to their importance for a correct classification of each cheese variety. Thus, for example, a lower proportion of formic acid and a comparatively higher proportion of caproic acid is characteristic of Berner Hobelkäse.
 Supervised learning is a Machine Learning (ML) process in which an ML algorithm is presented with a dataset whose target variable is already known. The algorithm learns correlations and dependencies in the data that explain these target variables. Once the training process is finished the quality of the prediction is evaluated, after which the learned patterns are applied to unknown data and forecasts and predictions are made.
 SHAP is an approach used in game theory to explain the output of the Machine Learning model.
- Free volatile carboxylic acids (FVCAs) are valuable features for characterising and differentiating Swiss cheese varieties.
- Tree-based algorithms can correctly classify at least 90% of cheese samples on the basis of FVCAs alone.
- Interpreting the SHAP values allows us to weight the importance of the individual carboxylic acids for classification.
- Formic acid is the most important and butyric acid the least important carboxylic acid for correct classification of the cheese varieties.
- The PyCaret program library proved to be a simple tool showing great promise for practical application.