Deep learning-based classification of cattle behavior using accelerometer sensors

ABSTRACT


INTRODUCTION
The world's urban population is growing rapidly, driven by a combination of overall population growth and the movement of people from rural to urban areas [1].The United Nations projects that by 2,050, the world's urban population will increase by 2.5 billion people, with nearly 90% of this growth taking place in Asia and Africa [2].This growth has significant implications for the demand for food, including meat, as urbanization leads to changes in dietary preferences and increases the demand for protein.The shifts in the global economy are affecting the beef cattle industry [3], necessitating a boost in the production and efficiency of high-quality meat.Also, by increasing the production of meat, more profits can be generated from its sale, which is significant given the crucial role that livestock plays in the economy [4].
Precision livestock is the approach to cattle management that relies on information and communication technology to introduce the best practices in meat production [5].By automating various aspects of the industry, such as optimizing production costs and minimizing environmental effects, this method allows for increased productivity.A significant advantage of precision livestock is that it treats the data of each animal individually [6], enabling decision-making based on their unique potential, including economic objectives and welfare indicators.As a result, more effective management strategies can be implemented [7].Overall, precision livestock represents a major advancement in the industry [8].Livestock production activities can be managed either manually or through automation.Manual methods rely on human monitoring of the animals, which can be expensive and lead to inaccuracies in the information recorded [9].In contrast, ISSN: 2252-8938  Deep learning-based classification of cattle behavior using accelerometer sensors (Khalid El moutaouakil) 525 automated techniques provide more precise data and can help identify the source of issues and better monitor the animals by enabling quick and accurate tracking of the individual history of each animal [10].Automation also helps to reduce reading errors, leading to improved quality of production [11].
In order to keep up with the growing demand for food and the expanding population, farmers must enhance their productivity and performance [12].To achieve this goal, they need to rely on new technologies based on the Agriculture 4.0 standard and adopt innovative techniques to optimize their livestock farms [13].These technologies can enable the implementation of smart and efficient management strategies through realtime automatic monitoring [14] and the use of advanced techniques such as artificial intelligence.
The primary goal of this study is to create classifying models that use three axial accelerometer sensors data to classify cattle behaviors accurately [15].The behaviors include Moving, Feeding, Resting, Ruminating, and Salting behaviors that represent the most prominent activities that occupy the animal's time throughout the day [16].By improving the precision of the classification of these behaviors, the proposed models can contribute to a better understanding of cattle behavior and help in livestock management.The identification and classification of cattle behavior are a very important things for farmers and barn managers to help in decision-making [17].Machine learning algorithms can classify several behaviors using accelerometers data, as well as video scenes.The use of video scenes and surveillance cameras for monitoring can be very expensive in terms of data processing, storage memory, network bandwidth [18].In this context, using accelerometer is much more efficient and less expensive [19].By monitoring cattle behavior, we can detect, among other things: estrus (when too much movement is detected) [20], lameness (short standing times) [21] and signs of diseases (little movements) [22].This work uses accelerometer data in order to build classifiers that can help improve meat production and livestock management, based on the automatic identification of cattle behavior.It's based on the Japanese black beef cow behavior classification dataset which is among the few datasets available in public access.There are two publications using this dataset to date [23], [24].We have tested 3 classification models including two models based on decision tree and random forest in addition to a convolutional neural network (CNN) model with our own architecture.
The paper is organized as follows: in the next section, we will explain the specifics of the dataset that we relied on, the model development process, the model architecture and the evaluation metrics.Then in the following section, we will explain the results of the research and at the same time provide a comprehensive discussion.And at the end, in the conclusion section, we will conclude this work and give reference to the prospect of the development of research results and application outlook of further studies in this regard.

METHOD 2.1. Dataset
Our proposed approach used the version number 2.0 of the Japanese black beef cow behavior classification [25] to classify cow behaviors using embedded in collars tri-axial accelerometer sensor data.The data was collected using a commercial accelerometer, specifically the Kionix KX122-1037 model, with a sensitivity of 16 bits and a range of +/-2g.It has been collected on June 12, 2020, from six Japanese black beef cows at a farm owned by Shinshu University in Nagano, Japan, consists of 13 different labeled cow behaviors.The cows were allowed to roam freely in two areas, a grass field and farm pens, and were recorded using Sony FDR-X3000 4K video cameras for one day.
The data is labeled by human observers, including behavior experts and non-experts, who matched the timestamps of the video and accelerometer data.This resulted in 197 minutes of high-quality labeled data, with an accelerometer sampling rate of 25Hz.This means that 25 data samples are generated every second.The dataset contains 85,0529 labeled samples, with columns representing TimeStamp_UNIX and TimeStamp_JST for GPS timestamps in UNIX and JST, respectively, and AccX, AccY, and AccZ for acceleration along the X, Y, and Z axes and the label column.The dataset is divided into six .csvfiles, one for each cow.We merged the the behavior classes into six main categories.Table 1 provides the number of samples per label category for each cow and Table 2 show the distribution of data per class.We have a total of 28,023 samples for Feeding behavior (FES), 51,368 for moving (MOV), 150,894 for Resting (RES), 54,531 for Ruminating (RUS), 10,858 for Salting (SLT) and 554,855 for other behaviors (ETC).Therefore, the total number of samples is 850,529.Also, a sample of the dataset is shown in Figure 1.

Model development process
Figure 2 shows a schematic of the complete model development process, starting with the input dataset of tri-axis accelerometer data.The pre-processing stage involves filters to eliminate the noise due to sensors malfunction [26] and data normalization to remove differences in the magnitude of characteristic values and facilitate the learning proces.In the feature extraction stage, we segment the raw data and split the dataset into a training set (80%), a validation set (10%), and a test set (10%) then we apply three classification models: Random Forest, decision tree, and a deep learning CNN model with our own architecture.Finally, we perform behavior analysis by classifying the six cattle behaviors using the three classifiers.

Model architecture
The pre-processing data stage involves normalizing the input features, reshaping the data, converting the labels to categorical variables, and balancing the classes in the training dataset using bootstrap resampling.The data normalization technique used in our model is the Z-score normalization.In Z-score normalization, the mean of each variable is subtracted from each value in the variable, and then the result is divided by the standard deviation of the variable.This rescales the values to have a mean of 0 and a standard deviation of 1.
Our CNN model proposed architecture comprises of 8 layers, consisting of 3 convolutional layers, 3 max pooling layers, 1 flatten layer, and 2 dense layers.Rectified linear unit (ReLU) is used as an activation function, and we normalized the probability of our classes using the Softmax function.The architecture comprises 126,598 trainable parameters in total.Figure 3 displays the details of each layer of the model.The input shape of the model is (3, 1, 1), which corresponds to the time-series dataset with 3 features and a single time step.The model consists of three convolutional layers with ReLU activation functions and max pooling layers in between.The output from the last convolutional layer is flattened into a vector and passed through two fully connected layers with ReLU and softmax activation functions, respectively.The model is compiled with the Adam optimizer, categorical cross-entropy loss function, and evaluation metrics of accuracy and F1-score using the macro average, computed for each of the 6 possible classes.The CNN has 32 filters in the first convolutional layer and 96 filters in the second and third convolutional layers.The kernel sizes for the convolutional layers are (9, 9) and (3, 3) for the first and subsequent layers, respectively.The CNN is trained for 100 epochs with a batch size of 64 and the Glorot uniform initializer is used to initialize the kernel weights with a random seed.The output classes are mapped to integer values using a dictionary called labels_map.

Evaluation metrics
To assess how effectively our classification models are performing, we used several evaluation metrics.It allows us to measure the accuracy and effectiveness of the models.These evaluation metrics provide us with a set of quantitative measures that enable us to compare the performance of the different models and determine which one is the most effective for our specific use case.

Precision
Precision is an an indispensable evaluation metric.It measures the ability of a model to correctly identify positive instances, minimizing false positives.It is the ratio of true positives to the sum of true positives and false positives.

Accuracy
Accuracy is a fundamental evaluation metric.It measures the overall correctness of a model's predictions.It is the ratio of the number of correct predictions to the total number of predictions.

Recall
Recall is a critical evaluation metric.It measures the ability of a model to identify all relevant instances of a class.It is the ratio of true positives to the sum of true positives and false negatives.

F1-score
The F1-score serves as a helpful metric to evaluate model performance.
where: P is the precision and R is the recall of the classification model.

Support
Support is the number of instances of a class in the dataset.It is used to calculate the weighted average of different metrics.It is a critical component in calculating various evaluation metrics.

Micro avg
Micro avg is a way of aggregating the metrics across all classes by treating all instances equally.It is the ratio of the sum of true positives across all classes to the sum of true positives, false positives, and false negatives across all classes.Micro avg gives equal weight to each instance and is useful when the dataset is imbalanced.

Weighted avg
Weighted avg is a way of aggregating the metrics across all classes by taking into account the support of each class.It is the weighted average of the metrics for each class, where the weight is the support of the class.Weighted avg gives more weight to the classes with more instances and is useful when the dataset is balanced.

Training results with the CNN model
The CNN architecture is implemented using the Python programming language.It incorporates several libraries, such as pandas, numpy, tensorflow, and scikit-learn.The model underwent training for 100 epochs and attained an accuracy of 99.65% with a loss of 0.98%.Figure 4   The confusion matrix serves as a concise summary of the classifier's performance.The rows correspond to the actual class instances, and the columns correspond to the predicted class instances.Figure 5 represent the CNN model confusion matrix.
Table 3 presents a summary of the metrics' values that were obtained during the testing phase of our CNN model.These metrics provide information about how well the model performed in terms precision, recall, and other evaluation measures.By presenting this information in a table, we can easily compare the performance of our model across different metrics and make decisions about its effectiveness.The table shows the classification metrics of a CNN model that has been trained on a dataset with 85,053 samples and 6 possible output classes.The precision, recall, and F1-score are computed for each of the classes, as well as the support, which is the number of samples in each class.The micro-average and weightedaverage metrics are also provided.The results show that the model has achieved high accuracy, with an overall accuracy of 0.9965.The F1-scores for most of the classes are also high, ranging from 0.9868 to 0.9989.The precision and recall metrics are generally high across all classes, with some classes achieving near-perfect scores.Overall, the results suggest that the CNN model is performing well on the classification task.

Training results with the random forest-based model
A Random Forest-based model was used to test the database and achieved an accuracy of 72.45%.The results show that the Random Forest-based model has achieved lower accuracy compared to the CNN model, with an overall accuracy of 0.7245.The F1-scores for most of the classes are also lower, ranging from 0.0514 to 0.8170.The class with the lowest F1-score is MOV, with a score of 0.0514.The precision and recall metrics are generally lower across all classes, with some classes achieving relatively low scores.
Overall, the results suggest that the Random Forest-based model is performing less effectively on the classification task compared to the CNN model.This is due to the fact that Random Forests are less suited for modeling sequential data such as time-series, compared to CNNs.The inferior performance of the Random Forest-based model suggests that it is not capable of capturing the complex patterns and dependencies present in the sequential data

Training results with the decision tree based model
We tested the database using a decision tree based model and achieved an accuracy rate of 63.39%.Table 5 summarizes the metrics that were used to evaluate the performance of the decision tree-based model in classifying cattle behavior.By showing this information, it is easier to compare the performance of the model across different metrics and identify areas where the model can be improved.Overall, the table provides a clear and concise summary of the evaluation metrics used to assess the decision tree model's effectiveness.According to the findings, the decision Tree model performed poorly in comparison to both the CNN and Random Forest models, with an accuracy of 0.6339 and lower F1-scores ranging from 0.1559 to 0.7443, with the MOV class having the lowest score.The precision and recall metrics were also lower across all classes.These results suggest that decision Trees are not as effective as CNNs and Random Forests for modeling sequential data like time-series.However, it's important to keep in mind that the model's performance may vary based on the dataset, and additional testing may be necessary to determine its generalizability.

Discussion of the results
The decision tree-based model achieved an accuracy rate of 63.39%, which is lower than the accuracy rates of both the random forest and CNN models.This indicates that the decision tree model struggled to capture the underlying patterns in the cattle behavior dataset, and was not able to make accurate predictions.This also highlights the limitations of the decision tree model and underscores the need for alternative approaches in handling this type of data.
On the other hand, the random forest model achieved an accuracy rate of 72.45%, which is higher than the decision tree model but still significantly lower than the accuracy rate achieved by the CNN model.The random forest model is a more complex and advanced version of the decision tree model, which uses multiple decision trees to make predictions.This allows it to capture more complex relationships and interactions between the input variables, resulting in improved accuracy compared to the decision tree model.However, the CNN model achieved the best performance, achieving an accuracy rate of 99.65%.This suggests that the CNN model was able to learn highly discriminative features from the cattle behavior dataset, which allowed it to make highly accurate predictions.The CNN model is the most suitable option for this particular task, as it was able to provide the highest accuracy rate and best overall performance compared to the other models.

CONCLUSION
In conclusion, this article highlights the importance of precision livestock management in the beef cattle industry, which is crucial for meeting the increasing demand for food production.For that, the article proposes a methodology that uses accelerometer sensors embedded in collars to automatically classify cattle behaviors, which can help farmers and barn managers in decision-making.The study used the Japanese black beef cow behavior classification dataset to classify cow behaviors using deep learning techniques, achieving promising results.The use of automated techniques, such as precision livestock, can help in monitoring and managing the livestock industry, leading to increased productivity, efficiency, and improved quality of production.The article concludes that future studies can build on the proposed methodology to enhance the development and application of precision livestock management in the industry

BIOGRAPHIES OF AUTHORS Khalid El moutaouakil
In 2017, he earned his Master's degree in computer engineering and systems from the Polydisciplinary Faculty of Sultan Moulay Slimane University in Beni Mellal, Morocco.Currently, he is pursuing his Ph.D. studies in the same faculty and works as a computer science teacher in a high school in Marrakech, Morocco.His research interests lie in Digital Agriculture, Deep Learning, and Information Systems.To reach him, you can contact him via email at elmoutaouakil.kh@gmail.com.

Noureddine Falih
In 2013, he obtained a Doctor of Computer Science degree from the Faculty of Sciences and Technologies of Mohammedia, Morocco.Since 2014, he has been working as an associate professor at the Polydisciplinary Faculty of Sultan Moulay Slimane University in Beni Mellal, Morocco.With 18 years of professional experience in several renowned companies, his research interests revolve around Information System Governance, Business Intelligence, Big Data Analytics, and Digital Agriculture.For further communication, he can be reached via email at nourfald@yahoo.fr.

Figure 1 .
Figure 1.Samples of the dataset

Figure 2 .
Figure 2. Model development process


ISSN: 2252-8938 Int J Artif Intell, Vol. 13, No. 1, March 2024: 524-532 528 It's a metric indicating test accuracy, throughout the training and validation of the model for each successive epoch.It measures the accuracy of the models and takes into account Precision and Recall of the test to classify examples as positive or negative.The F1-score of the classification model is calculated as follows: displays line plots illustrating a steady rise in the F1 score.It's a metric indicating test accuracy, throughout the training and validation of the model for each successive epoch.These plots also depict the loss observed during both training and validation phases.

Figure 4 .
Figure 4.The evolution of F1-score and loss during both training and validation

Table 2 .
Distribution of data by class

Table 3 .
Classification metrics of the CNN model

Table 4
gives an overview of metrics for evaluating the Random Forest model performance.It contains values for the different evaluation metrics.The model's performance is analyzed through these metrics.

Table 4 .
Classification metrics of the Random Forest-based model

Table 5 .
Classification metrics of the decision Tree-based model