Automated extraction of domain knowledge for transition-cow management

Authors: J. Zhu and R. Lacroix and K. Wade

Date: 2023-06-01

Status: Published

View External Publication


In dairy cattle, the transition period (±3 weeks from calving) represents a challenging time for management. Vast changes in a cow’s physiology, housing, and feeding often result in metabolic or reproductive diseases, leading to a drop in production. Because most metabolic processes are intricately linked, dairy producers and their advisors may have difficulty drawing concise conclusions concerning transition-cow management. To help in this, machine-learning techniques and knowledge-graph theory were explored with a view to creating a decision-support system that could provide producers and their advisors with knowledge from domain literature. Specifically, knowledge was modeled as entities and relationships in knowledge-graph theory, and natural language models were developed to extract information as knowledge graphs. A data set comprising 1,152 sentences from 20 papers was created and split into 922 sentences for training and 230 sentences for testing. Two deep-learning models were trained to extract entities and relationships, respectively. For testing, a bi-lstm model was applied to the entity extraction task and obtained an F1 score of 80%. With regard to relationship extraction, a transformer-based model was deployed but yielded a low F1 of 23%. Therefore, a pre-trained transformer model with 80.8% accuracy was deployed. After feeding the domain literature into the deep-learning models, a knowledge graph of 1,576 nodes and 3,456 edges was constructed and stored in a Neo4j graph database. Subsequently, a semantic parsing method was used to allow users to query the knowledge graph using natural language. To determine the quality of the responses, answers were sampled and evaluated based on human evaluation. On average, the answers scored 7.5 out of 10 and proved informative with respect to the original literature. Although the final interactive results demonstrated a high degree of visualization and scalability, this study primarily sought to demonstrate its feasibility. For tailored industrial applications, further improvements could be implemented in specific knowledge-graph expansion and reasoning.