Skip to content
UW Crest

Dairy

Division of Extension

  • HPAI
  • Topics
    • Animal Welfare & Herd Health
    • Emerging Technologies and Facilities
    • Feed and Nutrition
    • Heat Stress
    • Milk Quality and Milking Systems
    • Reproduction and Genetics
  • Events
  • News
  • Dairy Programs
    • Badger Dairy Insight
    • El “Break” Info-Lechero
    • Four-State Nutrition and Management Conference
    • Midwest Manure Summit
    • Milk Quality from the Udder World: Trainer Certification Program
  • Articles
  • Newsletters
    • Dairy Newsletter
    • Bovi-Noticias
  • People
  • About
    • About the Dairy Program
    • Our Impacts
    • Sign-up for the Dairy Program Newsletter
  • Contact Us
Search
University of Wisconsin-Extension
Articles > Emerging Technologies

How Can You Improve Your Dairy’s Data?

Written by Katelyn Goldsmith
Share
  • Share:
  • Share on Facebook
  • Share on X (Twitter)
  • Share via Email
  • Copy Link

Copied!

How Can You Improve Your Dairy’s Data?

Introduction
What is Dirty Data?
Pillars of Quality Data
Accuracy
Completeness
Consistency
Uniqueness
Tips and Tricks to Increase Data Quality
Summary
Hand pointing at illuminated checkmark icon above laptop. Text overlays include the UW-Madison "W" logo, "DAIRY" label, the heading "How Can You Improve Your Dairy's Data?" and presenter name "Katelyn Goldsmith" in a gray banner.

Introduction

Data is increasingly collected and used on dairy farms to monitor performance and support decision-making. But for data to provide truly valuable insights, it must be high quality. As the saying goes, “Garbage in, Garbage Out”, or maybe more fittingly, “Quality In, Quality Out”!

This article explains the difference between dirty and clean data, pillars of quality data, and practical tips to prevent entering dirty data.

What is Dirty Data?

Dirty data doesn’t refer to when record sheets are covered in manure or cow saliva. Instead, it refers to data that contains mistakes or quality issues and requires “cleaning”. Cleaning data is the process of reviewing and correcting data problems such as incorrect entries, missing values, inconsistent units, or duplicated records (1). Errors and issues like these in dirty data can lead to inaccurate insights or even make analysis impossible (2). While cleaning can help address dirty data, it is time consuming and can’t fix every issue such as missing records.

Think of your data like your house. You can live in a very messy house, but it’s uncomfortable and harder to function in. Cleaning helps, but some damage from the mess (like stained floors) requires extra effort or may be irreversible. Instead, you can avoid headaches and issues by preventing the mess in the first place.

This same idea applies to your data. You can work with dirty data, but it is more difficult, less reliable, and sometimes unusable. All data will contain a little dirt, it’s unavoidable, but minimizing dirty data from the start makes everything easier.

Pillars of Quality Data

Farmers can minimize the creation of dirty data by following these pillars for data quality: Accuracy, Completeness, Consistency, and Uniqueness.

Infographic showing four pillars of high quality data: Accurate, Complete, Consistent, and Unique.
Highlights of the pillars of high-quality data for dairy farms.

Accuracy

Accurate data reflects the true values including correct measurements, entries, and no typos. Accuracy is influenced by all the other pillars and is essential for reliable insights.

Real Dairy Farm Scenarios That Cause Inaccurate Data:

  • A calf handler forgets to zero the scales when weighing a group of weaned calves. This results in all the weights for that group being recorded as 13 lbs. heavier than actual.  
  • The calf manager enters the data but accidentally skips a few lines, mismatching weights to calves. This results in an inaccurate report of average daily gains. 

How to Avoid Inaccurate Data:

  • Train staff thoroughly and consistently. 
  • Ensure that all sensors and systems are properly calibrated and electronically recording accurate data. 
  • Use the same people for data collection when possible. 
  • Spot-check entries during data input. 
  • Use software tools to flag outliers and catch mistakes early. 

Completeness

Complete data includes values for all required fields. Missing values reduce the usefulness and reliability of a dataset.

Real Dairy Farm Scenarios That Cause Incomplete Data:

  • During a busy night, an employee forgets to record which calf ID coincides with which dam for four births. Due to the incomplete data, the farm is unable to reliably match those calves with their dams. 
  • Some calving record sheets are thrown away by an employee cleaning the office before they have been inputted. This instance leads to incomplete gaps in the farm’s calving data.   

How to Avoid Incomplete Data:

  • Emphasize the importance of complete records. 
  • Use organization systems such as folders or trays to safely store unentered data. 
  • Keep original hardcopy sheets or verified scans after entry for reference. 
  • Use spreadsheet filters or herd software tools to check for missing data points. 

Consistency

Consistent data uses standardized formats, units, terminology, and definitions. Inconsistent data results in limited and potentially unreliable analysis.

Real Dairy Farm Scenarios That Cause Inconsistent Data:

  • When checking fresh cows, some employees on a farm consider a cow to have ketosis if her urine test result is 15 mg/dL while others on the farm use a threshold of 80 mg/dL. This inconsistency in diagnosis results in misleading and inconsistent health reports. 
  • When entering the fresh cow health information into the herd management software, the manager uses various terms for ketosis including “ketosis, “ket.”, “k”, and “keto”. This inconsistency in terminology makes generating and analyzing health reports difficult.  

How to Avoid Inconsistent Data:

  • Develop and train staff on consistent and clear data collection or inputting protocols. 
  • Regularly review protocols and data standards to minimize protocol drift. 
  • Develop and use consistent terms, formats, and naming conventions across all data systems. 

Uniqueness

Unique data avoids duplication. Redundant records can skew analysis and cause confusion.

Real Dairy Farm Scenario That Caused Duplicated Data:

  • A farm weighs groups of heifers as they return from a custom-raiser. One returning group was smaller than normal. The farm manager accidentally enters the data for this group twice. Later when evaluating heifer weights, the duplicated entries of that group skew the overall average down, making it look like heifers are returning from the custom-raiser lighter on average than they actually are. 

How to Avoid Duplicated Data:

  • Train those entering data to follow consistent input practices.
  • Use spreadsheet tools or herd management software to identify duplicate records.
  • Establish protocols to double-check data entry (i.e. have a second person to verify all data entries).

Tips and Tricks to Increase Data Quality

High-quality data does not happen magically. Instead, it takes focused effort and time. Focusing on the pillars of quality data, farms can take steps like these to improve their data.

  • Provide consistent training and review protocols regularly with anyone collecting or entering data. 
  • Design and organize recording sheets or systems so that they are easy to understand and fill out. 
  • Reduce variability by having the same people regularly entering data. 
  • Create and use standardized terms or definitions for health events and labels. Consider using drop-down menus in spreadsheets to only allow selection of pre-defined terms. 
  • Use standardized formats (i.e. dates, animal IDs) across all systems. 
  • Avoid formatting issues in spreadsheets such as merged cells, empty rows, or using colored cells used as a form of data. 
  • Use consistent file naming and folder organization to make data easy to find. 
  • Check data for errors during entry by using methods like double entry or read aloud. Double entry, where data is entered twice and compared, is the most accurate. The read-aloud method, where the data is read from the original hardcopy sheet when checking against the entered data, is faster but slightly less accurate (3). 

Summary

Low-quality, “dirty” data has reduced value for performance tracking and decision making on dairy farms. By focusing on the pillars of data quality – accuracy, completeness, consistency, and uniqueness – farms can improve the reliability and usefulness of their data. High quality data helps farms make more informed decisions to support their long-term success.

Author

 

Katelyn Goldsmith

Katelyn Goldsmith

Dairy Outreach Specialist– In her role as a statewide Dairy Outreach Specialist, Katelyn connects research with practical farm management practices to create educational programming addressing the needs of Wisconsin dairy producers.

Articles by Katelyn Goldsmith
Contact Katelyn Goldsmith

 

Reviewers

Victor Carbera
Dairy Systems Management Extension Specialist, Professor, Honorary Associate/Fellow
University of Wisconsin-Madison, Division of Extension

John Shutske
Agricultural Safety and Health Extension Specialist, Professor, Honorary Associate/Fellow
University of Wisconsin-Madison, Division of Extension

References

  1. Guo, M., Wang, Y., Yang, Q., Li, R., Zhao, Y., Li, C., Zhu, M., Cui, Y., Jiang, X., Sheng, S., Li, Q., & Gao, R. (2023). Normal workflow and key strategies for data cleaning toward real-world data: Viewpoint. Interactive Journal of Medical Research, 12:e44310. Doi: https://doi.org/10.2196/44310 
  2. Barchard, K., & Pace, L. (2011). Preventing human error: The impact of data entry methods on data accuracy and statistical results. Computers in Human Behavior, 27:1834-1839. https://doi.org/10.1016/j.chb.2011.04.004 
  3. Barchard, K.A., Freeman, A.J., Ochoa, E., & Stephens, A.K. (2019). Comparing the accuracy and speed of four data-checking methods. Behavior Research Methods, 52:97-115. https://doi.org/10.3758/s13428-019-01207-3 

 

RETURN TO TOP

 

Print This Page

You May Also Like

  • Feeding Zeolite A for Milk Fever Prevention in Dairy CattleFeeding Zeolite A for Milk Fever Prevention in Dairy Cattle
  • Best Practices for Roasting Soybeans on Farm for Dairy DietsBest Practices for Roasting Soybeans on Farm for Dairy Diets
  • Grouping Strategies in Automatic Milking System HerdsGrouping Strategies in Automatic Milking System Herds
  • Feeding Dairy Cows for Milk ProteinFeeding Dairy Cows for Milk Protein

Division of Extension

Connecting people with the University of Wisconsin

  • Agriculture
  • Community Development
  • Health & Well-Being
  • Families & Finances
  • Natural Resources
  • Positive Youth Development

Agriculture at Extension

  • Agriculture Water Quality
  • Crops and Soils
  • Farm Management
  • Horticulture
  • Livestock

Questions?

Contact us at dairy@extension.wisc.edu

Dairy Newsletter

Stay up to date on the latest information and upcoming programs from Extension Dairy, sign up for our newsletter.

Sign up now.

University of Wisconsin-Madison      |        Explore Extension: Agriculture Community Development Families & Finances Health Natural Resources Youth
Connect With Us
Support Extension
Extension Home

We teach, learn, lead and serve, connecting people with the University of Wisconsin, and engaging with them in transforming lives and communities.

Explore Extension »

County Offices

Connect with your County Extension Office »

Map of Wisconsin counties
Staff Directory

Find an Extension employee in our staff directory »

staff directory
Social Media

Get the latest news and updates on Extension's work around the state

facebook iconFacebook

twitter icon Follow on X


Facebook
Follow on X

Feedback, questions or accessibility issues: info@extension.wisc.edu | © 2026 The Board of Regents of the University of Wisconsin System
Privacy Policy | Non-Discrimination Statement & How to File a Complaint | Disability Accommodation Requests

The University of Wisconsin–Madison Division of Extension provides equal opportunities in employment and programming in compliance with state and federal law.