<aside> ⬅️ Previous guide:
</aside>
Here you will find all the important information about why data health is important, how we detect data quality issues, how we can improve your data health in the peopleIX platform and what you need to do to get the best out of your data.
The Data Health Score is a metric that indicates how clean your data is. Its purpose is to provide a quick overview about the reliability and accuracy of your data in a single metric. The score can range from 0 to 100, with a higher score indicating better data quality.
A higher score means that the conclusions drawn from your data are more reliable and accurate. Only if your data is clean and of high quality, you should rely on it. When analyzing the data, you should always include the Data Health Score to interpret the results. Therefore, achieving a high Data Health Score is crucial before analyzing the data.
When you import your data into our platform, our algorithm automatically runs to detect data quality issues. These include:
The data health algorithm does not simply identify all empty fields, as in the case of HR data, some fields may intentionally be left blank. The algorithm is designed to differentiate between intentionally empty fields and genuinely missing values by following logical rules. For example, if a person is hired but no offer date is entered, the algorithm recognizes that logically a value is missing in this field.
Duplicates refers to instances where there are duplicates or inconsistencies in the labels of the data. For example, if one employee has their nationality listed as "German" and another employee has it listed as "Deutsch," it is important to merge and standardize these values into a single category.