Data Validation with Rule Sets

June 11, 2019

Key Takeaways

Poor data quality leads to duplication, errors, and unreliable analytics—making data validation essential for business accuracy.
Decisions’ rules engine automates data cleaning through validation upon entry, nightly clean-and-replace jobs, or pre-reporting cleanup.
Clean, standardized data improves performance across CRM, ERP, and reporting systems, ensuring trustworthy insights and better decision-making.

Garbage in, garbage out was first uttered by an IBM employee back in 1965 and this computer science term is just as valid today as it was nearly 55 years ago. In fact, we are swimming in more data today than ever, which is created from new software applications and IOT devices that add to the data noise each and every second. With all of this data comes the need to ensure that data is cleaned, summarized or prepared for further analysis or action.

Many software applications lack data validation capabilities or they weren’t configured when the systems first went live. As a result, data records are often duplicated with multiple entries for the same (customer, vendor, product, complaint, ticket, etc, etc, etc). I think we have all experienced this phenomenon and there are likely duplicates living in nearly all our systems, creating havoc when we try to roll data up for analysis or associate different data elements.

It doesn’t have to stay this way. Rule sets within Decisions can be used to clean operational and reporting data. There are typically three methods that can be employed to clean and maintain data for both operational and reporting systems. In each example below, a set of data can be run through a set of data validation rules that all operate on different fields or attributes within a single data set.

Method 1 – Validate Upon Entry: This is always the ideal case, although some software applications don’t have the capability to use external services to incorporate this method. If the application does support this, data can be passed to the rules engine for processing before it is saved back to the application database.

Method 2 – Clean & Replace: Using this method, nightly jobs can be run that grab newly entered data from your operational system (CRM, ERP……) and processed through a rule set within Decisions. These rules can compare this data against previously generated rules that validate addresses, company names and any other commonly duplicated or misspelled data items. This data can then be standardized to a common spelling. Once cleaned the data can be replaced, deleted or repaired in the operational system.

Method 3 – Clean for Reporting: In this method, operational data is run through the rule set and cleaned prior to the data being added to a data warehouse or data lake. In this case, the data is left as-is in the operational system but is cleaned for entry into the data warehouse. If the data warehouse is where business people look to make operational decisions – this can be perfectly ok.

In each example above, the rules themselves are the same. The only difference is when and where the data validation takes place in your process. If you have a particularly tricky data cleaning project we would love to hear about it. Please feel free to reach out to sales@decisions.com.

Gordon Jones has founded and sold three companies with the last built using Decisions technology. He has also led factories and large IT implementations both in the US and in Asia, where he lived for over seven years.

DECISIONS PLATFORM

EXPLORE MORE

INDUSTRY SOLUTIONS

USE CASE SOLUTIONS

BUSINESS SOLUTIONS

DECISIONS PARTNERS

Company

RESOURCES

EXPLORE MORE

Blog

Data Validation with Rule Sets

Key Takeaways

Latest Articles

Transform your business with automation.