Data quality of an organization denotes the fitment and the ability of the information as recorded in the organization to serve its goals be it operational , marketing or compliance . If the data quality is poor, the organization would not have clear picture of its customers, products, transactions and obviously no organization would wish to be in this undesirable situation.
Organizations with high data quality have the competitive edge in terms of increased sales and operational efficiency, increased revenue, lower operational costs as well as be upto date on meeting regulatory requirements.
Problem of poor data quality in a customer facing organization is a real world problem and can occur at many instances in the information life cycle. Right from data feeding stage to decisioning system to warehouse to BI/MIS , the data is handled for various purposes by real people and is vulnerable to errors as it flows through the system.
Therefore it is required to ensure that a proper data quality management practices are adopted along with strong data quality management tool or solution.
High levels capabilities expected from data quality tool.
- Column value frequency analysis and related statistics (number of distinct values, null counts, maximum, minimum, mean, standard deviation)
- Table structure analysis
- Cross-table redundancy analysis
- Data mapping analysis
- Business rule documentation and validation
Parsing and standardization
- Flexible definition of patterns and rules for parsing
- Flexible definition of rules for transformation
- Knowledge base of known patterns
- Ability to support multiple data concepts (individual, business, etc.)
- Manageable transformation actions
- Entity identification
- Record matching
- Record linkage
- Record merging and consolidation
- Flexible definition of business rules
- Knowledge base of rules and patterns
- Integration with parsing and standardization tools
- Advanced algorithms for deterministic or probabilistic matching
Auditing and monitoring
- Data validation
- Data controls
- Rule management
- Rule-based monitoring
Cleansing and enhancement
- Flexible definition of cleansing rules
- Knowledge base of common patterns (for cleansing)
- Knowledge base of enhancements (e.g., address cleansing, geocoding)