Very often, data quality is seen as a brake or an obstacle for companies, even though it is at the center of their thinking.
Today, data quality is now driving many data-driven initiatives and projects. In fact, it has become essential to implement a data quality strategy within your company, but it is clear that this step is struggling to be fully integrated into business practices. It therefore becomes necessary to know and master all aspects of this process in order to achieve its data transformation effectively and thus become a “data-driven” organization.
Get the best data quality
If one aspires to effectively achieve better data quality, it is paramount to make the data quality process more collaborative in business. Indeed, while the various businesses increasingly want to regain control over their tasks, the challenge is to process and manage the quality of the data as a team, in order to combine understanding of the objectives (provided by the business users) and results generated by the use of data (provided by IT teams) with the necessary governance and control. From this perspective, the level of integration and communication of the Google Suite or Office 365 model, built on a large number of gateways and easy to use for business users, represents the objective to be achieved.
In addition, thanks to the implementation of solutions in “self-service” mode, the quality of data has been standardized and industrialized, and the levels of collaboration in companies have improved significantly. These solutions, such as data stewardship or data preparation, allow users to have control over the data they need, to apply the necessary rules and to ensure the availability of the data, while at the same time, the IT teams take care of the management of governance and data access needs. However, faced with the difficulty of using certain data tools, it is not uncommon to see users take refuge in conquered territory – such as the Office suite – as soon as an obstacle stands in their way. Organizations thus find themselves confronted with silos resulting mainly from the lack of understanding of data, or even from the lack of initiatives aimed at providing more “data literacy”.
Place the data according to the context
If, in an ideal world, users and collaborators in charge of data processing would have no trouble meeting, the reality is quite different; on the one hand, business users understand “the language of data”, but on the other hand, employees do not master its subtleties and confine themselves to processing processes, which are much less complex to understand. To overcome this problem, establishing a data culture is an ideal strategy to put in place to finally be able to consider data as defined information, because if tools occupy a fundamental place in data quality projects, it is also essential to ensure that employees all have the same understanding of the information.
It is important to keep in mind that the quality of data varies depending on the context. The state of the data is measured by collaborators according to various factors (such as reliability and accuracy for example) to determine its quality, but this is rarely done internally. For example, in the completeness of information, the data may or may not exist. But if the data does not exist, is that a problem?
Consider the following situation: the customer database contains “opt-in” and “opt-out” information and fields. On the one hand, if the customer is “opt-in”, it will be possible to find, for example, his mobile number; on the other hand, if the customer is “opt-out”, no personal information concerning him can be viewed. It is therefore by its absence that the data is said to be “valid” in the eyes of the law on data compliance. The context around the information thus makes the interpretation of completeness valid.
In putting data into context, certain types of tools and technologies play a key role. This is particularly the case for metadata, which is used via data inventory and data cataloging tools; thanks to them, users are able to find the data, know that it exists and understand it. The more data there is, the better the understanding of the data – and therefore of the information generated – will be.
But these tools are not the only ones to play a decisive role in putting data into context: rule repository, data preparation and data stewardship technologies make it possible to apply rules and change so-called “raw” data. into contextualized information for the business user.
In other words, it is essential to understand the data in the first place in order to be able to treat it later as a company asset in its own right.
Author: Patrick PeinoitPrincipal Product Manager, Talend
(c) Fig. DepositPhotos