Select Page

BellaDati Blog

Data Cleaning Near the Source

High-quality data drives good business decisions, increase in revenue and saves cots. Cleaning of data is expensive, it may reach up to 40% of total project costs. Some organizations willingly leave out 20 or 30 percent of data in their analytics because bits are missing. In effect, they accept blatant inaccuracy as normal. Sweeping this multi-billion dollar challenge under the carpet is fundamentally misguided and creates shaky foundation on which to build a data-driven actions for companies or government.

It’s time to break through denial and clean up data at the source instead of to invest billions in subsequent data cleaning.

Fortunately, there is a proactive approach that proposes a more rational path toward being data-driven. Organizations must begin to insist on much higher standards across the entire chain of data acquisition and use. Data needs to be cleansed not just prior to running the analytics, but as it comes in. Cleansing begins by resolving the big, obvious impurities near the source. The cleansing that takes place just prior to analytics should be seen as a final, super-fine filtering. It’s time to break through denial and clean up data at the source instead of to invest billions in subsequent data cleaning. BellaDati Advanced Analytics provide powerful features for data cleaning near the source. Any standard form for data collection can be created in BellaDati that allows data collection via browser, mobile, scanning or other sources. Data are immediately cleaned near the source before they are used for analytics.

The following case demonstrates the data cleaning form for customs office. When exporting or importing goods, importers, exporters or their agents must fill in standardized forms. Forms can be submitted electronically or in the paper form. Manual input allows that one company is entered under multiple names with incorrect address or under different company ID format. Therefore, subsequent analysis of data by customs office becomes very difficult and subsequent data cleaning is very costly and time demanding, in many cases to achieve 100% clean data is impossible.

BellaDati offers validation function for forms in any format. BellaDati validation function ensures in case of the electronic submission that only descriptions that match to master file can be entered.  In case of the paper submission, BellaDati Framework gets data from the scanner or other source, match them with the master file authorized by customs office and indicates in real time which fields on the form was filled in incorrectly. BellaDati Framework matching function offers similar values for fields that are authorized. Values that are confirmed by the validator are subsequently stored in the database for further analytics.

BellaDati Framework matching function offers similar values for fields that are authorized. Values that are confirmed by the validator are subsequently stored in the database for further analytics.

Belladati analysis real time which combinations are authorited, for example, the combination of the company name, the address and ID and provides the matching.

Total costs of data cleaning near the source using BellaDati as the tool brings proven monetary gain and outweighs in the multiplication related costs.

Check out related use case video on BellaDati YouTube channel: