Two of the most important normalization processes are accounting for the many iterations of company names and establishing an accurate company location. See the previously published article, “The ABCs of U.S. Customs Data- Issues & Shortcomings“. There can be many dozen iterations of the same company name. This wreaks havoc with the veracity of the data under analysis. The problem is evident is a cursory review of trade intelligence applications offered by most data vendors.
In order to resolve these issues, the name and address fields contained on the bills of lading (for both shipper and receiver) are broken down into “tokens” and compared with a dynamically evolving referential database of “resolved” names and addresses. Actually, accurately “geo-locating” the entity is the simplest of the two tasks. Zip codes, for the U.S. at least, follow a predictable pattern and typically occur at the end of the text string in the “address” block of the flat file.
The two diagrams below are tables utilized within the fourth database involved in the third step of the transformation. The first diagram shows elements that are utilized to resolve company location. The second shows those necessary to resolve company name.
A separate, complimentary and very important utility – called the company-location resolver – is THE essential cornerstone of the A.I. (Artificial Intelligence) Engine and is required to dynamically evolve and “educate” the system. More on that later.
The location – company match utility is a very nifty accessory and vital component of the A.I. Engine. Although the system is set up to quickly, accurately and automatically normalize U.S. Customs data, it also has the capacity to “learn” and improve its performance over time. Some of this learning takes place automatically over time as it gains more and more experience performing its daily processing rituals. Adjunct education is interjected manually.
For instance, perhaps during the last several days/weeks/months processing routines, our A.I. Engine encountered some company name iterations that it hadn’t handled before and wasn’t in its library of established “tokens”. Conveniently, it would display these unresolved iterations, ranked by the number of occurrences along with likely matches. With one stroke an operator could resolve and match all particular aberrations or variations on a particular supplier or importer name or location… sometimes representing several hundred or thousand individual BOLs.
Thus the A.I. Engine learned something new. And unlike its human counterparts, it will never have to ask the same question again.
The location – company match utility also can be used to link unlinked branch locations to their respective parent company or regional/ divisional headquarters. Furthermore, it can process and link a proprietary client’s database of customers as well. In this fashion, one can monitor customer’s trading activity and supply chain operations on a daily basis! This information can be incorporated into a web application which is distributed within the secure company intranet or protected proprietary web site. An example is Panalpina, one of our previous (CenTradeX) clients wherein we integrated their proprietary information into a customized web application for distribution to their regional sales offices.