Cleansing your Customer Data using Dun & Bradstreet DQS service

This post has been republished via RSS; it originally appeared at: SQL Server Integration Services (SSIS) articles.

First published on MSDN on Jun 11, 2013

Starting today you can use SQL Server Data Quality Service (DQS) to cleanse data using Dun & Bradstreet (D&B) extensive business information data. This is done using DQS Reference Data Service (RDS) functionality. You can subscribe to the D&B Company Cleanse & Match for Microsoft SQL Server Data Quality Service from Windows Azure Marketplace , then connect to the service from your DQS client tool or SSIS DQS Cleansing transform task to cleanse your data.



The following are the step to use the D&B for your DQS data cleansing:


A. Subscribe to D&B Reference Data Service from Windows Azure MarketPlace


B. Configure DQS to use Reference Data Service


C. Create DQS knowledge base and attach to D&B Reference Data Service


D. Cleanse Data using D&B Reference Data Service from Data Quality Client


E. Automate data cleansing using SSIS DQS Cleansing transform task



A. Subscribe to D&B Reference Data Service from Windows Azure MarketPlace




  1. Create Windows Azure Marketplace account (if you have previously create an account with Windows Azure Marketplace, you can skip this step and login with your account)




  2. Navigate to Data > Data Quality Services > Company Cleanse & Match for Microsoft SQL Server Data Quality Services and subscribe. For trial subscription, please contact DNB_MS_Partnership_CoreTeam@DNB.com for a promotion code.




  3. Navigate to My Account and copy Primary Account Key information





B. Configure DQS to use Reference Data Service




  1. Start Data Quality Client. For information about this step, refer to Run the Data Quality Client Application .




  2. In the Data Quality Client home screen, under Administration, click Configuration.




  3. In the Reference Data tab, under the Network Settings area, type appropriate values in the Proxy Server and Port boxes if you or your organization uses proxy server to connect to the Internet.




  4. Specify the Marketplace primary account key (See Step A3 above) in the DataMarket Account ID box, and click the Validate DataMarket Account ID icon to validate the account key.




  5. Click close to go back to the main menu.






Note that you may want to extend the DQS RDS timeout setting. Refer to this article for the instruction.



Reference : Configure DQS to use Reference Data (msdn)



C. Create DQS knowledge base and attach to D&B Reference Data Service




  1. From Data Quality Client main menu, click New Knowledge Base




  2. Specify Name for your knowledge base (e.g. DNBCleansing), make sure "Domain Management" is selected as the activity and click Next to continue




  3. Click on create domain icon (see the screenshot below) and create the following domains (you can use default properties for this example; refer to Create a Domain for more information about creating DQS Knowledge Base domain)





  • CompanyName




  • State




  • Country




  • AddressLine ( optional )




  • City ( optional )




  • Zip ( optional )




  • ZipPlus4 ( optional )




  • DunsNumber ( optional )




  • OrderReasonCode ( required only if you are cleansing non US company )








  1. Create a composite domain by clicking on create composite domain icon (see the screenshot below), specify name (e.g. FullCompanyInformation) and select all domains you created in Step C3. The final configuration should look like below:







  1. Make sure the composite domain you created in step 4 is selected in the domain list and click Reference Data tab







  1. Click Browse and from the Online Reference Data Provider Catalog pop-up window, select D&B Cleanse and Match and map the schema to your domain (from step C3 above). If you are cleansing international company (which require "OrderReasonCode") or you want to include more domains, then click Create More Schema icon (see screenshot below), select additional schema, and map to your domains. Make sure that all domains in your composite domain (step C4) are mapped to a schema. The final mapping should look like below, click OK to continue.






  1. Click Finish publish your knowledge base.




Reference : DQS Knowledge Bases and Domains (msdn)



D. Cleanse Data using D&B Reference Data Service from Data Quality Client




  1. Start Data Quality Client. For information about this step, refer to Run the Data Quality Client Application .




  2. In the Data Quality Client home screen, click New data quality project




  3. In the New Data Quality Project screen:





  • In the Name box, type a name for the new data quality project (e.g. New Customer Data Cleansing)




  • In the Use Knowledge base list, click to select a knowledge base created in Step C (e.g. DNBCleansing)




  • Ensure that Cleansing is selected as activity




  • Click Next








  1. In the Map screen:





  • Select data source.  For testing purpose, I used the following file I created using Excel and save the file to C:\temp\NewCustomerData.xls (if you are using x64 machine, you may need to save the file as type "Excel 97-2003 Workbook". If you are cleansing international data, you need to include a column for "order reason code" and populate the column with value of "1" (make sure to set the excel column as string data type as you can only map data with the same type as your domain data type definition).









  • Select the input file, specify worksheet, and map the column to the domains created in Step C3 - the final mapping should looks like below and click Next to continue.








  1. In the Cleanse screen: click Start button, click Next to continue




  2. In the Manage and View Results you can review the results. Refer to Cleanse Data Using Reference Data (external) knowledge for more information on reviewing results.  Note that D&B also provides additional information about your data. You can review this information on the last step. For example, the below screenshot shows additional information such as address and DUNS number added to the valid company.






Reference: Cleanse Data Using Reference Data (External) Knowledge (msdn)



E. Automate data cleansing using SSIS DQS Cleansing transform task




  1. Create a new SQL Server Integration Services project using SQL Server 2012 Data Tool




  2. Configure your data source




  3. Select DQS Cleansing transform task from Data Flow toolbox




  4. Configure the  DQS Cleansing transform task (for more information, refer to Overview of the DQS Cleansing Transform )





  • Connect to Data Quality Server which host the Knowledge Base (created in Step C).




  • Map the column in your data source to the Knowledge Base domain (created in Step C3)





  • Configure the data destination




  • Execute SSIS




  • In this blog article, I describe using SQL Server Data Quality Service (DQS) to connect to 3rd party Service Provider such as Dun & Bradstreet to cleanse customer data. Alternatively, you can also create your own Knowledge Base to cleanse data. In the next article , I describe how to create your own DQS Knowledge Base for data cleansing.


    Leave a Reply

    Your email address will not be published. Required fields are marked *

    *

    This site uses Akismet to reduce spam. Learn how your comment data is processed.