When you are building an integration using SSIS, chances are you will need to update or insert data that has been created or changed in the source system on a regular basis. The SSIS job must be able to identify the newly created records versus the changes to existing records and perform the updates and inserts accordingly. The mechanism needed is fairly simple, and in today’s blog, we will walk you through all the steps needed to perform this action!
The traditional mechanism needed to deal with update and insert transactions consists of four steps:
1. Identify the collection of records to be processed by using the create/modify date of the record or other criteria that business rules may dictate.
2. Compare the records to the target database and identify updates vs. inserts.
3. Route the records to the update or insert step accordingly.
4. Perform the updates and inserts.
Using the approach above coupled with a merge join transformation, your data flow may look something like this:
Though this approach will work, it does have some steps that have a design, maintenance and performance cost such as the sorting, merge join, conditional split, and, most importantly of all, requiring you to map your data twice.
There is an alternative mechanism that can achieve the same results with simpler steps and will yield better performance.
Before you start, you will need to ensure that you have a temporary table that contains all of the attributes of the target table named identically and, ideally, using the same data types to ensure smooth transactions.
You can create the temporary table quickly using this SQL statement:
Now that you have the temporary table, you are ready to build the SSIS job using the steps below:
1. Add a dataflow and two Execute SQL tasks to the canvas.
2. In the dataflow task, add your data source (in this case, MSCRM), and then add an ADO/OLEDB/ODBC destination step. Map your source to insert the records into the temporary table.
3. In the first Execute SQL task, you will merge the records into the permanent table by using a merge statement.
The merge statement works by matching the records in the two tables to the selected key and executing an update or an insert based on the success of the matching criteria.
INTO investments stg
USING investments_tmp tmp ON (stg.investmentid = tmp.investmentid)
stg.CODE = tmp.CODE,
stg.CODEANDNAME = tmp.CODEANDNAME,
stg.CODESET = tmp.CODESET,
stg.ASSETS = tmp.ASSETS,
stg.ACCOUNTNUMBER = tmp.ACCOUNTNUMBER,
stg.TOTALVALUE = tmp.TOTALVALUE
4. In the second Execute SQL task, you will truncate the temporary table by using a statement similar to the one below.
Note: You can delete on a row-by-row basis with a more complex merge statement that does not require the truncate step, however, for today we will be keeping it separate for simplicity.
Now we have a complete job that will perform update and insert transactions much faster and with less components to maintain.