Tuesday, December 7, 2010

Sql Server Integration Services: Merge Join Problem


Unwanted behavior of merge joins in SSIS
Creating a package through Sql Server Integration Services, usage of MERG JOIN is common. Merge join is used when we need to combine data from two related data sources.
Merge join can only be implemented when both data sources are in proper ORDER. Ambiguous results can be observed some time when we use sort option as follow:
1.           Right click on source ---> Select Show Advance Editor… --->Select Input and Output properties --->Click on OLEDB Source Output --->On right side change “IsSorted” to true

2.   Expend OLE Source Output --->Output Columns ---> Select your desired column --->On right side change value for “SortKeyPossition” to 1, and 2 for your next column and so on


Implement merge join and get output in your desired target (Here we will use raw file source). Here are the ambiguous results.


SOLUTION:
                Always use explicit sort operations before merge join to avoid such ambiguous results.

NOTE:  This merge join problem can be observed occasionally. Not every merge join is problematic through  normal sort (through Advance Editor) process.

2 comments:

  1. Each OLE DB source query MUST be explicitly sorted as well, then you wont get such behavior:
    select column1, column2 from table order by column1

    ReplyDelete
  2. So what's the point of calling this a JOIN? It isn't a real JOIN if it requires sorted results prior to the record matching and it's only going to link up matching records.

    ReplyDelete

All suggestions are welcome