Introduction
SSIS is a great ETL tool and has been there now for a long time. But not always we get all the desired component that we need for transformation. For that we go ahead with Script component. But the disadvantage being that it is application specific. If we need to do the same kind of operation in other SSIS packages, then script component won't help. Instead we can go ahead with our own custom transformation component and even that can be reusable. This article will focus on building, deploying and using a custom script transformation component in a step by step manner
SSIS transformation components helps us to transform the source data to some desirable format. Though SSIS provides many transformation components, but, sometime we may need to perform certain actions for which there is no components available. In such cases custom components comes very handy. In this article we will look into how we can make a custom SSIS transformation component. The custom component is a simple email validator that will read the emails from some source and will let us know if it is a valid email or not. We will use C# as the language of choice.
Steps for Implementation
Step 1:
To begin with, let us create a Class Library project (say DFCEmailValidator).
Step 2:
Once the project is created, let us add the following dll references in our project.
Microsoft.SqlServer.Dts.Design
Microsoft.SqlServer.DTSPipelineWrap
Microsoft.SqlServer.DTSRuntimeWrap
Microsoft.SqlServer.ManagedDTS
Microsoft.SqlServer.PipelineHost
Step 3:
Let us create a class by the name CustomEmailValidator.cs .Inherit the class from the PipelineComponent class and decorate it with DtsPipelineComponent attribute
Step 4:
Build and deploy the component into the Gac
Now, we should build the class library, generate a Strong name and then deploy it to the gac by using the command
C:\Program Files\Microsoft Visual Studio 9.0\VC>gacutil -i "D:\DFCEmailValidator\bin\Debug\DFCEmailValidator.dll"
N.B.~ The dll path will vary... So change it accordingly
If successful we will get the message
Assembly successfully added to the cache
We can even verify this browsing to C:\WINDOWS\assembly where we will find the assembly by the name DFCEmailValidator
Step 5: Copy the dll and put into the PipeLineComponents folder
Once the previous step is over, next we can copy the dll to the PipelineComponents folder which will be in C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents.
Step 6:
Open BIDS. Go to the DataFlow editor’s toolbox. Right Click -> Choose Items.Visit the SSIS DataFlow Items tab and locate the CustomEmailValidator. Tick the checkbox option and click the OK button.
We can find that the custom transformation has been added to the toolbox
Step 7:
Drag and drop a Flat File Source and configure it. Let’s say we have a text file (Source.txt) whose content is as under
testmailwrong
testmail@test.com
invalid@invalid
valid@gmail.com
valid_123@yahoo.co.in
123@123@123@1123#456~xud.com
Next drag and drop our CustomEmailValidator and add the data flow path from Flat File Source to it. Double click on the CustomEmailValidator for opening the Editor.
The screen shows that it has three tabs viz Custom Properties, Input columns and Input Output Properties.
The first tab shows the Custom Properties that we set.
Visiting the Input Columns tab we find the available input columns and that needs to be selected. In this case it is Column0.
And in the InputOutput Properties, if we expand the Output columns , we can find that our Output column has been created automatically whose name is IsValidEmail_Column0.
Click Ok.Next drag and drop a Conditional Split and add the data flow path from the CustomEmailValidator to it. And Configure the Conditional Split component as under
Output Name | Condition |
Success | [IsValidEmail_Column0] == TRUE |
Failure | [IsValidEmail_Column0] == FALSE |
Lastly add two flat file destination component one of whose source will be the success path of the Conditional path and the other being the Failure. The destination files are named respectively as ValidEmail.txt and InvalidEmail.txt.
The entire package is as under
Step 8:
Let us run the application and we will see the valid emails in ValidEmail.txt
testmail@test.com
valid@gmail.com
valid_123@yahoo.co.in
while the InvalidEmail.txt contains the content as
testmailwrong
invalid@invalid
123@123@123@1123#456~xud.com
No comments:
Post a Comment