There is a staggeringly large volume of data available on the topic of using c# to create custom components for SSIS. What surprised me was the lack of suitability of that content *in one place* to deal with the issues that arose during my latest implementation.
This is probably more to do with me not being a very proficient developer who is overly reliant on google cut and paste. However. Here is a summary of my experience for my own future reference.
The plan was to create (initially) a custom data flow component that would connect to a web service and pass over the contents of a CSV, outputting the result. The CSV is effectively a data sheet of profile information relating to an email address that I need to populate into our email service providers service. They publish a web service (SOAP), and have published a number of sample API calls in c# which I had previously downloaded and successfully run through Visual Studio.
My first sources of information was the review the MSDN around the objects involved, the methods that need to be used etc. in order to generate a pipeline component (it was through googling this that I found out that it needed to be a pipeline component).
These resources include:
I then found this very useful blog article where someone had created a pipeline component so I nabbed the code and ran through it as an example to test whether I could do it – it worked.
There’s some initial findings along the way – such as building the class and finding the pipelinecomponents folder, and also installing the dlls into the GAC. I’m inexperienced at this sort of thing so it was a good learning curve.
The working credit card number validator had the bones of what I wanted to do – it accepted a file (in this case excel but it could also be a CSV I found out) – and processed that file. I found out during this process that the data flow converts data into a buffer object and then passes this buffer through the components. So it made no difference what file type it was, it was the incoming buffer that needed manipulation within the component.
The one thing this sample did not do was to connect to a web service, and this is where I hit the first real problems. Essentially I started to create a mashup of samples – taking the number validator as a base code and then lifting the references and methods from the API samples for what I wanted to do. This built fine, but when transferred into SSIS (which by this time I had discovered build events to copy the files to the appropriate directories and install into the GAC, closing and opening SSIS and refreshing the toolbox between each test as apparently the toolbox elements are cached on load…) it would fail. It would not be able to find the endpoint. Checking back through, the endpoints were all available in the web service reference files, and the tests in visual studio worked. Turns out , SSIS components ignore app.config and other settings files. I’m sure this is something to do with it being a closed off component but the build in VS generated a .dll.config file that was being ignored by the transfer. I found this very useful explanation as to what was happening and how to fix it:
I’m sure if/when my work moves towards some form of production I’ll need to revisit this when it starts failing again…
This then meant I was moving forwards with just adapting the sample code and plugging in the API examples as and when required.
All was fine using test data and I moved on to some more real tests and it started failing again, this time it was on the volume of data I was trying to extract into the buffer. Turns out there are limits – again set by bindings in settings files. This is obviously an issue as SSIS components ignore settings files, so I needed to understand which bindings that were requiring changing and then coding them in to the component directly so they were accessible. By doing this I then needed to work through issues around security etc.
In the end, all works fine – however it feels a little Heath Robinson with little planning for errors and proper organisation and coding practise, some elements I’d like to think more on if I had the time, including the web service accepts the CSV file as a byte array, I found myself hacking a routine that converted the buffer stream (which was originally a CSV file) back into a string structured like a CSV, so I could convert it into a byte array and pass it to the API. Clearly there are easier ways to do this, but this is what is in there at the moment.
Also, dealing with the buffer in general would be something I need to familiarise myself with more, there’s a lot more there to get my head around – as a reference, one aspect which I managed was:
For reference, custom tasks are obviously different here are some references around this: