iklan banner
MASIGNCLEAN104

How to configure OData SSIS Connection for SharePoint Online

iklan banner
As data warehouse developers, we often have to extract data from a myriad of source systems. Thus, whilst some source systems readily integrate with our ETL tools there are instances whereby we need to install additional drivers and software addons in order to successfully connect and extract data from other source systems. Microsoft SharePoint Online is one such source system that I recently had to extract data from and its connectors are by default not part of the standard SQL Server Integration Services (SSIS) package template. As SSIS developers we often don’t have solid background on environments such as SharePoint, thus figuring out which version of SharePoint Software Development Kit to install in order to enable successful connection from SSIS can sometimes be a frustrating experience. In this article, I try to alleviate some of that frustration by sharing some of my recent experiences relating to getting data out of a SharePoint list using SSIS.

Part 1: SharePoint List as a Data Source

We begin by looking at sample SharePoint environment that will be used as a data source and identifying a list of URL addresses that we will need in our SSIS package. Figure 1 gives a preview of the data contained within our sample ApexSQLFree SharePoint list – basically a list containing SQL addons produced by ApexSQL and licensed to the public for free.
Figure 1: Sample Office 365 SharePoint List
The first URL address to take note of, is the link to your SharePoint list, in my case that will be the web address to the ApexSQLFree SharePoint list, which is as follows:
  • https://mydomain.sharepoint.com/sites/SPSDemo/Lists/ApexSQLFree/sample.aspx
Obviously, assuming that you have a basic understanding of SharePoint, by looking at the URL address itself you would have noted the following:
  • mydomain is a placeholder for an actual domain;
  • SPSDemo is the name of the SharePoint site that this list is stored; and
  • ApexSQLFree is the name of the list.
We can use the SharePoint URL to derive a REST API web address that will be suffixed with listdata.svc. It is important that we get a *listdata.svc URL as we will need it for establishing connection to the SharePoint list using the REST API in an SSIS package. What worked for me in terms of deriving the *listdata.svc URL was appending /_vti_bin/listdata.svc just after the site name in the above SharePoint list URL such that it becomes the following:
  • https://mydomain.sharepoint.com/sites/SPSDemo/_vti_bin/listdata.svc/
Obviously, you can always test the generated *listdata.svc URL by running it into a browser as shown in Figure 2.
Figure 2

Part 2: SSIS Development – Configuring OData Source for SharePoint Online

Having identified the SharePoint list from which data will be sourced, we switch to SSIS and configure the necessary components for SharePoint data extraction.

  1. Configuring the OData Source Connection Manager
    The first SSIS component that ought to be configured is the OData Connection Manager. This connection type is available under New Connection… option in SSIS’s Connection Managers tab as shown in Figure 3.
    Figure 3
    By clicking on the New Connection… option, the OData Connection Manager Editor will popup.
    Fill in the Service document location box with your *listdata.svc URL and on the Authentication Typedropdown, choose “Microsoft Online Services”. The Microsoft Online Services will further require that you specify a username and password. Finally, if everything has been captured correctly, clicking the Test Connection button at the bottom left of the editor should return a “Test connection succeeded” message as shown in Figure 4.
    Figure 4
  2. Configuring the Data Flow Task
    Having successfully created and tested our OData connection, we are now ready to begin data extraction and SSIS facilitates such an exercise through its Data Flow Task component. Within the Data Flow Task, we start off by configuring our data source component – which in this case will be OData Source as indicated in Figure 5.
    Figure 5
    Once the OData Source has been added into the Data Flow Task pane and its editor opened, under the OData connection manager label we select the OData connection we configured above. As indicated in Figure 6, such a selection will result in the rest of the boxes being auto-populated (except for Query options).
    Figure 6
    You can use several query options in your OData source connection as outlined here. For instance, if you want to return a subset of data from your SharePoint list, you can use the $top query option as indicated in Figure 7.
    Figure 7
    For the purposes of showing you how the filtering works, I duplicated the components in the data flow tasks such that one section applies the $top query option and the other doesn’t. As indicated in Figure 8, the first flow retrieves only 2 rows compared to the other section that returns all (7) rows.
Share This :