ssis etl within an hour · pdf filessis etl within an hour ... ssis naming conventions ......

54
SSIS ETL Within an Hour Alex Junk Braden Smith ITS – Business Intelligence Shared Service Center (BISSC)

Upload: doandang

Post on 07-Mar-2018

223 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

SSIS ETL Within an Hour Alex Junk

Braden Smith

ITS – Business Intelligence Shared Service Center (BISSC)

Page 2: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

What is ETL?

“Extract-Transform-Load”

Process of accessing and manipulating data from sources to load into Data Warehouse

Process can vary based on situation and goals

Page 3: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

What is SSIS?

SQL Server Integration Services

Microsoft’s platform that can be used to build ETL solutions

Includes built in components for ETL

Control Flow

Task Execution Order, Containers, & Functionality

Data Flow

Sources, Transformations, Destinations

Extended with Task Factory components and others such as Attunity drivers

Task Factory Example - Fact Destination Table

Page 4: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

UI Architecture

Can deploy to:

IOWASSISHOST1.iowa.uiowa.edu (2012 SSIS Server)

IOWASSISTEST3 & IOWASSISHOST3.iowa.uiowa.edu (2016 SSIS Server)

Secure Zone Servers:

IOWASSISHOST2.iowa.uiowa.edu (2012 SSIS Server)

IOWASSISTEST4 & IOWASSISHOST4.iowa.uiowa.edu (2016 SSIS Server)

*** TEST4 & HOST4 are still being provisioned at this time

Page 5: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

SSIS Lifecycle

Initial Project Development:

Connect to and develop on REMOTEBI.iowa.uiowa.edu

Deploy the project to the Test SSIS server

Test and verify correct execution of the SSIS project

Promote the project to the Production SSIS server

Ongoing Project Development:

Re-factor or make changes to the project on REMOTEBI.iowa.uiowa.edu

Re-deploy the project to the Test SSIS server

Test and verify correct execution of revisions made to SSIS project

Re-deploy the project to the Production SSIS server

Page 6: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

ETL in an Hour Demo

Install Visual Studio 2015

Visual Studio licensing

https://helpdesk.its.uiowa.edu/software/download/msselect/MS%20Select%20Products%205-19-2017.pdf

Install SQL Server Data Tools (SSDT)

https://docs.microsoft.com/en-us/sql/ssdt/download-sql-server-data-tools-ssdt

Setup GIT repository on the University BitBucket service

SSIS Naming Conventions

http://sqlblog.com/blogs/jamie_thomson/archive/2012/01/29/suggested-best-practises-and-naming-conventions.aspx

Page 7: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

ETL in an Hour Resources

Visual Studio licensing

https://helpdesk.its.uiowa.edu/software/download/msselect/MS%20Select%20Products%205-19-2017.pdf

Access to REMOTEBI.iowa.uiowa.edu

Email [email protected] to request access

SQL Server Database Request Form (Workflow)

https://workflow.uiowa.edu/form/its-sql-server-database-request

BISSC

Alex Junk ([email protected])

Braden Smith ([email protected])

Page 8: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Creating Project

Open Visual Studio

File>New>Project…

Templates>Business Intelligence>Integration Services

Name: BIC_SSIS

Solution Name: BIC_Demo

* Make sure “Create Directory for Solution” is checked

Page 9: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Encrypting Solution

Right Click on BIC_SSIS>Properties to change settings

Change protection level to at minimum “Encrypt Sensitive Data with Password”

Choose a password and remember that for a later step

Page 10: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Configuration Properties

Navigate to “Configuration Properties”>”Target Server Version” within the properties pane on the right hand side

Select Appropriate SSIS deployment server version (2012, 2014, 2016, 2017)

Warning Message Prompts: Change Package Security Warning and Target Server warning (Click OK or Yes for both

Page 11: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

SSIS Package Setup

Delete original Package.dtsx and recreate another one (in the Solution Explorer window, right click SSIS Packages, click on New SSIS Package)

Rename SSIS Package to something more descriptive (GL_Demo)

Creating a new SSIS package will inherit the encryption ProtectionLevel from the project

Page 12: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Create Connection Manager

Right-click in the “Connection Manager” Area, then select “New OLE DB Connection…”

Page 13: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Configure Connection Manager

Change Provider to Oracle Provider for OLE DB

Set Server or File Name to: DWTEST.WORLD

Use a specific user name and password for authentication (EX: Your_app_User)

Test Connection to verify connection is ok

Page 14: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Finish Connection Configuration

Rename connection to something more descriptive EX: Source_OracleDW

Convert connection to Project Connection (This allows SSIS packages to use the same connection)

Right Click Connection> “Convert to Project Connection”

Page 15: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Parameterize Connection

Right Click the connection, then select “Parameterize”

Page 16: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Create Parameters

As a guideline it is good to parameterize at least the following DB Connection Properties depending on authen:

Oracle DB Parameters: ServerName, UserName, Password

SQL DB Parameters: ServerName, InitialCatalog, UserName, Password

Repeat Process for each parameter needing to be parameterized

Page 17: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Parameter Location

These New Parameters will be located within the Project.Params location

Page 18: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Create a Data Flow Task

Drag and drop Data Flow Task from the SSIS Toolbox window onto the workspace area

Rename Data Flow Task to something more descriptive (DFT - Import GL data to SQL)

Double click on the Data Flow Task

Page 19: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Create Data Source

Create a data source by dragging and dropping the OLE DB Source item onto the workspace area

Rename OLE DB Source to something more descriptive (ORA_SRC – GL_DTL_DAILY)

Page 20: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Configure Data Source

Double click the ORA_SRC – GL_DTL_DAILY data source

Data access mode>SQL command

Page 21: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Add SQL Command Paste in SQL query for the data wanting to be retrieved (example below):

SELECT

GL_CORP_ELEM_CD,

GL_JRNL_ID,

GL_JRNL_EFFECT_DT,

GL_JRNL_SEQ_NBR,

GL_JRNL_LN_NBR,

ACCTG_PRD,

ACCTG_FISC_YR,

SUBLEDGER_ACCT,

SOURCE_CD,

GL_JRNL_DESC,

SSN,

CHECK_NBR,

VOUCH_NBR,

PO_NBR,

GL_JRNL_AMT,

GL_LOAD_DT,

SOURCE,

LEDGER

FROM

GL.GL_DTL_DAILY

Page 22: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

View Columns

Click on the Columns on the left pane of the Source Editor to view the columns being loaded from the data source. Modifying the Output Column is optional and may be useful for some data loads.

Page 23: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Add Derived Column

Find Derived Column within the Common section of the SSIS Toolbox and drag it onto the workspace area.

Rename as appropriate for the column being created (DER – Concatenate columns together)

Page 24: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Configure Derived Column

Single click on ORA_SRC - GL_DTL_DAILY

Click and drag the blue arrow to DER – Concatenate columns together

Now the components are connected!

Page 25: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Add Sort

Drag and drop a Sort component onto the workspace area

Rename it SRT - Remove GL_DTL_DAILY duplicates

Single click on DER - Concatenate columns together

Click and drag the blue arrow to SRT - Remove GL_DTL_DAILY duplicates

Page 26: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Configure Sort

Double click on SRT - Remove GL_DTL_DAILY duplicates

Select all the Available Input Columns needing to be sorted (to remove duplicate rows, select all the columns that make up a unique identifier)

Click the checkbox next to Remove rows with duplicate sort values

Page 27: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Add Destination

Drag and drop an OLE DB Destination onto the workspace area

Rename it SS_DST - GL_DTL_DAILY

Single click on SRT - Remove GL_DTL_DAILY duplicates

Click and drag the blue arrow to SS_DST - GL_DTL_DAILY

Page 28: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Configure Destination Connection

Double click on SS_DST -GL_DTL_DAILY

Click on the OLE DB connection manager dropdown

Select the appropriate database destination

Page 29: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Configure Destination

Leave Data access mode as Table or view - fast load

Click to Name of the table or the view dropdown and select the appropriate destination table (GL_DTL_DAILY)

Page 30: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

SSIS Create Table

*** If a table is not already setup, SSIS can create the table by clicking New… next to the Name of the table or the view dropdown

Make sure to modify the table name within the CREATE TABLE statement

For the CREATE TABLE statement to work correctly, the credentials used for the Connection Manager must have table create permissions within the database

A recommendation is to copy and paste the CREATE TABLE statement into SQL Server Management Studio (SSMS), DBVisualizer, or a text-editing program in order to add columns, such as an identity column, and constraints, such as a primary key

Also note that SSIS tries to map each column to the correct datatype, but sometimes this can be problematic (i.e. – Oracle timestamp columns)

Page 31: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

SSIS Create Table…Cont.

Notepad++ (Changed table name to GL_DTL_DAILY and added an identity column GL_DTL_DAILY_ID

Added a primary key constraint at the end of the CREATE TABLE statement

When the CREATE TABLE statement is finished, copy and paste it back into the SSIS Create Table window

Click OK to execute the CREATE TABLE statement (this will create the table in the database)

Page 32: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Column Mappings

Click on Mappings to display the Available Input Columns, coming from the SSIS package and the Available Destination Columns, residing in the destination database

Make sure every Input Column needing to be stored in the destination database is mapped to the appropriate column

To map the columns, click and drag from one of the Available Input Columns to the appropriate Destination Column

Click OK

Page 33: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Congratulations! This first ETL is complete.

Page 34: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Deployment

Right click on SSIS project wanting to be deployed or re-deployed to SSIS server>Deploy

Click Next

Page 35: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Configure Deployment

Enter Server name if it isn’t populated already (IOWASSISTEST3.iowa.uiowa.edu)

Click Browse next to the Path textbox

Expand SSISDB and select your workgroup folder (BISSC)

Click OK

Page 36: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Advance Deployment

Click Next

Page 37: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Deploy

Click Deploy

Page 38: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Finish Deployment

Click Close when all the Results say Passed

Page 39: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Create Environment

Create Environments on the SSIS server

Connect to SSIS server using SSMS

Expand Integration Services Catalogs>SSISDB>WorkgroupCatalog

Right click Environments>Create Environments…

Page 40: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Name Environment

Enter Environment name (BIC_Demo - TEST)

Click OK

Create another environment (BIC_Demo - PROD)

Page 41: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Environmental Variables - Test

Right click on the test environment (BIC_Demo -TEST)

Click Properties

Select the Variables page (in the upper left hand corner)

*** Name the variables the same as what are in the SSIS Project Params

Click in the Name field to create the first environmental variable (Destination_SQL_DB_ServerName)

Select the Type of variable (String)

Place the value of the variable for the TEST environment (IOWASQLDMTEST1.iowa.uiowa.edu)

Repeat this process for all the TEST environment variables needed

Click OK when all the variables are created

Page 42: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Environmental Variables - PROD

Perform the same variable creation for the PROD environment (BIC_Demo -PROD)

Name the variables the same as what are in the TEST environment

Click in the Name field to create the first environmental variable (Destination_SQL_DB_ServerName)

Select the Type of variable (String) Place the value of the variable for the

TEST environment (IOWASQLDMHOST1.iowa.uiowa.edu)

Repeat this process for all the PROD environment variables needed

Page 43: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Linking Environmental variables to SSIS Projects

Expand Integration Services Catalogs>SSISDB>Workgroup Space (BISSC)>Projects

Right click on the SSIS project

Click Configure…

Page 44: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Linking Environmental Variables – Cont.

Click on References (upper left hand corner)

Click Add… to link the previously created environment to the SSIS project

Expand Local Folder

Click on TEST environment (BIC_Demo - TEST)

Click OK

Repeat to add PROD environment

Page 45: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Map Parameters

Switch back to Parameters (upper left hand corner)

There is a list of the Project Parameters which were created within the SSIS Project

Click on the ellipsis next to the Value of the first variable

Select Use environment variable radio button

Click the dropdown and select the environmental variable to use (they should be the same name)

Click OK Repeat this process for all the variables

listed

Page 46: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Complete Setup

Once completed, the Value column for all the Parameters should have an underlined value (this indicates that it is using an environmental variable)

Click OK

The SSIS Project is now setup to use environmental variables

Page 47: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Execute Using Environmental Variables

Expand Integration Services Catalogs>SSISDB>Workgroup Space (BISSC)>Projects>SSIS Project (BIC_SSIS)>Packages

Right click on the package needing to be executed (GL_Demo.dtsx)

Click Execute…

Page 48: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Select Environment

Click Environment checkbox

Click the dropdown and select the environment for the package execution (.\BIC_Demo - TEST)

Click OK

Page 49: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

BI Architecture

Page 50: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

SSIS Resources

Stairway to Integration Services (21 part tutorial)

http://www.sqlservercentral.com/stairway/72494/

Pluralsight series by Stacia Misner Varga

https://app.pluralsight.com/library/courses/ssis-basic

https://app.pluralsight.com/library/courses/ssis-advanced

https://app.pluralsight.com/library/courses/ssis-advanced-part2

How to get Pluralsight through the university

https://its.uiowa.edu/pluralsight

Page 51: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Roadmap

SSIS Scale Out

Page 52: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Tie Parameterized Variable to Visual Studio Configuration Settings

First, with the Project Parameters window open, click the Add Parameters to Configurations button.

Page 53: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Tie Parameterized Variable to Visual Studio Configuration Settings This opens the Manage Parameters Values window which lets you assign different values to each

configuration for each parameter. Once that is set up, as you change the active configuration in Visual Studio the appropriate values will be automatically assigned to the parameters. This is completely separate from the Environments that are set up in SQL Server, it’s just useful when running within Visual Studio so you can easily and automatically use different values depending on the active configuration.

Page 54: SSIS ETL Within an Hour · PDF fileSSIS ETL Within an Hour ... SSIS Naming Conventions ... practises-and-naming-conventions.aspx. ETL in an Hour Resources

Additional Resources

Visual Studio Licensing Information:

https://helpdesk.its.uiowa.edu/software/download/msselect/MS%20Select%20Products%205-19-2017.pdf

Request access to RemoteBI server (Workflow):

https://workflow.uiowa.edu/form/remotebi-access-request

RemoteBI server information:

https://its.uiowa.edu/support/article/110181

SQL server database request form (Workflow):

https://workflow.uiowa.edu/form/its-sql-server-database-request