advanced etl ms ssis 2012 & talend

Post on 27-Jan-2015

142 Views

Category:

Technology

19 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Advanced ETL -SSIS 2012 & Talend

By

Sunny Okoro

1

ContentsDatabase Systems........................................................................................................................................2

Applications.................................................................................................................................................2

Microsoft SQL Server Integration Services 2012.....................................................................................4

Talend Open Studio 5.4...........................................................................................................................183

2

Database Systems

Microsoft SQL Server 2008R2

Microsoft SQL Server 2012

Applications

3

Microsoft Visio

Microsoft Visual Studio 2010

4

Microsoft SQL Server Integration Services 2012

5

6

7

8

9

10

11

12

13

14

15

16

Example of Flat files Creation

17

18

The connection string ensures that file is created in the right folder with the right name as declared in the SSIS variable.

19

20

21

22

23

Example of Pivot Creation

24

25

This data flow task contains many tables, files, aggregations and derived columns not all will be illustrated. The pervious demonstrations illustrate some of the key components in this data flow. The

following illustrations

demonstrates major expression used in derived columns to transform the data.

26

27

28

29

The stored procedure executed from SQL Server management studio displays null data that would be transformed to a specific value using expression in SSIS.

30

31

Countrycode = AU [AUSTRIALIA]

STATECODE= VIC[VICTORIA]

EXECUTION

32

33

34

35

36

Results Abridged

37

Results Abridged

38

39

Results Abridged

40

41

42

43

44

45

46

47

48

49

50

51

52

53

Results Abridged

54

55

56

57

58

Results Abridged

Results Abridged

59

60

The countrycode is changed to US for USA and Statecode to CA for this execution. The [SalesRpt_FiscalYr_City] table does not contain any Australian cities from the previous demonstration because the table was truncated at the beginning of each package execution

The countrycode remained the same but the statecode was changed to IL. The data contrnts for the state of Illinios where created in the same folder as state contents for Victoria. The prefixes were changed to IL for each file name to reflect the countrycode and statecode which was done using file connection strings.

61

62

No data found for the city which was in California in the previous execution of this package. I will change the countrycode to CA and state code to BC .

63

64

The output folder is clustered and SSIS will delete every content in the output folder at the beginning of each execution.

65

66

The pervious content has been deleted by SSIS using the file system task which can also be utilized to create directories, copy files etc. The output folder has no content for Great Britain.

67

68

69

70

71

72

73

These files will be imported into MS SQL Server database using foreach loop to grab each csv files and upload them into the product tables.

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

For this demonstration, Talend ETL application would be utitlized to transform the data into xml format that can be recognized by SSIS.

95

96

97

Data Mapping

98

99

100

101

102

103

The Adoworks XML document and the Adworks XSD document are created in the XML folder.

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

Data Validation

Only the pivot based reports are displayed fully. The rest of reports are snapshots not the entire data extracted from the database.

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

Another way to create the XML format is to use TSQL XML features like XML Auto and Elements to parse the Query result into an XML Format and extract into an XML file which can be read by SSIS. This method is much faster for smaller data not for big data in a laptop environment.

162

163

164

165

166

167

168

169

170

171

172

173

All of the results are abridged

174

Instead of inserting data for all country when the package is executed. SSIS will insert data using the county code and state code highlighted above and the additional countrycode to determine which table to populate

175

Only the Australian table is populated. The reaming tables were ignored because the condition of the expression on the conditional split did elevate to true

176

177

Australian Customer data All of the results are abridged

4

178

179

Canadian Customer All of the results are Abridged

180

American Customer All of the results are Abridged

181

182

183

184

185

186

Talend Open Studio 5.4

187

188

189

JDBC drivers have to be uploaded manually to make it easier to connect to different platform like Oracle, MYSQL, Sybase SQL Anywhere, Postgresql and DB2 . Talend allows ODBC to be utilized for connection instead of the traditional JDBC. I had worked with Java based applications like Oracle SQL Developer and JDeveloper, ODBC does not work well in these environments only if the option is available.

190

191

192

193

194

195

196

197

198

199

200

201

All of the results are Abridged

202

203

204

205

206

207

208

209

210

Results Abridged

211

212

Results Abridged

213

214

215

216

217

218

219

Results Abridged

220

Results Abridged

221

222

223

224

225

Results Abridged

226

227

228

229

230

231

232

233

234

235

236

Results Abridged

237

238

239

240

Results Abridged

241

Results Abridged

242

Results Abridged

243

Results Abridged

244

Results Abridged

245

246

247

248

249

Results Abridged

250

251

252

253

254

Results Abridged

255

Results Abridged

256

Results Abridged

257

Results Abridged

258

259

260

261

262

263

264

265

Results Abridged

266

267

268

269

270

271

272

Only the Excel files will be read and uploaded into the database

273

274

Results Abridged

275

276

277

278

279

280

281

282

Results Abridged

Results Abridged

283

Results Abridged

284

Results Abridged

285

286

287

Results Abridged

288

289

Results Abridged

290

291

292

Results Abridged

All the file names includes the countrycode passed through the context

293

294

295

296

297

298

299

300

301

302

303

304

305

top related