dmdw 8. student presentation - groovy to mongodb

7

Click here to load reader

Upload: johannes-hoppe

Post on 18-Dec-2014

1.606 views

Category:

Technology


1 download

DESCRIPTION

8. ETL Project by Maximilian Butterer

TRANSCRIPT

Page 1: DMDW 8. Student Presentation - Groovy to MongoDB

DMDW - ETL-PROJECT THE GROOVY-WAY

BY MAXIMILIAN BUTTERER

Samstag, 14. Mai 2011

Page 2: DMDW 8. Student Presentation - Groovy to MongoDB

WHAT WAS THE JOB

EXTRACT THE DATA FROM ROOM-PLAN-FILE (EXCEL)

TRANSFORM IT INTO NEW STRUCTURE (DATATYPES)

LOAD IT INTO NEW TARGET (E.G. DATABASE-SYSTEM)

CREATE KIND OF DOCUMENTATION (HOW-TO) OR

PRESENT IT TO YOU

Samstag, 14. Mai 2011

Page 3: DMDW 8. Student Presentation - Groovy to MongoDB

#1: EXPORT THE DATA

EXPORTING DATA FROM EXCEL IS PRETTY EASY

SO I EXPORTED HOLE FILE AS CSV-FILE

1ST PROBLEM: COMMA IS SEMICOLON

2ND PROBLEM: ENCODING IS NOT UTF-8

Samstag, 14. Mai 2011

Page 4: DMDW 8. Student Presentation - Groovy to MongoDB

#2: A HELPER

BECAUSE CSV IS PLAINTEXT IT‘S EASY TO PARSE

I CREATED A GROOVY-SCRIPT FOR CONVERTING

MY SOLUTION IS NOT THE CLEANEST WAY BUT SOME KIND THE EASIEST

3RD PROBLEM: DATA IS NOT CONSISTENT / THERE ARE CORRUPTED DATA-SETS

Samstag, 14. Mai 2011

Page 5: DMDW 8. Student Presentation - Groovy to MongoDB

#3: PUT IT INTO

HAVING ALL THE OBJECTS PARSED AND CONVERTED

JUST PUT ALL THE STUFF INTO THE DATABASE

4TH PROBLEM: HOW?

Samstag, 14. Mai 2011

Page 6: DMDW 8. Student Presentation - Groovy to MongoDB

HOW I SOLVED THE PROBLEMS

PARSING A FILE WITH GROOVY IS EASY AS 1,2,3

EACH LINE I HAD TO SPLIT UP BECAUSE OF THE SEMICOLONS

TO CONVERT THE DATE-STRING INTO A REAL DATE WE HAVE TO TRICK

Samstag, 14. Mai 2011

Page 7: DMDW 8. Student Presentation - Groovy to MongoDB

LET‘S HAVE A LOOK BEHIND

LOOK INTO THE CODE

HOW DOES THE CODE WORK

QUESTIONS

Samstag, 14. Mai 2011