bing chen, james hanck, patrick hanck, scott hertel, allen...

97

Upload: haliem

Post on 10-Jun-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99
Page 2: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

BingChen,JamesHanck,PatrickHanck,ScottHertel,AllenLissarrague,PaulMédaille

DataIntegrationwithSAP®DataServicesThisE-Biteisprotectedbycopyright.FullLegalNotesandNotesonUsagecanbefoundattheendofthispublication.

Page 3: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

SAPPRESSE-Bites

SAPPRESSE-Bitesprovideyouwithahigh-qualityresponsetoyourspecificprojectneed.Ifyou’relookingfordetailedinstructionsonaspecifictask;orifyouneedtobecomefamiliarwithasmall,butcrucialsub-componentofanSAPproduct;orifyouwanttounderstandallthehypearoundproductxyz:SAPPRESSE-Biteshaveyoucovered.AuthoredbythetopprofessionalsintheSAPuniverse,E-BitesprovidetheexcellenceyouknowfromSAPPRESS,inadigestibleelectronicformat,delivered(andconsumed)inafractionofthetime!

AkashKumarPlanViz:ImprovingSAPHANAPerformanceISBN978-1-4932-1300-9|$14.99|75pages

EricDuSAPHANASmartDataStreamingandtheInternetofThingsISBN978-1-4932-1303-0|$9.99|86pages

AronMacDonaldIntegratingSAPHANAandHadoopISBN978-1-4932-1293-4|$12.99|85pages

TheAuthorsofthisE-Bite

BingChen,JamesHanck,PatrickHanck,ScottHertel,AllenLissarrague,andPaulMédaillearethebest-sellingauthorsofSAPDataServices:TheComprehensiveGuide,anSAPPRESSpublication.Learnmoreabouttheauthorsatwww.sap-press.com/sap-data-services_3688/authors/.

Page 4: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

WhatYou’llLearn

DiveintodataintegrationwiththisE-Bite.LearnaboutdatawarehousingscenariostohelpyousolvecommonintegrationchallengesinSAPDataServices.Discoverintegrationstrategiesforthedistributionandretailindustries,andbuilddataflowstounderstandproperdataprovisioningtodatawarehouses.

1IntegrationintoDataWarehouses

1.1DimensionalDataModelOverview

1.2ConformedDimensionsandtheBusMatrix

1.3DimensionalModelDesignPatterns

1.4ProcessingSlowlyChangingDimensions

1.5LoadingFactTables

2IntegrationwithPOSSystems

2.1InquiryCaptureJob

2.2IdentifyDiscountTargetsBatchJob

2.3ProductAvailabilityCheckLookup

2.4Results

3IntegrationwithSAPBW,SAPAPO,andSAPECC

3.1SAPECCExtraction

3.2DeliveryTimeCalculations

3.3SAPAPOInterfaces

3.4SAPECCInterfaces

3.5Results

ThisE-BiteisanexcerptfromSAPDataServicesbyBingChen,JamesHanck,PatrickHanck,ScottHertel,AllenLissarragueandPaulMédaille.

Page 5: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

1IntegrationintoDataWarehouses

OneofthemostcommonDataServicesintegrationscenariosisprocessingandmovingdataintoadatawarehouse.Adatawarehouse(DW)isdistinctfromanOnlineTransactionProcessing(OLTP)typeofsysteminthatit’sdesignedandoptimizedforanalyticoperationsonlargesetsofdataversustransactionaloperationsonasinglerecordinanOLTPsystem.ADWoperationcanaggregatebymultipledimensionalattributesovermillionsofrecordsversusmanyprocessesoperatingonasinglerecordinanOLTPscenario.

KimballMethodology

Tofullyunderstandsomeofthetechniquesanddesignpatternsthatyou’llusetointegrateintoacustomDWwithDataServices,youfirstneedsomebackgroundonKimballMethodology.TherehavebeenmanyapproachestoDWdesign,buttheKimballmethodologyisprobablythemostcommonlyacceptedbestpracticeapproachtoday.

Itcombinesspecificdatamodelinganddesignpatternswithaniterativeprocessandadataframeworkcalledabusmatrix.

TheKimballmethodologyisafulllifecyclethatincludesparallelimplementationtracksfortechnicalarchitecture,databasedesign,andBIapplicationsaswellasprojectplanning,management,deployment,andmaintenance.However,forthepurposesofthisE-Bite,thediscussionislimitedtodimensionalmodeling,physicaldesign,anduseofDataServicesfortheExtraction,Transformation,andLoading(ETL)integrationaspectsoftheKimballlifecycle.

ThefollowingarethefundamentalprinciplesoftheKimballapproach:

Focusonbusinessprocess

Dimensionallystructureddatamodels

Iterativedevelopmentinmanageableincrements

Thisapproachensuresthatthefocusremainsondeliveringbusinessvalueinanincrementalapproach.

ForcomprehensivecoverageoftheKimballapproach,seeTheDataWarehouseLifecycleToolkit,byKimballetal.(2nded.,Wiley,2008).

Page 6: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

InSection1.1toSection1.5wewilltakeyouthroughdatawarehousingscenariosandstrategiestosolvecommonintegrationchallengeswhenleveragingdimensionalandfactualdatawithDataServices.Wewillalsobuilddataflowstohighlightproperprovisioningofdatatodatawarehouses.

ANoteonSAPBW

ThisresourcespecificallydiscussesKimball-baseddimensionalDWdesignandintegrationsusingDataServices.Thisistypicallyadatawarehouseordatamartwherethedevelopermustintegrateintoacustom-designeddimensionaldatamodelonvarioustypesofdatabases.

SAPBusinessWarehouse(SAPBW),ontheotherhand,implementsmanyoftheconceptsdiscussedinthissection,buttherearespecifictoolsandmethodologiesusedbyDataServicestointegratewithSAPBW,whicharediscussedinSection3.

Now,let’sexplorevarioustechniquesthatareusedtoloaddimensionandfacttablesintoadimensionalDWusingDataServices.

Page 7: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

1.1DimensionalDataModelOverview

WhenloadingadimensionalDWmodelwithDataServices,thefundamentaltasksareprocessingandloadingmasteranddescriptivedataintodimensiontables,andloadingtransactions,aggregates,andmeasuresintofacttables.ToeffectivelydesignandimplementDataServicesjobstodothis,youneedtounderstandthepurposeandfunctionalityofthedimensionaldatamodelanditsvariousdesignpatterns.

Adimensionaldatamodel,sometimesknownasastarschema,consistsofafacttablewithmeasureslinkedandkeyedtoasetofdescriptivetablescalleddimensions.Dimensionsdescribehowdataissliced.IntheexampleinFigure1,ordersareseenasafacttablethatyoucandimensionbydate,channel,product,andsoon.

Figure1ExampleStarSchemaModelforOrders

Thedimensionalmodelisadatastructurethatisoptimizedforqueryperformanceandusability.Thedimensionalmodel’skeybenefitsincludethefollowing:

Usability

Consistency

Performance

Extensibilityandflexibility

Page 8: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

1.2ConformedDimensionsandtheBusMatrix

Conformeddimensionshaveequalmeaningacrossdifferentfacttables.Thus,inFigure2,thedimensionrecordsthatarelinkedandkeyedwiththeorderfactrecordscanalsobelinkedandkeyedtoinventoryandshipments.Thus,thedate,product,currency,andotherdimensionrecordswillhavethesamemeaningacrossallfacts.

Figure2StarSchemawithConformedDimensions

ApartoftheKimballapproach,atoolthat’sextremelyhelpfulforunderstandingthebusinessprocessesandfacilitatescommunicationandplanning,istheenterpriseDWbusmatrix.AsimplifiedbusmatrixisshowninTable1.

Date Product Salesperson Currency Promo Channel

BusinessProcess

DCInventory

X X X

SalesOrder

X X X X X X

OrderDelivery

X X X X X X

Page 9: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Table1EnterpriseDWBusMatrix

Thebusmatrixcapturestheorganization’sbusinessprocessesanddimensionshorizontallyacrossthetop.Duringplanningdiscussions,thecrucialbusinessprocessescanbecapturedandassociatedwiththedimensionswithwhichtheyshouldbelinked.InthebusmatrixinTable1,theDCinventoryprocessisassociatedwithdate,product,andcurrencydimensions.

Fromtheplanningapproach,keyoutcomesincludethefollowing:

Thefacttablesthatwillrepresentthebusinessprocess

Thegranularityatwhichthosefactswillbecaptured

Afterthebusinessprocessesanddimensionsaredefinedinabusmatrix,youcancreatethedimensionalmodel.Thedimensionalmodelingprocesscanbebrokendownintoaniterativefour-stepprocess:

1. Choosethebusinessprocess.

2. Declarethegrain.Whichfacttablesdoyouhave,andwhatisthegranularityofeach?

3. Identifythedimensions.Whichdimensionsapplytoeachfacttable?

4. Identifythefacts.Whatarethemeasuresonthefacttables?

Thisprocessisrepeatedtoenableanincrementalimplementation.

Page 10: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

1.3DimensionalModelDesignPatterns

AtypicaldimensionalstarschemaisshowninFigure3.Therearemanytoolsavailabletocreatelogicalandphysicalmodelsforthedatabaseyou’reusing.

Figure3TypicalStarSchema

Next,we’llbrieflydiscusssomeofthemorecommondesignpatternsappliedindimensionalmodels.

Dimensions

Dimensionsaretablesthatrepresentdescriptivedataaboutfactsandmeasures.Thiscanincludemasterdatasuchascustomerandproduct,dateandtime,orcodedefinitions.It’salsocommontorepresenthierarchiesinthesedimensions,suchasproductcategoriesorgeographicalregions,tofacilitatedrill-downanddrill-upaswellasaggregation.

OneofthebasicrequirementsinaDWenvironmentistopreservethechangehistoryofthesedimensions.Whilefactrecordsaretypicallymuchhigherinvolume,thedimensionscanchangeaswell,suchasacustomerupdatinghisemailaddressoranemployeechangingdepartments.Thesetypesofchangesinadimensiontablearecommonlyreferredtoasslowlychangingdimensions(SCDs)becausethesechangesareusuallymuchlessfrequentthanfactrecordchanges.

FollowingarethemostcommonlyusedtypesorpatternsforSCDs:

Page 11: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Type1Updateorinsertintothedimensiontable,andoverwriteanyexistingrecordwithchanges.Nochangehistoryispreserved.

Type2Keepversionsofallchangesbyinsertingrows.Enddateanyexistingrecord,andcreateanewrecordthatisthecurrentversion.Factrecordsarekeyedtotheappropriateversionofeachdimensionrecord.

Type3Additionalcolumnsareusedtostorepreviousversuscurrentversionsofspecificattributes.Thislimitshowmanyattributescanbetrackedforchangehistory,andtypicallyonlyoneorafewversionscanpracticallybetrackedforeachoftheseattributes.

Type4Twotablesareused,acurrentandahistoricaltable.Anytimeachangeoccurs,theoldrecordismovedtothehistorytable,andthenewrecordisinsertedintothecurrenttable.ThisisverysimilartotheOLTPaudittypeoftables.

ThereareseveralotherSCDtypesaswellashybridapproaches,butwe’llfocusonexploringimplementationoftypes1through4usingDataServices.

SurrogateKeys

Asurrogatekeyisamachine-generatedvaluethathasnobusinessmeaning.Itssolepurposeinadimensionalmodelistouniquelyidentifydimensionrecords.WhenyouimplementSCDs,you’lloftenhavemultipleversionsofasingledimensionentity.Youcan’tusethenaturalorbusinesskeyinthefacttabletoreferencethisdimensionbecauseitwillbeduplicated,soyougeneratesurrogatekeys.ThismeansthattherewillbelookupsandjoinswhenyouprocessthesedimensionandfacttablesinDataServicesaswellaswhenyouquerythestarschema.

We’lllookathowtoprocessandlookupthesesurrogatekeysinDataServicesinSection1.5.

DimensionTableDesignPatterns

PreservingchangehistoryusingSCDsappliestoalltypesofdimensions.Inaddition,therearemanytypesofdimensiondesignpatternsthatcanbeapplied

Page 12: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

aswell.

DegenerateDimension

Thisisusedwhenthereisadimension(notameasure)thatdoesn’thavedescriptiveattributes.Often,thisisanumberorIDsuchasasalesordernumberoratransactionIDfromasourcesystem.Inthiscase,itdoesn’tnecessarilymakesensetocreateaseparatedimensiontabletostorethisnumberbecausetherewouldlikelybeasmanydimensionrecordsasfacts,andthereisnothingdescriptivetoreportbywiththisdimension.Inthiscase,theattributeiskeptdirectlyinthefacttable,whichavoidsanunnecessaryjoin.

Role-PlayingDimension

Thisisusedwhenyouhaveasingledimensiontablethatisusedinmany“roles”inafacttable.Acommonexampleisthedatedimension.Youcanhavemanydatesonafacttablerecord(e.g.,orderdateandshipdate),butyoudon’twanttocreatemultiplephysicaldatedimensiontablesthathavetheexactsamedata.Youcancreatemultipledimensionviewsforeachdatethatsharethesameunderlyingtable.

MiniDimension

Thisisadimensiontablethatrepresentssegmentationorbandingofanotherdimension.Thisisoftenusedwhenyouhaveverylargedimensionsthathavefrequentchanges,andyoudon’tneedtotrackeverychangeoneverydimension.Customersegmentationanddemographicsisagoodexample,wheretherearemillionsofcustomerswithmanychangingattributes,andyoudon’tneedtheoverheadoftrackingallthosechanges.Youcancreateaminidimensionthathasgroupsor“bands,”suchasagerangesorregions,anddimensionfactrecordsbythebandsintheminidimension.

JunkDimension

Thisisusedformiscellaneous,low-cardinalityattributesthatdon’thavemuchdescriptivedata,suchasindicatorsandflags.Theymaynotjustifytheirowndimensions,soseveralofthesecanbecombinedintoageneral-purpose“junk”dimension.

Page 13: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Snowflaking

Asnowflakepatterniswhenadimensionisnormalizedintomultipletables.Usuallyyoushouldtrytoavoidthisbecauseitrequiresmorejoinswhenqueryingandcanmakethestarschemamoredifficulttouse.Itcanbeappropriate,however,whenyouhavemultiplelevelsinahierarchy,andthefacttablesaredimensionedatthesevariouslevelsseparately.

Many-ValuedDimension

Amany-valueddimensionrepresentsamany-to-manyrelationshipbetweendimensionandfacttable.Thismustbeusedandqueriedcarefullytoavoiderrorsinaggregation.

FactTableDesignPatterns

Facttablesareaspecializedtypeoftableusedtostoremeasuresandfacts,alongwithallofthesurrogatekeysforthecorrespondingdimensions.Usually,facttablesareverylongandnarrow,withmanyrowsbutrelativelyfewcolumns.Eachrowinthefacttableisessentiallyauniquecombinationofallofthedimensions,andeachfacttableshouldhaveaconsistentgrain.Thismeansthateveryrecordshouldbeeitherthelowestlevelofgrainorbeaggregatedatthesamelevel.

Mostfacttableswillfallintooneofthefollowingthreetypes:

TransactionfactThemostatomicandcommontypeoffacttablewithonerecordpertransaction,suchasorders,point-of-sale(POS)transactions,orinventorymovements.

PeriodicsnapshotAggregatedfactsataconsistenttimeperiod,suchasmonthlyaccountbalancesanddailysales.Newrecordsareaggregatedandinsertedintothefacttableforeachperiod.

AccumulatingsnapshotThisisusedtotrackalong-runningprocess,suchasaninsuranceclaimororderfulfillmentprocess.Youcancreateanewfactrecordwhentheprocessinitiates,andthenupdatedatesandmeasuresonthatfactrecordastheprocesscontinuesovertime.

Othertypesoffacttables:

Page 14: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Fact-lessfactMeasuretheexistenceoroccurrenceofaneventthatdoesn’tnecessarilyhavemeasures,suchasattendance.

ConsolidatedfactCombinemultiplefactsinasingletableatthesamelevelofgrain.Thisisusedtostoremeasuresfrommultipleprocessesinthesamefacttable,suchassalesordersandsalesforecast.

AggregatefactArollupandaggregationoftransactionfactdatathatisusuallyusedforperformanceoptimization.

Next,we’llexplorehowtouseDataServicesinvariousscenariostoprocessandloaddimensionandfacttables.

Page 15: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

1.4ProcessingSlowlyChangingDimensions

TheSCDisafoundationalelementofthedimensionalDWasdescribedpreviously.ThissectionwillexplorethemostcommonlyusedtypesofSCDs(types1through4)andhowthesecanbeimplementedinDataServices.

We’llusethefollowingsimpleexampleofastarschemashowninFigure4todemonstratehowtouseDataServicestoprocessandloadSCDsandassociatedfacttablesinatypicalDWscenario.Inthisexample,asalesfacttableisdimensionedbycustomer,employee,product,anddate.

Figure4ExampleOrdersStarSchema

SCDType1

Thisisthemostbasicscenariowhenupdatingadimensiontable.Type1insertsnewrecordsandoverwritesanyexistingrecordswithnewdata.Thistypedoesn’tpreservehistoricaldata.InTable2,weseeanexistingrecordintheDIM_EMPLtablefor“VeronicaMeyer”withacurrentpayrateof“22”.

EMP_SK EMP_NUM EMP_LNAME EMP_FNAME EMP_PAY_RATE

328 2714 Meyer Veronica 22

Table2ExistingEmployeeRecordintheDIM_EMPLTable

Thisemployee’spayratehaschangedfrom22to31.ThenexttimeweloadthisdimensionasSCDType1,weupdatetheexistingrecordwithnewdata.TheresultscanbeseeninTable3whereEMP_PAY_RATEhasbeenupdatedto“31”,andwenolongerhavetheprevioushistoricalvalueof“22”.

Page 16: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

EMP_SK EMP_NUM EMP_LNAME EMP_FNAME EMP_PAY_RATE

328 2714 Meyer Veronica 31

Table3UpdatedEmployeeRecordintheDIM_EMPLTable

Figure5illustratesanexampleofhowtoimplementSCDType1inaDataServicesdataflow.

Figure5SCDType1DataFlow

ThefollowingstepsdescribehowtoimplementeachcomponentwhencreatingadataflowtoupdateanSCDType1dimensiontable.

1. Thedatasourceisthe table,whichisqueriedwithnotransformation.

2. The stepisusedtocomparethesourcetabletothetargetdimensiontableusingtheEMP_NUMcolumntocompareEMP_PAY_RATE(Figure6).

Figure6ComparingtheSourceEMPLOYEETabletotheTargetDIM_EMPLTableontheEMP_PAY_RATEColumnOnlyUsingEMP_NUMastheKey

3. The transformthendetermineswhethertoinsertanewrecord,orupdate/overwritetheexistingrecord.InFigure7theMAP_OPERATIONinsertsnewrowsbasedonEMP_NUMandupdatesexistingrowsbasedonchangestoEMP_PAY_RATE.

Page 17: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure7MapOperation

4. The transformcreatessurrogatekeysfornewlyinserteddimensionrecords.InFigure8,KEY_GENERATIONisusedtocreateanewsurrogatekeyfortheEMP_SKcolumnandisincrementedby1foreachnewrecord.

Figure8UsingtheKey_GenerationTransformtoGenerateaNewSurrogateKey

SCDType2

SCDType2isprobablythemostcommonlyusedpatterntopreservechangehistoryfordimensiontables.Type2preserveshistorybycreatinganewrecordwiththelatestversionoftheentity,enddatingthepreviousversionoftherecord,andoftenmaintainingacurrentrecordindicatoraswell.InTable4,weseeanexistingrecordintheDIM_EMPLtablefor“VeronicaMeyer”withacurrentpayrateof“22”.

EMP_SK EMP_NUM EMP_LNAME EMP_FNAME EMP_PAY_RATE EMP_START_DATE

328 2714 Meyer Veronica 22 2014.08.25

Table4ExistingEmployeeRecordinDIM_EMPL

Thisemployee’spayratehaschangedfrom22to31on10/16/2014(seeTable5).ThenexttimeweloadthisdimensionasSCDType2,theexistingemployeerecordisenddatedbyupdatingtheEMP_END_DATEwiththecurrentdateand

Page 18: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

theEMP_CUR_INDtoN.Thenanewcurrentversionoftheemployeerecordiscreatedwiththenewdata.

EMP_SK EMP_NUM EMP_LNAME EMP_FNAME EMP_PAY_RATE EMP_START_DATE

328 2714 Meyer Veronica 22 2014.08.25

329 2714 Meyer Veronica 31 2014.10.17

Table5EndDatedandNewEmployeeRecordintheDIM_EMPLTable

Thefollowingdataflow(Figure9)illustratesanexampleofhowtoimplementSCDType2inDataServices.

ThisdataflowisverysimilartothepreviousexampleforType1.Thesourceisour table,whichisqueriedwithnotransformation.The

stepisusedtocomparethesourcetabletothetargetdimensiontableusingtheEMP_NUMcolumntocompareEMP_PAY_RATE.The

transformthenmanagestheType2updatesandinsertsbeforeakeyisgeneratedandthetargetdimensiontableisupdated.ThisisspecificallyusedinthisType2scenariotomanagemultipleversionsandrowsoftheemployeerecordovertimeandwouldbeconfiguredasshowninFigure10.

Figure9SCDType2DataFlow

Figure10HistoryPreservingforSCDType2withValidDates,CurrentColumn,andCompareColumn

Page 19: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

SCDType3

SCDType3preserveshistorybyusingtwocolumnsforcurrentandpreviousversionsofthedimensionentity.Thisallowsalimitednumberofversionstobetracked,usuallyonlytwo(currentandpreviousororiginal).InTable6,weseeanexistingrecordintheDIM_EMPLtablefor“VeronicaMeyer”withacurrentpayrateof“22”.

EMP_SK EMP_NUM EMP_LNAME EMP_FNAME ORIG_EMP_PAY_RATE

328 2714 Meyer Veronica 22

Table6ExistingEmployeeRecordintheDIM_EMPLTable

Thisemployee’spayratehaschangedfrom22to31on10/16/2014.ThenexttimeweloadthisdimensionasSCDType3,weupdatetheCUR_EMP_PAY_RATEandEFFECTIVE_DATEwiththenewpayrateandcurrentdate.TheresultscanbeseeninTable7,whereCUR_EMP_PAY_RATEhasbeenupdatedto“31”,andwepreservetheoriginalpayrateof“22”inORIG_EMP_PAY_RATE.

EMP_SK EMP_NUM EMP_LNAME EMP_FNAME ORIG_EMP_PAY_RATE

328 2714 Meyer Veronica 22

Table7UpdatedEmployeeRecordintheDIM_EMPLTable

Figure11illustratesanexampleofhowtoimplementSCDType3inaDataServicesdataflow.

TheexampleinFigure11illustratesanalternativemethodofcomparingdatasourceandtargetbyusingaqueryjoinoperation,whichissometimesmoreefficientifthesourceandtargettablesareinthesamedatabaseandthecomparisoncanbepusheddown.

Figure11SCDType3DataFlow

Page 20: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Thesourcesareour tableand targetdimensiontable,whicharequeriedwithaleftouterjointodeterminewhethertherecordisneworexisting.Theexistingrecordsareupdatedusingthe transform,whilenewrecordsareinsertedwithsurrogatekeys.

SCDType4

SCDType4isoftenreferredtoasahistorytablebecauseitkeepsatableforcurrentrecordsandaseparatetableforhistoricalrecords.Someorallofthechangescanbekeptinthehistorytable,andsurrogatekeysfrombothtablesarereferencedfromthefacttables.

Thisemployee’spayratechangedfrom22to31on10/16/2014.ThenexttimeweloadthisdimensionasSCDType4,weupdatethecurrentemployeerecordEMP_PAY_RATEinthe table(seeTable8).Wethencreateanewrecordinthe tabletopreservethepreviousversionwiththecurrentdateasCREATE_DATE(seeTable9).

EMP_SK EMP_NUM EMP_LNAME EMP_FNAME EMP_PAY_RATE

328 2714 Meyer Veronica 31

Table8CurrentEmployeeRecordintheDIM_EMPLTable

EMP_SK EMP_NUM EMP_LNAME EMP_FNAME EMP_PAY_RATE CREATE_DATE

472 2714 Meyer Veronica 22 2014.10.16

Table9EndDatedEmployeeRecordintheDIM_EMPL_HISTTable,andNewCurrentRecord

Figure12illustratesanexampleofhowtoimplementSCDType4inaDataServicesdataflow.

Figure12SCDType4DataFlow

Page 21: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Thesourcesareour tableand targetdimensiontable,whicharequeriedwithaleftouterjointodeterminewhethertherecordisneworexisting.Theexistingrecordsareupdatedusingthe transform,andnewhistoricalrecordsareinsertedintothe table.Newemployeerecordsareinsertedintothe tablewithsurrogatekeys.

Late-ArrivingDimensionData

Normally,youloadthedimensiontablesfirstinastarschemaandthefacttableslastbecausethefacttablerecordsreferencethesurrogatekeysinthedimensiontables.Inreallife,however,dataintegrationsaren’tnecessarilysynchronizedinthecorrectorder.Sometimes,youmayhavefactrecordstoloadwherethecorrespondingdimensionrecordhasn’tyetarrived.

Consideranorderfacttablesituationinwhichtheordersarereadytobeinsertedintothefacttable,butthereisadelayinthelatestproductorcustomerdimensionrecordsbeingavailabletoloadintoyourDW.Ifyoulookupthesurrogatekey,you’llnotfindsomeofthesedimensionrecordsandthefactrecordcan’tbeinsertedwithaforeignkeyconstraint.Thisissometimescalledalate-arrivingdimensionrecord.

Atypicalwaytohandlethissituationistohaveadefaultrecordseededintoeachdimensiontablethathasabusinessmeaningof“unspecified”or“N/A,”andareservedsurrogatekeyvalueof0or1.Thisway,youcaninsertallfacttableseveniftheydon’tyethaveacorrespondingdimensionrecordloaded.Also,whenyouquerythedimensionalmodel,thefactsandmeasureswillcorrectlyaggregateandbedesignatedinthe“unspecified”or“N/A”categoryforanymissingdimensionrecords.

Page 22: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

1.5LoadingFactTables

Afteryourdimensiontableshavebeenprocessed,youcanthenloadyourfacttables.Rememberthatafacttableisessentiallyacross-referenceofallthedimensionsinthestarschema,consistingofauniquecombinationofsurrogatekeysaswellasmeasuresandfacts.

SurrogateKeyPipeline

Loadingafacttablerequiresderivingallofthesurrogatekeysforeachfactrecordtobeprocessed.Thisprocesscanbeseenasalogicalpipelinewherethefactrecordgoesthroughasequenceoflookupsagainstthedimensionsusingthenaturalbusinesskeystoobtainthesurrogatekeys.

Thiscantheoreticallybeaccomplishedinajoinoperationwithinthedatabase,butfacttabledatasetsaretypicallyverylarge,andthisapproachoftenwon’tperformaslargesetoperationswhenattemptingtojoin15+tablesinasinglequery.Alternatively,executingaseparatejoinqueryagainstthedatabaseforeachfactrecordincurssubstantialoverheadandisusuallynotthemostefficientwaytoderivesurrogatekeysandloadthefacttable.

ThepreferredwaytoimplementthesurrogatekeypipelineinafacttableloadwithinDataServicesistousethe functioncallwithinaqueryobject.The functionallowsseveraloptionsforcachinglookupdata.Thisallowshighlyefficientlookupsinmemory,aswellasallowingtightcontroloverexecutionplanandcachingoptions.

Thepreferredwaytoimplementthe functionistocreateanewfunctioncallbyright-clickingonthequeryinquestionandchoosingNEWFUNCTIONCALL,asshowninFigure13.

Page 23: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure13CreatingaNewFunctionCallintheQueryEditor

Createanewfunctioncallforeachdimensionlookup.SpecifytheLOOKUPTABLEandCACHESPECoptions:

:Cacheinmemory.

:Partialcachinginmemorywithadditionalcachingondemandasnecessary.

:Row-by-rowlookup;nocaching.

Select wheneverpossible(seeFigure14).

Figure14SelectingCacheOptionsforlookup_ext

Next,we’lllookatsomeexampledataflowsintypicalfacttableloadingscenarios.

Page 24: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

TransactionFactLoading

Atypicaltransactionfacttabledataflowusingthe functionwithalookupforeachsurrogatekeyisshowninFigure15.

Figure15MultipleCachedSurrogateKeyLookupsintheQueryEditor

Figure16showsasimpletransactionfactloadingdataflowwithsourcesalesdata,oursurrogatekeylookuppipelineasdescribedpreviously,andoursalesfacttableasthetarget.

Figure16SimpleTransactionFactProcessingDataFlow

PeriodicSnapshotandAggregateFactLoading

Aperiodicsnapshotoraggregatefacttabledataflowisverysimilartothetransactionfact,butyouaggregateyourmeasuresbyreportingperiods.ThemappedderivationtoperformthissummationisshowninFigure17.

Page 25: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure17AggregationofMeasuresforAggregatedFactTableLoad

AccumulatingSnapshotFactLoading

Loadinganaccumulatingsnapshotrequiresupdatingexistingfactrecordsaswellasinsertingnewrecords.Youcanuseaqueryobjectwithaleftouterjointhatcomparesthesourcefactrecordstothefacttabletodeterminewhethertherecordisneworneedstobeupdated,asshowninFigure18.

Figure18AccumulatingSnapshotFactTableDataFlow

Page 26: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

2IntegrationwithPOSSystems

ThenexttwosectionsshowDataServicesjobsbeingappliedtosolveintegrationrequirementsintwodifferentindustries.Let’sfirstexploretheretailindustrywhereintegrationsarebeingleveragedtofacilitatecustomerloyaltysolutionsthatincreaseconsumersales.

Alargeretailerdeterminedthatitsprimarykeyinteractionpointswithconsumerscamewhentheyinquiredaboutaproductonline.Thecompanyfounditcouldrealizeacompetitiveadvantageifitintegrateditsexistingmobilestorefrontandin-storeapplicationswithitspoint-of-sale(POS)datatodeterminewhichinquiriesdidn’tendinconversion.Withtheseinquiries,afollow-uptouchpointwiththeconsumercouldbecreatedtoincreaseconversionofproductinquiriesonlineintosales.

Theretailerdecidedtotieproductinquiry,currentlocation,andproximitytothestoretogenerateaconsumerproduct-specificdiscountthathadaveryshortexpiration.ThesolutionwouldinvolveDataServicesjobsandafunctiontoeffectthefollowing:

Inquirycapture

Storeproximity

Productavailabilitycheck

We’llexplorethesesolutionsinthefollowingsubsections.

Page 27: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

2.1InquiryCaptureJob

Thecompanyneedstocapturewhomadetheinquiry,whatproductswerelookedat,andwhattimetheywerelookedat.Thisinformationwillbemergedwithtransactionsfromthecompany’se-commercechanneltodeterminewhethertheinquiryresultedinasaleandtoidentifytheneareststorewithstocktothelocationwheretheinquirywasmade.

Toeffectthecreationoftheinquirycapturejob,thefollowingprimaryobjectsneedtobecreated:

Real-timejob

XSDXMLschemadefinitionfile

XMLSchemaFileFormatobject

DataflowfromsourceXMLmessagetoinquirytable

DataflowtotargetXMLmessage

Real-TimeJob

Createanewreal-timejob,asshowninFigure19.

Figure19CreationofNewReal-TimeJob

Insertadataflowbetweenthe and placeholdersasshowninFigure20.

Page 28: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure20Real-TimeJobLayout

XSDXMLSchemaDefinitionFile

Anytimeareal-timejobiscreated,oneXMLmessagesourceandoneXMLmessagetargetmustexistwithinoneortwoofitsdataflows.TocreateanXMLmessagesourceortargetfile,anXMLschemadefinitionfile(XSDfile)mustexist.

AnXSDfiledescribesthelayoutofanXMLdocument.Thisfiletypicallyisdenotedwithan.xsdfileextension.TocreateanXSDfile,youcanuseathird-partyeditororuseamethodthatcreatesatemporarydataflow.Here,weprovidethestepstocreateanXSDfilebyleveragingatemporarydataflow:

1. Dragthe tableontothedataflowworkspaceasbothasourceandatarget.Thenadda andconnectthem(Figure21).

Figure21TemporarilyConnectingTableasSource,Query,andTarget

2. Right-clickQUERYwithintheoutputschemaasshown,andchooseGENERATEXMLSCHEMA(Figure22).

3. AfterthenewXSDfilehasbeensaved,deletethetemporarysource,query,andtargetobjectsfromthedataflow.

Page 29: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure22ChoosingGenerateXMLSchemafromtheQueryOutputSchema

XMLSchemaFileFormatObject

Now,you’llusethecreatedXSDfile(s)asthestructureofanXMLSchemaFileFormatobject.

1. AftertheXSDfilehasbeencreated,navigatetotheLOCALOBJECTLIBRARY’sfileFORMATtab,right-click,andchooseNEW•XMLSCHEMA(Figure23).

Figure23ChoosingtoCreateaNewXMLSchema

2. Usingthedialogthatappears,specifythenewlycreatedXSDfilewithintheFILENAMEproperty,andsettheROOTELEMENTNAMEfieldto“Query”(seeFigure24)(orotherifanothermethodwasusedtocreatetheXSDfilethanthemethodspecifiedhere).

3. ClickOK,andtheXMLSchemaFileFormatisreadytobeusedwithinthefirstdataflow.

Page 30: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure24SpecifyingtheXMLSchemaFormat

DataFlowfromSourceXMLMessagetoInquiryTable

NowthattheXMLSchemaFormathasbeencreated,itcanbeusedonthefirstdataflow.

1. Dragitontotheworkspacearea.Upondropping,acontextmenuwillappearasshowninFigure25.EnterthenameforthesourcefileintheXMLFILEfield.

Figure25XMLFormatOptionsWhenDroppingontotheDataFlowWorkspace

2. AfterdroppingtheXMLSchemaformatontothedataflowworkspaceandspecifyinganame,thefollowingotherobjectsareadded(asshowninFigure26):

transform(withintheDataIntegratorTransformgroup)

targettable

Page 31: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure26DataFlow—XMLMessageSourcetoInquiryTable

DataFlowtoCreateTargetXMLMessage

Intheseconddataflow,atargetXMLmessageiscreated(seeFigure27).Thiswillbeusedtosendaresponsebacktotherequestorofthereal-timeservice.

Figure27TargetXMLMessage

Page 32: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

2.2IdentifyDiscountTargetsBatchJob

InquiriesthathavebeenconvertedtosalesneedtohavetheirrepresentativeinquiryrecordsupdatedwiththecorrespondingPOStransactionidentifierthatwascreatedinthee-commercePOSsystem.Toeffectthisupdate,thefollowingprocessesoccur:

Updatinginquirieswiththee-commercePOSidentifier

Identifyingthecloseststorestothesitewheretheonlineinquirywasmade

Identifyingwhetherthecloseststorehastheproductinstock

Thesethreeprocessesareimplementedwithinonejobthathastwodataflows,asshowninFigure28.

Figure28IdentifyingDiscountTargets

UpdatingInquirieswiththeE-CommercePOSIdentifier

Thefirstdataflowinthejobperformsapushed-downupdatetoaddPOStransactionidentifiersfromthee-commercePOSsystemtoidentifythoseinquiriesthathavebeenconvertedtoasalewithoutdiscount.Thematchingisperformedusingtheloyaltyidentifier,theproduct,andaparameterizedtimeintervalinwhichthesalecouldoccur.ThedataflowisshowinFigure29.

Page 33: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure29DataFlowThatUpdatestheInquirywithE-CommercePOSIdentifiers

Inquiryrecordsthatareunconverted;thatis,thoserecordswithoutane-commercePOStransactionidentifier,areevaluatedinthenextdataflowasshowninFigure30.

Figure30DataFlowtoDetermineProximateStoreandOn-HandQuantity

DetermineStoreProximityofUnconvertedSalesDataFlow

Usingthelatitudeandlongitudeoftheinquiry,theproximateSTOREIDiscomparedwithstoresintheregion.Todothis,thetwosetsoflatitudeandlongitudearepassedintoa transformthatimplementstheHaversineformula(aformulafirstpublishedintheearly1800stocalculatedistancebetweentwopointsonasphere).

Inthiscase,thesphereisearth,andthetwopointsarethepointlatitudeandlongitudeoftheInternetconnectionfromwhichtheinquirywasmadeandthestoreswithinthatregion.TheimplementationofthatformulaisshowninFigure31.

Page 34: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure31InputOptionsofUser-DefinedTransformsImplementingHaversineFormula

InListing1,you’llseethePythoncodeusedforthisformulatocalculatethedistancebetweentwogeographicpositionsthatwillbeusedtodeterminethecloseststoretotheinquiryposition.frommathimportasin,cos,radians,sin,sqrt

defhaversine(lat1,lon1,lat2,lon2):

#convertdecimaldegreestoradians

lon1=float(lon1)

lon2=float(lon2)

lat1=float(lat1)

lat2=float(lat2)

#convertdecimaldegreestoradians

lon1,lat1,lon2,lat2=map(radians,[lon1,lat1,lon2,lat2])

d_lon=lon2-lon1

d_lat=lat2-lat1

a=sin(d_lat/2)**2+cos(lat1)*cos(lat2)*sin(d_lon/2)**2

c=2*asin(sqrt(a))

#6367kmistheradiusoftheEarth

km=6367*c

returnkm

print‘AssignUDTinputrecordcolumnvaluestovariables’

p_lat1=record.GetField(u’LATITUDE’)

p_lat2=record.GetField(u’LATITUDE_1’)

p_long1=record.GetField(u’LONGITUDE’)

p_long2=record.GetField(u’LONGITUDE_1’)

print‘Makecalltodefinedhaversinefunction’

dKm=haversine(p_long1,p_lat1,p_long2,p_lat2)

print‘SetassignresultvaluetoUDToutputcolumn’

record.SetField(u’distanceKm’,unicode(dKm))

Listing1PythonCodetoApplytheHaversineFormula

Afterdeterminingthedistanceofstoreswithintheassociatedregionwithwhichtheinquirywasmade,thestoreIDwiththeleastdistanceisaddedtotheinquiryrecordasshowninFigure32.

Page 35: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure32PortionofDataFlowtoCaptureClosestStoretoInquirySite

Page 36: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

2.3ProductAvailabilityCheckLookup

Aftertheproximatestoreisfound,thestoreidentifierandproductidentifierarejoinedwiththecurrentstocksnapshottodeterminewhetherthatstorehason-handquantitybecausetheorganizationdoesn’twanttoincentivizediscountswithquickexpirationsforproductsthataren’tinstock.ThisprocessishighlightedinFigure33.Aftercheckingforapositivequantity(withaparameterizedmargin),thecloseststoreandin-stockindicatorareupdatedonthe table.

Figure33ProcesstoPerformIfThereIsOn-HandQuantityoftheProductattheStore

Page 37: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

2.4Results

Thiswasasolutionwithaveryaggressivedesignanddevelopmentcycle.DataServicesprovedagainthatit’sabletodelivertherequiredfunctionalityandperformance.Thedatathatisbeingprovisionedisvaluablebeyondtheoriginaloperationalfunctionofencouragingmoresalesandisnowbeingusedandanalyzedfortuningdiscountstoincreasecustomersalesandloyalty.

Page 38: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

3IntegrationwithSAPBW,SAPAPO,andSAPECC

Inthissection,wewilllookatthedistributionindustryandhowtouseDataServicestoreduceintegrationtimesbetweenSAPAdvancedPlanningandOptimization(SAPAPO)andSAPECC.

Alargeindustrialdistributioncompanyhadtocalculatethedeliverytimesofproductsbetweeninternalsitesaswellastocustomersitesasoneofitsbusinessprocesses.Therewerethreetypesofdeliverytimes:planneddeliverytime,actualdeliverytime,andaveragedeliverytime.Thedeliverytimeswerederivedfrompurchaseorderdataresidinginthecompany’sSAPECCsystem.Theresultsoftheplanneddeliverytimeandactualdeliverytimeweretobeusedbythecompany’sSAPAdvancedPlanningandOptimizing(SAPAPO)/ProjectSystem(SAPPS)tocalculateaproduct’ssafetystockandpipelineforecast.

Atahighlevel,theneedseemsstraightforwardandeasy,especiallyifviewedbackwards.ThebusinessusersneeddataaccessibleinSAPECC.ThedatacomesfromSAPAPO,whichisprovisionedfromDataServices.ThedatafromDataServicesisderivedfromSAPECC.Initially,dataisextractedfromSAPECC,deliverytimesarecalculatedinDataServices,andthenthedataisloadedintoSAPAPO.SAPAPOrunsaspartofSAPBusinessWarehouse(SAPBW),sotoeffecttheloadintoSAPAPO,weneedtoloaditintotheinstanceofSAPBWthatSAPAPOisrunning.AdditionalcalculationsaremadeinSAPAPOandthenextractedviaDataServicesandloadedintoSAPECC.

WhatseemedstraightforwardandeasywasattemptedbyadifferentExtraction,Transformation,andLoading(ETL)toolset.Thedetailsonthebusinessrequirementsforcalculatingdeliverytimesturnedouttobeverycomplex.Averagedeliverytimeswerecalculatedmonthly.Plannedandactualdeliverytimeswerecalculatedweeklyandforthepreviousweek.Thedeliverytimeswerecalculateddifferentlydependingonwhetherthetransactionwasinternalorexternal,whetherthevendorwasdomesticorforeign,whetherastatisticallyrelevantnumberofdatapointsexisted,andwhethertherewereoutliers.Asaresult,theETLtoolwasunabletointegratesystemsasneeded.Thedeliverytimecalculationtook23hoursandwasunreliable.

ThecompanyhadjustmadeaninvestmenttoimplementDataServices4.2,partlyfordatagovernanceandpartlytotakeoverprocesseslikethisthatnegativelyimpactthebusiness.

Page 39: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

AsolutionwascreatedthatleveragesintegrationwithSAPECCthroughextractorsandIDocmessages,andthatleveragesintegrationwithSAPAPOthroughInfoPackagesandBusinessApplicationProgrammingInterface(BAPI)functioncalls.Thefirstjob(Figure34)wassetuptorunonaweeklybasistoextractpurchaseorderdatafromSAPECCandcalculatedeliverytimes.

Figure34DataServicesJobforSAPECCExtractionandCalculation

SAPECCextractorswereusedtoextractthepurchaseorderdata.Whereextractorsweren’tavailableorsetup,suchasmaterialandvendordata,ABAPdataflowswereused.

Dataflowswerecreatedtocalculatedeliverytimesperbusinessrequirements.

AsecondjobwasthencreatedtoloadthecalculationsintoSAPAPO.SAPAPOthenusedthatdatatoruncalculationsonsafetystockandpipelineforecasts.AfterthecalculationstookplaceinSAPAPO,resultswereextractedwithaBAPIfunctioncall.DatawasthenloadedintoSAPECCusingIDocmessages(Figure35).

Page 40: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure35DataServicesJobforSAPAPOExtractionandSAPECCLoad

Page 41: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

3.1SAPECCExtraction

LeveragingtheSAPECCextractorsandtheirdeltaqueuesallowedforanefficientprocesstoextractthedatachanges.The transformwasusedinconjunctionwiththeextractorstomanagetheinsertsandupdatesintothestagingtables.ThedataflowisshowninFigure36.

Figure36SAPECCExtractor

Page 42: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

3.2DeliveryTimeCalculations

Thecalculationsrequiredtodeterminedeliverytimesinvolvedtakingayear’sworthofdataandretrievingaspecifiednumberofrecordsforeachmaterial-locationcombination.Tomakethesecalculationsatthedatalevelandnotindividuallyrowbyrow,wecreatedaseriesofdataflowsthatproducedcalculationsforallscenarios.Thescenariosweredifferentsourcesforinternalversusexternaltransactions,differentvolumesofrecordsfordomesticversusforeign,anddifferentoutlierfactorsforexternaldomesticversusexternalforeignversusinternaltransactions.Ifaminimumvolumeofrecordsdidn’texist,thentheresultswerereplacedwithadefaultvalue.

Afterthecalculationswerecompleted, transformswereusedtoassociatetheappropriateresultswiththematerial-locationcombinationbasedonbusinessrequirementscriteria.Figure37showstheseriesofcasestatementstomaptheresultsforeachgivenscenario.The transformsproceedingthetransformsmapthecalculatedresultsintotheproperoutputfields.

Figure37CalculationOutputDataFlow

Page 43: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

3.3SAPAPOInterfaces

Afterthecalculationsarecompleted,thedataneedstobeloadedintoSAPAPOandusedtohelpdeterminethesafetystockandpipeline.TheinterfacefromDataServicesintoSAPAPOusestheInfoPackageobject.AftertheSAPAPOcalculationsaremade,aDataServicesjobwillreadthosecalculatedvaluesfromSAPAPOusingaBAPIfunctioncall.

TousetheInfoPackage,acoupleofsetupswereneeded.ThedatastorecontainingtheconnectioninformationandmetadataissetupasaSAPBWtargetdatastore.DataServicesneededtobesetupasasourcesysteminSAPBW/SAPAPO.Oncesetupandactivated,it’savailableundertheexternalmetadatasources.

Figure38showsthedataflowrequiredtoconsolidatethedataandstructureproperlytoloadintoSAPAPO.

Figure38LoadingDatafromDataServicesintoSAPAPO

ToreaddatafromSAPAPO,wesetupadifferentdatastore,configuredasanSAPBWsourcesystem.WeusedaBAPIfunctioncalltoreaddataintoDataServicesasshowninFigure39.ThefunctioncallisastandardfunctionontheSAPAPO/SAPBWsystem,buttouseitinthisflow,theSAPAPOteamneededtosetuptheplanningareatohavethenecessarydataavailable.Thefunctioncallrequiredinputdatainanestedtable.Tosetupthenestedtablestructureanditerationcorrectly,weconnectedthesourcetablestothefunctionusingan

transform.

Page 44: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure39ReadingfromSAPAPOUsingaBAPIFunctionCall

Page 45: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

3.4SAPECCInterfaces

ThefinalstepofthisprocesswastoloadtheSAPAPOcalculationdataintoSAPECC.Toaccomplishthis,weusedthe messagetransformastheinterface(Figure40).TheIDocwascreatedandexposedbytheSAPECCteam.Onceexposed,it’savailabletoimportfromtheexternalmetadatasectionoftheSAPECCdatastore.TomatchtheinputstructureneededfortheIDoc,weusedan

transformtostructureanditeratethepayload;andweusedatransformtosettheheaderinformation.Theexpectedvolume

forthisloadingprocesswasexpectedtobe100,000individualmessages,soweoptedforIDocmessagesratherthanIDocfiles.

Figure40LoadingDataintoSAPECCwithIDocMessages

Page 46: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

3.5Results

Consistentdatawasanongoingchallengeforseveraldifferentreasons.Theoriginaldesigninthedevelopmentenvironmentusedasubsetofdatathatwaspresentinproduction.Thesmallervolumesofdatahidtheperformanceissueswewouldfindlaterintheinterfaces.TheprimaryinterfacewasseeninsendinglargevolumesofinputdatatotheBAPIfunction.

TheBAPIfunctionrequiredinputintheformofmaterialandlocationkeys.Ifamaterial-locationcombinationwasnotpresentintheSAPAPOsystem,thenanerrorwouldberaised.Westarteddownthepathofsendingonlyvalidmaterial-locationcombinations,butfoundthevolumeofindividualcombinationswas~20millionrecords.SendingthoserecordstothefunctionwouldbogdowntheSAPAPOsystemandeventualraiseanerror.Tryingtothrottlewhatwouldbesentbyaddinga looppreventedthebogerrors,butthetimerequiredtoprocessindividualrequestswasexcessive.Adjustingtheinputs,wefoundthatomittingthematerialandonlysendinglocationkeysgreatlyimprovedtheperformance.TherequestwasbeinghandledmoreinabatchmodeinSAPAPObypullingallmaterialsassociatedwiththatlocation,preventingtheneedtospendresourcesontheDataServicessideforgroupingthematerialandlocationkeys,andnothavingtoprocesseachindividualrecordrequestinSAPAPO.

WealsofoundthatSAPECCextractors,whilesetupforchangedatacapture(CDC),didn’tactuallyproducetheoperationtypes.Thecompany’steamspentseveralcyclestweakingsomeoftheextractors,butultimatelyfoundtheycouldaccomplishthenecessarydataextractionusingABAPdataflowsandpullingdirectlyfromthetables.

Theextractorswerealsocustomizedtofitthecompany’sneeds;asaresult,wefoundmultiplesourcesforthesamedatathatdidn’tmatch.Somecustomizeddatawasn’tcurrent.Namingconventionswerenotconsistentbetweensystems,or,evenworse,theyusedthesamenamefordifferentsetsofdata.Toremedythisintheproject,wespentseveralcyclesmappingthenecessarydata,validatingthatitwasthecorrectdata.Foralargerpictureremedy,thecompany’sdatagovernanceteamislookingintoleveragingSAPInformationSteward’sMetapediatohaveacommondatadictionary.

Startingwiththeoriginalparameterofimprovingperformanceof23hours,wewereabletorunthefullprocessin3hours.ThisincludestheprocessingtimeinSAPAPOandtheadditionofloadingdataintoSAPECC(whichwasn’tpartoftheoriginalprocess).Theresultshelpedjustifythecompany’smovetoone

Page 47: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

standardizedETLplatform,providingasystemconsistentandtightlyintegratedwithitsbusinessapplications,ratherthanhavingseveraldifferenttools.

Page 48: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Usage,Service,andLegalNotes

NotesonUsage

ThisE-Biteisprotectedbycopyright.BypurchasingthisE-Bite,youhaveagreedtoacceptandadheretothecopyrights.Youareentitledtousethise-bookforpersonalpurposes.Youmayprintandcopyit,too,butalsoonlyforpersonaluse.Sharinganelectronicorprintedcopywithothers,however,isnotpermitted,neitherasawholenorinparts.Ofcourse,makingthemavailableontheInternetorinacompanynetworkisillegal.

Fordetailedandlegallybindingusageconditions,pleaserefertothesectionLegalNotes.

Page 49: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

ServicePages

Thefollowingsectionscontainnotesonhowyoucancontactus.

PraiseandCriticism

WehopethatyouenjoyedreadingthisE-Bite.Ifitmetyourexpectations,pleasedorecommendit.Ifyouthinkthereisroomforimprovement,pleasegetintouchwiththeeditoroftheE-Bite:HareemShafi.

Wewelcomeeverysuggestionforimprovementbut,ofcourse,alsoanypraise!YoucanalsoshareyourreadingexperienceviaTwitter,Facebook,oremail.

TechnicalIssues

Ifyouexperiencetechnicalissueswithyoure-bookore-bookaccountatSAPPRESS,pleasefeelfreetocontactourreaderservice:[email protected].

AboutUsandOurProgram

Thewebsitehttp://www.sap-press.comprovidesdetailedandfirst-handinformationonourcurrentpublishingprogram.Here,youcanalsoeasilyorderallofourbooksande-books.InformationonRheinwerkPublishingInc.andadditionalcontactoptionscanalsobefoundathttp://www.sap-press.com.

Page 50: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

LegalNotes

Thissectioncontainsthedetailedandlegallybindingusageconditionsforthise-book.

CopyrightNote

Thispublicationisprotectedbycopyrightinitsentirety.AllusageandexploitationrightsarereservedbytheauthorandRheinwerkPublishing;inparticulartherightofreproductionandtherightofdistribution,beitinprintedorelectronicform.©2016byRheinwerkPublishingInc.,Boston(MA)

YourRightsasaUser

Youareentitledtousethise-bookforpersonalpurposesonly.Inparticular,youmayprintthee-bookforpersonaluseorcopyitaslongasyoustorethiscopyonadevicethatissolelyandpersonallyusedbyyourself.Youarenotentitledtoanyotherusageorexploitation.

Inparticular,itisnotpermittedtoforwardelectronicorprintedcopiestothirdparties.Furthermore,itisnotpermittedtodistributethee-bookontheInternet,inintranets,orinanyotherwayormakeitavailabletothirdparties.Anypublicexhibition,otherpublication,oranyreproductionofthee-bookbeyondpersonaluseareexpresslyprohibited.Theaforementioneddoesnotonlyapplytothee-bookinitsentiretybutalsotopartsthereof(e.g.,charts,pictures,tables,sectionsoftext).Copyrightnotes,brands,andotherlegalreservationsmaynotberemovedfromthee-book.

LimitationofLiability

Regardlessofthecarethathasbeentakenincreatingtexts,figures,andprograms,neitherthepublishernortheauthor,editor,ortranslatorassumeanylegalresponsibilityoranyliabilityforpossibleerrorsandtheirconsequences.

Page 51: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Imprint

ThisE-Biteisapublicationmanycontributedto,specifically:

EditorHareemShafiAcquisitionsEditorKellyGraceWeaverCopyeditorJulieMcNameeLayoutDesignGrahamGearyCoverDesignGrahamGearyProductionE-BookNicoleCarpenterTypesettingE-BookIII-satz,Husby(Germany)

ISBN978-1-4932-1309-2

©2016byRheinwerkPublishingInc.,Boston(MA)1stedition2016Allrightsreserved.Neitherthispublicationnoranypartofitmaybecopiedorreproducedinanyformorbyanymeansortranslatedintoanotherlanguage,withoutthepriorconsentofRheinwerkPublishing,2HeritageDrive,Suite305,Quincy,MA02171.

RheinwerkPublishingmakesnowarrantiesorrepresentationswithrespecttothecontenthereofandspecificallydisclaimsanyimpliedwarrantiesofmerchantabilityorfitnessforanyparticularpurpose.RheinwerkPublishingassumesnoresponsibilityforanyerrorsthatmayappearinthispublication.

“RheinwerkPublishing”andtheRheinwerkPublishinglogoareregisteredtrademarksofRheinwerkVerlagGmbH,Bonn,Germany.SAPPRESSisanimprintofRheinwerkVerlagGmbHandRheinwerkPublishing,Inc.

AllofthescreenshotsandgraphicsreproducedinthisE-Bitearesubjecttocopyright©SAPSE,Dietmar-Hopp-Allee16,69190Walldorf,Germany.

SAP,theSAPlogo,ABAP,Ariba,ASAP,Duet,hybris,SAPAdaptiveServerEnterprise,SAPAdvantageDatabaseServer,SAPAfaria,SAPArchiveLink,SAPBusinessByDesign,SAPBusinessExplorer(SAPBEx),SAPBusinessObjects,SAPBusinessObjectsWebIntelligence,SAPBusinessOne,SAPBusinessObjectsExplorer,SAPBusinessWorkflow,SAPCrystalReports,SAPd-code,SAPEarlyWatch,SAPFiori,SAPGanges,SAPGlobalTradeServices(SAPGTS),SAPGoingLive,SAPHANA,SAPJam,SAPLumira,SAPMaxAttention,SAPMaxDB,SAPNetWeaver,SAPPartnerEdge,SAPPHIRENOW,SAPPowerBuilder,SAPPowerDesigner,SAPR/2,SAPR/3,SAPReplicationServer,SAPSI,SAPSQLAnywhere,SAPStrategicEnterpriseManagement(SAPSEM),SAPStreamWork,SuccessFactors,Sybase,TwoGobySAP,andTheBest-RunBusinessesRunSAPareregisteredorunregisteredtrademarksofSAPSE,Walldorf,Germany.

AllotherproductsmentionedinthisE-Biteareregisteredorunregisteredtrademarksoftheirrespectivecompanies.

Page 52: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

TheDocumentArchive

TheDocumentArchivecontainsallfigures,tables,andfootnotes,ifany,foryourconvenience.

Page 53: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure1ExampleStarSchemaModelforOrders

Page 54: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure2StarSchemawithConformedDimensions

Page 55: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure3TypicalStarSchema

Page 56: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure4ExampleOrdersStarSchema

Page 57: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure5SCDType1DataFlow

Page 58: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure6ComparingtheSourceEMPLOYEETabletotheTargetDIM_EMPLTableontheEMP_PAY_RATEColumnOnlyUsingEMP_NUMastheKey

Page 59: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure7MapOperation

Page 60: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure8UsingtheKey_GenerationTransformtoGenerateaNewSurrogateKey

Page 61: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure9SCDType2DataFlow

Page 62: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure10HistoryPreservingforSCDType2withValidDates,CurrentColumn,andCompareColumn

Page 63: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure11SCDType3DataFlow

Page 64: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure12SCDType4DataFlow

Page 65: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure13CreatingaNewFunctionCallintheQueryEditor

Page 66: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure14SelectingCacheOptionsforlookup_ext

Page 67: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure15MultipleCachedSurrogateKeyLookupsintheQueryEditor

Page 68: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure16SimpleTransactionFactProcessingDataFlow

Page 69: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure17AggregationofMeasuresforAggregatedFactTableLoad

Page 70: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure18AccumulatingSnapshotFactTableDataFlow

Page 71: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure19CreationofNewReal-TimeJob

Page 72: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure20Real-TimeJobLayout

Page 73: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure21TemporarilyConnectingTableasSource,Query,andTarget

Page 74: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure22ChoosingGenerateXMLSchemafromtheQueryOutputSchema

Page 75: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure23ChoosingtoCreateaNewXMLSchema

Page 76: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure24SpecifyingtheXMLSchemaFormat

Page 77: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure25XMLFormatOptionsWhenDroppingontotheDataFlowWorkspace

Page 78: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure26DataFlow—XMLMessageSourcetoInquiryTable

Page 79: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure27TargetXMLMessage

Page 80: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure28IdentifyingDiscountTargets

Page 81: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure29DataFlowThatUpdatestheInquirywithE-CommercePOSIdentifiers

Page 82: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure30DataFlowtoDetermineProximateStoreandOn-HandQuantity

Page 83: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure31InputOptionsofUser-DefinedTransformsImplementingHaversineFormula

Page 84: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure32PortionofDataFlowtoCaptureClosestStoretoInquirySite

Page 85: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure33ProcesstoPerformIfThereIsOn-HandQuantityoftheProductattheStore

Page 86: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure34DataServicesJobforSAPECCExtractionandCalculation

Page 87: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure35DataServicesJobforSAPAPOExtractionandSAPECCLoad

Page 88: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure36SAPECCExtractor

Page 89: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure37CalculationOutputDataFlow

Page 90: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure38LoadingDatafromDataServicesintoSAPAPO

Page 91: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure39ReadingfromSAPAPOUsingaBAPIFunctionCall

Page 92: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Figure40LoadingDataintoSAPECCwithIDocMessages

Page 93: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99
Page 94: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99
Page 95: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99
Page 96: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99
Page 97: Bing Chen, James Hanck, Patrick Hanck, Scott Hertel, Allen …pdf.ebook777.com/034/B01JGOEDIA.pdf · 2017-09-10 · Integrating SAP HANA and Hadoop ISBN 978-1-4932-1293-4 | $12.99

Footnotes