high availability in sap hana multitier cost … availability in sap hana multitier cost optimized...

48
High Availability in SAP HANA Multitier Cost Optimized Scenario Case Study Cleber Paiva de Souza <[email protected]> Gabriel Dieterich Cavalcante <[email protected]> S-SYS Sistemas e Soluções Tecnológicas

Upload: hoangminh

Post on 03-Apr-2018

268 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

High Availability in SAP HANA Multitier Cost Optimized ScenarioCase Study

Cleber Paiva de Souza <[email protected]>Gabriel Dieterich Cavalcante <[email protected]>S-SYS Sistemas e Soluções Tecnológicas

Page 2: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Who we are

2

Page 3: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

• S-SYS was founded in 2014.

• SUSE Partner since foundation.

• Formed by professionals with

experience in all SUSE’s portfolio.

• Formed by certified professionals:

CLP, CLE, CNI, SAP HANA.

3

Page 4: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Aché Farmacêutica

4

Aché, pharmaceutical laboratory 100% Brazilian, founded in 1966, it comes strengthening and expanding your performance in five decades of a success history.

The Company is present in segments Under Prescription, Exempt Medications of Prescription, Generic, Dermatologic and Dermocosmetics. It also has Nutraceutical and Probiotics products.

Nowadays, the portfolio embraces more than 23 medical specialties: they are more than 316 marks in 762 presentations. By 2020, the company will launch more 184 products and 120 in the next three years.

Aché also has participation in Bionovis – which will be inaugurated soon –focused on research, development, production, distribution and commercialization of biotechnical medications.

1st Company to implement SAP APO on HANA in Latin America1st Pharmaceutical Company to implement SAP ECC on HANA Multitenant (MDC)

Page 5: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Example of performance gains (I)

5

Creation of Consolidated files from Accounts:Payable, Account:Recievables and Purchasings

ORACLE

HANA

Page 6: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Example of performance gains (II)

6

Invoice Processing job

Processing time (ms)TimeStamp

Page 7: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

SUSE for SAP

7

Page 8: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

8

Page 9: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

SUSE Linux Enterprise Server for SAP Applications

9

Latest release for x86-64 servers

Page 10: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

SAP HANA

10

Page 11: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

What’s SAP HANA• In-memory database• Advanced analytics• Versions:

• Basic, Platform and Enterprise Editions• Additional capabilities as add-ons

• Sold as appliance, TDI (Tailored datacenter integration) or in the cloud

• Appliance: by certified hardware partners• TDI: Using customer owned hardware

• Certified SAP HANA appliances without storage as listed in the SAP HANA Hardware Directory

• Certified storage systems as listed in SAP HANA Hardware Certification – Enterprise Storage

• Certified professional (E_HANAINS151) 11

Page 12: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

SAP HANA deployment types

12

Page 13: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Single Instance vs Multi-tenant (MDC)• One database per instance.• Available in all SPS (Support

Package Stack) versions.• Indexserver port 3NN15 (NN =

Instance number)• It ’s possible to convert to

MDC.

13

• Multiple databases per instance.• Only in >= SPS09. High isolation

only in >= SPS10.• Indexserver port 3NN40, 3NN43,

3NN99… (NN = Instance number)• Revert not possible, except by

export/import.• Creating tenants:

CREATE DATABASE DB1 SYSTEM USER PASSWORD "Linux123"

Page 14: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

SAP HANA System Replication Scenarios

14

Page 15: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

15

pacemaker

active / active

System Replication

SAP HANA (PROD) primary

SAP HANA (PROD)

secondary

vIP

PROD PROD

SAP HANA Failover Automation(System Replication)

Page 16: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

16

pacemaker

active / passive

System Replication

SAP HANA (PROD) primary

SAP HANA (PROD)

secondary+

QA/DEV

vIP

PROD PROD QA/DEV

SAP HANA Failover Automation(Cost Optimized Scenario)

Page 17: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

SAP HANA Failover Automation(Cost Optimized Scenario + Disaster Recovery)

17

pacemaker

active / passive

System Replication

SAP HANA (PROD) primary

SAP HANA (PROD)

secondary+

QA/DEV

vIP

PROD PROD QA/DEV

Site A Site B

PROD

SAP HANA (PROD) DR

System Replication (async)

Page 18: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Log replication modes• mode=sync (Full Sync):

• No data loss.• Available in SPS8 and higher.• Enabled by global.ini -> [system_replication] -> enable_full_sync OR

hdbnsutil -sr_fullsync --enable

• mode=sync:• Data loss can occur, when a takeover is executed while the secondary system is disconnected.• Timeout controlled by global.ini -> [system_replication] -> logshipping_timeout.

• mode=syncmem• Data loss when primary and secondary fails at same time and when a takeover is executed.• Timeout controlled by global.ini -> [system_replication] -> logshipping_timeout.

• mode=async• The most vulnerable to data loss.

18

Page 19: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Operation modes for system replication• deltadata_shipping:

• Establish a system replication (per default every 10 minutes).• Delta data shipping takes place in addition to the continuous log shipping.• Shipped redo log is not replayed on the secondary site.

• logreplay:• Does not require a delta data shipping.• Shipped redo log is continuously replayed on the secondary site.

19

Page 20: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

System replication states• UNKNOWN:

• Secondary did not connect to primary since last restart of primary

• INITIALIZING:• Initial data transfer in progress.

• SYNCING:• Secondary is syncing again.

• ACTIVE:• Initialization or sync with primary is complete.

• ERROR:• Error occurred on the connection.

• Monitoring via system tablehdbsql -U userkey 'select distinct REPLICATION_STATUS from SYS.M_SERVICE_REPLICATION'. 20

Page 21: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

21

System replication states

Page 22: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Parameters• SAP HANA parameters should be equal between nodes.• Starting with SAP HANA SPS 12 it is possible to automatically

replicate parameters from the primary to the secondary site by activating the following parameter:

• global.ini -> [inifile_checker] -> replicate = 'true'

22

Page 23: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Environment consideration

23

Page 24: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Configuration details• Production database: HANA01 / SID: HPA / Instance number: 01• QA/DEV database: HANA02 / SID: HQA / Instance number: 02• Stonith: IPMI (Production), but for LABs SBD• Cluster resources for SAP: ocf:suse:SAPHanaTopology,

ocf:suse:SAPHana and ocf:heartbeat:SAPDatabase• stonith-enabled="true” (Set as false during configuration)• no-quorum-policy="ignore" • stonith-action="poweroff"• PREFER_SITE_TAKEOVER=”false”

• False: try to restart service locally• True: prefer to takeover to remote site

• AUTOMATED_REGISTER=”false”• False: Former primary instance should NOT register after DUPLICATE_PRIMARY_TIMEOUT• True: Former primary instance should register after DUPLICATE_PRIMARY_TIMEOUT

24

Page 25: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

High Availability Configuration

25

Page 26: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Setup SUSE High Availability• Install SAP HANA• Initiate cluster on first node

• Interfaces using bond• Exclusive channel for cluster communication• Openais service disabled at boot• Configure name resolution in SAP HANA for replication network. • global.ini -> system_replication_hostname_resolution

sleha-init

• Change configurations before join second node• Communication via udpu or multicast• Secure channel cluster communication (secauth)• Check files to sync via csync2sleha-join

26

Page 27: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Procedures for SAP HANA System Replication• Certify that SAP HANA Host Agent is installed in all nodes:sapcontrol -nr 01 -function CheckHostAgentSAPHostAgent Installed

• Create an initial full backuphdbsql -u SYSTEM -i 01 “BACKUP DATA USING FILE (’COMPLETE_DATA_BACKUP’)”

• System replication will not work without an initial full backup• For multitenant all tenants should have a full backup

• Setup HDB user store for SAP HANA Host Agent:hdbsql -u SYSTEM -i 01 ”CREATE USER SLEHASYNC PASSWORD Password1”hdbsql -u SYSTEM -i 01 ”GRANT PUBLIC TO SLEHASYNC”

hdbsql -u SYSTEM -i 01 ”GRANT MONITORING TO SLEHASYNC”hdbsql -u SYSTEM -i 01 ”ALTER USER SLEHASYNC DISABLE PASSWORD LIFETIME”hdbsql -u SYSTEM -i 01 ”ALTER USER SLEHASYNC SET PARAMETER PRIORITY = '8'”hdbuserstore SET slehaloc localhost:30113 slehasync Password1

27

Page 28: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Details about SAP HANA System Replication• Validate query• SAP HANA software version of the secondary has to be equal or

newer than the version on the primary• Near zero downtime procedures take this in-place

• Replication information:hdbsql -U slehaloc "select STATUS from M_SERVICE_REPLICATION”

hdbcons -e hdbindexserver "replication info"

28

Page 29: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

srTakeover• Set a script to hook takeover operations, adjust SAP HANA

configurations to fit in memory:• When takeover occurs, hook is invoked on secondary:ALTER SYSTEM ALTER CONFIGURATION ('global.ini','SYSTEM') UNSET \

('memorymanager','global_allocation_limit') WITH RECONFIGURE

ALTER SYSTEM ALTER CONFIGURATION ('global.ini','SYSTEM') UNSET \('system_replication','preload_column_tables') WITH RECONFIGURE

• SAP Note 2196941:• https://launchpad.support.sap.com/#/notes/2196941/E

• https://wiki.scn.sap.com/wiki/display/ATopics/HOW+TO+SET+UP+SAPHanaSR+IN+THE+COST+OPTIMIZED+SAP+HANA+SR+SCENARIO+-+PART+I

29

Page 30: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Fixed Bugs

30

Page 31: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

SAPDatabase Resource Agent (sapdb.sh)• In Multi-Tenant Databases, SAP HANA start multiple indexserver daemons,

one for each databasehana01:/usr/sap/hostctrl/exe> saphostctrl -function GetDatabaseStatus -dbname HPA -dbtypehdb 01 slehaloc

Database Status: Warning

Component name: hdbdaemon (HDB Daemon), Status: Running (Running)

Component name: hdbcompileserver (HDB Compileserver), Status: Running (Running)

Component name: hdbnameserver (HDB Nameserver), Status: Running (Running)

Component name: hdbpreprocessor (HDB Preprocessor), Status: Running (Running)

Component name: hdbwebdispatcher (HDB Web Dispatcher), Status: Running (Running)

Component name: hdbindexserver (indexserver-DB1), Status: Running (Running)

Component name: hdbindexserver (indexserver-DB2), Status: Running (Running)

Component name: hdbindexserver (indexserver-DB3), Status: Running (Running)

Component name: hdbconnectivity (HDB Connectivity), Status: Running (connect possible)

Component name: hdbalertmanager (HDB Alertmanager), Status: Warning (alerts on database.) 31

Page 32: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

SAPDatabase Resource Agent (sapdb.sh)• Resource agent uses a command to get status from monitored

services, as specified in MONITOR_SERVICES• sapdb.sh was crafted to parse output based only on service name

and check if status is “Running”hana01:/usr/sap/hostctrl/exe> echo "$output" | grep -i

"Component[ ]*Name *[:=] *hdbindexserver (" |

sed 's/^.*Status *[:=] *\([A-Za-z][A-Za-z0-9_]*\).*$/\1/i'

Running

Running

Running

#

• Now we have N services with the same name, which causes a “Running Running Running ...” status

• Pull request sent to ClusterLabs (#858), merged on Oct 12, 201632

Page 33: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Troubleshooting

33

Page 34: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Stress testing• HANA Stress Tool

• https://github.com/Centiq/HanaStress• ./hanastress.py -v --host localhost -i 01 -u SLEHASYNC \

-p Password1 -g anarchy --tables 200 \

--rows 1000000 --threads 20

(This will create 200 tables with 1000000 rows of information each, using 20 threads)• After stress:- Remove database fragmentation:

ALTER SYSTEM RECLAIM DATAVOLUME 120 DEFRAGMENT

ALTER SYSTEM RECLAIM LOG- Force flushing log data to disk:

ALTER SYSTEM SAVEPOINT

34

Page 35: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Dectecting failures• /var/log/messages

• Inspect this file to see cluster actions about SAP HANA operations• All environment messages will be on it• HanaSR consists of editable scripts: /var/lib/ocf/• You can increase the verbosity of bash scripts by adding ”-x” to the shebang

• hb_report• hb_report -u root -f "2016/02/13 08:45" -t "2016/02/13 10:45" /tmp/inicident

• Monitor SAP HANA logs (look for *.crashdump files)• When SAP HANA have problems to run, or one of services hangs, it will write a

crashdump• The files are inside SAP HANA instance files folder

35

Page 36: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Failover SAP HANA

36

Page 37: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Operations on primary node• Stop primary node for maintenance:

• Usually is better make a graceful shutdown in Openais/Pacemakerrcopenais stop

• After a failure or back from maintenance: • It's better to rely on ad hoc sync first (HanaSR operations can be time-consuming to put

primary on SOK status and take PROD back, avoid split-brain) • Cleanup all replication status (primary node):hdbnsutil -sr_cleanup -force

37

Page 38: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Operations on primary node• Reestablishing synchronization:

• Register primary as slave node:hdbnsutil -sr_register --remoteHost=<hostname> \

--remoteInstance=<SID> --mode=syncmem --name=<SITENAME>• Start HANA, wait until sync, then stop it• Start OpenAIS

• It'll start Hana and replication, it will migrate HANA Production to primary node after a while

38

Page 39: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Operations on second node• Simple, because no resource reallocation is needed.• Stop/Start openais should be fine.

39

Page 40: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

DR (Disaster Recovery) considerations

40

Page 41: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Operations DR Node• Enable secondary node as sync sourcehdbnsutil -sr_enable

• Start DR replication:hdbnsutil -sr_register --remoteHost=<remote_hostnam> \

--remoteInstance=<SID> --mode=async \

--operationMode=<delta_shipping|logreplay> --name=<DR_NAME>

• Stop DR replication:hdbnsutil -sr_unregister --name=<DR_NAME>

• Avoid to shutdown DR node without unregister:• HanaSR use HANA commands to monitor the system replication• When a node is unreachable these commands can take more than one minute to output.• With this latency, some cluster operations can be timed out.

41

Page 42: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

References

42

Page 43: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Related SUSECon Presentations• CAS89126 - SAP HANA High Availability with SUSE HA: Tales of

clustering from the real world• FUT92716 - SUSE Linux Enterprise Server for SAP Applications

Roadmap• SPO98283 - Scaling Your SAP HANA Data Warehouse with Lenovo

and SUSE• TUT89539 - SUSE High Availability for SAP HANA TDI in a VMware

Environment• TUT90846 - Towards Zero Downtime - How to Maintain SAP HANA

System Replication Clusters• TUT91496 - Live Patching Demo: Keep SAP Running When Patching

the Linux Kernel43

Page 44: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

SAP Notes• SAP Note 2165547 - FAQ: SAP HANA Database Backup & Recovery

in an SAP HANA System Replication Landscape• SAP Note 1999880 - FAQ: SAP HANA System Replication• SAP Note 1702224 - Disable password lifetime for technical users• SAP Note 2222250 - FAQ: SAP HANA Workload Management

44

Page 45: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

References• SAP HANA Administration Guide:

http://help.sap.com/hana/SAP_HANA_Administration_Guide_en.pdf• http://help.sap.com/hana/SAP_HANA_Server_Installation_Guide_en.p

df• https://help.sap.com/saphelp_hanaplatform/helpdata/en/54/01f498b2c

84fb5b3bcdcbda948d991/content.htm• https://help.sap.com/saphelp_hanaplatform/helpdata/en/74/418e86b48

542ffb38b54072e0b66ce/content.htm• https://www.suse.com/docrep/documents/19rp0i23ol/sap_hana_sr_per

formance_optimized_scenario_11_sp4.pdf

45

Page 46: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Question?

46

Page 47: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime

Thank you!

47

Page 48: High Availability in SAP HANA Multitier Cost … Availability in SAP HANA Multitier Cost Optimized Scenario ... • Near zero downtime procedures take this in ... Towards Zero Downtime