SLES15SP1 Pacemaker Cluster on HLI for SAP HANA 2.0SP5 Patch 52 Based on Fibre Channel

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

SLES15SP1 Pacemaker Cluster for SAP HANA 2.0SPS5 on HANA Large Instances, based on Fibre Channel or iSCSI SBD 
with Dynamic DNS reconfiguration

 

 

 

 

RalfKlahr_0-1629718850568.png

 

The usage of sensitive wording in this document like master, slave, blacklist and whitelist is because of the underlying cluster software components. It is not the wording of the author and/or Microsoft.

 

Sponsor:

Juergen Thomas              Microsoft

 

Author:

Ralf Klahr                            Microsoft

 

Support:

Fabian Herschel                SUSE

Lars Pinne                         SUSE

Peter Schinagl                   SUSE

Ralitza Deltcheva              Microsoft

Ross Sponholtz                 Microsoft

Abbas Ali Mir                    Microsoft

Momin Qureshi                Microsoft

 

 

Overview

This document describes how to configure the Pacemaker Cluster in SLES15 SP1 to automate a SAP HANA database failover for HANA Large Instances, and also take care of the DNS IP reconfiguration . This document assumes that the consultant has good Linux, SAP HANA and Pacemaker knowledge.  References:

Azure HANA Large Instances control through Azure portal - Azure Virtual Machines | Microsoft Docs

SAP HANA System Replication Scale-Up - Performance Optimized Scenario | SUSE Linux Enterprise Server for SAP Applications 12 SP4

System Configuration

This illustration is a short overview of the system setup.

RalfKlahr_1-1629719012444.png

 

dbhlinode1

192.168.0.32

vIP 192.168.0.50

 

HN1

 

hanavip.hli.azure

dbhlinode2

192.168.1.32

vIP 192.168.1.50

It is very helpful to setup the ssh key exchange before starting the cluster configuration. The two nodes must trust each other anyway.

 

 

Requirements and Limitations

 

  1. The solution depends on an DNS server outside the cluster. This server needs to allow Dynamic DNS changes for host entries. It also needs to allow short Time-To-Live (ttl). See man page ocf_heartbeat_dnsupdate(7) and nsupdate(8)  for needed features.

 

  1. Allowing Dynamic DNS updates to HA cluster nodes needs to comply with   security standards.

 

  1. The DNS ttl needs to be aligned with expected recovery time objective (RTO).  That means around 30-60 seconds which might increase load on DNS server. Even a ttl of 30 seconds is by factors more than the usual ARP update done by IPAddr2.

 

  1. Applications might ignore ttl, but cache hostnames. In that case applications might get stuck in tcp_retries2. (TODO: test that)

 

  1. A failed stop action of the dnsupdate resource will cause a node fence.

 

  1. The solution is meant for the SAP HANA system replication scale-up performance-optimized scenario.

   Due to complexitiy active/active read-enabled is not targeted.

 

  1. Any administrative take-over of HANA primary should follow a described procedure. This procedure needs to rule-out anc chance of duplicate HANA primary. It further should ensure all clients will follow the take-over.

 

  1. Due two this requirements and limitations the solution is not a direct replacement for existing HA concepts, which are based on moving a single  IP address. The solution based on Dynamic DNS is more an automated disaster recovery (DR) solution.

 

 

Network VLAN Definition

In this setup we are using several VLAN’s for the communication to the storage and the outside world

VLAN 90

User- , Application Server and Corosync Ring0

VLAN 91

NFS for HANA shared and log-backup

VLAN 92

 

VLAN 93

iSCSI, HANA System Replication (HSR) and Corosync Ring1

VLAN 94

HANA Data Volume

VLAN 95

HANA Log Volume

 

Maintain the /etc/hosts (must be identical on both nodes)  !!!DO NOT declare the VIP here!!! This must be resolved from the DNS server!!!

[dbhlinode1~]# cat /etc/hosts

127.0.0.1   localhost localhost.azlinux.com

172.0.0.5    dnsvm01.nbagfh3o2y4urhrtvelgr2nvnf.bx.internal.cloudapp.net dnsvm01

172.0.0.6    dnsvm02.nbagfh3o2y4urhrtvelgr2nvnf.bx.internal.cloudapp.net dnsvm02

172.0.1.9    dnsvm03.nbagfh3o2y4urhrtvelgr2nvnf.bx.internal.cloudapp.net dnsvm03

 

192.168.0.32 dbhlinode1.hli.azure dbhlinode1 node1    #VLAN90

192.168.1.32 dbhlinode2.hli.azure dbhlinode2 node2    #VLAN90

 

10.25.91.51  dbhlinode1-nfs        #VLAN91

10.23.91.51  dbhlinode2-nfs        #VLAN91

 

10.25.95.51  dbhlinode1-log        #VLAN95

10.23.95.51  dbhlinode2-log        #VLAN95

 

10.25.94.51  dbhlinode1-data       #VLAN94

10.23.94.51  dbhlinode2-data       #VLAN94

 

10.25.93.51  dbhlinode1-iscsi      #VLAN93

10.23.93.51  dbhlinode2-iscsi      #VLAN93

 

HANA System Replication

 

A much better option is to create a performance optimized scenario where the data base can be switched over directly. Only this scenario is described here in this document. In this case we recommend installing one cluster for the QAS system and a separate cluster for the PRD system. Only in this case it is possible to test all components before it goes into production.

 

RalfKlahr_2-1629719124330.png

This process is build of the SUSE description on page:

SAP HANA System Replication Scale-Up - Performance Optimized Scenario | SUSE Linux Enterprise Server for SAP Applications 12 SP4

 

This are the actions to execute on node1 (primary)

 

Make sure, that the database logmode is set to normal

su - hn1adm

hdbsql -u system -p <password> -i 00 "select value from "SYS"."M_INIFILE_CONTENTS" where key='log_mode'"

 

VALUE

"normal"

 

SAP HANA system replication will only work after initial backup has been performed. The following command will create an initial backup in the /tmp/ directory. Please select a propper Backup filesystem for the data base.

hdbsql -u SYSTEM -d SYSTEMDB \

   "BACKUP DATA FOR FULL SYSTEM USING FILE ('/tmp/backup')"

 

Backup files were created

ls -l /tmp

total 2031784

-rw-r----- 1 hn1adm sapsys     155648 Oct 26 23:31 backup_databackup_0_1

-rw-r----- 1 hn1adm sapsys   83894272 Oct 26 23:31 backup_databackup_2_1

-rw-r----- 1 hn1adm sapsys 1996496896 Oct 26 23:31 backup_databackup_3_1

 

 

Backup all database containers of this database.

hdbsql -i 00 -u system -p <password> -d SYSTEMDB "BACKUP DATA USING FILE ('/tmp/sydb')"

 

hdbsql -i 00 -u system -p <password> -d SYSTEMDB "BACKUP DATA FOR HN1 USING FILE ('/tmp/rh2')"

 

Enable the HSR process on the source system

hdbnsutil -sr_enable --name=DC1

nameserver is active, proceeding ...

successfully enabled system as system replication source site

done.

 

 

check the status of the primary system

hdbnsutil -sr_state

 

System Replication State

~~~~~~~~~~~~~~~~~~~~~~~~

online: true

mode: primary

operation mode: primary

site id: 1

site name: DC1

 

is source system: true

is secondary/consumer system: false

has secondaries/consumers attached: false

is a takeover active: false

 

Host Mappings:

~~~~~~~~~~~~~~

Site Mappings:

~~~~~~~~~~~~~~

DC1 (primary/)

Tier of DC1: 1

Replication mode of DC1: primary

Operation mode of DC1:

 

done.

 

 

 

 

This are the actions to execute on node2 (secondary)

Stop the database

su – hn1adm

sapcontrol -nr 00 -function StopSystem

 

(SAP HANA2.0 only) Copy the SAP HANA system PKI SSFS_HN1.KEY and SSFS_HN1.DAT files from primary node to secondary node.

scp /usr/sap/HN1/SYS/global/security/rsecssfs/key/SSFS_HN1.KEY root@dbhlinode2:/usr/sap/HN1/SYS/global/security/rsecssfs/key/SSFS_HN1.KEY

 

scp /usr/sap/HN1/SYS/global/security/rsecssfs/data/SSFS_HN1.DAT rootdbhlinode2:/usr/sap/HN1/SYS/global/security/rsecssfs/data/SSFS_HN1.DAT

 

 

 

Log Replication Mode

Describtion

Synchronous in-memory (default)

Synchronous in memory (mode=syncmem) means the log write is considered as successful, when the log entry has been written to the log volume of the primary and sending the log has been acknowledged by the secondary instance after copying to memory.

When the connection to the secondary system is lost, the primary system continues transaction processing and writes the changes only to the local disk.

Data loss can occur when primary and secondary fail at the same time as long as the secondary system is connected or when a takeover is executed, while the secondary system is disconnected. This option provides better performance because it is not necessary to wait for disk I/O on the secondary instance, but is more vulnerable to data loss.

 

Synchronous

Synchronous (mode=sync) means the log write is considered as successful when the log entry has been written to the log volume of the primary and the secondary instance.

 

When the connection to the secondary system is lost, the primary system continues transaction processing and writes the changes only to the local disk.

 

No data loss occurs in this scenario as long as the secondary system is connected. Data loss can occur, when a takeover is executed while the secondary system is disconnected.

 

Additionally, this replication mode can run with a full sync option. This means that log write is successful when the log buffer has been written to the log file of the primary and the secondary instance. In addition, when the secondary system is disconnected (for example, because of network failure) the primary systems suspends transaction processing until the connection to the secondary system is reestablished.No data loss occurs in this scenario. You can set the full sync option for system replication only with the parameter [system_replication]/enable_full_sync). For more information on how to enable the full sync option, see Enable Full Sync Option for System Replication.

Asynchronous

Asynchronous (mode=async) means the primary system sends redo log buffers to the secondary system asynchronously. The primary system commits a transaction when it has been written to the log file of the primary system and sent to the secondary system through the network. It does not wait for confirmation from the secondary system.

This option provides better performance because it is not necessary to wait for log I/O on the secondary system. Database consistency across all services on the secondary system is guaranteed. However, it is more vulnerable to data loss. Data changes may be lost on takeover.

 

Source SAP Help Portal      https://help.sap.com/

 

 

As (hn1adm)

hdbnsutil -sr_register --remoteHost=dbhlinode1 --remoteInstance=00 --replicationMode=syncmem --name=DC2

 

adding site ...

--operationMode not set; using default from global.ini/[system_replication]/operation_mode: logreplay

nameserver node2:30001 not responding.

collecting information ...

updating local ini files ...

done.

 

Start the DB

sapcontrol -nr 00 -function StartSystem

 

hdbnsutil -sr_state

 

System Replication State

~~~~~~~~~~~~~~~~~~~~~~~~

online: true

mode: syncmem

operation mode: logreplay

site id: 2

site name: DC2

 

is source system: false

is secondary/consumer system: true

has secondaries/consumers attached: false

is a takeover active: false

active primary site: 1

 

primary primarys: node1

Host Mappings:

~~~~~~~~~~~~~~

node2 -> [DC2] node2

node2 -> [DC1] node1

 

Site Mappings:

~~~~~~~~~~~~~~

DC1 (primary/primary)

    |---DC2 (syncmem/logreplay)

 

Tier of DC1: 1

Tier of DC2: 2

 

Replication mode of DC1: primary

Replication mode of DC2: syncmem

Operation mode of DC1: primary

Operation mode of DC2: logreplay

 

Mapping: DC1 -> DC2

done.

 

 

 

It is also possible to get more information on the replication status:

hn1adm@node1: > cdpy

hn1adm@node1: > python systemReplicationStatus.py

| Database | Host       | Port  | Service Name | Volume ID | Site ID | Site Name | Secondary  | Secondary | Secondary | Secondary | Secondary     | Replication | Replication | Replication    |

|          |            |       |              |           |         |           | Host       | Port      | Site ID   | Site Name | Active Status | Mode        | Status      | Status Details |

|          |            |       |              |           |         |          

| SYSTEMDB | dbhlinode2 | 30301 | nameserver   |         1 |       2 | SITE2     | dbhlinode1 |     30301 |         1 | SITE1     | YES           | SYNC        | ACTIVE      |                |

| HN1      | dbhlinode2 | 30307 | xsengine     |         3 |       2 | SITE2     | dbhlinode1 |     30307 |         1 | SITE1     | YES           | SYNC        | ACTIVE      |                |

| NW1      | dbhlinode2 | 30340 | indexserver  |         2 |       2 | SITE2     | dbhlinode1 |     30340 |         1 | SITE1     | YES           | SYNC        | ACTIVE      |                |

| HN1      | dbhlinode2 | 30303 | indexserver  |         2 |       2 | SITE2     | dbhlinode1 |     30303 |         1 | SITE1     | YES           | SYNC        | ACTIVE      |                |

 

status system replication site "1": ACTIVE

overall system replication status: ACTIVE

 

Local System Replication State

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 

mode: PRIMARY

site id: 2

site name: SITE2

 

 

Network Setup for HANA System Replication

To be sure that the replication traffic is using the right VLAN for the replication it is important to configure it properly in the global.ini. If you skip this step HANA will use the Access VLAN for the replication what you might not like.

Examples

The following examples show the host name resolution configuration for system replication to a secondary site. Three distinct networks can be identified:

  • Public network with addresses in the range of 10.0.1.*
  • Network for internal SAP HANA communication between hosts at each site: 192.168.1.*
  • Dedicated network for system replication: 10.5.1.*

In the first example, the [system_replication_communication]listeninterface parameter has been set to .global and only the hosts of the neighboring replicating site are specified.

In the following example, the [system_replication_communication]listeninterface parameter has been set to .internal and all hosts (of both sites) are specified.

 

RalfKlahr_3-1629719189647.png

 

 

Source SAP AG SAP HANA HRS Networking

 

Since we now have the additional VLAN available and can setup the node to node communication directly on the VLAN 93 with MTU of 9000 we do not have to go over the user VLAN anymore. Modify the global.ini file to create a dedicated network for system replication; the syntax for this is as follows:

vi global.ini

 

[system_replication_communication]

listeninterface = .internal

 

[system_replication_hostname_resolution]

10.25.93.51 = dbhlinode1

10.23.93.51 = dbhlinode2

Modify the DNS Server:

Reference and general documentation:

The Domain Name System | Administration Guide | SUSE Linux Enterprise Server 15 SP2

Setting Up IP Relocation via DNS Update | Geo Clustering Guide | SUSE Linux Enterprise High Availability Extension 15 SP2

 

The sesolv.conf must point to the “localhost.”

 

dnsvm01:~ # cat /etc/resolv.conf

### /etc/resolv.conf is a symlink to /var/run/netconfig/resolv.conf

### autogenerated by netconfig!

#

# Before you change this file manually, consider to define the

# static DNS configuration using the following variables in the

# /etc/sysconfig/network/config file:

#     NETCONFIG_DNS_STATIC_SEARCHLIST

#     NETCONFIG_DNS_STATIC_SERVERS

#     NETCONFIG_DNS_FORWARDER

# or disable DNS configuration updates via netconfig by setting:

#     NETCONFIG_DNS_POLICY=''

#

# See also the netconfig(8) manual page and other documentation.

#

### Call "netconfig update -f" to force adjusting of /etc/resolv.conf.

search reddog.microsoft.com

search reddog.microsoft.com

nameserver 127.0.0.1

nameserver 172.0.0.10

 

Add the dynamic DNS server to the HLI nodes

dbhlinode1:~ # yast dns edit nameserver2=172.0.0.5

dbhlinode2:~ # yast dns edit nameserver3=172.0.1.9

 

dbhlinode1:~ # yast dns list                    

 

DNS Configuration Summary:

* Hostname: dbhlinode1.hli.azure

* Name Servers: 172.0.0.10, 172.0.0.5

 

 

 

create a tsig security key on the DNS server

 

dnsvm01:~ # tsig-keygen -a hmac-md5 hli.azure > ddns_key.txt

 

dnsvm01:~ # cat ddns_key.txt

key "hli.azure" {

        algorithm hmac-md5;

        secret "j2SsKLzBs/AMfzY0bHHqUA==";

};

 

Define the created key in the named.conf

dnsvm01:/var/log # vi /etc/named.conf

key "hli.azure" {

        algorithm hmac-md5;

        secret "j2SsKLzBs/AMfzY0bHHqUA==";

};

 

options {

        directory "/var/lib/named";

        managed-keys-directory "/var/lib/named/dyn/";

        dump-file "/var/log/named_dump.db";

        statistics-file "/var/log/named.stats";

        listen-on-v6 { any; };

        #allow-query { 127.0.0.1; };

        notify no;

        disable-empty-zone "1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.IP6.ARPA";

        include "/etc/named.d/forwarders.conf";

};

zone "." in {

        type hint;

        file "root.hint";

};

zone "localhost" in {

        type master;

        file "localhost.zone";

};

zone "0.0.127.in-addr.arpa" in {

        type master;

        file "127.0.0.zone";

};

zone "0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa" in {

        type master;

        file "127.0.0.zone";

};

include "/etc/named.conf.include";

zone "example.com" in {

        file "master/example.com";

        type master;

};

zone "hli.azure" {

        type master;

        file "/var/lib/named/master/hli.azure.hosts";

        };

 

Restart the named to see if the changes are okay.

dnsvm01:/var/log # systemctl restart named.service 

if no error occurs the change of the named was successful.

 

The next change is to allow the DNS entry to be changed.

 

zone "hli.azure" {

        type master;

        file "/var/lib/named/master/hli.azure.hosts";

        allow-update { key hli.azure; };

        };

 

Restart the named to check

dnsvm01:/var/log # systemctl restart named.service

if no error occurs the change of the named was successful.

 

The config on the DNS server is done.

Now we login to the DB server

dbhlinode1:~ # nslookup hanavip.hli.azure

Server:         172.0.0.10

Address:        172.0.0.10#53

Non-authoritative answer:

Name:   hanavip.hli.azure

Address: 192.168.0.32

 

dbhlinode1:~ # host hanavip.hli.azure

hanavip.hli.azure has address 192.168.0.32

 

now let’s try to change the DNS entry.

dbhlinode1:~ # nsupdate

> server dnsvm01

> key hli.azure j2SsKLzBs/AMfzY0bHHqUA==

> zone hli.azure

> update delete hanavip.hli.azure A

> update add hanavip.hli.azure 5 A 192.168.0.50

> send

> ^C

Check the result.

dbhlinode1:~ # nslookup hanavip.hli.azure

Server:         172.0.0.10

Address:        172.0.0.10#53

Non-authoritative answer:

Name:   hanavip.hli.azure

Address: 192.168.1.32

 

It seems to work pretty well……..

 

create the dns-key files on each node.

dbhlinode1:~ # mkdir /etc/ddns

dbhlinode2:~ # mkdir /etc/ddns

dbhlinode1:~ # vi /etc/ddns/DNS_update_05.key

key "hli.azure" {

        algorithm hmac-md5;

        secret "j2SsKLzBs/AMfzY0bHHqUA==";

};

 

dbhlinode2:~ # vi /etc/ddns/DNS_update_06.key

key "hli.azure" {

        algorithm hmac-md5;

        secret "YKJCN8/x/r0zzJOkpOaikA==";

};

dbhlinode1:~ # chmod 755 /etc/ddns/DNS_update*.key

dbhlinode1:~ # scp /etc/ddns/DNS* dbhlinode2:/etc/ddns/

OS Preparation for the cluster installation

Install the Cluster packages (on all nodes)

Dbhlinode1:~ # zypper in ha_sles

 

S  | Name                | Summary           | Type

---+---------------------+-------------------+--------

i  | ha_sles             | High Availability | pattern

i+ | patterns-ha-ha_sles | High Availability | package

 

dbhlinode1:~ # zypper in SAPHanaSR SAPHanaSR-doc

 

S  | Name         | Summary                        | Type

---+--------------+---------------------------------------------------

i+ | SAPHanaSR    | Resource agents to control the HANA database in system

i+ | SAPHanaSR-doc| Setup Guide for SAPHanaSR                                               

 

 

Create and exchange the SSH keys

[dbhlinode1~]# ssh-keygen -t rsa -b 1024

[dbhlinode2~]# ssh-keygen -t rsa -b 1024

 

[dbhlinode1~]# ssh-copy-id -i /root/.ssh/id_rsa.pub dbhlinode1

[dbhlinode1~]# ssh-copy-id -i /root/.ssh/id_rsa.pub dbhlinode2

[dbhlinode2~]# ssh-copy-id -i /root/.ssh/id_rsa.pub dbhlinode1

[dbhlinode2~]# ssh-copy-id -i /root/.ssh/id_rsa.pub dbhlinode2

 

Disable selinux on both nodes

[dbhlinode1~]# vi /etc/selinux/config

...

SELINUX=disabled

 

[dbhlinode2 ~]# vi /etc/selinux/config

...

SELINUX=disabled

 

Reboot the servers

[dbhlinode1~]# sestatus

SELinux status:                disabled

 

Configure NTP

It is a very bad idea to have a different time or time zone on the cluster nodes.

Therefore, it is absolutely mandatory to configure NTP!

vi /etc/chrony.conf

# Use public servers from the pool.ntp.org project.

# Please consider joining the pool (http://www.pool.ntp.org/join.html).

server 0.rhel.pool.ntp.org iburst

 

systemctl enable chronyd

systemctl start chronyd

 

chronyc tracking

Reference ID    : CC0BC90A (voipmonitor.wci.com)

Stratum         : 3

Ref time (UTC)  : Thu Jan 28 18:46:10 2021

 

chronyc sources

210 Number of sources = 8

MS Name/IP address         Stratum Poll Reach LastRx Last sample

===============================================================================

^+ time.nullroutenetworks.c>     2  10   377  1007  -2241us[-2238us] +/-   33ms

^* voipmonitor.wci.com           2  10   377    47   +956us[ +958us] +/-   15ms

^- tick.srs1.ntfo.org            3  10   177   801  -3429us[-3427us] +/-  100ms

 

Update the System

First install the latest updates on the system before we start to install the SBD device.

 

[dbhlinode1:~ # zypper up


Install the OpenIPMI and ipmitools on all nodes

node1:~ # zypper install OpenIPMI ipmitools

 

The default linux watchdog which will be installed during the installation is the iTCO watchdog wich is not supported by UCS and HPE SDFlex systems. Therefore, this watchdog must be disabled.

 

 

The wrong watchdog is installed and loaded on the system:

dbhlinode1:~ # lsmod |grep iTCO

iTCO_wdt               13480  0

iTCO_vendor_support    13718  1 iTCO_wdt

Unload the wrong driver from the environment:

dbhlinode1:~ # modprobe -r iTCO_wdt iTCO_vendor_support

dbhlinode2:~ # modprobe -r iTCO_wdt iTCO_vendor_support

 

Implementing the Python Hook SAPHanaSR

This step must be done on both sites. SAP HANA must be stopped to change the global.ini and allow SAP HANA to integrate the HA/DR hook script during start.

-Install the HA/DR hook script into a read/writable directory

-Integrate the hook into global.ini (SAP HANA needs to be stopped for doing that offline)

-Check integration of the hook during start-up

Use the hook from the SAPHanaSR package (available since version 0.153). Optionally copy it to your preferred directory like /hana/share/myHooks. The hook must be available on all SAP HANA cluster nodes.

 

Stop the HANA system – both systems

sapcontrol -nr <instanceNumber> -function StopSystem

Add the HR provider part to the global.ini – both nodes

[ha_dr_provider_SAPHanaSR]

provider = SAPHanaSR

path = /usr/share/SAPHanaSR

execution_order = 1

 

[trace]

ha_dr_saphanasr = info

 

 

 

Allowing <sidadm> to access the Cluster – on both nodes

The current version of the SAPHanaSR python hook uses the command sudo to allow the <sidadm> user to access the cluster attributes. In Linux you can use visudo to start the vi editor for the /etc/sudoers configuration file.

##

## User privilege specification

##

root ALL=(ALL) ALL

# SAPHanaSR-ScaleUp entries for writing srHook cluster attribute

hn1adm ALL=(ALL) NOPASSWD: /usr/sbin/crm_attribute -n hana_hn1_site_srHook_*

 

copy the sudoers to the second node

dbhlinode1:~ # scp /etc/sudoers dbhlinode2:/etc/

IPMI Watchdog

 

Test if the ipmi service is started. It is important that the IPMI Timer is not running. The timer management will be done from the SBD pacemaker service.

dbhlinode2:~ # ipmitool mc watchdog get

Watchdog Timer Use:     BIOS FRB2 (0x01)

Watchdog Timer Is:      Stopped

Watchdog Timer Logging: On

Watchdog Timer Action:  No action (0x00)

Pre-timeout interrupt:  None

Pre-timeout interval:   0 seconds

Timer Expiration Flags: None (0x00)

Initial Countdown:      0.0 sec

Present Countdown:      0.0 sec

 

!!! Information!!! If this command is reporting an error It might be that IPMI is not enabled in BMC.

BMC Config on the HPE SDFlex

!!! HPE Start à

Per the Superdome Flex administration guide = https://support.hpe.com/hpesc/public/docDisplay?docId=a00038167en_us

 

All IPMI watchdog settings are done through the o.s., with the exception of 2 commands in the RMC command line:

 

show ipmi_watchdog

This confirms the current status of the IMPI_WATCHDOG status

 

set ipmi_watchdog (disabled | os_managed)

This either disables the ipmi_watchdog status in the RMC or allows the o.s. to manage it

 

From the hardware side, our recommendation is to go into the RMC command line and run the command:

show ipmi_watchdog

 

If it shows as disabled, turn it on using the command:

set ipmi_watchdog os_managed

 

After making the changes, you may need to reboot the o.s. for it to detect the change.

 

Also, confirm if the HPE foundation software for the Superdome Flex is installed in linux as it is mandatory and can cause problems if its not installed yet

It can be confirmed with the command:

# rpm -qa | grep foundation

ß HPE Stop !!!

 

 

 

By default the required device is /dev/watchdog will not be created.

No watchdog device was created

dbhlinode1:~ # ls -l /dev/watchdog

ls: cannot access /dev/watchdog: No such file or directory

 

Configure the IPMI watchdog

dbhlinode1:~ # mv /etc/sysconfig/ipmi /etc/sysconfig/ipmi.org

dbhlinode1:~ # vi /etc/sysconfig/ipmi

IPMI_SI=yes

DEV_IPMI=yes

IPMI_WATCHDOG=yes

IPMI_WATCHDOG_OPTIONS="timeout=20 action=reset nowayout=0 panic_wdt_timeout=15"

IPMI_POWEROFF=no

IPMI_POWERCYCLE=no

IPMI_IMB=no

 

dbhlinode1:~ # scp /etc/sysconfig/ipmi dbhlinode2:/etc/sysconfig/ipmi

 

Enable and start the ipmi service.

[dbhlinode1~]# systemctl enable ipmi

Created symlink from /etc/systemd/system/multi-user.target.wants/ipmi.service to /usr/lib/systemd/system/ipmi.service.

 

[dbhlinode1~]# systemctl start ipmi

 

[dbhlinode2 ~]# systemctl enable ipmi

Created symlink from /etc/systemd/system/multi-user.target.wants/ipmi.service to /usr/lib/systemd/system/ipmi.service.

 

[dbhlinode2 ~]# systemctl start ipmi

 

 

 

Now the IPMI service is started and the device /dev/watchdog is created – But the timer is still stopped. Later the SBD will manage the watchdog reset and enables the IPMI timer!!:

Check that the /dev/watchdog exists but is not in use.

dbhlinode2:~ # ipmitool mc watchdog get

Watchdog Timer Use:     SMS/OS (0x04)

Watchdog Timer Is:      Stopped

Watchdog Timer Logging: On

Watchdog Timer Action:  No action (0x00)

Pre-timeout interrupt:  None

Pre-timeout interval:   0 seconds

Timer Expiration Flags: None (0x00)

Initial Countdown:      20.0 sec

Present Countdown:      20.0 sec

 

The /dev/watchdog device must be existing now but still no one is managing the watchdog device.

[dbhlinode1~]# ls -l /dev/watchdog

crw------- 1 root root 10, 130 Nov 28 23:12 /dev/watchdog

 

[dbhlinode1~]# lsof /dev/watchdog

[dbhlinode1~]#

SBD configuration

Make sure the iSCSI or FC disk is visible on both nodes. This example will use three iSCSI based SBD devices.

The LUN-ID must be identically on all nodes!!!

Check the iSCIS SBD disks – you must see the identical LUN’s on both systems.

 

List the available iSCSI LUN’s on the storage systems

dbhlinode2:~ # systemctl start iscsid

 

dbhlinode2:~ # iscsiadm -m discovery -t sendtargets -p 10.23.93.41

10.23.93.41:3260,1474 iqn.1992-08.com.netapp:sn.69fd49d5925911eb8cfa00a098e0061d:vs.34

10.23.93.42:3260,1476 iqn.1992-08.com.netapp:sn.69fd49d5925911eb8cfa00a098e0061d:vs.34

10.23.93.32:3260,1475 iqn.1992-08.com.netapp:sn.69fd49d5925911eb8cfa00a098e0061d:vs.34

10.23.93.31:3260,1473 iqn.1992-08.com.netapp:sn.69fd49d5925911eb8cfa00a098e0061d:vs.34



dbhlinode2:~ #
iscsiadm -m discovery -t sendtargets -p 10.25.93.41

10.25.93.41:3260,1355 iqn.1992-08.com.netapp:sn.a4bf93c0925211eb940300a098f7d6a5:vs.29

10.25.93.42:3260,1357 iqn.1992-08.com.netapp:sn.a4bf93c0925211eb940300a098f7d6a5:vs.29

10.25.93.32:3260,1356 iqn.1992-08.com.netapp:sn.a4bf93c0925211eb940300a098f7d6a5:vs.29

10.25.93.31:3260,1354 iqn.1992-08.com.netapp:sn.a4bf93c0925211eb940300a098f7d6a5:vs.29

 

dbhlinode2:~ # iscsiadm -m discovery -t sendtargets -p 172.0.1.9

172.0.1.9:3260,1 iqn.2006-04.dbnw1.local:dbnw1

 

Now login to the iSCSI target

dbhlinode2:~ # iscsiadm -m node -T iqn.2006-04.dbnw1.local:dbnw1 -p 172.0.1.9:3260 –l

dbhlinode2:~ # iscsiadm -m node -T iqn.1992-08.com.netapp:sn.a4bf93c0925211eb940300a098f7d6a5:vs.29 -p 10.25.93.41 -l

dbhlinode2:~ # iscsiadm -m node -T iqn.1992-08.com.netapp:sn.69fd49d5925911eb8cfa00a098e0061d:vs.34 -p 10.23.93.41 -l

 

Now persist the iSCSI login in the boot.local file that the LUN’s are visible after the reboot of the OS.

dbhlinode2:~ # vi /etc/init.d/boot.local

#! /bin/sh

#

# Copyright (c) 2019 SuSE Linux AG Nuernberg, Germany.  All rights reserved.

#

echo 0 > /sys/kernel/mm/ksm/run

cpupower frequency-set -g performance

cpupower set -b 0

# Start the iSCIS login

iscsiadm -m node -T iqn.2006-04.dbnw1.local:dbnw1 -p 172.0.1.9:3260 --login

iscsiadm -m node -T iqn.1992-08.com.netapp:sn.a4bf93c0925211eb940300a098f7d6a5:vs.29 -p 10.25.93.41 -l

iscsiadm -m node -T iqn.1992-08.com.netapp:sn.69fd49d5925911eb8cfa00a098e0061d:vs.34 -p 10.23.93.41 -l

 

 

Copy the boot.local to the other node

dbhlinode2:~ # scp /etc/init.d/boot.local dbhlinode1:/etc/init.d/boot.local

 

List the new block devices (in this case 2 from NetApp and 1 from the VM.

dbhlinode2:~ # lsblk

...

sde                                           8:64   0   50M  0 disk 

└─360014058196174ae0bd473588adbd050         254:3    0   50M  0 mpath

sdf                                           8:80   0  178M  0 disk 

└─3600a09803830476a49244e594a66576f         254:4    0  178M  0 mpath

sdg                                           8:96   0  178M  0 disk 

└─3600a09803830472f332b4d6a4732377a         254:5    0  178M  0 mpath

 

dbhlinode2:~ # multipath -ll

3600a09803830476a49244e594a66576f dm-4 NETAPP,LUN C-Mode

size=178M features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 alua' wp=rw

`-+- policy='service-time 0' prio=50 status=active

  `- 5:0:0:1 sdf 8:80 active ready running

3600a09803830472f332b4d6a4732377a dm-5 NETAPP,LUN C-Mode

size=178M features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 alua' wp=rw

`-+- policy='service-time 0' prio=50 status=active

  `- 6:0:0:1 sdg 8:96 active ready running
3600a09803830472f332b4d6a47323778 dm-0 NETAPP,LUN C-Mode

size=50G features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 alua' wp=rw

|-+- policy='service-time 0' prio=50 status=active

| |- 1:0:3:0 sdb 8:16 active ready running

| `- 2:0:0:0 sdc 8:32 active ready running

`-+- policy='service-time 0' prio=10 status=enabled

  |- 1:0:1:0 sda 8:0  active ready running

  `- 2:0:2:0 sdd 8:48 active ready running
360014058196174ae0bd473588adbd050 dm-4 LIO-ORG,sbddbnw1

size=50M features='0' hwhandler='1 alua' wp=rw

`-+- policy='service-time 0' prio=0 status=active

  `- 4:0:0:0 sdi 8:128 active i/o pending running

 

 

 

Cluster initialization

Setup the cluster user password (all nodes)

passwd hacluster

 

Stop the firewall and disable it on (all nodes)

systemctl disable firewalld

systemctl mask firewalld

systemctl stop firewalld

 

 

Initial Cluster Setup Using ha-cluster-init

We do not create a vIP now… this comes later

dbhlinode1:# ha-cluster-init -u -s /dev/mapper/360014058196174ae0bd473588adbd050;/dev/mapper/3600a09803830476a49244e594a66576f;/dev/mapper/3600a09803830472f332b4d6a4732377a

  Configure Corosync (unicast):

  This will configure the cluster messaging layer.  You will need

  to specify a network address over which to communicate (default

  is vlan90's network, but you can use the network address of any

  active interface).

 

  Address for ring0 [192.168.0.32]

  Port for ring0 [5405]

  Initializing SBD......done

  Hawk cluster interface is now running. To see cluster status, open:

    https://192.168.0.32:7630/

  Log in with username 'hacluster'

  Waiting for cluster..............done

  Loading initial cluster configuration

 

Configure Administration IP Address:

  Optionally configure an administration virtual IP

  address. The purpose of this IP address is to

  provide a single IP that can be used to interact

  with the cluster, rather than using the IP address

  of any specific cluster node.

 

Do you wish to configure a virtual IP address (y/n)? n

  Done (log saved to /var/log/crmsh/ha-cluster-bootstrap.log)

Add the second node to the cluster:

dbhlinode2:~ # ha-cluster-join

  Join This Node to Cluster:

  You will be asked for the IP address of an existing node, from which

  configuration will be copied.  If you have not already configured

  passwordless ssh between nodes, you will be prompted for the root

  password of the existing node.

IP address or hostname of existing node (e.g.: 192.168.1.1) []dbhlinode1

User hacluster will be changed the login shell as /bin/bash, and

be setted up authorized ssh access among cluster nodes

Continue (y/n)? y

  Generating SSH key for hacluster

  Configuring SSH passwordless with hacluster@dbhlinode1

  Configuring csync2...done

  Merging known_hosts

  Probing for new partitions...done

  Address for ring0 [192.168.1.32]

  Hawk cluster interface is now running. To see cluster status, open:

    https://192.168.1.32:7630/

  Log in with username 'hacluster'

  Waiting for cluster.....done

  Reloading cluster configuration...done

  Done (log saved to /var/log/crmsh/ha-cluster-bootstrap.log)

 

The  SBD config file /etc/sysconfig/sbd

Maintain the Timeouts to a more realistic value.

SBD_PACEMAKER=yes

SBD_STARTMODE=always

SBD_DELAY_START=no

SBD_WATCHDOG_DEV=/dev/watchdog

SBD_WATCHDOG_TIMEOUT=30

SBD_TIMEOUT_ACTION=flush,reboot

SBD_MOVE_TO_ROOT_CGROUP=auto

SBD_DEVICE=”/dev/mapper/360014058196174ae0bd473588adbd050;/dev/mapper/3600a09803830476a49244e594a66576f;/dev/mapper/3600a09803830472f332b4d6a4732377a”

 

Now copy it over to node 2

dbhlinode1:~ # scp /etc/sysconfig/sbd dbhlinode2:/etc/sysconfig/

The corosync config is created as multicast config. Many swiches do have issues with this, thus we switch it over to dedicated IP connection. For more reliability we add a second cluster-ring as well.

For the ring networks I prefer the “user” and the “iSCSI” VLAN. Here we need an additional routing for the iSCSI VLAN.

dbhlinode1:~ # vi /etc/corosync/corosync.conf

# Please read the corosync.conf.5 manual page

totem {

        version: 2

        secauth: on

        crypto_hash: sha1

        crypto_cipher: aes256

        cluster_name: hacluster

        clear_node_high_bit: yes

        token: 5000

        token_retransmits_before_loss_const: 10

        join: 60

        consensus: 6000

        max_messages: 20

        transport: udpu

}

logging {

        fileline: off

        to_stderr: no

        to_logfile: no

        logfile: /var/log/cluster/corosync.log

        to_syslog: yes

        debug: off

        timestamp: on

        logger_subsys {

                subsys: QUORUM

                debug: off

        }

}

 

nodelist {

        node {

                ring0_addr: dbhlinode1

                nodeid: 1

        }

        node {

                ring0_addr: dbhlinode2

                nodeid: 2

        }

#       node {

#               ring1_addr: 10.25.91.51

#               nodeid: 3

#       }

#       node {

#               ring1_addr: 10.23.91.51

#               nodeid: 4

#       }

}

quorum {

        provider: corosync_votequorum

        expected_votes: 2

        two_node: 1

}

 

Copy the corosync.conf over to the secondary node.

dbhlinode1:~ # scp /etc/corosync/corosync.conf dbhlinode2:/etc/corosync/corosync.conf

 

 

Check the new cluster status with now one resource

crm_mon -r 1

Stack: corosync

Current DC: dbhlinode1 (version 2.0.1+20190417.13d370ca9-3.15.1-2.0.1+20190417.13d370ca9) - partition with quorum

Last updated: Mon Apr 19 03:50:23 2021

Last change: Fri Apr 16 12:27:01 2021 by hacluster via crmd on dbhlinode1

 

2 nodes configured

1 resource configured

 

Online: [ dbhlinode1 dbhlinode2 ]

Full list of resources:

 

stonith-sbd     (stonith:external/sbd): Started dbhlinode1

 

After the cluster is started and the SBD is active you will notice that the IPMI watchdog is active, and the /dev/watchdos has an active process.

dbhlinode1:~ # ipmitool mc watchdog get

Watchdog Timer Use:     SMS/OS (0x44)

Watchdog Timer Is:      Started/Running

Watchdog Timer Logging: On

Watchdog Timer Action:  Hard Reset (0x01)

Pre-timeout interrupt:  None

Pre-timeout interval:   0 seconds

Timer Expiration Flags: (0x10)

Initial Countdown:      5.0 sec

Present Countdown:      4.9 sec

 

dbhlinode1:~ # lsof /dev/watchdog

COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME

sbd     34667 root    4w   CHR 10,130      0t0 92567 /dev/watchdog

 

Now the IPMI timer must run and the /dev/watchdog device must be opened by sbd.

Check the SBD status

sbd -d /dev/mapper/360014058196174ae0bd473588adbd050 list

0       dbhlinode1       clear

1       dbhlinode2       clear

 

Test the SBD fencing by crashing the kernel

Trigger the Kernel Crash à The secondary must now reboot the primary. You can monitor this by checking crm_mon -r.

echo c > /proc/sysrq-trigger

 

System must reboot after 5 Minutes (BMC timeout) or the value which is set as panic_wdt_timeout in the /etc/sysconfig/ipmi config file.

Second test to run is to fence a node using CRM commands.

dbhlinode1:~ # crm node fence dbhlinode2

Fencing dbhlinode2 will shut down the node and migrate any resources that are running on it! Do you want to fence dbhlinode2 (y/n)? y

 

Okay, now we have setup the cluster with an active SBD / IPMI fencing.

Next we have to create the configuration for HANA

HANA Integration into the Cluster

Display the attributes from the cluster before we integrate HANA

dbhlinode1:~ # SAPHanaSR-showAttr

Global cib-time                

--------------------------------

global Fri Apr 16 12:27:01 2021

Hosts      node_state

----------------------

dbhlinode1 online    

dbhlinode2 online

 

setup the HANA Topology

crm config edit

 

primitive rsc_SAPHanaTopology_HN1_HDB00 ocf:suse:SAPHanaTopology \

        operations $id=rsc_sap2_HN1_HDB00-operations \

        op monitor interval=10 timeout=600 \

        op start interval=0 timeout=600 \

        op stop interval=0 timeout=300 \

        params SID=HN1 InstanceNumber=00

 

now the HANA instance

primitive rsc_SAPHana_HN1_HDB00 ocf:suse:SAPHana \

        operations $id=rsc_sap_HN1_HDB00-operations \

        op start interval=0 timeout=3600 \

        op stop interval=0 timeout=3600 \

        op promote interval=0 timeout=3600 \

        op monitor interval=60 role=Master timeout=700 \

        op monitor interval=61 role=Slave timeout=700 \

        params SID=HN1 InstanceNumber=00 PREFER_SITE_TAKEOVER=true DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=true

 

at last we configure the master/slave and clone resource

ms msl_SAPHana_HN1_HDB00 rsc_SAPHana_HN1_HDB00 \

        meta notify=true clone-max=2 clone-node-max=1 target-role=Started interleave=true maintenance=false

clone cln_SAPHanaTopology_HN1_HDB00 rsc_SAPHanaTopology_HN1_HDB00 \

        meta clone-node-max=1 target-role=Started interleave=true

After the HANA config it should look like:

dbhlinode1:/etc/ddns # SAPHanaSR-showAttr

Global cib-time                

--------------------------------

global Thu Apr 22 10:58:29 2021

 

Resource              maintenance

----------------------------------

grp_ip_HN1_dbhlinode1 false      

grp_ip_HN1_dbhlinode2 false      

msl_SAPHana_HN1_HDB03 false      

 

Sites srHook

-------------

SITE1 PRIM  

SITE2 SOK   

 

Hosts      clone_state lpa_hn1_lpt node_state op_mode   remoteHost roles                            score site  srmode sync_state version                vhost     

--------------------------------------------------------------------------------------------------------------------------------------------------------------------

dbhlinode1 PROMOTED    1619103509  online     logreplay dbhlinode2 4:P:master1:master:worker:master 150   SITE1 sync   PRIM       2.00.052.00.1599235305 dbhlinode1

dbhlinode2 DEMOTED     30          online     logreplay dbhlinode1 4:S:master1:master:worker:master 100   SITE2 sync   SOK        2.00.052.00.1599235305 dbhlinode2

 

 

Configure the dynamic DNS and the vIP’s  

create the two dDNS entries in the cluster

crm config edit

 

primitive pri_dnsupdate1_dbhlinode1 dnsupdate \

        params hostname=hanavip.hli.azure ip=192.168.0.50 ttl=5 keyfile="/etc/ddns/DNS_update_05.key" server=172.0.0.5 serverport=53 unregister_on_stop=true \

        op monitor timeout=30 interval=20 \

        op_params depth=0 \

        meta target-role=Started

 

primitive pri_dnsupdate2_dbhlinode1 dnsupdate \

        params hostname=hanavip.hli.azure ip=192.168.0.50 ttl=5 keyfile="/etc/ddns/DNS_update_06.key" server=172.0.0.6 serverport=53 unregister_on_stop=true \

        op monitor timeout=30 interval=20 \

        op_params depth=0 \

        meta target-role=Started

 

primitive pri_dnsupdate1_dbhlinode2 dnsupdate \

        params hostname=hanavip.hli.azure ip=192.168.1.50 ttl=5 keyfile="/etc/ddns/DNS_update_05.key" server=172.0.0.5 serverport=53 unregister_on_stop=true \

        op monitor timeout=30 interval=20 \

        op_params depth=0 \

        meta target-role=Started

 

primitive pri_dnsupdate2_dbhlinode2 dnsupdate \

        params hostname=hanavip.hli.azure ip=192.168.1.50 ttl=5 keyfile="/etc/ddns/DNS_update_06.key" server=172.0.0.6 serverport=53 unregister_on_stop=true \

        op monitor timeout=30 interval=20 \

        op_params depth=0 \

        meta target-role=Started

 

 

 

 

create the two vIP’s

primitive rsc_ip_HN1_dbhlinode1 IPaddr2 \

        op monitor interval=10s timeout=20s \

        params ip=192.168.0.50 cidr_netmask=26 \

        meta maintenance=false

primitive rsc_ip_HN1_dbhlinode2 IPaddr2 \

        op monitor interval=10s timeout=20s \

        params ip=192.168.1.50 cidr_netmask=26 \

        meta is-managed=tru

 

now we configure the group and the colocation for the vIP and dDNS update

group grp_ip_HN1_dbhlinode1 rsc_ip_HN1_dbhlinode1 pri_dnsupdate1_dbhlinode1  pri_dnsupdate2_dbhlinode1 \

        meta resource-stickiness=1

 

location loc_ip_dbhlinode1_not_on_dbhlinode2 grp_ip_HN1_dbhlinode1 -inf: dbhlinode2

 

location loc_ip_on_primary_dbhlinode1 grp_ip_HN1_dbhlinode1 \

        rule -inf: hana_hn1_roles ne 4:P:master1:master:worker:master

 

group grp_ip_HN1_dbhlinode2 rsc_ip_HN1_dbhlinode2 pri_dnsupdate1_dbhlinode2 pri_dnsupdate2_dbhlinode2 \

        meta resource-stickiness=1

 

location loc_ip_dbhlinode2_not_on_dbhlinode1 grp_ip_HN1_dbhlinode2 -inf: dbhlinode1

 

location loc_ip_on_primary_dbhlinode2 grp_ip_HN1_dbhlinode2 \

        rule -inf: hana_hn1_roles ne 4:P:master1:master:worker:master

 

 

 

Test the Cluster

Test the DNS server by querying the vip-hostname

Originally the DNS entry is the pointing to dbmv01

dbhlinode2:~ # dig -p 53 @172.0.0.5 hanavip.hli.azure. A +short 2>/dev/null

192.168.0.50

 

Then we stopped the primitive and the entry get’s deleted from the DNS server

dbhlinode2:~ # dig -p 53 @172.0.0.5 hanavip.hli.azure. A +short 2>/dev/null

 

Now we switch over to the second DB server

dbhlinode2:~ # dig -p 53 @172.0.0.5 hanavip.hli.azure. A +short 2>/dev/null

192.168.1.50

 

To test the cluster use first the cluster commands to test the scenario.

Switch the resource to the secondary node

dbhlinode1:~ # crm resource move msl_SAPHana_HN1_HDB03 force

INFO: Move constraint created for msl_SAPHana_HN1_HDB03

 

Wait until the cluster did swich over and the process is ready.

During this time the HANA on dbhlinode2 is still stopped.

dbhlinode1:~ # crm resource clear msl_SAPHana_HN1_HDB03

 

To clear generic messages use

dbhlinode1:~ # crm resource clean

 

 

NFO: Removed migration constraints for msl_SAPHana_HN1_HDB03

dbhlinode1:~ # crm resource cleanup

 

now we get a bit more mean and crash the kernel

Trigger the Kernel Crash

echo c > /proc/sysrq-trigger

 

or immediately reboot the system by:

echo b > /proc/sysrq-trigger

In all cases the cluster must behave like expected.

 

 

 

With option AUTOMATED_REGISTER=false you cannot switch back and forth.

If these option is set to false it is required to re-register the node with:

 

hdbnsutil -sr_register --remoteHost=dbhlinode2 --remoteInstance=00 --replicationMode=syncmem --name=DC1

 

Now Node2 was the primary and node1 acts as secondary host.

Maybe consider setting this option to true to automate the registration of the demoted host.

 

System – Database Maintenance

Pre Update Task

For the master-slave-resource set the maintenance mode:

 

dbhlinode1:~ # crm resource maintenance msl_SAPHana_HN1_HDB03

 

run the update tasks and refresh and activate the resource maintenance.

dbhlinode1:~ # crm resource refresh msl_SAPHana_HN1_HDB03

 

After the SAP HANA update is complete on both sites, tell the cluster about the end of the maintenance process. This allows the cluster to actively control and monitor the SAP again.

dbhlinode1:~ # crm resource maintenance msl_SAPHana_HN1_HDB03 off

 

 

Check the cluster status

crm_mon -r 1

2 nodes configured

9 resources configured

 

Online: [ dbhlinode1 dbhlinode2 ]

 

Full list of resources:

 

stonith-sbd     (stonith:external/sbd): Started dbhlinode2

 Resource Group: grp_ip_HN1_dbhlinode1

     rsc_ip_HN1_dbhlinode1      (ocf::heartbeat:IPaddr2):       Started dbhlinode1

     pri_dnsupdate1_dbhlinode1  (ocf::heartbeat:dnsupdate):     Started dbhlinode1

 Resource Group: grp_ip_HN1_dbhlinode2

     rsc_ip_HN1_dbhlinode2      (ocf::heartbeat:IPaddr2):       Stopped

     pri_dnsupdate1_dbhlinode2  (ocf::heartbeat:dnsupdate):     Stopped

 Clone Set: msl_SAPHana_HN1_HDB03 [rsc_SAPHana_HN1_HDB03] (promotable)

     Masters: [ dbhlinode1 ]

     Slaves: [ dbhlinode2 ]

 Clone Set: cln_SAPHanaTopology_HN1_HDB03 [rsc_SAPHanaTopology_HN1_HDB03]

     Started: [ dbhlinode1 dbhlinode2 ]

 

 

corosync.conf

Listing of the corosync.conf file

dbhlinode1:~ # cat /etc/corosync/corosync.conf

totem {

        version: 2

        secauth: on

        crypto_hash: sha1

        crypto_cipher: aes256

        cluster_name: hacluster

        clear_node_high_bit: yes

        token: 5000

        token_retransmits_before_loss_const: 10

        join: 60

        consensus: 6000

        max_messages: 20

        transport: udpu

        rrp_mode: passive

}

logging {

        fileline: off

        to_stderr: no

        to_logfile: no

        logfile: /var/log/cluster/corosync.log

        to_syslog: yes

        debug: off

        timestamp: on

        logger_subsys {

                subsys: QUORUM

                debug: off

        }

 

}

nodelist {

        node {

                ring0_addr: 192.168.0.32

                ring1_addr: 10.25.93.51

                nodeid: 1

        }

 
       node {

                ring0_addr: 192.168.1.32

                ring1_addr: 10.23.93.51

                nodeid: 2

        }

}

quorum {

 

        # Enable and configure quorum subsystem (default: off)

        # see also corosync.conf.5 and votequorum.5

        provider: corosync_votequorum

        expected_votes: 2

        two_node: 1

}

 

Make sure the corosync.conf is identically on both nodes

scp /etc/corosync/corosync.conf dbhlinode2:/etc/corosync/corosync.conf

 

 

 

Test the corosync ring config

On node dbhlinode1

dbhlinode1:~ # corosync-cfgtool -s

Printing ring status.

Local node ID 1

RING ID 0

        id      = 192.168.0.32

        status  = ring 0 active with no faults

RING ID 1

        id      = 10.25.93.51

        status  = ring 1 active with no faults

 

 

 

On node dbhlinode2

dbhlinode2:~ # corosync-cfgtool -s

Printing ring status.

Local node ID 2

RING ID 0

        id      = 192.168.1.32

        status  = ring 0 active with no faults

RING ID 1

        id      = 10.23.93.51

        status  = ring 1 active with no faults

 

CRM Configuration

dbhlinode1:~ # crm config show > crm_config.txt

 

dbhlinode1:~ # cat crm_config.txt

 

node 1: dbhlinode1 \

        attributes hana_hn1_vhost=dbhlinode1 hana_hn1_srmode=sync hana_hn1_site=SITE1 lpa_hn1_lpt=1620219197 hana_hn1_remoteHost=dbhlinode2 hana_hn1_op_mode=logreplay

node 2: dbhlinode2 \

        attributes hana_hn1_vhost=dbhlinode2 hana_hn1_site=SITE2 hana_hn1_srmode=sync lpa_hn1_lpt=30 hana_hn1_remoteHost=dbhlinode1 hana_hn1_op_mode=logreplay

primitive pri_dnsupdate1_dbhlinode1 dnsupdate \

        params hostname=hanavip.hli.azure ip=192.168.0.50 ttl=5 keyfile="/etc/ddns/DNS_update_dnsvm03.key" server=172.0.0.8 serverport=53 unregister_on_stop=true \

        op monitor timeout=30 interval=20 \

        op_params depth=0

primitive pri_dnsupdate1_dbhlinode2 dnsupdate \

        params hostname=hanavip.hli.azure ip=192.168.1.50 ttl=5 keyfile="/etc/ddns/DNS_update_dnsvm03.key" server=172.0.0.8 serverport=53 unregister_on_stop=true \

        op monitor timeout=30 interval=20 \

        op_params depth=0

primitive rsc_SAPHanaTopology_HN1_HDB03 ocf:suse:SAPHanaTopology \

        operations $id=rsc_sap2_HN1_HDB03-operations \

        op monitor interval=10 timeout=600 \

        op start interval=0 timeout=600 \

        op stop interval=0 timeout=300 \

        params SID=HN1 InstanceNumber=03

primitive rsc_SAPHana_HN1_HDB03 ocf:suse:SAPHana \

        operations $id=rsc_sap_HN1_HDB03-operations \

        op start interval=0 timeout=3600 \

        op stop interval=0 timeout=3600 \

        op promote interval=0 timeout=3600 \

        op monitor interval=60 role=Master timeout=700 \

        op monitor interval=61 role=Slave timeout=700 \

        params SID=HN1 InstanceNumber=03 PREFER_SITE_TAKEOVER=true DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=true

primitive rsc_ip_HN1_dbhlinode1 IPaddr2 \

        op monitor interval=10s timeout=20s \

        params ip=192.168.0.50 cidr_netmask=26

primitive rsc_ip_HN1_dbhlinode2 IPaddr2 \

        op monitor interval=10s timeout=20s \

        params ip=192.168.1.50 cidr_netmask=26

primitive stonith-sbd stonith:external/sbd \

        params pcmk_delay_max=30s

group grp_ip_HN1_dbhlinode1 rsc_ip_HN1_dbhlinode1 pri_dnsupdate1_dbhlinode1 \

        params resource-stickiness=1 \

        meta target-role=Started maintenance=false

group grp_ip_HN1_dbhlinode2 rsc_ip_HN1_dbhlinode2 pri_dnsupdate1_dbhlinode2 \

        params resource-stickiness=1 \

        meta maintenance=false target-role=Started

ms msl_SAPHana_HN1_HDB03 rsc_SAPHana_HN1_HDB03 \

        meta notify=true clone-max=2 clone-node-max=1 target-role=Started interleave=true maintenance=false

clone cln_SAPHanaTopology_HN1_HDB03 rsc_SAPHanaTopology_HN1_HDB03 \

        meta clone-node-max=1 target-role=Started interleave=true

location loc_ip_dbhlinode1_not_on_dbhlinode2 grp_ip_HN1_dbhlinode1 -inf: dbhlinode2

location loc_ip_dbhlinode2_not_on_dbhlinode1 grp_ip_HN1_dbhlinode2 -inf: dbhlinode1

location loc_ip_on_primary_dbhlinode1 grp_ip_HN1_dbhlinode1 \

        rule -inf: hana_hn1_roles ne 4:P:master1:master:worker:master

location loc_ip_on_primary_dbhlinode2 grp_ip_HN1_dbhlinode2 \

        rule -inf: hana_hn1_roles ne 4:P:master1:master:worker:master

property SAPHanaSR: \

        hana_hn1_site_srHook_SITE2=SOK \

        hana_hn1_site_srHook_SITE1=PRIM

property cib-bootstrap-options: \

        have-watchdog=true \

        dc-version="2.0.1+20190417.13d370ca9-3.15.1-2.0.1+20190417.13d370ca9" \

        cluster-infrastructure=corosync \

        cluster-name=hacluster \

        stonith-enabled=true \

        last-lrm-refresh=1620163248

rsc_defaults rsc-options: \

        resource-stickiness=1 \

        migration-threshold=3

op_defaults op-options: \

        timeout=600 \

        record-pending=true

REMEMBER: these articles are REPUBLISHED. Your best bet to get a reply is to follow the link at the top of the post to the ORIGINAL post! BUT you're more than welcome to start discussions here:

This site uses Akismet to reduce spam. Learn how your comment data is processed.