Thursday 28 February 2019

Step by Step Exadata Infiniband Switch Patching 18c

Overview
  • The Exadata network grid consists of multiple Sun QDR InfiniBand switches.
  • IB Switches are used for the storage network as well as the Oracle RAC interconnect.
  • Exadata compute nodes and storage cells are configured with dual-port InfiniBand ports and connect to each of the two leaf switches.
  • You can access IB Switches using command line and Web ILOM
  • IB Switches run Linux operating system.

In this article I will demonstrate how to patch or upgrade Oracle Exadata IB Switches.


About Infiniband Switch Patching
  • Starting with release 11.2.3.3.0, the patchmgr utility is used to upgrade and downgrade the InfiniBand switches.
  • IB Switch patch is delievered with Exadata storage patch.
  • IB Switch patches are released semi annually to annually.
  • IB Switch can be patched in Rolling fashion only.

Environment
  • Exadata Half Rack X4-2
  • 4 Compute nodes, 7 Storage cells and 2 IB Switches
  • Current IB Switch Version 2.2.7-1

Step by Step Infiniband Switch Patching
 

  • Identify the number of switches in clusters.

[root@dm01dbadm01 ~]# ibswitches
Switch  : 0x002128469b8aa0a0 ports 36 "SUN DCS 36P QDR dm01sw-iba01 10.209.41.246" enhanced port 0 lid 5 lmc 0
Switch  : 0x002128469b97a0a0 ports 36 "SUN DCS 36P QDR dm01sw-ibb01 10.209.41.247" enhanced port 0 lid 4 lmc 0


  • Identify the current IB switch software version on all the Switches

[root@dm01db01 patch_18.1.12.0.0.190111]# ssh dm01sw-iba01 version
SUN DCS 36p version: 2.2.7-1
Build time: Aug  4 2017 12:20:53
SP board info:
Manufacturing Date: 2014.05.20
Serial Number: "NCDFxxxxx"
Hardware Revision: 0x0107
Firmware Revision: 0x0000
BIOS version: SUN0R100
BIOS date: 06/22/2010


  • Log in to Exadata Compute node 1 as root user and navigate the Exadata Storage Software staging area

[root@dm01dbadm01 ESS_121220]# cd /u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111
 
[root@dm01dbadm01 patch_18.1.12.0.0.190111]# pwd
/u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111


  • Create a file named ibswitches.lst and enter IB switch names one per line as follows:

[root@dm01dbadm01 patch_18.1.12.0.0.190111]# vi ~/ibswitch_group
dm01sw-ibb01
dm01sw-iba01

[root@dm01dbadm01 patch_18.1.12.0.0.190111]# cat ~/ibswitch_group
dm01sw-ibb01
dm01sw-iba01


  • Execute the following to perform the IB Switch precheck

[root@dm01db01 patch_18.1.12.0.0.190111]# ./patchmgr -ibswitches ~/ibswitch_group -upgrade -ibswitch_precheck

2019-02-10 03:02:46 -0600 1 of 1 :Working: DO: Initiate pre-upgrade validation check on InfiniBand switch(es).
 ----- InfiniBand switch update process started 2019-02-10 03:02:47 -0600 -----
[NOTE     ] Log file at /u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111/upgradeIBSwitch.log

[INFO     ] List of InfiniBand switches for upgrade: ( dm01sw-ibb01 dm01sw-iba01 )
[SUCCESS  ] Verifying Network connectivity to dm01sw-ibb01
[SUCCESS  ] Verifying Network connectivity to dm01sw-iba01
[SUCCESS  ] Validating verify-topology output
[INFO     ] Master Subnet Manager is set to "dm01sw-ibb01" in all Switches
[INFO     ] Upgrade to 2.2.11_2 requires that the InfiniBand switch be at 2.2.7-2. Upgrading dm01sw-ibb01 first to 2.2.7-2

[INFO     ] ---------- Starting with InfiniBand Switch dm01sw-ibb01
[WARNING  ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.9-3 with the current package.
     To downgrade to other versions:
     - Manually download the InfiniBand switch firmware package to the patch directory
     - Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
     - Run patchmgr command to initiate downgrade.
[SUCCESS  ] Verify SSH access to the patchmgr host dm01db01.netsoftmate.com from the InfiniBand Switch dm01sw-ibb01.
[INFO     ] Starting pre-update validation on dm01sw-ibb01
[SUCCESS  ] Verifying that /tmp has 120M in dm01sw-ibb01, found 246M
[SUCCESS  ] Verifying that / has 20M in dm01sw-ibb01, found 26M
[SUCCESS  ] NTP daemon is running on dm01sw-ibb01.
[SUCCESS  ] opensm.conf passed all validations
[INFO     ] Manually validate the following entries Date:(YYYY-MM-DD) 2019-02-10 Time:(HH:MM:SS) 03:03:03
[INFO     ] Validating the current firmware on the InfiniBand Switch
[SUCCESS  ] Firmware verification on InfiniBand switch dm01sw-ibb01
[SUCCESS  ] Verifying that the patchmgr host dm01db01.netsoftmate.com is recognized on the InfiniBand Switch dm01sw-ibb01 through getHostByName
[SUCCESS  ] Execute plugin check for Patch Check Prereq on dm01sw-ibb01
[INFO     ] Finished pre-update validation on dm01sw-ibb01
[SUCCESS  ] Pre-update validation on dm01sw-ibb01

[INFO     ] ---------- Starting with InfiniBand Switch dm01sw-ibb01
[WARNING  ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.9-3 with the current package.
     To downgrade to other versions:
     - Manually download the InfiniBand switch firmware package to the patch directory
     - Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
     - Run patchmgr command to initiate downgrade.
[SUCCESS  ] Verify SSH access to the patchmgr host dm01db01.netsoftmate.com from the InfiniBand Switch dm01sw-ibb01.
[INFO     ] Starting pre-update validation on dm01sw-ibb01
[SUCCESS  ] Verifying that /tmp has 120M in dm01sw-ibb01, found 246M
[SUCCESS  ] Verifying that / has 20M in dm01sw-ibb01, found 26M
[SUCCESS  ] NTP daemon is running on dm01sw-ibb01.
[SUCCESS  ] opensm.conf passed all validations
[INFO     ] Manually validate the following entries Date:(YYYY-MM-DD) 2019-02-10 Time:(HH:MM:SS) 03:03:34
[INFO     ] Validating the current firmware on the InfiniBand Switch
[SUCCESS  ] Firmware verification on InfiniBand switch dm01sw-ibb01
[SUCCESS  ] Verifying that the patchmgr host dm01db01.netsoftmate.com is recognized on the InfiniBand Switch dm01sw-ibb01 through getHostByName
[SUCCESS  ] Execute plugin check for Patch Check Prereq on dm01sw-ibb01
[INFO     ] Finished pre-update validation on dm01sw-ibb01
[SUCCESS  ] Pre-update validation on dm01sw-ibb01
[SUCCESS  ] Prereq check on dm01sw-ibb01
[INFO     ] Upgrade to 2.2.11_2 requires that the InfiniBand switch be at 2.2.7-2. Upgrading dm01sw-iba01 first to 2.2.7-2

[INFO     ] ---------- Starting with InfiniBand Switch dm01sw-iba01
[WARNING  ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.9-3 with the current package.
     To downgrade to other versions:
     - Manually download the InfiniBand switch firmware package to the patch directory
     - Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
     - Run patchmgr command to initiate downgrade.
[SUCCESS  ] Verify SSH access to the patchmgr host dm01db01.netsoftmate.com from the InfiniBand Switch dm01sw-iba01.
[INFO     ] Starting pre-update validation on dm01sw-iba01
[SUCCESS  ] Verifying that /tmp has 120M in dm01sw-iba01, found 246M
[SUCCESS  ] Verifying that / has 20M in dm01sw-iba01, found 26M
[SUCCESS  ] NTP daemon is running on dm01sw-iba01.
[SUCCESS  ] opensm.conf passed all validations
[INFO     ] Manually validate the following entries Date:(YYYY-MM-DD) 2019-02-10 Time:(HH:MM:SS) 03:04:06
[INFO     ] Validating the current firmware on the InfiniBand Switch
[SUCCESS  ] Firmware verification on InfiniBand switch dm01sw-iba01
[SUCCESS  ] Verifying that the patchmgr host dm01db01.netsoftmate.com is recognized on the InfiniBand Switch dm01sw-iba01 through getHostByName
[SUCCESS  ] Execute plugin check for Patch Check Prereq on dm01sw-iba01
[INFO     ] Finished pre-update validation on dm01sw-iba01
[SUCCESS  ] Pre-update validation on dm01sw-iba01

[INFO     ] ---------- Starting with InfiniBand Switch dm01sw-iba01
[WARNING  ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.9-3 with the current package.
     To downgrade to other versions:
     - Manually download the InfiniBand switch firmware package to the patch directory
     - Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
     - Run patchmgr command to initiate downgrade.
[SUCCESS  ] Verify SSH access to the patchmgr host dm01db01.netsoftmate.com from the InfiniBand Switch dm01sw-iba01.
[INFO     ] Starting pre-update validation on dm01sw-iba01
[SUCCESS  ] Verifying that /tmp has 120M in dm01sw-iba01, found 246M
[SUCCESS  ] Verifying that / has 20M in dm01sw-iba01, found 26M
[SUCCESS  ] NTP daemon is running on dm01sw-iba01.
[SUCCESS  ] opensm.conf passed all validations
[INFO     ] Manually validate the following entries Date:(YYYY-MM-DD) 2019-02-10 Time:(HH:MM:SS) 03:04:36
[INFO     ] Validating the current firmware on the InfiniBand Switch
[SUCCESS  ] Firmware verification on InfiniBand switch dm01sw-iba01
[SUCCESS  ] Verifying that the patchmgr host dm01db01.netsoftmate.com is recognized on the InfiniBand Switch dm01sw-iba01 through getHostByName
[SUCCESS  ] Execute plugin check for Patch Check Prereq on dm01sw-iba01
[INFO     ] Finished pre-update validation on dm01sw-iba01
[SUCCESS  ] Pre-update validation on dm01sw-iba01
[SUCCESS  ] Prereq check on dm01sw-iba01
[SUCCESS  ] Overall status

 ----- InfiniBand switch update process ended 2019-02-10 03:05:00 -0600 -----
2019-02-10 03:05:00 -0600 1 of 1 :SUCCESS: DONE: Initiate pre-upgrade validation check on InfiniBand switch(es).


  • Upgrade the IB Switches using the following command:

[root@dm01db01 patch_18.1.12.0.0.190111]# ./patchmgr -ibswitches ~/ibswitch_group -upgrade

2019-02-10 03:07:26 -0600 1 of 1 :Working: DO: Initiate upgrade of InfiniBand switches to 2.2.11-2. Expect up to 40 minutes for each switch
                                                  
 ----- InfiniBand switch update process started 2019-02-10 03:07:27 -0600 -----
[NOTE     ] Log file at /u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111/upgradeIBSwitch.log

[INFO     ] List of InfiniBand switches for upgrade: ( dm01sw-ibb01 dm01sw-iba01 )
[SUCCESS  ] Verifying Network connectivity to dm01sw-ibb01
[SUCCESS  ] Verifying Network connectivity to dm01sw-iba01
[SUCCESS  ] Validating verify-topology output
[INFO     ] Proceeding with upgrade of InfiniBand switches to version 2.2.11_2
[INFO     ] Master Subnet Manager is set to "dm01sw-ibb01" in all Switches
[INFO     ] Upgrade to 2.2.11_2 requires that the InfiniBand switch be at 2.2.7-2. Upgrading dm01sw-ibb01 first to 2.2.7-2

[INFO     ] ---------- Starting with InfiniBand Switch dm01sw-ibb01
[WARNING  ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.9-3 with the current package.
     To downgrade to other versions:
     - Manually download the InfiniBand switch firmware package to the patch directory
     - Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
     - Run patchmgr command to initiate downgrade.
[SUCCESS  ] Verify SSH access to the patchmgr host dm01db01.netsoftmate.com from the InfiniBand Switch dm01sw-ibb01.
[INFO     ] Starting pre-update validation on dm01sw-ibb01
[SUCCESS  ] Verifying that /tmp has 120M in dm01sw-ibb01, found 246M
[SUCCESS  ] Verifying that / has 20M in dm01sw-ibb01, found 26M
[SUCCESS  ] Service opensmd is running on InfiniBand Switch dm01sw-ibb01
[SUCCESS  ] NTP daemon is running on dm01sw-ibb01.
[SUCCESS  ] opensm.conf passed all validations
[INFO     ] Manually validate the following entries Date:(YYYY-MM-DD) 2019-02-10 Time:(HH:MM:SS) 03:07:43
[INFO     ] Validating the current firmware on the InfiniBand Switch
[SUCCESS  ] Firmware verification on InfiniBand switch dm01sw-ibb01
[SUCCESS  ] Verifying that the patchmgr host dm01db01.netsoftmate.com is recognized on the InfiniBand Switch dm01sw-ibb01 through getHostByName
[SUCCESS  ] Execute plugin check for Patch Check Prereq on dm01sw-ibb01
[INFO     ] Finished pre-update validation on dm01sw-ibb01
[SUCCESS  ] Pre-update validation on dm01sw-ibb01
[INFO     ] Package will be downloaded at firmware update time via scp
[SUCCESS  ] Disable Subnet Manager on dm01sw-ibb01
[SUCCESS  ] Execute plugin check for Patching on dm01sw-ibb01
[INFO     ] Starting upgrade on dm01sw-ibb01 to 2.2.7_2. Please give upto 15 mins for the process to complete. DO NOT INTERRUPT or HIT CTRL+C during the upgrade
[INFO     ] Rebooting dm01sw-ibb01 to complete the firmware update. Wait for 15 minutes before continuing. DO NOT MANUALLY REBOOT THE INFINIBAND SWITCH
[SUCCESS  ] Load firmware 2.2.7_2 onto dm01sw-ibb01
[SUCCESS  ] Disable Subnet Manager on dm01sw-ibb01
[SUCCESS  ] Verify that /conf/configvalid is set to 1 on dm01sw-ibb01
[INFO     ] Set SMPriority to 5 on dm01sw-ibb01
[INFO     ] Rebooting dm01sw-ibb01. Wait for 4 minutes before continuing
[SUCCESS  ] Reboot dm01sw-ibb01
[SUCCESS  ] SUCCESS
[INFO     ] Starting post-update validation on dm01sw-ibb01
[SUCCESS  ] Service opensmd is running on InfiniBand Switch dm01sw-ibb01
[SUCCESS  ] NTP daemon is running on dm01sw-ibb01.
[INFO     ] Manually validate the following entries Date:(YYYY-MM-DD) 2019-02-10 Time:(HH:MM:SS) 03:29:11
[INFO     ] /conf/configvalid is 1
[INFO     ] Validating the current firmware on the InfiniBand Switch
[SUCCESS  ] Firmware verification on InfiniBand switch dm01sw-ibb01
[SUCCESS  ] Execute plugin check for Post Patch on dm01sw-ibb01
[INFO     ] Finished post-update validation on dm01sw-ibb01
[SUCCESS  ] Post-update validation on dm01sw-ibb01

[INFO     ] ---------- Starting with InfiniBand Switch dm01sw-ibb01
[WARNING  ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.9-3 with the current package.
     To downgrade to other versions:
     - Manually download the InfiniBand switch firmware package to the patch directory
     - Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
     - Run patchmgr command to initiate downgrade.
[SUCCESS  ] Verify SSH access to the patchmgr host dm01db01.netsoftmate.com from the InfiniBand Switch dm01sw-ibb01.
[INFO     ] Starting pre-update validation on dm01sw-ibb01
[SUCCESS  ] Verifying that /tmp has 120M in dm01sw-ibb01, found 246M
[SUCCESS  ] Verifying that / has 20M in dm01sw-ibb01, found 28M
[SUCCESS  ] Service opensmd is running on InfiniBand Switch dm01sw-ibb01
[SUCCESS  ] NTP daemon is running on dm01sw-ibb01.
[INFO     ] Manually validate the following entries Date:(YYYY-MM-DD) 2019-02-10 Time:(HH:MM:SS) 03:29:39
[INFO     ] Validating the current firmware on the InfiniBand Switch
[SUCCESS  ] Firmware verification on InfiniBand switch dm01sw-ibb01
[SUCCESS  ] Verifying that the patchmgr host dm01db01.netsoftmate.com is recognized on the InfiniBand Switch dm01sw-ibb01 through getHostByName
[SUCCESS  ] Execute plugin check for Patch Check Prereq on dm01sw-ibb01
[INFO     ] Finished pre-update validation on dm01sw-ibb01
[SUCCESS  ] Pre-update validation on dm01sw-ibb01
[INFO     ] Package will be downloaded at firmware update time via scp
[SUCCESS  ] Disable Subnet Manager on dm01sw-ibb01
[SUCCESS  ] Execute plugin check for Patching on dm01sw-ibb01
[INFO     ] Starting upgrade on dm01sw-ibb01 to 2.2.11_2. Please give upto 15 mins for the process to complete. DO NOT INTERRUPT or HIT CTRL+C during the upgrade
[INFO     ] Rebooting dm01sw-ibb01 to complete the firmware update. Wait for 15 minutes before continuing. DO NOT MANUALLY REBOOT THE INFINIBAND SWITCH
[SUCCESS  ] Load firmware 2.2.11_2 onto dm01sw-ibb01
[SUCCESS  ] Disable Subnet Manager on dm01sw-ibb01
[SUCCESS  ] Verify that /conf/configvalid is set to 1 on dm01sw-ibb01
[INFO     ] Set SMPriority to 5 on dm01sw-ibb01
[INFO     ] Rebooting dm01sw-ibb01. Wait for 4 minutes before continuing
[SUCCESS  ] Reboot dm01sw-ibb01
[SUCCESS  ] SUCCESS
[INFO     ] Starting post-update validation on dm01sw-ibb01
[SUCCESS  ] Service opensmd is running on InfiniBand Switch dm01sw-ibb01
[SUCCESS  ] NTP daemon is running on dm01sw-ibb01.
[INFO     ] Manually validate the following entries Date:(YYYY-MM-DD) 2019-02-10 Time:(HH:MM:SS) 03:51:03
[INFO     ] /conf/configvalid is 1
[INFO     ] Validating the current firmware on the InfiniBand Switch
[SUCCESS  ] Firmware verification on InfiniBand switch dm01sw-ibb01
[SUCCESS  ] Execute plugin check for Post Patch on dm01sw-ibb01
[INFO     ] Finished post-update validation on dm01sw-ibb01
[SUCCESS  ] Post-update validation on dm01sw-ibb01
[SUCCESS  ] Update InfiniBand switch dm01sw-ibb01 to 2.2.11_2
[INFO     ] Upgrade to 2.2.11_2 requires that the InfiniBand switch be at 2.2.7-2. Upgrading dm01sw-iba01 first to 2.2.7-2
[INFO     ] ---------- Starting with InfiniBand Switch dm01sw-iba01
[WARNING  ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.9-3 with the current package.
     To downgrade to other versions:
     - Manually download the InfiniBand switch firmware package to the patch directory
     - Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
     - Run patchmgr command to initiate downgrade.
[SUCCESS  ] Verify SSH access to the patchmgr host dm01db01.netsoftmate.com from the InfiniBand Switch dm01sw-iba01.
[INFO     ] Starting pre-update validation on dm01sw-iba01
[SUCCESS  ] Verifying that /tmp has 120M in dm01sw-iba01, found 246M
[SUCCESS  ] Verifying that / has 20M in dm01sw-iba01, found 26M
[SUCCESS  ] Service opensmd is running on InfiniBand Switch dm01sw-iba01
[SUCCESS  ] NTP daemon is running on dm01sw-iba01.
[SUCCESS  ] opensm.conf passed all validations
[INFO     ] Manually validate the following entries Date:(YYYY-MM-DD) 2019-02-10 Time:(HH:MM:SS) 03:51:38
[INFO     ] Validating the current firmware on the InfiniBand Switch
[SUCCESS  ] Firmware verification on InfiniBand switch dm01sw-iba01
[SUCCESS  ] Verifying that the patchmgr host dm01db01.netsoftmate.com is recognized on the InfiniBand Switch dm01sw-iba01 through getHostByName
[SUCCESS  ] Execute plugin check for Patch Check Prereq on dm01sw-iba01
[INFO     ] Finished pre-update validation on dm01sw-iba01
[SUCCESS  ] Pre-update validation on dm01sw-iba01
[INFO     ] Package will be downloaded at firmware update time via scp
[SUCCESS  ] Disable Subnet Manager on dm01sw-iba01
[SUCCESS  ] Execute plugin check for Patching on dm01sw-iba01
[INFO     ] Starting upgrade on dm01sw-iba01 to 2.2.7_2. Please give upto 15 mins for the process to complete. DO NOT INTERRUPT or HIT CTRL+C during the upgrade
[INFO     ] Rebooting dm01sw-iba01 to complete the firmware update. Wait for 15 minutes before continuing. DO NOT MANUALLY REBOOT THE INFINIBAND SWITCH
[SUCCESS  ] Load firmware 2.2.7_2 onto dm01sw-iba01
[SUCCESS  ] Disable Subnet Manager on dm01sw-iba01
[SUCCESS  ] Verify that /conf/configvalid is set to 1 on dm01sw-iba01
[INFO     ] Set SMPriority to 5 on dm01sw-iba01
[INFO     ] Rebooting dm01sw-iba01. Wait for 4 minutes before continuing
[SUCCESS  ] Reboot dm01sw-iba01
[SUCCESS  ] SUCCESS
[INFO     ] Starting post-update validation on dm01sw-iba01
[SUCCESS  ] Service opensmd is running on InfiniBand Switch dm01sw-iba01
[SUCCESS  ] NTP daemon is running on dm01sw-iba01.
[INFO     ] Manually validate the following entries Date:(YYYY-MM-DD) 2019-02-10 Time:(HH:MM:SS) 04:13:06
[INFO     ] /conf/configvalid is 1
[INFO     ] Validating the current firmware on the InfiniBand Switch
[SUCCESS  ] Firmware verification on InfiniBand switch dm01sw-iba01
[SUCCESS  ] Execute plugin check for Post Patch on dm01sw-iba01
[INFO     ] Finished post-update validation on dm01sw-iba01
[SUCCESS  ] Post-update validation on dm01sw-iba01

[INFO     ] ---------- Starting with InfiniBand Switch dm01sw-iba01
[WARNING  ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.9-3 with the current package.
     To downgrade to other versions:
     - Manually download the InfiniBand switch firmware package to the patch directory
     - Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
     - Run patchmgr command to initiate downgrade.
[SUCCESS  ] Verify SSH access to the patchmgr host dm01db01.netsoftmate.com from the InfiniBand Switch dm01sw-iba01.
[INFO     ] Starting pre-update validation on dm01sw-iba01
[SUCCESS  ] Verifying that /tmp has 120M in dm01sw-iba01, found 246M
[SUCCESS  ] Verifying that / has 20M in dm01sw-iba01, found 28M
[SUCCESS  ] Service opensmd is running on InfiniBand Switch dm01sw-iba01
[SUCCESS  ] NTP daemon is running on dm01sw-iba01.
[INFO     ] Manually validate the following entries Date:(YYYY-MM-DD) 2019-02-10 Time:(HH:MM:SS) 04:13:35
[INFO     ] Validating the current firmware on the InfiniBand Switch
[SUCCESS  ] Firmware verification on InfiniBand switch dm01sw-iba01
[SUCCESS  ] Verifying that the patchmgr host dm01db01.netsoftmate.com is recognized on the InfiniBand Switch dm01sw-iba01 through getHostByName
[SUCCESS  ] Execute plugin check for Patch Check Prereq on dm01sw-iba01
[INFO     ] Finished pre-update validation on dm01sw-iba01
[SUCCESS  ] Pre-update validation on dm01sw-iba01
[INFO     ] Package will be downloaded at firmware update time via scp
[SUCCESS  ] Disable Subnet Manager on dm01sw-iba01
[SUCCESS  ] Execute plugin check for Patching on dm01sw-iba01
[INFO     ] Starting upgrade on dm01sw-iba01 to 2.2.11_2. Please give upto 15 mins for the process to complete. DO NOT INTERRUPT or HIT CTRL+C during the upgrade
[INFO     ] Rebooting dm01sw-iba01 to complete the firmware update. Wait for 15 minutes before continuing. DO NOT MANUALLY REBOOT THE INFINIBAND SWITCH
[SUCCESS  ] Load firmware 2.2.11_2 onto dm01sw-iba01
[SUCCESS  ] Disable Subnet Manager on dm01sw-iba01
[SUCCESS  ] Verify that /conf/configvalid is set to 1 on dm01sw-iba01
[INFO     ] Set SMPriority to 5 on dm01sw-iba01
[INFO     ] Rebooting dm01sw-iba01. Wait for 4 minutes before continuing
[SUCCESS  ] Reboot dm01sw-iba01
[SUCCESS  ] SUCCESS
[INFO     ] Starting post-update validation on dm01sw-iba01
[SUCCESS  ] Service opensmd is running on InfiniBand Switch dm01sw-iba01
[SUCCESS  ] NTP daemon is running on dm01sw-iba01.
[INFO     ] Manually validate the following entries Date:(YYYY-MM-DD) 2019-02-10 Time:(HH:MM:SS) 04:35:00
[INFO     ] /conf/configvalid is 1
[INFO     ] Validating the current firmware on the InfiniBand Switch
[SUCCESS  ] Firmware verification on InfiniBand switch dm01sw-iba01
[SUCCESS  ] Execute plugin check for Post Patch on dm01sw-iba01
[INFO     ] Finished post-update validation on dm01sw-iba01
[SUCCESS  ] Post-update validation on dm01sw-iba01
[SUCCESS  ] Update InfiniBand switch dm01sw-iba01 to 2.2.11_2
[INFO     ] InfiniBand Switches ( dm01sw-ibb01 dm01sw-iba01 ) updated to 2.2.11_2
[SUCCESS  ] Overall status

 ----- InfiniBand switch update process ended 2019-02-10 04:35:25 -0600 -----
2019-02-10 04:35:25 -0600 1 of 1 :SUCCESS: DONE: Upgrade InfiniBand switch(es) to 2.2.11-2.


  • Verify that all the IB Switches are upgraded to latest version.

[root@dm01db01 ~]# ssh dm01sw-ibb01 version
SUN DCS 36p version: 2.2.11-2
Build time: Aug 27 2018 11:18:39
SP board info:
Manufacturing Date: 2014.05.19
Serial Number: "NCDFxxxxx"
Hardware Revision: 0x0107
Firmware Revision: 0x0000
BIOS version: SUN0R100
BIOS date: 06/22/2010
[root@dm01db01 ~]#

[root@dm01db01 ~]# ssh dm01sw-iba01 version
SUN DCS 36p version: 2.2.11-2
Build time: Aug 27 2018 11:18:39
SP board info:
Manufacturing Date: 2014.05.20
Serial Number: "NCDFxxxxx"
Hardware Revision: 0x0107
Firmware Revision: 0x0000
BIOS version: SUN0R100
BIOS date: 06/22/2010

 


Conclusion
In this article we have demonstrated how to patch Exadata IB Switches using patchmgr utility. Patching an Exadata IB switch is very straight forward and can be done in rolling fashion without any downtime.

Wednesday 20 February 2019

Step by Step Exadata Storage Server Patching 18c

The patchmgr utility can be used for upgrading, rollback and backup Exadata Storage cells. patchmgr utility can be used for upgrading Storage cells in a rolling or non-rolling fashion. Non-Rolling is default. Storage server patches apply operating system, firmware, and driver updates.

Launch patchmgr from the compute node that is node 1 that has user equivalence setup to all the storage cells.

In this article I will demonstrate how to perform upgrade Exadata Storage cells using patchmgr utility.

MOS Notes
Read the following MOS notes carefully.

  • Exadata Database Machine and Exadata Storage Server Supported Versions (Doc ID 888828.1)
  • Exadata 18.1.12.0.0 release and patch (29194095) (Doc ID 2492012.1)   
  • Oracle Exadata Database Machine exachk or HealthCheck (Doc ID 1070954.1)   

Software Download

  • Download the following patches required for Upgrading Storage cells.
  • Patch 29194095 - Storage server (18.1.12.0.0.190111) and InfiniBand switch software (2.2.11-2)

Current Environment

  • Exadata X4-2 Half Rack (4 Compute nodes, 7 Storage Cells and 2 IB Switches) running ESS version 12.2.1.1.6

Current Image version

  • Execute the “imageinfo” command on one of the Compute nodes to identify the current Exadata Image version
[root@dm01cel01 ~]# imageinfo

Kernel version: 4.1.12-94.7.8.el6uek.x86_64 #2 SMP Thu Jan 11 20:41:01 PST 2018 x86_64
Cell version: OSS_12.2.1.1.6_LINUX.X64_180125.1
Cell rpm version: cell-12.2.1.1.6_LINUX.X64_180125.1-1.x86_64

Active image version: 12.2.1.1.6.180125.1
Active image kernel version: 4.1.12-94.7.8.el6uek
Active image activated: 2018-05-08 00:42:57 -0500
Active image status: success
Active system partition on device: /dev/md6
Active software partition on device: /dev/md8

Cell boot usb partition: /dev/sdac1
Cell boot usb version: 12.2.1.1.6.180125.1

Inactive image version: 12.1.2.3.6.170713
Inactive image activated: 2017-10-03 00:57:25 -0500
Inactive image status: success
Inactive system partition on device: /dev/md5
Inactive software partition on device: /dev/md7

Inactive marker for the rollback: /boot/I_am_hd_boot.inactive
Inactive grub config for the rollback: /boot/grub/grub.conf.inactive
Inactive kernel version for the rollback: 2.6.39-400.297.1.el6uek.x86_64
Rollback to the inactive partitions: Possible



Prerequisites

  • Install and configure VNC Server on Exadata compute node 1. It is recommended to use VNC or screen utility for patching to avoid disconnections due to network issues.

  • Enable blackout (OEM, crontab and so on)

  • Verify disk space on storage cells
[root@dm01db01 ~]# dcli -g ~/cell_group -l root 'df -h /'
dm01cel01: Filesystem      Size  Used Avail Use% Mounted on
dm01cel01: /dev/md6        9.8G  4.4G  4.9G  48% /
dm01cel02: Filesystem      Size  Used Avail Use% Mounted on
dm01cel02: /dev/md6        9.8G  4.5G  4.8G  49% /
dm01cel03: Filesystem      Size  Used Avail Use% Mounted on
dm01cel03: /dev/md6        9.8G  4.5G  4.8G  49% /
dm01cel04: Filesystem      Size  Used Avail Use% Mounted on
dm01cel04: /dev/md6        9.8G  4.5G  4.8G  49% /
dm01cel05: Filesystem      Size  Used Avail Use% Mounted on
dm01cel05: /dev/md6        9.8G  4.5G  4.8G  49% /
dm01cel06: Filesystem      Size  Used Avail Use% Mounted on
dm01cel06: /dev/md6        9.8G  4.6G  4.7G  50% /
dm01cel07: Filesystem      Size  Used Avail Use% Mounted on
dm01cel07: /dev/md6        9.8G  4.5G  4.8G  48% /


  • Run Exachk before starting the actual patching. Correct any Critical issues and Failure that can conflict with patching.

  • Verify hardware failure. Make sure there are no hardware failures before patching
[root@dm01db01 ~]# dcli -g ~/cell_group -l root 'cellcli -e list physicaldisk where status!=normal'
[root@dm01db01 ~]# dcli -l root -g ~/cell_group "cellcli -e list physicaldisk where diskType=FlashDisk and status not = normal"
[root@dm01db01 ~]# dcli -g ~/dbs_group -l root 'dbmcli -e list physicaldisk where status!=normal'

[root@dm01db01 ~]# dcli -g ~/dbs_group -l root 'ipmitool sunoem cli "show -d properties -level all /SYS fault_state==Faulted"'
[root@dm01db01 ~]# dcli -g ~/cell_group -l root 'ipmitool sunoem cli "show -d properties -level all /SYS fault_state==Faulted"'


  • Clear or acknowledge alerts on db and cell nodes
[root@dm01db01 ~]# dcli -l root -g ~/cell_group "cellcli -e drop alerthistory all"
[root@dm01db01 ~]# dcli -l root -g ~/dbs_group "dbmcli -e  drop alerthistory all"


  • Download patches and copy them to the compute node 1 under staging directory
Patch 29194095 - Storage server software (18.1.12.0.0.190111) and InfiniBand switch software (2.2.11-2)

  • Copy the patches to compute node 1 under staging aread and unzip the patches
[root@dm01db01 ~]# cd /u01/app/oracle/software/exa_patches
[root@dm01db01 ~]# unzip p29194095_181000_Linux-x86-64.zip


  • Read the readme file and document the steps for storage cell patching.

Steps to perform Storage Cell Patching

  • Open VNC Session and login as root user

  • Login as root user
[root@dm01db01 ~]# id
uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel)


  • Check SSH user equivalence
[root@dm01db01 ~]# dcli -g cell_group -l root uptime
dm01cel01: 01:46:18 up 194 days, 40 min,  0 users,  load average: 0.17, 0.50, 0.61
dm01cel02: 01:46:18 up 194 days, 40 min,  0 users,  load average: 0.05, 0.29, 0.45
dm01cel03: 01:46:18 up 194 days, 40 min,  0 users,  load average: 0.25, 0.64, 0.63
dm01cel04: 01:46:18 up 194 days, 40 min,  0 users,  load average: 0.12, 0.44, 0.53
dm01cel05: 01:46:18 up 194 days, 40 min,  0 users,  load average: 0.15, 0.55, 0.65
dm01cel06: 01:46:18 up 194 days, 40 min,  0 users,  load average: 0.33, 0.48, 0.55
dm01cel07: 01:46:18 up 194 days, 40 min,  0 users,  load average: 0.09, 0.37, 0.52

  • Adjust the disk_repair_time for Oracle ASM.
SQL> col value for a40
SQL> select dg.name,a.value from v$asm_diskgroup dg, v$asm_attribute a where dg.group_number=a.group_number and a.name='disk_repair_time';

NAME                           VALUE
------------------------------ --------------------------
DATA_DM01                      3.6H
RECO_DM01                      3.6h


  • Shut down and stop the Oracle components on each database server using the following commands:
[root@dm01db01 ~]# dcli -g dbs_group -l root '/u01/app/11.2.0.4/grid/bin/crsctl stop cluster -all'
[root@dm01db01 ~]# dcli -g dbs_group -l root '/u01/app/11.2.0.4/grid/bin/crsctl stop crs'


  • Get the current Cell Exadata Storage software version
[root@dm01cel01 ~]# imageinfo

Kernel version: 4.1.12-94.7.8.el6uek.x86_64 #2 SMP Thu Jan 11 20:41:01 PST 2018 x86_64
Cell version: OSS_12.2.1.1.6_LINUX.X64_180125.1
Cell rpm version: cell-12.2.1.1.6_LINUX.X64_180125.1-1.x86_64

Active image version: 12.2.1.1.6.180125.1
Active image kernel version: 4.1.12-94.7.8.el6uek
Active image activated: 2018-05-08 00:42:57 -0500
Active image status: success
Active system partition on device: /dev/md6
Active software partition on device: /dev/md8

Cell boot usb partition: /dev/sdac1
Cell boot usb version: 12.2.1.1.6.180125.1

Inactive image version: 12.1.2.3.6.170713
Inactive image activated: 2017-10-03 00:57:25 -0500
Inactive image status: success
Inactive system partition on device: /dev/md5
Inactive software partition on device: /dev/md7

Inactive marker for the rollback: /boot/I_am_hd_boot.inactive
Inactive grub config for the rollback: /boot/grub/grub.conf.inactive
Inactive kernel version for the rollback: 2.6.39-400.297.1.el6uek.x86_64
Rollback to the inactive partitions: Possible


  • Shut down all cell services on all cells to be updated. Use dcli command to do all cells at the same time:
[root@dm01db01 ~]# dcli -g cell_group -l root "cellcli -e alter cell shutdown services all"
dm01cel01:
dm01cel01: Stopping the RS, CELLSRV, and MS services...
dm01cel01: The SHUTDOWN of services was successful.
dm01cel02:
dm01cel02: Stopping the RS, CELLSRV, and MS services...
dm01cel02: The SHUTDOWN of services was successful.
dm01cel03:
dm01cel03: Stopping the RS, CELLSRV, and MS services...
dm01cel03: The SHUTDOWN of services was successful.
dm01cel04:
dm01cel04: Stopping the RS, CELLSRV, and MS services...
dm01cel04: The SHUTDOWN of services was successful.
dm01cel05:
dm01cel05: Stopping the RS, CELLSRV, and MS services...
dm01cel05: The SHUTDOWN of services was successful.
dm01cel06:
dm01cel06: Stopping the RS, CELLSRV, and MS services...
dm01cel06: The SHUTDOWN of services was successful.
dm01cel07:
dm01cel07: Stopping the RS, CELLSRV, and MS services...
dm01cel07: The SHUTDOWN of services was successful.


  • Reset the patchmgr state to a known state using the following command:
[root@dm01db01 patch_18.1.12.0.0.190111]# ./patchmgr -cells ~/cell_group -reset_force

2019-02-10 01:56:19 -0600        :Working: DO: Force Cleanup
2019-02-10 01:56:21 -0600        :SUCCESS: DONE: Force Cleanup


  • Clean up any previous patchmgr utility runs using the following command:
[root@dm01db01 patch_18.1.12.0.0.190111]# ./patchmgr -cells ~/cell_group -cleanup

2019-02-10 01:57:39 -0600        :Working: DO: Cleanup
2019-02-10 01:57:40 -0600        :SUCCESS: DONE: Cleanup


  • Verify that the cells meet prerequisite checks using the following command.
[root@dm01db01 patch_18.1.12.0.0.190111]# ./patchmgr -cells ~/cell_group -patch_check_prereq

2019-02-10 02:01:53 -0600        :Working: DO: Check cells have ssh equivalence for root user. Up to 10 seconds per cell ...
2019-02-10 02:01:55 -0600        :SUCCESS: DONE: Check cells have ssh equivalence for root user.
2019-02-10 02:02:00 -0600        :Working: DO: Initialize files. Up to 1 minute ...
2019-02-10 02:02:01 -0600        :Working: DO: Setup work directory
2019-02-10 02:02:02 -0600        :SUCCESS: DONE: Setup work directory
2019-02-10 02:02:04 -0600        :SUCCESS: DONE: Initialize files.
2019-02-10 02:02:04 -0600        :Working: DO: Copy, extract prerequisite check archive to cells. If required start md11 mismatched partner size correction. Up to 40 minutes ...
2019-02-10 02:02:17 -0600        :INFO   : Wait correction of degraded md11 due to md partner size mismatch. Up to 30 minutes.
2019-02-10 02:02:18 -0600        :SUCCESS: DONE: Copy, extract prerequisite check archive to cells. If required start md11 mismatched partner size correction.
2019-02-10 02:02:18 -0600        :Working: DO: Check space and state of cell services. Up to 20 minutes ...
2019-02-10 02:03:40 -0600        :SUCCESS: DONE: Check space and state of cell services.
2019-02-10 02:03:40 -0600        :Working: DO: Check prerequisites on all cells. Up to 2 minutes ...
2019-02-10 02:03:49 -0600        :SUCCESS: DONE: Check prerequisites on all cells.
2019-02-10 02:03:49 -0600        :Working: DO: Execute plugin check for Patch Check Prereq ...
2019-02-10 02:03:49 -0600        :INFO   : Patchmgr plugin start: Prereq check for exposure to bug 22909764 v1.0.
2019-02-10 02:03:49 -0600        :INFO   : Details in logfile /u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111/patchmgr.stdout.
2019-02-10 02:03:49 -0600        :INFO   : Patchmgr plugin start: Prereq check for exposure to bug 17854520 v1.3.
2019-02-10 02:03:49 -0600        :INFO   : Details in logfile /u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111/patchmgr.stdout.
2019-02-10 02:03:49 -0600        :SUCCESS: No exposure to bug 17854520 with non-rolling patching
2019-02-10 02:03:49 -0600        :INFO   : Patchmgr plugin start: Prereq check for exposure to bug 22468216 v1.0.
2019-02-10 02:03:49 -0600        :INFO   : Details in logfile /u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111/patchmgr.stdout.
2019-02-10 02:03:49 -0600        :SUCCESS: Patchmgr plugin complete: Prereq check passed for the bug 22468216
2019-02-10 02:03:49 -0600        :INFO   : Patchmgr plugin start: Prereq check for exposure to bug 24625612 v1.0.
2019-02-10 02:03:49 -0600        :INFO   : Details in logfile /u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111/patchmgr.stdout.
2019-02-10 02:03:49 -0600        :SUCCESS: Patchmgr plugin complete: Prereq check passed for the bug 24625612
2019-02-10 02:03:49 -0600        :SUCCESS: No exposure to bug  with non-rolling patching
2019-02-10 02:03:49 -0600        :INFO   : Patchmgr plugin start: Prereq check for exposure to bug 22651315 v1.0.
2019-02-10 02:03:49 -0600        :INFO   : Details in logfile /u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111/patchmgr.stdout.
2019-02-10 02:03:51 -0600        :SUCCESS: Patchmgr plugin complete: Prereq check passed for the bug 22651315
2019-02-10 02:03:51 -0600        :SUCCESS: DONE: Execute plugin check for Patch Check Prereq.
2019-02-10 02:03:51 -0600        :Working: DO: Check ASM deactivation outcome. Up to 1 minute ...
2019-02-10 02:04:02 -0600        :SUCCESS: DONE: Check ASM deactivation outcome
.

  • If the prerequisite checks pass, then start the update process.
[root@dm01db01 patch_18.1.12.0.0.190111]# ./patchmgr -cells ~/cell_group -patch
********************************************************************************
NOTE Cells will reboot during the patch or rollback process.
NOTE For non-rolling patch or rollback, ensure all ASM instances using
NOTE the cells are shut down for the duration of the patch or rollback.
NOTE For rolling patch or rollback, ensure all ASM instances using
NOTE the cells are up for the duration of the patch or rollback.

WARNING Do not interrupt the patchmgr session.
WARNING Do not alter state of ASM instances during patch or rollback.
WARNING Do not resize the screen. It may disturb the screen layout.
WARNING Do not reboot cells or alter cell services during patch or rollback.
WARNING Do not open log files in editor in write mode or try to alter them.

NOTE All time estimates are approximate.
********************************************************************************

2019-02-10 02:08:27 -0600        :Working: DO: Check cells have ssh equivalence for root user. Up to 10 seconds per cell ...
2019-02-10 02:08:28 -0600        :SUCCESS: DONE: Check cells have ssh equivalence for root user.
2019-02-10 02:08:33 -0600        :Working: DO: Initialize files. Up to 1 minute ...
2019-02-10 02:08:34 -0600        :Working: DO: Setup work directory
2019-02-10 02:09:13 -0600        :SUCCESS: DONE: Setup work directory
2019-02-10 02:09:15 -0600        :SUCCESS: DONE: Initialize files.
2019-02-10 02:09:15 -0600        :Working: DO: Copy, extract prerequisite check archive to cells. If required start md11 mismatched partner size correction. Up to 40 minutes ...
2019-02-10 02:09:28 -0600        :INFO   : Wait correction of degraded md11 due to md partner size mismatch. Up to 30 minutes.
2019-02-10 02:09:30 -0600        :SUCCESS: DONE: Copy, extract prerequisite check archive to cells. If required start md11 mismatched partner size correction.
2019-02-10 02:09:30 -0600        :Working: DO: Check space and state of cell services. Up to 20 minutes ...
2019-02-10 02:10:05 -0600        :SUCCESS: DONE: Check space and state of cell services.
2019-02-10 02:10:05 -0600        :Working: DO: Check prerequisites on all cells. Up to 2 minutes ...
2019-02-10 02:10:13 -0600        :SUCCESS: DONE: Check prerequisites on all cells.
2019-02-10 02:10:13 -0600        :Working: DO: Copy the patch to all cells. Up to 3 minutes ...
2019-02-10 02:12:01 -0600        :SUCCESS: DONE: Copy the patch to all cells.
2019-02-10 02:12:03 -0600        :Working: DO: Execute plugin check for Patch Check Prereq ...
2019-02-10 02:12:03 -0600        :INFO   : Patchmgr plugin start: Prereq check for exposure to bug 22909764 v1.0.
2019-02-10 02:12:03 -0600        :INFO   : Details in logfile /u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111/patchmgr.stdout.
2019-02-10 02:12:03 -0600        :INFO   : Patchmgr plugin start: Prereq check for exposure to bug 17854520 v1.3.
2019-02-10 02:12:03 -0600        :INFO   : Details in logfile /u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111/patchmgr.stdout.
2019-02-10 02:12:03 -0600        :SUCCESS: No exposure to bug 17854520 with non-rolling patching
2019-02-10 02:12:03 -0600        :INFO   : Patchmgr plugin start: Prereq check for exposure to bug 22468216 v1.0.
2019-02-10 02:12:03 -0600        :INFO   : Details in logfile /u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111/patchmgr.stdout.
2019-02-10 02:12:03 -0600        :SUCCESS: Patchmgr plugin complete: Prereq check passed for the bug 22468216
2019-02-10 02:12:03 -0600        :INFO   : Patchmgr plugin start: Prereq check for exposure to bug 24625612 v1.0.
2019-02-10 02:12:03 -0600        :INFO   : Details in logfile /u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111/patchmgr.stdout.
2019-02-10 02:12:03 -0600        :SUCCESS: Patchmgr plugin complete: Prereq check passed for the bug 24625612
2019-02-10 02:12:03 -0600        :SUCCESS: No exposure to bug  with non-rolling patching
2019-02-10 02:12:03 -0600        :INFO   : Patchmgr plugin start: Prereq check for exposure to bug 22651315 v1.0.
2019-02-10 02:12:03 -0600        :INFO   : Details in logfile /u01/app/oracle/software/exa_patches/patch_18.1.12.0.0.190111/patchmgr.stdout.
2019-02-10 02:12:05 -0600        :SUCCESS: Patchmgr plugin complete: Prereq check passed for the bug 22651315
2019-02-10 02:12:06 -0600        :SUCCESS: DONE: Execute plugin check for Patch Check Prereq.
2019-02-10 02:12:12 -0600 1 of 5 :Working: DO: Initiate patch on cells. Cells will remain up. Up to 5 minutes ...
2019-02-10 02:12:16 -0600 1 of 5 :SUCCESS: DONE: Initiate patch on cells.
2019-02-10 02:12:16 -0600 2 of 5 :Working: DO: Waiting to finish pre-reboot patch actions. Cells will remain up. Up to 45 minutes ...
2019-02-10 02:13:16 -0600        :INFO   : Wait for patch pre-reboot procedures
2019-02-10 02:14:34 -0600 2 of 5 :SUCCESS: DONE: Waiting to finish pre-reboot patch actions.
2019-02-10 02:14:34 -0600        :Working: DO: Execute plugin check for Patching ...
2019-02-10 02:14:34 -0600        :SUCCESS: DONE: Execute plugin check for Patching.
2019-02-10 02:14:35 -0600 3 of 5 :Working: DO: Finalize patch on cells. Cells will reboot. Up to 5 minutes ...
2019-02-10 02:14:39 -0600 3 of 5 :SUCCESS: DONE: Finalize patch on cells.
2019-02-10 02:15:41 -0600 4 of 5 :Working: DO: Wait for cells to reboot and come online. Up to 120 minutes ...
2019-02-10 02:16:41 -0600        :INFO   : Wait for patch finalization and reboot
2019-02-10 02:44:33 -0600 4 of 5 :SUCCESS: DONE: Wait for cells to reboot and come online.
2019-02-10 02:44:33 -0600 5 of 5 :Working: DO: Check the state of patch on cells. Up to 5 minutes ...
2019-02-10 02:44:52 -0600 5 of 5 :SUCCESS: DONE: Check the state of patch on cells.
2019-02-10 02:44:52 -0600        :Working: DO: Execute plugin check for Pre Disk Activation ...
2019-02-10 02:44:53 -0600        :SUCCESS: DONE: Execute plugin check for Pre Disk Activation.
2019-02-10 02:44:53 -0600        :Working: DO: Activate grid disks...
2019-02-10 02:44:54 -0600        :INFO   : Wait for checking and activating grid disks
2019-02-10 02:45:00 -0600        :SUCCESS: DONE: Activate grid disks.
2019-02-10 02:45:03 -0600        :Working: DO: Execute plugin check for Post Patch ...
2019-02-10 02:45:03 -0600        :SUCCESS: DONE: Execute plugin check for Post Patch.
2019-02-10 02:45:04 -0600        :Working: DO: Cleanup
2019-02-10 02:45:56 -0600        :SUCCESS: DONE: Cleanup


  • Monitor the log files and cells being updated when e-mail alerts are not setup. open a new session and do a tail on the log file as shown below
[root@dm01db01 patch_18.1.12.0.0.190111]# tail -f patchmgr.stdout

  • Verify the update status after the patchmgr utility completes as follows:
[root@dm01cel01 ~]# imageinfo

Kernel version: 4.1.12-94.8.10.el6uek.x86_64 #2 SMP Sat Dec 22 21:26:11 PST 2018 x86_64
Cell version: OSS_18.1.12.0.0_LINUX.X64_190111
Cell rpm version: cell-18.1.12.0.0_LINUX.X64_190111-1.x86_64

Active image version: 18.1.12.0.0.190111
Active image kernel version: 4.1.12-94.8.10.el6uek
Active image activated: 2019-02-10 02:43:36 -0600
Active image status: success
Active system partition on device: /dev/md5
Active software partition on device: /dev/md7

Cell boot usb partition: /dev/sdac1
Cell boot usb version: 18.1.12.0.0.190111

Inactive image version: 12.2.1.1.6.180125.1
Inactive image activated: 2018-05-16 00:58:24 -0500
Inactive image status: success
Inactive system partition on device: /dev/md6
Inactive software partition on device: /dev/md8

Inactive marker for the rollback: /boot/I_am_hd_boot.inactive
Inactive grub config for the rollback: /boot/grub/grub.conf.inactive
Inactive usb grub config for the rollback: /boot/grub/grub.conf.usb.inactive
Inactive kernel version for the rollback: 4.1.12-94.7.8.el6uek.x86_64
Rollback to the inactive partitions: Possible




  • Check the imagehistory
[root@dm01cel01 ~]# imagehistory
Version                              : 12.1.1.1.1.140712
Image activation date                : 2014-11-23 00:34:06 -0800
Imaging mode                         : fresh
Imaging status                       : success

Version                              : 12.1.1.1.2.150411
Image activation date                : 2015-05-28 21:40:16 -0500
Imaging mode                         : out of partition upgrade
Imaging status                       : success

Version                              : 12.1.2.3.2.160721
Image activation date                : 2016-10-14 02:45:04 -0500
Imaging mode                         : out of partition upgrade
Imaging status                       : success

Version                              : 12.1.2.3.4.170111
Image activation date                : 2017-04-04 00:25:08 -0500
Imaging mode                         : out of partition upgrade
Imaging status                       : success

Version                              : 12.1.2.3.6.170713
Image activation date                : 2017-10-19 03:40:28 -0500
Imaging mode                         : out of partition upgrade
Imaging status                       : success

Version                              : 12.2.1.1.6.180125.1
Image activation date                : 2018-05-16 00:58:24 -0500
Imaging mode                         : out of partition upgrade
Imaging status                       : success

Version                              : 18.1.12.0.0.190111
Image activation date                : 2019-02-10 02:43:36 -0600
Imaging mode                         : out of partition upgrade
Imaging status                       : success


  • Verify the image on all cells
[root@dm01db01 ~]# dcli -g cell_group -l root 'imageinfo | grep "Active image version"'
dm01cel01: Active image version: 18.1.12.0.0.190111
dm01cel02: Active image version: 18.1.12.0.0.190111
dm01cel03: Active image version: 18.1.12.0.0.190111
dm01cel04: Active image version: 18.1.12.0.0.190111
dm01cel05: Active image version: 18.1.12.0.0.190111
dm01cel06: Active image version: 18.1.12.0.0.190111
dm01cel07: Active image version: 18.1.12.0.0.190111


  • Clean up the cells using the -cleanup option to clean up all the temporary update or rollback files on the cells.
[root@dm01db01 patch_18.1.12.0.0.190111]# ./patchmgr -cells ~/cell_group -cleanup

2019-02-10 02:58:37 -0600        :Working: DO: Cleanup
2019-02-10 02:58:39 -0600        :SUCCESS: DONE: Cleanup


  • Start Clusterware and databases
[root@dm01db01 ~]# /u01/app/11.2.0.4/grid/bin/crsctl check crs
CRS-4639: Could not contact Oracle High Availability Services

[root@dm01db01 ~]# dcli -g dbs_group -l root '/u01/app/11.2.0.4/grid/bin/crsctl start crs'
dm01db01: CRS-4123: Oracle High Availability Services has been started.
dm01db02: CRS-4123: Oracle High Availability Services has been started.
dm01db03: CRS-4123: Oracle High Availability Services has been started.
dm01db04: CRS-4123: Oracle High Availability Services has been started.

[root@dm01db01 ~]# dcli -g dbs_group -l root '/u01/app/11.2.0.4/grid/bin/crsctl check crs'
dm01db01: CRS-4638: Oracle High Availability Services is online
dm01db01: CRS-4537: Cluster Ready Services is online
dm01db01: CRS-4529: Cluster Synchronization Services is online
dm01db01: CRS-4533: Event Manager is online
dm01db02: CRS-4638: Oracle High Availability Services is online
dm01db02: CRS-4537: Cluster Ready Services is online
dm01db02: CRS-4529: Cluster Synchronization Services is online
dm01db02: CRS-4533: Event Manager is online
dm01db03: CRS-4638: Oracle High Availability Services is online
dm01db03: CRS-4537: Cluster Ready Services is online
dm01db03: CRS-4529: Cluster Synchronization Services is online
dm01db03: CRS-4533: Event Manager is online
dm01db04: CRS-4638: Oracle High Availability Services is online
dm01db04: CRS-4537: Cluster Ready Services is online
dm01db04: CRS-4529: Cluster Synchronization Services is online
dm01db04: CRS-4533: Event Manager is online


[root@dm01db01 ~]# /u01/app/11.2.0.4/grid/bin/crsctl stat res -t | more
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA_dm01.dg
               ONLINE  ONLINE       dm01db01
               ONLINE  ONLINE       dm01db02
               ONLINE  ONLINE       dm01db03
               ONLINE  ONLINE       dm01db04
ora.DBFS_DG.dg
               ONLINE  ONLINE       dm01db01
               ONLINE  ONLINE       dm01db02
               ONLINE  ONLINE       dm01db03
               ONLINE  ONLINE       dm01db04
ora.LISTENER.lsnr
               ONLINE  ONLINE       dm01db01
               ONLINE  ONLINE       dm01db02
               ONLINE  ONLINE       dm01db03
               ONLINE  ONLINE       dm01db04
ora.RECO_dm01.dg
               ONLINE  ONLINE       dm01db01
               ONLINE  ONLINE       dm01db02
               ONLINE  ONLINE       dm01db03
               ONLINE  ONLINE       dm01db04
ora.asm
               ONLINE  ONLINE       dm01db01                 Started
               ONLINE  ONLINE       dm01db02                 Started
               ONLINE  ONLINE       dm01db03                 Started
               ONLINE  ONLINE       dm01db04                 Started
ora.gsd
               OFFLINE OFFLINE      dm01db01
               OFFLINE OFFLINE      dm01db02
               OFFLINE OFFLINE      dm01db03
               OFFLINE OFFLINE      dm01db04
ora.net1.network
               ONLINE  ONLINE       dm01db01
               ONLINE  ONLINE       dm01db02
               ONLINE  ONLINE       dm01db03
               ONLINE  ONLINE       dm01db04
ora.ons
               ONLINE  ONLINE       dm01db01
               ONLINE  ONLINE       dm01db02
               ONLINE  ONLINE       dm01db03
               ONLINE  ONLINE       dm01db04
ora.registry.acfs
               ONLINE  OFFLINE      dm01db01
               ONLINE  OFFLINE      dm01db02
               ONLINE  OFFLINE      dm01db03
               ONLINE  OFFLINE      dm01db04
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       dm01db04
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       dm01db03
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       dm01db01
ora.cvu
      1        ONLINE  ONLINE       dm01db02
ora.dbm01.db
      1        OFFLINE OFFLINE
      2        OFFLINE OFFLINE
      3        OFFLINE OFFLINE
      4        OFFLINE OFFLINE
ora.dm01db01.vip
      1        ONLINE  ONLINE       dm01db01
ora.dm01db02.vip
      1        ONLINE  ONLINE       dm01db02
ora.dm01db03.vip
      1        ONLINE  ONLINE       dm01db03
ora.dm01db04.vip
      1        ONLINE  ONLINE       dm01db04
ora.oc4j
      1        ONLINE  ONLINE       dm01db02
ora.orcldb.db
      1        ONLINE  ONLINE       dm01db01                 Open
      2        ONLINE  ONLINE       dm01db02                 Open
      3        ONLINE  ONLINE       dm01db03                 Open
      4        ONLINE  ONLINE       dm01db04                 Open
ora.nsmdb.db
      1        ONLINE  ONLINE       dm01db01                 Open
      2        ONLINE  ONLINE       dm01db02                 Open
      3        ONLINE  ONLINE       dm01db03                 Open
      4        ONLINE  ONLINE       dm01db04                 Open
ora.scan1.vip
      1        ONLINE  ONLINE       dm01db04
ora.scan2.vip
      1        ONLINE  ONLINE       dm01db03
ora.scan3.vip
      1        ONLINE  ONLINE       dm01db01


  • Verify the databases and start them if needed
$ srvctl status database -d orcldb
$ srvctl status database -d nsmdb

 

Conclusion

In this article we have learned how to perform upgrade Exadata Storage cells using patchmgr utility. The patchmgr utility can be used for upgrading, rollback and backup Exadata Storage cells. patchmgr utility can be used for upgrading Storage cells in a rolling or non-rolling fashion. Non-Rolling is default. Storage server patches apply operating system, firmware, and driver updates. Launch patchmgr from the compute node that is node 1 that has user equivalence setup to all the storage cells.

Monday 18 February 2019

Change root User Password on Exadata Infiniband Switch

I was working on changing password for the administrative user accounts on all Exadata Components. I encountered a strange issue while changing the root password on Infiniband Switch. We were unable to change the root password on IB Siwtch using command line method. We used couple different command line methods to change the root password on IB switches but all of them failed. This could be a BUG, firmware issue or something else.

In this article we demonstrate how to change the root password on an Exadata infiniband switch using Browser User Interface.

Issue 1: Using passwd command

Tried to change the root user password using passwd command using dcli. This method assumes you are have ssh equivalence setup from compute node 1. As you can see the command failed saying to use the ILOM shell. In the past I have used the same command successfully to change the root password on IB Switches.

[root@dm01db01 ~]#  dcli -g ibswitch_group -l root "echo welcome1 | passwd --stdin root"
dm01sw-ibb01: This command should not be used for ILOM users.
dm01sw-ibb01: Please use ILOM shell to handle password for this user.
dm01sw-ibb01: Example:
dm01sw-ibb01: -> set /SP/users/root password
dm01sw-ibb01:
dm01sw-iba01: This command should not be used for ILOM users.
dm01sw-iba01: Please use ILOM shell to handle password for this user.
dm01sw-iba01: Example:
dm01sw-iba01: -> set /SP/users/root password
dm01sw-iba01:


So I decided to login to the IB switch directly and use the passwd command instead of running from dcli. The passwd command fail again with the same error.

[root@dm01sw-iba01 ~]# ssh dm01sw-ibb01
You are now logged in to the root shell.
It is recommended to use ILOM shell instead of root shell.
All usage should be restricted to documented commands and documented
config files.
To view the list of documented commands, use "help" at linux prompt.

[root@dm01sw-ibb01 ~]# hostname
dm01sw-ibb01

[root@dm01sw-iba01 ~]# passwd root
This command should not be used for ILOM users.
Please use ILOM shell to handle password for this user.
Example:
   -> set /SP/users/root password



Issue 2: Using ILOM Shell

As the passwd command failed asking to use the ILOM shell, I login to the IB switch as ilom-admin and executed the change password command. What I see is, the password change command failed at ILOM prompt as well.

[root@dm01sw-iba01 ~]# su - ilom-admin

Oracle(R) Integrated Lights Out Manager
Version 2.2.7-1 ILOM 3.2.6 r118629
Copyright (c) 2017, Oracle and/or its affiliates. All rights reserved.
Warning: HTTPS certificate is set to factory default.
 

Hostname: dm01sw-iba01

-> set /SP/users/root welcome1
set: Invalid command syntax
Usage: set [-script] [target] <property>=<value> [<property>=<value>...]



 

Solution: Using Browser User Interface

I have decided to use the BUI to change the password.

Steps:

  • Open a Browser and enter the IB Switch hostname or IP address
https://dm01sw-ibb01.netsoftmate.com
  • Accept the security warning and proceed to connect to the IB Switch
  • Enter the username and password to connect to the IB Switch

  • This show the summary page

  • On the left Pan, expand ILOM administration and select User Management

  • Click on  User Accounts, Select root user and click on edit button

  • Enter the new password and confirm and Finally click on the Save button to change the password.

  • To Verify the new password, open a Putty session and ssh to IB Switch using new password.
[root@dm01db01 ~]# ssh dm01sw-ibb01
Password:
You are now logged in to the root shell.
It is recommended to use ILOM shell instead of root shell.
All usage should be restricted to documented commands and documented
config files.
To view the list of documented commands, use "help" at linux prompt.

[root@dm01sw-ibb01 ~]# hostname
dm01sw-ibb01



Conclusion

In this article we have learned how to change the root password on Infiniband Switch using Browser User Interface when the command line option doesn't work.

Friday 15 February 2019

How To Clear Hardware Fault on Exadata Infiniband Switch Manually

Introduction

We had a FAN failure on Exadata Infiniband Switch (FAN2). Scheduled the faulty hardware replacement with Oracle. The Oracle Feild Engineer came to the Customer Data Center and replaced the faulty FAN on Infiniband Switch. The FAN replacement was successful however the fault was not cleared automatically. We can still see the FAN was marked faulted from Infiniband BUI and CLI.

From Infiniband Browser User Interface



In this article we will demonstrate how to clear the fault on Infiniband Switch after hardware replacement.


  • Login to the Infiniband switch using Putty as root user and check the Infiniband health. From the output below we can see the FANs are all good.
[root@dm01sw-iba01 ~]# env_test
Environment test started:
Starting Environment Daemon test:
Environment daemon running
Environment Daemon test returned OK
Starting Voltage test:
Voltage ECB OK
Measured 3.3V Main = 3.28 V
Measured 3.3V Standby = 3.39 V
Measured 12V = 11.97 V
Measured 5V = 5.02 V
Measured VBAT = 3.14 V
Measured 2.5V = 2.49 V
Measured 1.8V = 1.79 V
Measured I4 1.2V = 1.22 V
Voltage test returned OK
Starting PSU test:
PSU 0 present OK
PSU 1 present OK
PSU test returned OK
Starting Temperature test:
Back temperature 40
Front temperature 41
SP temperature 57
Switch temperature 55, maxtemperature 59
Temperature test returned OK
Starting FAN test:
Fan 0 not present
Fan 1 running at rpm 17004
Fan 2 running at rpm 15696
Fan 3 running at rpm 17004
Fan 4 not present
FAN test returned OK
Starting Connector test:
Connector test returned OK
Starting Onboard ibdevice test:
Switch OK
All Internal ibdevices OK
Onboard ibdevice test returned OK
Starting SSD test:
SSD test returned OK
Starting Auto-link-disable test:
Auto-link-disable test returned OK
Environment test PASSED

  • Check the FAN Speed. FAN looks good.
[root@dm01sw-iba01 ~]# getfanspeed
Fan 0 not present
Fan 1 running at rpm 17004
Fan 2 running at rpm 15478
Fan 3 running at rpm 17004
Fan 4 not present


  • Switch to the ilom-admin user
[root@dm01sw-iba01 ~]# su - ilom-admin

Oracle(R) Integrated Lights Out Manager

Version 2.2.9-3 ILOM 3.2.11 r124039

Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved.

Warning: HTTPS certificate is set to factory default.

Hostname: dm01sw-iba01.netsoftmate.com

->


  • Now check the fault table for any faulty components. Now we can see the FAN2 is Faulted though the FAN was replaced with a new FAN.
-> show / -a -l 4 -o table fault_state
Target                                  | Property                                     | Value
----------------------------------------+----------------------------------------------+--------------------------------------------------------------------
/SYS                                    | fault_state                                  | OK
/SYS/MB                                 | fault_state                                  | OK
/SYS/PSU0                               | fault_state                                  | OK
/SYS/PSU1                               | fault_state                                  | OK
/SYS/FAN1                               | fault_state                                  | OK
/SYS/FAN2                               | fault_state                                  | Faulted /SYS/FAN3                               | fault_state                                  | OK

->


  • You can also execute the below command to identify the fault
-> show -d targets /SP/faultmgmt

 /SP/faultmgmt
    Targets:
        shell
        0 (/SYS/FAN2)


  • Clear the Fault as show below
-> set /SYS/FAN2 clear_fault_action=true
Are you sure you want to clear /SYS/FAN2 (y/n)? y
Set 'clear_fault_action' to 'true'


  • Verify the fault is cleared
-> show / -a -l 4 -o table fault_state
Target                                  | Property                                     | Value
----------------------------------------+----------------------------------------------+--------------------------------------------------------------------
/SYS                                    | fault_state                                  | OK
/SYS/MB                                 | fault_state                                  | OK
/SYS/PSU0                               | fault_state                                  | OK
/SYS/PSU1                               | fault_state                                  | OK
/SYS/FAN1                               | fault_state                                  | OK
/SYS/FAN2                               | fault_state                               
   | OK
/SYS/FAN3                               | fault_state                                  | OK

-> show -d targets /SP/faultmgmt

 /SP/faultmgmt
    Targets:
        shell


  • Verify from the Infiniband Band BUI


Conclusion

In this article we have learned how to identify the fault and clear it manually on an Exadata Infiniband Switch. The ILOM commands comes handy for clearing the fault. You can also clear the fault using the Browser User Interface (BUI).

Comparing Oracle Database Appliance X8-2 Model Family

September 2019 Oracle announced Oracle Database Appliance X8-2 (Small, Medium and HA). ODA X8-2 comes with more computing resources com...