Overview
As we know Exadata database
machine is a combination of Hardware and Software. Over a period of time these
hardware and software can failure or give performance issues. In my experience
I have seen hardware failures during the Exadata install and as well immediate
after completion of Installation. But the newer generations of Exadata machines
are more stable and you might find fewer hardware failures.
When you work with Support on
hardware, software or performance issues they would request you to run the
following Diagnostic utilities and uploaded the diagnostic data.
The example of hardware, software
or performance issues are as follows:
- Hardware failure: Hard disk, Flash disk, mother board, processor, DIMM and so on
- Software issues: Operating system, firmware, Oracle software and so on
- Performance issues: Operating system and database
In this article I will
demonstrator how to execute these utilities with a live example.
Diagnostic Utilities at a Glance
Utility Name
|
Description
|
SOSREPORT
|
collects
detailed information about the hardware and configuration of Oracle Linux
server
|
SUNDIAG
|
The
utility is used for gathering hardware related information
|
ILOM
SNAPSHOT
|
The
utility is used for gathering hardware related information
|
EXAWATCHER
|
It
collects the system data and reporting utilities. This information is mostly
used for troubleshooting OS or performance issue.
|
Now let’s take a look at these
utilities in little more detail
SOSREPORT UTILITY
SOSREPORT
utility collects detailed information about the hardware and configuration of
Oracle Linux server.
Steps to run SOSREPORT:
- Log in to the compute node or storage cell as root user account for which you are running SOSREPORT (example: dm01db01)
[root@dm01db01
~]# id
uid=0(root)
gid=0(root) groups=0(root), 1(bin), 2(daemon), 3(sys), 4(adm), 6(disk),
10(wheel)
- You will find the sosreport utility under /usr/sbin location. You also use the Linux command “locate” to search for the utility.
[root@dm01db01 ~]# locate
sosreport
/usr/sbin/sosreport
- Execute the sosreport utility at the shell as follows
[root@dm01db01 ~]# /usr/sbin/sosreport
On the
execution of this utility it will ask you for the input.
a.
Press ENTER
to continue, or CTRL-C to quit.
Press ENTER
on your keyboard
b.
Please enter
your first initial and last name [dm01db01]:
Press ENTER
to accept the default or enter a value of your choice
c.
Please enter
the case number that you are generating this report for:
Enter the SR
number
At this time
it will take a while (approximately 5-6 minutes) and generate a compressed
archive file in /tmp directory.
- Use WinScp or similar utility to copy the output file to your desktop
- Upload the output file to Oracle Support for review.
Sample SOSREPORT Run:
[root@dm01db01
~]# locate sosreport
/usr/sbin/sosreport
[root@dm01db01
~]# /usr/sbin/sosreport
sosreport
(version 2.2)
This command will collect diagnostic and
configuration
information
from this Oracle Linux system and installed
applications.
An archive containing the collected
information will be
generated
in /tmp and may be provided to a Oracle USA
support
representative.
Any information provided to Oracle USA will
be treated
in
accordance with the published support policies at:
https://linux.oracle.com/
The generated archive may contain data
considered
sensitive
and its content should be reviewed by the
originating
organization before being passed to any third
party.
No changes will be made to system
configuration.
Press
ENTER to continue, or CTRL-C to quit.
Please
enter your first initial and last name [dm01db01]:
Please
enter the case number that you are generating this report for [None]: 3-1386095xxxx
Running plugins. Please wait ...
Completed [66/66] ...
Creating
compressed archive...
Your
sosreport has been generated and saved in:
/tmp/sosreport-dm01db01.3-1386095-20161230023010-9f83.tar.xz
The
md5sum is: b1ccc01a773cbd36d463ba07b57c9f83
Please
send this file to your support representative.
[root@dm01db01
~]#
|
SUNDIAG UTILITY
The utility is used for gathering
hardware related information. Oracle Support uses this diagnostic data to
assess the hardware failure.
Steps to run SUNDIAG report:
Follow the steps listed below to
run the sundiag.sh utility.
- Log in to the compute node or storage cell as root user account for which you are running SUNDIAG (example: dm01db01)
- You will find the sundiag utility under /opt/oracle.SupportTools location. You also use the Linux command “locate” to search for the utility.
[root@dm01cel01 ~]# locate sundiag
/opt/oracle.SupportTools/sundiag.sh
- Run the sundiag.sh utility
[root@dm01cel01 ~]#
/opt/oracle.SupportTools/sundiag.sh
- Use WinScp or similar utility to copy the output file to your deskto
- Upload the output file to Oracle Support for review.
Sample sundiag Run:
[root@dm01cel01
~]# locate sundiag
/opt/oracle.SupportTools/sundiag.sh
[root@dm01cel01 ~]#
/opt/oracle.SupportTools/sundiag.sh
Oracle
Exadata Database Machine - Diagnostics Collection Tool
Last
alert date is beyond 7 days. Skipping OSW/Metrics collection
Gathering
Linux information
Skipping collection of
OSWatcher/ExaWatcher logs, Cell Metrics and Traces
Skipping ILOM collection. Use the
ilom or snapshot options, or login to ILOM
over the network and run Snapshot
separately if necessary.
/var/log/exadatatmp/sundiag_dm01cel01_1605NM70AD_2016_12_30_02_30
Gathering
Cell information
==============================================================================
Done.
The report files are bzip2 compressed in
/var/log/exadatatmp/sundiag_dm01cel01_1605NM70AD_2016_12_30_02_30.tar.bz2
==============================================================================
|
If you read the output carefully,
sundiag utility doesn’t collect the ILOM data and Exawatcher data. That is the
reason we need to run separate utilities to gather these data.
ILOM SNAPSHOT
This data is required to
troubleshoot a hardware issue by Oracle Support.
Follow the steps listed below to
run the snapshot utility in GUI.
Using GUI Interface:
- For ILOM 2.x and 3.0
- Open a web browser (use something other than Internet Explorer) and enter the following address
Note: You may see complaints about security –
ignore or override – click I understand the risks/Add exception/Confirm
Security Exception
- Enter root as User Name and its password and click on Log In. This will take you to the Home screen.
- Select Maintenance -> Snapshot. (ILOM 2.x and 3.0)
The Service Snapshot
Utility page appears.
ILOM 2.x and 3.0 will
look similar to this:
- From the above Screen, Select Data Set “Normal”, Select Transfer Method as “Browser” and Click “Run”.
Normal - Specifies
that ILOM, operating system, and hardware information is collected.
The download file will be saved
according to your browser settings.
Important Note: Do
not enable this option: 'Collect Only Log Files from Data Set'.
Doing so will limit the snapshot to a much smaller sub-section of log files.
- In the dialog box, specify the directory to which to save the file and the file name.
Click OK.
The file is saved to the specified directory.
The file is saved to the specified directory.
- For ILOM version 3.1
If the ILOM version is 3.1 which is the latest version shipped with
X3/X4 Exadata. There a little difference in the design.
- Open a web browser (use something other than Internet Explorer) and enter the following address
Note: You may see complaints about security –
ignore or override – click I understand the risks/Add exception/Confirm
Security Exception.
- Enter root as User Name and its password and click on Log In. This will take you to the Home Screen.
- Select ILOM Administration -> Maintenance -> Snapshot (ILOM 3.1)
The Service Snapshot
Utility page appears.
ILOM 3.1 will look
similar to this:
- Select Data Set “Normal”, Select Transfer Method as “Browser” and Click “Run”.
Normal - Specifies
that ILOM, operating system, and hardware information is collected.
The download file will be saved
according to your browser settings.
Important Note: Do
not enable this option: 'Collect Only Log Files from Data Set'.
Doing so will limit the snapshot to a much smaller sub-section of log files.
- In the dialog box, specify the directory to which to save the file and the file name.
Click OK.
The file is saved to the specified directory.
- Using CLI
Follow the steps listed below to
run the snapshot utility in command line.
- Log in to the ILOM CLI interface
[root@dm01db01 ~]# ssh dm01db01-ilom
Password:
- You will see a similar output
Oracle(R)
Integrated Lights Out Manager
Version
3.0.16.15.j r101695
Copyright
(c) 2015, Oracle and/or its affiliates. All rights reserved.
->
- After the '->' prompt, type the command in below:
-> set /SP/diag/snapshot
dataset=normal
Set 'dataset' to 'normal'
- Type the following command:
-> set /SP/diag/snapshot
dump_uri=sftp://root:welcome@10.10.10.51/tmp
Set 'dump_uri' to 'sftp://root:welcome@10.10.10.51/tmp'
- Next cd to the snapshot directory and view the status:
-> cd /SP/diag/snapshot
/SP/diag/snapshot
-> show
/SP/diag/snapshot
Targets:
Properties:
dataset = normal
dump_uri = (Cannot show property)
encrypt_output = false
result = Running
Commands:
cd
set
show
->
Wait for the snapshot process to
complete. It may take several minutes.
Continue to check until the
status is shows 'Snapshot Complete'
Do not use, access, view, copy or
move the snapshot file until it has completed.
-> show
/SP/diag/snapshot
Targets:
Properties:
dataset = normal
dump_uri = (Cannot show property)
encrypt_output = false
result = Collecting data into sftp://root:*****@10.10.10.51/tmp/dm01db01-ilom_10.10.23.56_2016-12-29T08-44-09.zip
Snapshot Complete.
Done.
Commands:
cd
set
show
- You can now exit the CLI interface and find your snapshot in the directory you specified.
-> exit
Connection to dm01db01-ilom
closed.
- The file name will look similar to this example:
dm01db01-ilom_10.10.10.56_2016-12-29T08-44-09.zip
Do not rename the snapshot file.
Exawatcher
The /opt/oracle.ExWatcher
directory contains the Oracle ExaWatcher system data gathering and reporting
utilities. This information is mostly used for troubleshooting OS or performance
issue.
Steps for ExaWatcher collection:
- Navigate to the Exawatcher directory and execute the GetExawatcherResults.sh script.
[root@dm01db01 ~]# cd /opt/oracle.ExaWatcher/
[root@dm01db01
oracle.ExaWatcher]# ls -ltr GetExaWatcherResults.sh
-rwx------ 1 root root 21012 Oct
21 2015 GetExaWatcherResults.sh
[root@dm01db01 oracle.ExaWatcher]#
./GetExaWatcherResults.sh -h
Usage:
./GetExaWatcherResults.sh {--from $FromTime [--to $ToTime] | --at
$AtTime [--range $Hours]}
[--archivedir $ArchiveDir]
[--scp $UserName@SrvName]
[--filter $SamplerName]
[--resultdir $ResultDir]
- To collect from/to a certain date and time:
# ./GetExaWatcherResults.sh
--from 07/31/2015_00:00:00 --to 07/31/2015_23:00:00
mm/dd/yyyy hh:mm:si
Default output location:
/opt/oracle.ExaWatcher/archive/ExtractedResults
[root@dm01cel07
ExtractedResults]# cd /opt/oracle.ExaWatcher/archive/ExtractedResults
- To collect for a time range. In this case, we are collecting for 4 hrs before and after 1300:
# ./GetExaWatcherResults.sh --at
08/05/2015_13:00:00 --range 4
mm/dd/yyyy hh:mm:si
The default archive directory
is /opt/oracle.ExaWatcher/archive/ExtractedResults; however, you can change
this using [-d|--archivedir] flag:
Example of changed default
archive location to /tmp/ExaWatcherArchive:
# ./GetExaWatcherResults.sh
--from 01/25/2014_13:00:00 --to 01/25/2014_14:00:00 --archivedir
/tmp/ExaWatcherArchive
Conclusion
In this article we have seen
different Exadata diagnostic utilities and how to execute them to collect the
diagnostic data. These utilities are used on a daily basis to assess the
hardware, software and performance issues.
Consider TFA for gathering all above information in single command,
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteMake the most of mainly premium substances - you will find him or her for: Private utility locating
ReplyDelete