Search This Blog

Sunday, January 5, 2014

Corrections for troubleshooting I/O fencing procedures

Corrections for troubleshooting I/O fencing procedures

How vxfen driver checks for pre-existing split-brain condition
Replace this topic in the Veritas Cluster Server User's Guide for 5.0 MP3 with the following content.
The vxfen driver functions to prevent an ejected node from rejoining the cluster after the failure of the private network links and before the private network links are repaired.

For example, suppose the cluster of system 1 and system 2 is functioning normally when the private network links are broken. Also suppose system 1 is the ejected system. When system 1 restarts before the private network links are restored, its membership configuration does not show system 2; however, when it attempts to register with the coordinator disks, it discovers system 2 is registered with them. Given this conflicting information about system 2, system 1 does not join the cluster and returns an error from vxfenconfig that resembles:

vxfenconfig: ERROR: There exists the potential for a preexisting 
split-brain. The coordinator disks list no nodes which are in 
the current membership. However, they also list nodes which are 
not in the current membership.
I/O Fencing Disabled!

Also, the following information is displayed on the console:

<date> <system name> vxfen: WARNING: Potentially a preexisting
<date> <system name> split-brain.
<date> <system name> Dropping out of cluster.
<date> <system name> Refer to user documentation for steps
<date> <system name> required to clear preexisting split-brain.
<date> <system name>
<date> <system name> I/O Fencing DISABLED!
<date> <system name>
<date> <system name> gab: GAB:20032: Port b closed


However, the same error can occur when the private network links are working and both systems go down, system 1 restarts, and system 2 fails to come back up. From the view of the cluster from system 1, system 2 may still have the registrations on the coordinator disks.

<date> 07:29:25 VCS CRITICAL V-16-1-10029 VxFEN driver not configured. VCS Stopping. Manually restart VCS after configuring fencing
<date> 07:30:12 VCS NOTICE V-16-1-11022 VCS engine (had) started
<date> 07:30:12 VCS NOTICE V-16-1-11050 VCS engine version=4.1
<date> 07:30:12 VCS NOTICE V-16-1-11051 VCS engine join version=4.1000
<date> 07:30:12 VCS NOTICE V-16-1-11052 VCS engine pstamp=4.1 04/27/07-03:15:00
<date> 07:30:12 VCS NOTICE V-16-1-10114 Opening GAB library
<date> 07:30:12 VCS NOTICE V-16-1-10619 'HAD' starting on: system2
<date> 07:30:12 VCS WARNING V-16-1-11030 HAD not ready to receive this command. Message was: 0xc31
<date> 07:30:12 VCS INFO V-16-1-10125 GAB timeout set to 15000 ms
<date> 07:30:17 VCS INFO V-16-1-10077 Received new cluster membership
<date> 07:30:17 VCS NOTICE V-16-1-10080 System (system2) - Membership: 0x1, Jeopardy: 0x0
<date> 07:30:17 VCS NOTICE V-16-1-10086 System system2 (Node '0') is in Regular Membership - Membership: 0x1
<date> 07:30:17 VCS NOTICE V-16-1-10322 System system2 (Node '0') changed state from CURRENT_DISCOVER_WAIT to LOCAL_BUILD
<date> 07:30:18 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying...
<date> 07:30:20 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying...
<date> 07:30:20 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying...
<date> 07:30:20 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying...
<date> 07:30:20 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying...
<date> 07:30:20 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying...
<date> 07:30:20 VCS CRITICAL V-16-1-10029 VxFEN driver not configured. VCS Stopping. Manually restart VCS after configuring fencing

To resolve actual and apparent potential split brain conditions
Depending on the split-brain condition that you encountered, do the following:

Actual potential split brain condition - system 2 is up and system 1 is ejected

Determine if system2 is up or not. If system 1 is up and running, shut it down and repair the private network links to remove the split brain condition.
Restart system 1.

Apparent potential split brain condition - system 2 is down and system 1 is ejected physically verify that system 2 is down.

Verify the systems currently registered with the coordinator disks. Use the following command:
# vxfenadm -g all -f /etc/vxfentab

The output of this command identifies the keys registered with the coordinator disks.
Clear the keys on the coordinator disks as well as the data disks using the vxfenclearpre command.
# /opt/VRTSvcs/rac/bin/vxfenclearpre.

To check whether current host is an not using Fencing Information.
system2:root:/var/VRTSvcs/log # vxfenadm -d
I/O Fencing Cluster Information:
================================
VCS FEN vxfenadm ERROR V-11-2-1115 Local node is not a member of cluster!Checking whether there is an split brain condition existing or not.

system2:root:/var/VRTSvcs/log # /sbin/vxfenconfig -c
VCS FEN vxfenconfig NOTICE Driver will use SCSI-3 compliant disks.
VCS FEN vxfenconfig ERROR V-11-2-1016 There exists the potential for a preexisting split-brain
        The coordinator disks list no nodes which,
        are in the current membership.  However, they,
        also list nodes which are not in the,
        current membership.
        I/O Fencing Disabled!

Verify the systems currently registered with the coordinator disks. Use the following command:
system2:root:/var/VRTSvcs/log # vxfenadm -g all -f /etc/vxfentab
Device Name: /dev/sde
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------Device Name: /dev/sdep
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
Device Name: /dev/sdbw
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
Device Name: /dev/sdhh
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
Device Name: /dev/sdd
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
Device Name: /dev/sdbv
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------Device Name: /dev/sdhg
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
Device Name: /dev/sdeo
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
Device Name: /dev/sdc
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
Device Name: /dev/sdhf
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
Device Name: /dev/sden
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
Device Name: /dev/sdbu
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
Device Name: /dev/sdb
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
Device Name: /dev/sdem
Total Number Of Keys: 2
key[0]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
key[1]: 
        Key Value [Numeric Format]:  66,45,45,45,45,45,45,45
        Key Value [Character Format]: B-------
system2:root:/var/VRTSvcs/log # 

Clear the vxfencing keys
system2:root:/ # /opt/VRTSvcs/vxfen/bin/vxfenclearpre
This script recovers from a preexisting split brain condition.
This script removes the SCSI-3 registrations and reservations on
the coordinator disks as well as the data disks contained in shared
disk groups.
         ******** WARNING!!!!!!!! ********
THIS SCRIPT CAN ONLY BE USED IF THERE ARE NO OTHER ACTIVE NODES IN
THE CLUSTER!  VERIFY ALL OTHER NODES ARE POWERED OFF OR INCAPABLE OF
ACCESSING SHARED STORAGE.
If this is not the case, data corruption will result.Do you still want to continue: [y/n] (default : n)
yCleaning up the coordinator disks...Cleaning up the data disks for all shared disk groups...
Successfully removed SCSI-3 persistent registration and reservations
from the coordinator disks as well as the shared data disks.
Reboot the server to proceed with normal cluster startup...


system2:root:/ # /sbin/vxfenconfig -c            
VCS FEN vxfenconfig NOTICE Driver will use SCSI-3 compliant disks.

system2:root:/ # gabconfig -a
GAB Port Memberships
===============================================================
Port a gen   906801 membership 0                               
Port b gen   906816 membership 0

Do normal hastart on node which is system2.


Thank you for Reading,
For Reading other article, visit to “https://sites.google.com/site/unixwikis/https://sites.google.com/site/unixwikis/

1 comment:

  1. Unix Journey Of Indrajit: Corrections For Troubleshooting I/O Fencing Procedures >>>>> Download Now

    >>>>> Download Full

    Unix Journey Of Indrajit: Corrections For Troubleshooting I/O Fencing Procedures >>>>> Download LINK

    >>>>> Download Now

    Unix Journey Of Indrajit: Corrections For Troubleshooting I/O Fencing Procedures >>>>> Download Full

    >>>>> Download LINK

    ReplyDelete