Search This Blog

Friday, January 31, 2014

Intermittent disk failures

Intermittent disk failures
Intermittent disk failures are failures that occur off and on and involve problems that cannot be consistently reproduced. Therefore, these types of failures are the most difficult for the operating system to handle and can cause the system to slow down considerably while the operating system attempts to determine the nature of the problem. If you encounter intermittent failures, you should move data off of the disk and remove the disk from the system to avoid an unexpected failure later. However, intermittent disk failures are also very rare. With intermittent disk failures, you can sometimes observe disks being labelled by VxVM as failing as shown on the slide. If Volume Manager experiences occasional I/O failures on a disk but can still access the private region of the disk, it marks the disk as failing.

Note: If the failing flag is set on a disk, it is not turned off until the administrator executes the following command:
# vxedit -g diskgroup set failing=off dm_name

1.  List the disks under Veritas Volume Manager control to determine which disk is marked bad:

# vxdisk list 

DEVICE       TYPE      DISK         GROUP        STATUS

c0t1d0s2     sliced    disk01       rootdg       online failing

c0t3d0s2     sliced    rootdisk     rootdg       online

........(more)


2.  Clear the failing flag for the disk that is marked as failing:

# vxedit -g rootdg set failing=off disk01


 3.  Verify that the flag has been cleared:

# vxdisk list - to verify that the flag has been changed 
  
DEVICE       TYPE      DISK         GROUP        STATUS

c0t1d0s2     sliced    disk01       rootdg       online

c0t3d0s2     sliced    rootdisk     rootdg       online


........(more)

Checking Free Memory Slot in HP Unix Host


#echo "selclass qualifier memory;info;wait;infolog" | /usr/sbin/cstm

or

root> cstm
cstm>map
(2 is the dev num for my memory so I will select 2 in next step)
cstm>sel dev 2
cstm>info
cstm>il

or

stm

Preventing users from logging in

Preventing users from logging in

If you want to prevent users from logging in to the system, but don’t want to change the runlevel to single user mode, there is another choice to do this. In the file /etc/default/security there is a variable called NOLOGIN. If you change it to 1 – practically, this means uncommenting that line – you will have a means to avoid new user to log in. If it is set to 1, every application that use session management with pam_hpsec (like ssh) will check the presence of /etc/nologin. If the file /etc/nologin exists on the system, no more users will be able to login to the system, every user attempting to login will be presented with the contents of that file. Of course root is immune to this, so you can’t lock out yourself from the system. You can do e.g. this:

# echo ?System Maintenance until 4am - logins disallowed? > /etc/nologin

This is also the way the shutdown process works. If you reboot the system, this file will be automatically erased, no matter if you made it manually or it was created by a shutdown process.

lvcreate can return the error: "Argument out of domain

PROBLEM
lvsplit and lvcreate can return the error: "Argument out of domain".
How to resolve this message?
Example 1:

# lvsplit /dev/vg01/lvol4  
lvsplit: The logical volume "/dev/vg01/lvol4b" could not be created:   
Argument out of domain 
lvsplit: Couldn't delete logical volume "/dev/vg01/lvol4b": 
The supplied lv number refers to a non-existent logical volume. 

Example 2:

# lvcreate -L 528 -n lvol5 /dev/vg05  
lvcreate: the logical volume /dev/vg01/lvol05 could not be created: 
Argument out of domain 

or

# lvcreate -D y -s g -L 361200 -n lvol9 /dev/vg201           
Warning: rounding up logical volume size to extent boundary at size "361216" MB.
lvcreate: The logical volume "/dev/vg201/lvol9" could not be created:
Argument out of domain

RESOLUTION
The formal definition of "Argument out of domain" is "You probably specified an argument a command does not support, or you specified a value to an argument that lies outside the acceptable range. Examine the syntax for the command, make adjustments, and try again." Both of the example commands above have valid arguments, which indicates the volume group configuration should be checked.

# vgdisplay /dev/vg01 
 Max LV   5 
 Cur LV   5 

indicates that the maximum number of logical volumes has been hit.
max_lv is defined by the -l option when vgcreate was used to create the volume group. From the man page "-l max_lv  Set the maximum number of logical volumes that the volume group is allowed to contain. The default value for max_lv is 255. The maximum number of logical volumes can be a value in the range 1 to 255."
Given this number cannot be changed on-the-fly at 11.x, the volume group will have to be recreated with a higher -l setting.