»

ID #1077

Why do COMSTAR views sometimes fail to failover with a ZFS service

COMSTAR views have an associated LUN (logical unit number), which must be unique on that server. If the view is created without specifying a LUN, then the STMF framework will automatically assign a unique number. However, the STMF framework on one node knows nothing about the views on other nodes in the same cluster, so it is possible for it to assign a LUN that is already in use on a different node.

Consider a 2 node cluster (solaris1 and solaris2) with 2 services configured. vol1 is running on solaris1 and vol2 on solaris2. A LU is created in both pools with:

root@solaris1:~# zfs create -V 100M vol1/test1
root@solaris1:~# stmfadm create-lu /dev/zvol/rdsk/vol1/test1 
Logical unit created: 600144F0C9D00C00000054087EF30001
root@solaris1:~# stmfadm add-view 600144F0C9D00C00000054087EF30001

and

root@solaris2:~# zfs create -V 100M vol2/test1
root@solaris2:~# stmfadm create-lu /dev/zvol/rdsk/vol2/test1 
Logical unit created: 600144F0389A8300000054087EA80001
root@solaris2:~# stmfadm add-view 600144F0389A8300000054087EA80001

The 2 views have been created on different nodes, so can easily have the same LUN:

root@solaris1:~# stmfadm list-view -l 600144F0C9D00C00000054087EF30001
View Entry: 0
    Host group   : All
    Target Group : All
    LUN          : 0

and

root@solaris2:~# stmfadm list-view -l 600144F0389A8300000054087EA80001
View Entry: 0
    Host group   : All
    Target Group : All
    LUN          : 0

This is perfectly valid and will work until one service is moved. When a failover is performed, stmfha backs up view information to the .mapping directory in the pool's root directory (/vol1/.mapping/ and /vol2/.mapping/), deletes the views and LUs from COMSTAR snd then adds them on the other node after the pool is imported.

In the scenario described above, the LU is loaded on the other node but the view fails to be added. We can also see the following error message in /opt/HAC/RSF-1/log/stmfha/stmfha.log:

[7944 Sep 04 17:48:23 (UTC +0100)] ERROR: Failed to add view entry code 32782: STMF_ERROR_VE_CONFLICT: Adding this view entry is in conflict with one or more existing view entries - sending event notify.

and the view is not loaded into COMSTAR:

root@solaris2:~# stmfadm list-view -l 600144F0C9D00C00000054087EF30001
stmfadm: 600144f0c9d00c00000054087ef30001: no views found

As the error message suggests, the problem is that stmfha is trying to add a view with LUN = 0, when there is already a view with LUN = 0 on the system (from vol2).

 

To avoid this problem, the LUN should be specified when creating any views. That way, it is possible to ensure the LUN is unique within the cluster, rather than just within a single node:

root@solaris2:~# stmfadm add-view -n 2 600144F0C9D00C00000054087EF30001

The above command adds a view to the LU 600144F0C9D00C00000054087EF30001 with LUN = 2.

Tags: -

Related entries:

Last update: 2014-09-04 17:08
Author: Matt
Revision: 1.0

{writeDiggMsgTag} {writeFacebookMsgTag} {writePrintMsgTag} {writeSend2FriendMsgTag} {writePDFTag}
{translationForm}
Please rate this FAQ:

Average rating: 0 (0 Votes)

completely useless 1 2 3 4 5 most valuable

You cannot comment on this entry