part 2/2 Troubleshooting VSAN errors. VSAN misconfiguration??

VSAN cluster isn’t 100% there is some problems with see all the storage. 
Symptoms:
1. Cannot write to the VSAN Datastore
2. Correct amount of capacity isn’t present
How to resolve with ESXCLI
Log into each host via ssh. You do remember the root password right?? This example is a three node VSAN cluster.
node 1::::::
 
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-31T01:11:33Z
Local Node UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Backup UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Membership Entry Revision: 1
Sub-Cluster Member UUIDs: 551374b5-03f9-7bd6-6257-a0369f58b8e8, 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster Membership UUID: d6da1955-e2f8-38eb-d7f0-a0369f58b8e8
NODE 2:::::
 
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-31T00:12:02Z
   Local Node UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4
   Local Node State: MASTER  << a different master for a different UUID
   Local Node Health State: HEALTHY
   Sub-Cluster Master UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4  <<< That is a different UUID!
   Sub-Cluster Backup UUID:
Sub-Cluster UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster Membership Entry Revision: 0
Sub-Cluster Member UUIDs: 55197cee-f530-4966-5ea6-a0369f58b8e4
Sub-Cluster Membership UUID: 60df1955-b9f1-d685-13b0-a0369f58b8e4
~ # esxcli vsan cluster leave
~ # esxcli vsan cluster join -u 551374b5-03f9-7bd6-6257-a0369f58b8e8   <<<- join the correct UUID (cluster)
 
Validate on NODE 2
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-31T00:12:52Z
Local Node UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4
Local Node State: AGENT
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Backup UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Membership Entry Revision: 2
Sub-Cluster Member UUIDs: 551374b5-03f9-7bd6-6257-a0369f58b8e8, 54f9dc6f-8674-f412-364d-a0369f58b5a8, 55197cee-f530-4966-5ea6-a0369f58b8e4  <<< three members
Sub-Cluster Membership UUID: d6da1955-e2f8-38eb-d7f0-a0369f58b8e8
~ # esxcli vsan network list
Interface
VmkNic Name: vmk1
IP Protocol: IPv4
Interface UUID: 17dd1955-0bdf-abac-aba6-a0369f58b8e4
Agent Group Multicast Address: 224.2.3.4
Agent Group Multicast Port: 23451
   Master Group Multicast Address: 224.1.2.3
Master Group Multicast Port: 12345
Multicast TTL: 5
Valdiate on NODE 3 and 1
 
NODE 3::::::
 
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-31T00:14:33Z
Local Node UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Local Node State: BACKUP
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Backup UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Membership Entry Revision: 2
Sub-Cluster Member UUIDs: 551374b5-03f9-7bd6-6257-a0369f58b8e8, 54f9dc6f-8674-f412-364d-a0369f58b5a8, 55197cee-f530-4966-5ea6-a0369f58b8e4
Sub-Cluster Membership UUID: d6da1955-e2f8-38eb-d7f0-a0369f58b8e8
~ # esxcli vsan network list
Interface
VmkNic Name: vmk1
IP Protocol: IPv4
Interface UUID: f16e1355-c174-11f6-2602-a0369f58b5a8
Agent Group Multicast Address: 224.2.3.4
Agent Group Multicast Port: 23451
Master Group Multicast Address: 224.1.2.3
Master Group Multicast Port: 12345
   Multicast TTL: 5
NODE 1
FIXED
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-31T01:12:22Z
Local Node UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Backup UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
   Sub-Cluster Membership Entry Revision: 2
   Sub-Cluster Member UUIDs: 551374b5-03f9-7bd6-6257-a0369f58b8e8, 54f9dc6f-8674-f412-364d-a0369f58b5a8, 55197cee-f530-4966-5ea6-a0369f58b8e4 <<— all three members!
   Sub-Cluster Membership UUID: d6da1955-e2f8-38eb-d7f0-a0369f58b8e8
~ # esxcli vsan network list
Interface
VmkNic Name: vmk1
IP Protocol: IPv4
Interface UUID: 3b5e1955-5eb6-2bbc-57bc-a0369f58b8e8
Agent Group Multicast Address: 224.2.3.4
   Agent Group Multicast Port: 23451
   Master Group Multicast Address: 224.1.2.3 <<< all the same Multicast address
   Master Group Multicast Port: 12345
Multicast TTL: 5
~ #
Advertisements

VMWARE Virtual SAN networking

VSAN networking can be a bit tricky to troubleshoot. Before I go deeper into the topic here is a very important concept to remember about VSAN clusters.

Given any VSAN cluster remember the following:

** “Introduction to Virtual SAN Networking

Before getting into network in detail, it is important to understand the roles that nodes/hosts can play in Virtual SAN. There are three roles in Virtual SAN: master, agent and backup. There is one master that is responsible for getting CMMDS (clustering service) updates from all nodes, and distributing these updates to agents. Roles are applied during cluster discovery, when all nodes participating in Virtual SAN elect a master. A vSphere administrator has no control over roles.”

** from Cormac’s troubleshooting guide

That is a lot to digest but if break it down you can see some key principles about a VSAN cluster to remember.

The roles in VSAN:
A master
B agent
C backup.

There is one master.
If you see more than one master there is something not quite right with you VSAN CLUSTER.

The VSAN admin does not control which node will be the master.

Example:
Log into each node of a three node VSAN. The normal pre-req for troubleshooting make sure ssh is enabled.

Run the following command on each node:
~ # esxcli vsan cluster get

Cluster Information will output below.
NODE 1

Cluster Information
Enabled: true

Current Local Time: 2015-03-30T22:38:38Z
Local Node UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4
Sub-Cluster Backup UUID:
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Membership Entry Revision: 0
Sub-Cluster Member UUIDs: 55197cee-f530-4966-5ea6-a0369f58b8e4
Sub-Cluster Membership UUID: a5ce1955-f5e5-5663-d338-a0369f58b8e4

Node 2
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-30T22:38:38Z
Local Node UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4
Sub-Cluster Backup UUID:
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Membership Entry Revision: 0
Sub-Cluster Member UUIDs: 55197cee-f530-4966-5ea6-a0369f58b8e4
Sub-Cluster Membership UUID: a5ce1955-f5e5-5663-d338-a0369f58b8e4

Node 3
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-30T22:56:46Z
Local Node UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Local Node State: BACKUP
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Backup UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Membership Entry Revision: 1
Sub-Cluster Member UUIDs: 551374b5-03f9-7bd6-6257-a0369f58b8e8, 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster Membership UUID: d6da1955-e2f8-38eb-d7f0-a0369f58b8e8

See the image below for the error seen in the web client.

From the output above can you see the problem?

IMG_2285.PNG

IMG_2285-0.PNG