Design, Storage, Uncategorized, Virtualization, vmware, VSAN

Core Knowledge vSAN HBA

The fundamentals cannot be over-emphasized. You need to ensure that the key components of your vSAN host is configured per recommendations.

Just a reminder of the HBA controller configuration.

  1. Make sure the device is on the Hardware Compatibility Guide (HCG) 
  2. And verify the firmware is up-to-date.

I have seen first hand what impact different firmware can have on your environment.

Example: Dell Perc H310

Controller queue depth impacts the rebuild/resync times. A low controller queue depth may impact the availability of your production VMs during rebuild/resync. A minimum queue depth of 256 is required in vSAN. Some vSAN Ready Node profiles require minimum queue depth of 512, All Flash configs.

For more details see this: vSAN Hardware Quick Reference Guide

The availability of vSAN and VMFS can be vying for the same resource; the HBA.

Do NOT mix Disk Access modes to your Host Bus Adapter (HBA) also called an I/O Controller. Pass through configuration is preferred, but RAID-0 can work. vSAN prefers to have a more direct access to the device attached to the I/O Controller.  So for example if the HBA is setup with some logic configuration the groups all the devices together before presenting to the ESXi host then you have some prep work to do. Several array controllers do not support pass through mode,  to use this type of controller for vSAN, we need to create a single disk RAID-0 group for every SSD and HDD.

 

dell_08173_H740P_MINI_MONO_14G_3130LF

Dell PERC 740

Example.

  • RAID levels access for the devices attached.
  • vSAN and VMFS devices on same HBA.

From the VMware KB:

  • Do not mix the controller mode for vSAN and non-vSAN disks.
    • If the vSAN disks are in pass-through/JBOD mode, the non-vSAN disks must also be in pass-through/JBOD mode.
    • If the vSAN disks are in RAID mode, the non-vSAN disks must also be in RAID mode.
    • Mixing the controller mode will mean that various disks will be handled in different ways by the storage controller. This introduces the possibility that issues affecting one configuration could also affect the other, with possible negative consequences for vSAN.
    • https://kb.vmware.com/s/article/2129050

If you absolutely must use the same HBA:

  1. limit the use of the VMFS that is sharing the HBA with vSAN.
  2. AND DO NOT USE RDM for that shared device/HBA
  3. DO NOT have the boot device on the same vSAN controller
  • If the non-vSAN disks are in use for VMFS, the VMFS datastore should be used only for scratch, logging and coredumps.
    • Virtual machines should not be running from a disk or RAID group that shares its controller with vSAN disks or RAID groups.
    • ESXi host installation is permitted on non-vSAN disks attached to same controller.
  • Do not pass through non-vSAN disks to virtual machine guests as Raw Device Mappings (RDMs).

The number and type of drives plus their disk group configuration is not covered here but another topic of important discussion!

 

 

 

Advertisements
Standard
Design, servers, Storage, Uncategorized, Virtualization, vmware

vSphere Content Libraries (CL)

2017-10-01_13-44-16

The introduction of the Content Libraries feature came with vSphere 6. The goal is to reduce the complexity in management of VM templates, vApps, ISO images, and scripts that your virtual environment needs for day to day operations. Content libraries are container objects.
The Content library can be

  1. Local to the vCenter your create it in.
  2. Published externally to other vCenters with password authentication
  3. Subscribed Content Library to another library

The flexibility of the content library topology availability will enable your organization to maximize your operational efficiencies. How? Here are some scenarios that Administrator face.
“What Template did you use to build this VM?”
“Is it patched? Is it the latest one?”

Now imagine this conversation across the business units that span across geographic regions, time zone etc.
What and Where?
Some key things that a CL will help prevent is the bad practice of building workflow and processes around a single person. Increase efficiency in your organization, by using a central repository of essentials files you can avoid using the “wrong” vm template. That answers the what version is the latest? You can increase efficiency of answering the question of where is the latest version?

How do you setup a CL?

  1. In the vSphere Web Client navigator, select vCenter Inventory Lists > Content Libraries.
  2. Click the Objects tab.
  3. Click the Create a New Library icon (create a content library).
  4. Enter a name for the content library, and in the Notes text box, enter a description for the library and click Next.
  5. Select the type of content library that you want to create.

Option

Description

Local content library

A local content library is accessible only in the vCenter Server instance where you create it.

Published content library

Select Publish externally to make the content of the library available to other vCenter Server instances.

If you want the users to use a password when accessing the library, select Enable authentication and set a password.

Optimized published content library

Select Optimize for syncing over HTTP to create an optimized published library.

This library is optimized to ensure lower CPU usage and faster streaming of the content over HTTP. Use this library as a main content depot for your subscribed libraries. You cannot deploy virtual machines from an optimized library. Use optimized published content library when the subscribed libraries reside on a remote vCenter Serversystem and enhanced linked mode is not used.

Subscribed content library

Creates a content library that is subscribed to a published content library. You can sync the subscribed library with the published library to see up-to-date content, but you cannot add or remove content from the subscribed library. Only an administrator of the published library can add, modify, and remove contents from the published library.

Provide the following settings to subscribe to a library:

  1. In the Subscription URL text box, enter the URL address of the published library.

  2. If authentication is enabled on the published library, enter the publisher password.

  3. Select a download method for the contents of the subscribed library.

    • If you want to download a local copy of all the items in the published library immediately after subscribing to it, select Download all library content immediately.

    • If you want to save storage space, select Download library content only when needed. You download only the metadata for the items in the published library.

      If you need to use an item, you can synchronize it to download its content.

  4. When prompted, accept the SSL certificate thumbprint.

    The SSL certificate thumbprint is stored on your system until you delete the subscribed content library from the inventory.

6. Click Next.
7. Select a datastore, or enter the path to a remote storage location where to keep the contents of this library.

Option

Description

Enter an SMB or an NFS server and path

If you use avCenter Server instance that runs on a Windows system, enter the SMB machine and share name.

If you use vCenter Server Appliance, enter a path to an NFS storage. You can store your templates on an NFS storage that is mounted to the appliance. After the create a new library operation is complete, the vCenter Server Appliance mounts the shared storage to the host OS.

Select a datastore

Select a datastore from your vSphere inventory.

vSAN Datastore will appear here as a choice

8. Review the information on the Ready to Complete page and click Finish.

Great now you have a Content library.. what next?

ADD CONTENT to your Content Library.
You can:
Clone the VM as a template into your Content Library (Right click the VM choose
Actions–> Clone –> Clone to Template in Library

2017-10-01_13-44-59

Now for another time saver!
So, you already realize the importance of a repository and you have a single folder on datastore that says /iso-templates. Now what? You need to be able to copy all of that to your new Content Library. So you can publish the CL and enable other vCenter’s to Subscribe.
The tricky option is to deal with ISO images.

Sure Templates and VM’s can be handled with cloning VM to Template actions but here is a option for existing templates in your datastore. This will save you a bit of time in re-copying the ISO back into the content library.

 

When I first started to use the CL I didn’t see an option the the CL to add ISO files. I reached out to Roman Konarev and he provided this excellent guide.

 

How to import your ISOs from DS:
Get a URL to your ISO file that you want to import to Content library. The structure of that URL is the following: [DataStore url]/[ISOs folder]/[file_name].

Here is my ISOs folder:
1.png
Here is my DS url:

So, the final URL will be the following: ds:///vmfs/volumes/56cd1758-86602854-5166-020019640efe/RK_ISOs/small_ISO.iso

2)    Open a standard “Import library item” wizard and paste the URL above there:

 

** vSphere 6.5 update **

** Update to vSphere 6.5 and make it easier! **

What a difference a version makes!

Procedure

  1. In the vSphere Web Client navigator, select vCenter Inventory Lists > Content Libraries.
  2. Right-click a content library and select Import Item.

    The Import Library Item dialog box opens.

  3. Under Source section, select the option to import an item from a local file. Click Browse to navigate to the file that you want to import from your local system. You can use the drop-down menu to filter files in your local system.
  4. Under Destination section, enter a name and description for the item, and click OK.

Content Libraries can even extend into the Cloud!

Create a content library that is subscribed to the content library you published from your on-premises data center. Content is synchronized from your on-premises data center to your SDDC in VMware Cloud on AWS.

Standard
backup, vdp, vmware

Snapshots are not backups! or VDP and YOU

Ominous words would be echoed in the meeting… “You do have a BACKUP right?”

plan-b

Working in production environments the constant challenge of maintaining uptime aka ‘steady-state’ but at the same slowly or as quick as feasible move forward with changing demands of the business.

Change can came in many forms. It is a driver for your organization.

A simple response to a vulnerability; patching is a necessity.

New features are required. Upgrades will be needed.

And more importantly disaster avoidance. The idea is to prepare in advance avoid disaster. It is akin to shift and dodge BEFORE some bump comes in the road. There are many approaches to this like having a stretched geo-location metro cluster.

Whatever the driver you have to have a fallback plan. If the post-change activity fails, if there is an unforeseen after-effect.. Things do not always work 100% as planned. What is your fallback plan? What? You have a VMware environment. You did click the snapshot button.. Well, that does work but it isn’t a full backup

From KB 1025279

  • Snapshots are not backups. A snapshot file is only a change log of the original virtual disk.
  • Snapshots are not complete copies of the original vmdk disk files….it only copies the delta disks. The change log in the snapshot file combines with the original disk files to make up the current state of the virtual machine. If the base disks are deleted, the snapshot files are useless.
  • Delta files can grow to the same size as the original base disk file, which is why the provisioned storage size of a virtual machine increases by an amount up to the original size of the virtual machine multiplied by the number of snapshots on the virtual machine.
  • The maximum supported amount of snapshots in a chain is 32. However, VMware recommends that you use only 2-3 snapshots in a chain. — [ed The reason is there is a performance hit]

In fact VMware recommendation is to setup an alarm in vcenter if the VM is running from a snapshot to avoid this condition

See KB 1018029 “Configuring VMware vCenter Server to send alarms when virtual machines are running from snapshots”

Now the question still remains.. What options do you have?

Well there is good news!! VMware as of March 1, 2015. “VMware vSphere Data Protection Advanced will be consolidated into VMware vSphere Data Protection (available through vSphere Essentials Plus Kit or higher vSphere editions, all vSphere with Operations Management editions and all vCloud Suite editions) and will no longer require purchase of a separate license. All functionality available with vSphere Data Protection Advanced, previously available as a standalone product, is now included in VMware vSphere Data Protection 6.0 – See more at: Announcement

WOOHOO.

Why is this cool? There are many reasons but to sum things up.

VMware Data Protection Advanced (VDP) is very cool. It is based on modern backup solutions.

  • There are no tapes
  • There is deduplication – Variable length up
  • There is replication
  • File recovery
  • VM recovery
  • Application aware backups
  • Efficient, bandwidth throttling
  • Changed Block Tracking (CBT) Restore
    vSphere Data Protection uses Changed Block Tracking (CBT) during image-level backups. CBT is also utilized with image-level restores in some cases to improve speed and efficiency
  • It can plug into something really big (Data Domain and Avamar)
    • Data Domain allows for “Consolidate backup, archive, and disaster recovery with high-speed deduplication”
    • Avamar  DEDUPLICATION BACKUP SOFTWARE AND SYSTEM — VDP is based on Avamar. See the announcement.

It is super easy to install and use. I did say easy and it is, because you can even configure VDP to allow for.

  • Linux-based virtual appliance: Easily install and configure backups.
  • Self-Service File Level Recovery: Enable guest OS administrators to restore individual files and folders.
  • Wizard-driven backup policies: Assign backup jobs to individual virtual machines or larger containers such as a cluster or resource pool, with specific schedules and retention policies.
  • There is no need for agents in the VM for normal backups.
  • Application aware backups. Backup agents for Microsoft SQL Server, Exchange, and SharePoint. The agents enable application consistent backup and recovery of these applications on virtual and physical machines

– See more at: http://www.vmware.com/products/vsphere/features/data-protection.html

Some tips about VDP deployment.

Do not put the all your eggs in the same basket

— Don’t setup your backup volumes in the same datastore your VMs reside in. The option to use Data Domain is great option! Data Domain can be data backup target.

DNS. Have it working!

It is fast enough? Avoid problems and run a performance test before your backups. Make sure your Data backup targets are validated for performance.

Initial configuration is deployment via OVF. 

Log on via the https://ip-address-assigned/:8543/vdp-configure/

Here you log in as root/changeme

vdp-intial

BUT if you need to ssh in later via IP or hostname. You cannot use the root account. You must use the admin account, which has the same password then you su to root.

“Currently, users can access the VDP appliance command line using the vSphere Client console, SSH, or Putty sessions. With the VDP 5.8 and later releases, the ability to use SSH or Putty to log on to the VDP appliance with the root user has been removed.” — Administration Guide

and lastly

VAMI is your friend and so is the log.

The VAMI is: Virtual Appliance Management Infrastructure (VAMI). VAMI provides end‐users of virtual appliances with a Web console and command line interface that can:

  • „Configure network settings
  • „Check for updates and install them, manually or automatically„
  • Review basic system information for the virtual appliance
  • Stop or restart the virtual appliance

Where is the magical vami?

From the command line you can find it here: /opt/vmware/share/vami/

vami

and if you run into problems..

Log in via ssh to the vdp appliance. Run the following while you attempt the action where you see the error.

root@vdp01:~/#: tail -f /usr/local/avamar/var/vdr/server_logs/vdr-server.log

Then watch the log and try to reproduce the error.

Additional Resources:

Here is a great overview from the VMware HOL team!

VDP overview install and backups! VDP DEMO

and more VDP feature walk through DEMO

and learn how to:

  1. Creating a Virtual Machine Backup Job
  2. Creating a Replication Job
  3. Creating an Application Backup Job
  4. File Level Restore
  5. Restoring a Virtual Machine
  6. Restoring an Application

Standard
EMC, vmware, VSAN

Is your IT infrastructure an Oil Tanker?

I had the opportunity to attend an Avnet/EMC/VMware/Brocade sponsored for channel partners EVO-RAIL VSPEX Blue BootCamp.

In a nutshell it was all you can drink information from a firehose— about EVO-RAIL specifically the EMC VSPEX BLUE.

EVO RAIL is a new beast of an animal. It is a different breed. No, not in and single dimension you measure. The combination of technology presented is a synergy. Definition: Synergy is the creation of a whole that is greater than the simple sum of its parts.

Yes; you can get the the form factor for compute separately. You can also do the same for VSAN and ESXi vSphere 5.5 and networking. but you cannot get what the entire VSPEX BLUE offering of EVO RAIL provides TOGETHER.

I get ahead of myself.

You have to have perspective to understand where we are today. To me that means if you don’t know where you come from you  cannot know where are you today and where you will be tomorrow. It is all relative.

EVO RAIL is a clustered system. It is a Datacenter in a 2U form factor. Not just compute but a modern hyper converged solution.

DD709A0B-0F27-43AD-ABB2-F2D3D75C2601

Sure. Another IT buzzword. Is it just talk? I would say no.

Everyone is talking about it but only a few are “doing it”… more often than not the IT industry is a buzz with the new technology of the day.
In this case it is Hyper-converged.
To put it simply storage is local to compute. What a minute how is that different than 15 years ago when Client-Server model was the norm and storage was already local to the compute. Compute meaning the processing of the server CPU. Well lots has changed.
How is it better. A snapshot of what is the current available technology:
  • Compute is way, way faster and more dense.
  • Networks are 10 gigabit vs Fast Ethernet 100 Base-T or FDDI Optical rings are no longer the only viable choice.
  • Storage is IOPs crazy.
  • And add to that the agility of VMware Virtualization!
So look at the speed of compute, network and storage. Technology will continue to get faster and better (lower cost for the return on investment).
But what really hasn’t changed much is the complexity of the solution. There are many moving parts but how do you delivery your solution today to support legacy applications and have the agility to respond to changing business objectives.
The old phrase “turning an oil tanker on a dime”. Is IT today an Oil tanker? Does your private cloud have the agility your business requires? How will the current toolset respond? How will your staff? Oh what was that “IT staffing has been reduced and that is a trend that hasn’t gone away”
IT organizations are forced to do more with little… queue in viable alternatives…
Do you outsource.. the simplest short term gain, but not always the best long term investment.
I view EVO rail as a datacenter in a box. EVO-RAIL VPSEX Blue is Not the Cluster in a box solution but much much more.
This reminds me of the forerunning of MCSC cluster in a box. Like I said.. perspective. Where has the IT industry been before relative to where it is today. I recall back in the day when cluster in a box was a viable (and the best solution at the time) era 2000. That was only a high-availablity MSCS cluster with one node active at a time. 
1BB7AEF5-74F9-46FE-8971-6808D6593D0A
Yes, I deployed and supported a few of these solutions. It was cutting edge back then.
Adjusted for inflation:
$1000.00 USD in 2000 is $1395.20
so the CL1850 was $27,864.00 (2000 $)
or $38,569 in 2015 $.
But what did you get for your IT dollar?
Each “node” in the CL1850
RAM:
  • 1 GB RAM (128-MB 100-MHz registered ECC SDRAM memory)
Compute:
  • 2 Pentium III processors @550MHz
Network
  • 3 NICS (one dedicate for internode communication) 100 BaseT
Storage:
  • 2 RAID controllers
Shared Storage System:
  • 218.4 GB (6 x 36.4-GB 1′′ Ultra3 10,000 drives)
Form Factor 10U
Fast forward to 2015….What does EVO-RAIL VSPEX BLUE PROVIDE:
Each EMC VSPEX BLUE appliance includes:
AKA WHAT’S UNDER THE HOOD??
13AA6B68-17C2-4521-93D6-CE89FE4F432B
FORM FACTOR
  • 4 nodes of integrated compute and storage, including flash (SSD) and HDD — 2U
  • VMware EVO:RAIL software including VMware Virtual SAN (VSAN), Log Insight
COMPUTE
        12 cores @ 2.1GHz per node
RAM
  • 128 or 192 GB memory per node
NETWORKING
  • Choice of 10 Gigabit Ethernet network connectivity: SFP+ or RJ45
STORAGE
  • Drives: up to 16 (four per node)
  • Drives per node: 1 x 2.5” SSD, 3 x 2.5” HDD
  • Drive capacities
    • HDD: 1.2TB (max total 14.4TB)
    • SSD for caching: 400GB (max total 1.6TB)
               14.4TB capacity RAW
Additional Software Solutions – Exclusive to VSPEX-BLUE
  • EMC VSPEX BLUE Manager providing a system health dashboard and support portal
  • EMC CloudArray to expand storage capacity into the cloud (license for 1 TB cache and 10 TB cloud storage included)
  • VMware vSphere Data Protection Advanced (VDPA) for centralized backup and recovery
  • EMC RecoverPoint for Virtual Machines for continuous data protection of VMs. Includes licenses for 15 VMs.

The question still remains. Is your IT infrastructure an Oil Tanker? OR Can you turn on a dime??

How does your IT respond the the ever changing business demands?

Is EVO RAIL for everyone? There are a lot of use cases that EVO RAIL VSPEX BLUE will work perfect for. But,  No it isn’t for everyone.  BUT what it does is usher in a new consumption of IT that is different manner. You will not have to be provision your datacenter in the same piece meal function. You can commoditize that to a pre-validated solution that is supported by a single vendor.

FF9DB928-E9E8-4718-8B16-320FC5B28333

This graphic has a lot more details than this single blog post can explain! I will try to explain each section that helps to make VSPEX BLUE a different redefined EVO RAIL solution.

EMC VPSEX BLUE MANAGER  

– VSPEX BLUE Manager users can conveniently access electronic services, such as the EMC knowledge base articles, access to the VSPEX Community for online and real-time information and EMC VSPEX BLUE best practices.

EMC VPEX BLUE WITH ESRS 

– ESRS is a two-way, secure remote connection between your EMC environment and EMC Customer Service that enables remote monitoring, diagnosis, and repair – assuring availability and optimization of your EMC products

*EMC VSPEX BLUE SUPPORT

628D9175-352E-4B21-A9F3-7FFA630E8C53

EMC EMC VSPEX BLUE WITH RECOVERPOINT FOR VMs

– Protects at VM-LEVEL GRANULARITY, OPERATIONAL AND DR AT ANY POINT IN TIME

EMC VSPEX Blue with VDPA and DATA DOMAIN

– Built in Deduplicated Backups, powered by EMC Avamar

EMC VSPEX BLUE & CLOUD ARRAY

– Block and File upto 10 TB FREE. “EMC CloudArray software provided scalable cloud based storage with your choice of many leading cloud providers enabling limitless Network Attached Storage, offsite backup and disaster recovery and the ability to support both Block and File simply”

VSPEX BLUE MARKET

– Built into the VSPEX MANAGER dashboard. This unique feature enables customers to browse complementary EMC and 3rd party products that easily extend the capabilities of the appliance.

 ===
Again there are a lot of take aways for the VSPEX BLUE – EVO RAIL solution. Contact me if you more information.
Standard
Design, Troubleshooting, Virtualization, vmware

vSphere Web Client cool feature! Topology maps

Anyone who has to work with, administer VMware sphere needs to have to top down view. You can review uplink settings, uplinks per host. How each Distributed port group is related to the VM defined. VMKernel ports (vmk) IP addresses — all of them at a glance.. Very helpful to see what is online or offline etc.

To access a Topology map of the Distributed vSwitch and Virtual Machine Networking.

There are also advanced features to check out for you uses.

example: filter and save views!

Procedure

  1. Navigate to the vSphere distributed switch in the vSphere Web Client.
  2. On the Configure tab, expand Settings and select Topology.


Standard
ssh, Troubleshooting, vmware

part 2/2 Troubleshooting VSAN errors. VSAN misconfiguration??

VSAN cluster isn’t 100% there is some problems with see all the storage. 
Symptoms:
1. Cannot write to the VSAN Datastore
2. Correct amount of capacity isn’t present
How to resolve with ESXCLI
Log into each host via ssh. You do remember the root password right?? This example is a three node VSAN cluster.
node 1::::::
 
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-31T01:11:33Z
Local Node UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Backup UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Membership Entry Revision: 1
Sub-Cluster Member UUIDs: 551374b5-03f9-7bd6-6257-a0369f58b8e8, 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster Membership UUID: d6da1955-e2f8-38eb-d7f0-a0369f58b8e8
NODE 2:::::
 
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-31T00:12:02Z
   Local Node UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4
   Local Node State: MASTER  << a different master for a different UUID
   Local Node Health State: HEALTHY
   Sub-Cluster Master UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4  <<< That is a different UUID!
   Sub-Cluster Backup UUID:
Sub-Cluster UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster Membership Entry Revision: 0
Sub-Cluster Member UUIDs: 55197cee-f530-4966-5ea6-a0369f58b8e4
Sub-Cluster Membership UUID: 60df1955-b9f1-d685-13b0-a0369f58b8e4
~ # esxcli vsan cluster leave
~ # esxcli vsan cluster join -u 551374b5-03f9-7bd6-6257-a0369f58b8e8   <<<- join the correct UUID (cluster)
 
Validate on NODE 2
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-31T00:12:52Z
Local Node UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4
Local Node State: AGENT
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Backup UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Membership Entry Revision: 2
Sub-Cluster Member UUIDs: 551374b5-03f9-7bd6-6257-a0369f58b8e8, 54f9dc6f-8674-f412-364d-a0369f58b5a8, 55197cee-f530-4966-5ea6-a0369f58b8e4  <<< three members
Sub-Cluster Membership UUID: d6da1955-e2f8-38eb-d7f0-a0369f58b8e8
~ # esxcli vsan network list
Interface
VmkNic Name: vmk1
IP Protocol: IPv4
Interface UUID: 17dd1955-0bdf-abac-aba6-a0369f58b8e4
Agent Group Multicast Address: 224.2.3.4
Agent Group Multicast Port: 23451
   Master Group Multicast Address: 224.1.2.3
Master Group Multicast Port: 12345
Multicast TTL: 5
Valdiate on NODE 3 and 1
 
NODE 3::::::
 
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-31T00:14:33Z
Local Node UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Local Node State: BACKUP
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Backup UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Membership Entry Revision: 2
Sub-Cluster Member UUIDs: 551374b5-03f9-7bd6-6257-a0369f58b8e8, 54f9dc6f-8674-f412-364d-a0369f58b5a8, 55197cee-f530-4966-5ea6-a0369f58b8e4
Sub-Cluster Membership UUID: d6da1955-e2f8-38eb-d7f0-a0369f58b8e8
~ # esxcli vsan network list
Interface
VmkNic Name: vmk1
IP Protocol: IPv4
Interface UUID: f16e1355-c174-11f6-2602-a0369f58b5a8
Agent Group Multicast Address: 224.2.3.4
Agent Group Multicast Port: 23451
Master Group Multicast Address: 224.1.2.3
Master Group Multicast Port: 12345
   Multicast TTL: 5
NODE 1
FIXED
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-31T01:12:22Z
Local Node UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Backup UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
   Sub-Cluster Membership Entry Revision: 2
   Sub-Cluster Member UUIDs: 551374b5-03f9-7bd6-6257-a0369f58b8e8, 54f9dc6f-8674-f412-364d-a0369f58b5a8, 55197cee-f530-4966-5ea6-a0369f58b8e4 <<— all three members!
   Sub-Cluster Membership UUID: d6da1955-e2f8-38eb-d7f0-a0369f58b8e8
~ # esxcli vsan network list
Interface
VmkNic Name: vmk1
IP Protocol: IPv4
Interface UUID: 3b5e1955-5eb6-2bbc-57bc-a0369f58b8e8
Agent Group Multicast Address: 224.2.3.4
   Agent Group Multicast Port: 23451
   Master Group Multicast Address: 224.1.2.3 <<< all the same Multicast address
   Master Group Multicast Port: 12345
Multicast TTL: 5
~ #
Standard
Troubleshooting, Uncategorized, Virtualization, vmware, VSAN

VMWARE Virtual SAN networking

VSAN networking can be a bit tricky to troubleshoot. Before I go deeper into the topic here is a very important concept to remember about VSAN clusters.

Given any VSAN cluster remember the following:

** “Introduction to Virtual SAN Networking

Before getting into network in detail, it is important to understand the roles that nodes/hosts can play in Virtual SAN. There are three roles in Virtual SAN: master, agent and backup. There is one master that is responsible for getting CMMDS (clustering service) updates from all nodes, and distributing these updates to agents. Roles are applied during cluster discovery, when all nodes participating in Virtual SAN elect a master. A vSphere administrator has no control over roles.”

** from Cormac’s troubleshooting guide

That is a lot to digest but if break it down you can see some key principles about a VSAN cluster to remember.

The roles in VSAN:
A master
B agent
C backup.

There is one master.
If you see more than one master there is something not quite right with you VSAN CLUSTER.

The VSAN admin does not control which node will be the master.

Example:
Log into each node of a three node VSAN. The normal pre-req for troubleshooting make sure ssh is enabled.

Run the following command on each node:
~ # esxcli vsan cluster get

Cluster Information will output below.
NODE 1

Cluster Information
Enabled: true

Current Local Time: 2015-03-30T22:38:38Z
Local Node UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4
Sub-Cluster Backup UUID:
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Membership Entry Revision: 0
Sub-Cluster Member UUIDs: 55197cee-f530-4966-5ea6-a0369f58b8e4
Sub-Cluster Membership UUID: a5ce1955-f5e5-5663-d338-a0369f58b8e4

Node 2
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-30T22:38:38Z
Local Node UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 55197cee-f530-4966-5ea6-a0369f58b8e4
Sub-Cluster Backup UUID:
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Membership Entry Revision: 0
Sub-Cluster Member UUIDs: 55197cee-f530-4966-5ea6-a0369f58b8e4
Sub-Cluster Membership UUID: a5ce1955-f5e5-5663-d338-a0369f58b8e4

Node 3
~ # esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-30T22:56:46Z
Local Node UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Local Node State: BACKUP
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Backup UUID: 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster UUID: 551374b5-03f9-7bd6-6257-a0369f58b8e8
Sub-Cluster Membership Entry Revision: 1
Sub-Cluster Member UUIDs: 551374b5-03f9-7bd6-6257-a0369f58b8e8, 54f9dc6f-8674-f412-364d-a0369f58b5a8
Sub-Cluster Membership UUID: d6da1955-e2f8-38eb-d7f0-a0369f58b8e8

See the image below for the error seen in the web client.

From the output above can you see the problem?

IMG_2285.PNG

IMG_2285-0.PNG

Standard