Storage Hardware Acceleration

Hardware Acceleration helps ESXi to offload the Storage and Virtual Machine operations to Storage Hardware, thus consuming less CPU, Memory and Storage Bandwidth. It’s supported by Blocked storage devices, SAN, iSCSI and NAS devices

Can view the status in the Hardware Acceleration column of Devices View and datastore view in vSphere Client

vSphere uses ESXi extensions referred to as Storage API’s – Array Integration (formerly known as VAAI). With the release of vSphere 5, these extensions are implemented as T10 SCSI based commands else the ESXi reverts to using the VAAI Plug in’s

With Hardware Acceleration, host can get hardware assistance on:

–          Storage vMotion

–          Cloning/Deploying VM’s from Template

–          VMFS file locking, Atomic Test Set (ATS) (no use of SCSI Reservations)

–          Writes to Thin and Thick provisioned disks

–          Creating/Cloning thick disks on NAS devices

Hardware Acceleration on NAS Devices is implemented through vendor-specific NAS plug-ins and no claim rules are required

–          File clone; this is similar to VMFS block cloning, except NAS clones entire files instead of file segments

–          Create Thick disks

–          Accurately report Space Utilization for Virtual Machines

Hardware Acceleration Considerations

The VMFS data mover does not leverage hardware offloads and instead uses software data movement when one of the following occurs:

–          The source and destination VMFS datastores have different block sizes.

–          The source file type is RDM and the destination file type is non-RDM (regular file).

–          The source VMDK type is eagerzeroedthick and the destination VMDK type is thin.

–          The source or destination VMDK is in sparse or hosted format.

–          The source virtual machine has a snapshot.

–          The logical address and transfer length in the requested operation are not aligned to the minimum alignment required by the storage device. All datastores created with the vSphere Client are aligned automatically.

–          The VMFS has multiple LUNs or extents, and they are on different arrays.

Some quick Command lines:

Display Hardware Acceleration Plug-Ins and Filter

esxcli –server=server_name storage core plugin list –plugin-class=value (Filter or VAAI)

Verify Hardware Acceleration Support Status

esxcli –server=server_name storage core device list -d=device_ID

Verify Hardware Acceleration Support Details

esxcli –server=server_name storage core device vaai status get -d=device_ID

List Hardware Acceleration Claim Rules

esxcli –server=server_name storage core claimrule list –claimrule-class=value (Filter or VAAI)

Add Hardware Acceleration Claim Rules, only for Storage Devices that do not support T10 SCSI Commands

esxcli –server=server_name storage core claimrule add –claimrule-class=Filter –plugin=VAAI_FILTER

esxcli –server=server_name storage core claimrule add –claimrule-class=VAAI

esxcli –server=server_name storage core claimrule load –claimrule-class=Filter

esxcli –server=server_name storage core claimrule load –claimrule-class=VAAI

esxcli –server=server_name storage core claimrule run –claimrule-class=Filter

Verify Hardware Acceleration Status for NAS, will require you to install the NAS Plug-ins using VIB’s

esxcli –server=server_name storage nfs list

**END**

P2V Consideration and Pre-Post Migration Checklist

Candidate selection:

Use capacity tools (Capacity Planner for ex.) to qualify a physical server as a P2V candidate, some points to note:

  • Small to medium sized workloads which use less SpecInt are the best fit for Virtualization, the SpecInt can eventually increase with the influx of new hardware which has increased CPU speeds. But most of the applications are not CPU intensive
  • Check the memory utilization at 95th Percentile, Memory allocation can be huge, more than 30+ GB, but keep a check on Active Memory, which is always the key
  • Average Disk Read/Write at the 95th percentile should not be way too high, benchmarks are set within the organization depending on their bandwidth and usage, same goes with network
  • It’s the Storage Protocol which dictates your disk space requirements, SAN can be an expensive deal, but worth looking at Thin provisioning (NFS) if there are large disk space requirements with low “Disk Usage”
  • Set benchmarks for OS drives, only if your applications and DB reside on Data volume
  • Application which does not have low latency requirements
  • Can be anything, Exchange, SQL, Oracle, SAP or any business critical application as long as it gets the required resources

Some showstoppers which might prevent candidate selection:

  • Applications with low latency
  • Very large workloads and large data sets
  • Non Standard hardware requirements, modems, fax cards, dongles etc

Pre Migration Checks:

  • Use Capacity tools to qualify a physical server as a P2V candidate, preferably Capacity Planner
  • Hostname
  • OS Type
  • Server Model
  • # of CPU sockets and Cores
  • Physical memory installed
  • Disk Capacity requirements, any LUNS from FC or SCSI or NFS mount points (CIFS)
  • CPU, Memory and Disk usage (Capacity tools can give an insight)
  • Decide on vCPU’s, Virtual Memory and Disk space to be allocated
  • It’s always good to have a local Administrator account on the physical box prior to P2V, else login atleast once using your domain credentials so that it’s stored in the local SAM
  • Record the IP configuration, possibly a screen dump of ipconfig
  • iLO information
  • Check for disk defragmentation
  • Check whether the applications are hardcoded to any IP/Mac addresses
  • RDP access
  • Information about all the applications and check which services are required to be stopped during migration
  • Possibly a runbook, which provides an update on the milestone, P2V started, in progress, successful or failed etc
  • On board resources from OS and Application teams, their contact information
  • Ensure there are no Hardware dongles, take note of compatibility, else the effort is wasted
  • Ensure the firewall rules are opened for the destination network, if applicable
  • Ensure your ESX/ESXi host’s management port group is connected to atleast 1GB port

Post Migration Checklist:

  • Ensure to power off the physical server before powering on the VM, else ensure the Network Adapter is disabled on the VM
  • Install VMware tools and ensure the Hardware Version is compliant with the ESX/ESXi version
  • Removed unwanted serial ports and uninstall all the Hardware Services, HP Insight Manager etc
  • Ensure there are no yellow exclamation marks in the Device Manager
  • Check and Adjust the Hardware Abstraction Layer, if required
  • Once the server is powered on, enable the NIC and assign the IP address
  • Start up all the disabled Application services if any, also check all the Automatic Services are started
  • Ensure the Physical Server is no longer pingable in the network, else you may run into Duplicate entries in AD
  • Get your OS teams (Check Windows Events log) and Application team for stress testing
  • Test Network and Disk latency
  • Most importantly “User Experience”

Please feel free to add if I have missed anything critical

Oracle Design Best Practices on vSphere

Let me start with some general considerations while deploying Oracle on vSphere

  • Do not install any software components that aren’t necessary, such as Office Suites, Graphics, Sound & Video Programs and Instant Messaging services
  • It is also recommended to disable some unnecessary foreground and background host processes
  • For Linux, anacron, apmd, atd, autofs, cups, cupsconfig, gpm, isdn, iptables, kudzu, netfs, and portmap
  • For Windows, alerter, automatic updates, clip book, error reporting, help and support, indexing, messenger, netmeeting, remote desktop, and system restore services
  • For Linux installs, the database administrator (DBA) should request that the system administrator compile a monolithic kernel, which will only load the necessary features

CPU:

  • Oracle databases are not usually heavy CPU consumers and therefore are not characterized as CPU-bound applications
  • As always start with less number of vCPU’s and then increase the count depending on the workload
  • Enable Hyper Threading for Intel Core i7 Processors
  • Use %RUN, %RDY & %CSTP metrics to determine CPU performance

Memory:

  • Set memory reservations equal to the size of the Oracle SGA, atleast for Production systems
  • Acceptable to introduce more aggressive over-commitment in non-production environments such as development, test, and QA etc
  • Set the Virtual Machine to Automatic selection of CPU/MMU Virtualization option
  • Use Large Memory pages, enabled by default on ESX 3.5 and later

Network:

  • Oracle is not heavy when it comes to network utilization
  • Use separate virtual switches, with each switch connected to its own physical network adapter to avoid contention between the ESX service console (applicable for 4.1 and earlier), the VMkernel, and virtual machines (especially virtual machines running heavy networking workloads).
  • To establish a network connection between two virtual machines that reside on the same ESX/ESXi host, connect both virtual machines to the same virtual switch. If the virtual machines are connected to different virtual switches, traffic will go through wire and incur unnecessary CPU and network overhead
  • Use the VMXNET network adapter for optimal performance
  • For IP based storage, enable Jumbo frames end to end

Storage:

  • A dedicated LUN for DB VM’s if the applications have a demanding I/O profile
  • Use VMFS for Single Instance Oracle DB
  • Make sure the VMFS is aligned properly, create the VMFS partitions using vCenter
  • Use Oracle Automatic Storage Management, preferably ASM disk groups with equal disk types and geometries. At a minimum, create two ASM disk groups; one for log files, which are sequential in nature; and one for datafiles, which are random in nature
  • Create a primary controller for use with a disk that will host the system software (boot disk) and a separate PVSCSI controller for the disk that will store the Oracle data files

General guidelines for deploying SQL on VMware vSphere

CPU:

  • Start with a thorough understanding of your workload, you can use VMware Capacity Planner to determine workloads
  • If workloads cannot be determined, start with 1 vCPU, single vCPU VM can support high transaction throughput
  • Account for virtualization overheads (8%-15%, depending on the workload)
  • In Windows Server 2003 guests, when using single-processor virtual machines, configure with a UP HAL (hardware abstraction layer) or kernel. Multi-processor virtual machines must be configured with an SMP HAL/kernel. Windows Server 2008 will automatically select the HAL appropriate for the underlying hardware
  • Avoid CPU overcommitment, otherwise Reserve the full capacity
  • Install/Update latest vmware tools
  • Ensure CPU Compatibility is met for vMotion

Memory:

  • Start with a thorough understanding of your workload, you can use VMware Capacity Planner to determine workloads
  • Increase the database buffer cache to reduce or avoid disk I/O and thus improve SQL Server performance
  • Avoid Memory overcommitment, otherwise reservations can be applied to avoid Ballooning and Swapping
  • If you set the SQL Server lock pages in memory parameter, be sure to set the virtual machine’s reservations to match the amount of memory you set in the virtual machine configuration
  • Use of large pages will help in improving performance

Storage:

There are no concrete recommendations for using VMFS or RDM in SQL Server deployments, both have their advantages and disadvantages. Fibre Channel may provide maximum I/O throughput, but iSCSI and NFS may offer a better price-performance ratio VMware test results show that aligning VMFS partitions to 64KB track boundaries results in reduced latency and increased throughput

It is considered a best practice to:

  • RDM is required when using third-party clustering software, storage based backups to disk or use of third party storage Management software
  • Guest OS can be installed on VMFS and SQL DB and Logs can be on RDM’s
  • Maintain a 1:1 mapping between the number of virtual machines and LUNs to avoid any disk I/O contention
  • It’s recommended to have VMDK’s as eagerzeroedthick
  • Aligning VMFS partitions to 64KB track boundaries results in reduced latency and increased throughput Create VMFS partitions from within vCenter. They are aligned by default
  • Setup a minimum of four paths from an ESX/ESXi host to storage Array, this means that each host requires at least 2 HBA ports

Network:

  • Use NIC teaming and segregate network traffic using VLANs
  • Use the VMXNET3 network adapter for optimal performance. The Enhanced VMXNET3 driver also supports jumbo frames and TSO for better network performance
  • Network communications between co-located virtual machines usually outperforms physical 1Gbps network speed so, if possible, place the various virtual machines that make up an application stack on the same ESXi host

Performance Counters of Interest to SQL Administrators

Subsystem esxtop Counters vCenter Counter
CPU %RDY%USED Ready (milliseconds in a 20,000 ms window) Usage
Memory %ACTVSWW/sSWR/s ActiveSwapin RateSwapout Rate
Storage ACTVDAVG/cmdKAVG/cmd CommandsdeviceWriteLatency & deviceReadLatency kernelWriteLatency & kernelReadLatency
Network MbRX/sMbTX/s packetsRxpacketsTx

*The above content is provided with an Assumption that the VMware environment is vSphere 4.x and later

What makes an ESXi?

Just exactly what makes up an ESXi?

  • There is no Service Console rather has VMKernel and VMM
  • Traditional command line interface with access to Management, Troubleshooting, and Config Tools is gone
  • Third Party agents, Backups, customized settings to be applied differently

ESXi has three components

VMKernel: It’s a 64 bit Microkernel OS POSIX styled oS, desinged by VMware to be not a general purpose OS but one specifically tuned to operate as a hypervisor

VMKernel Extensions: It involves special kernel modules and device drivers which help the OS interact with the hardware

Worlds:

System Worlds:

Processes like idle and helper run as system worldsVMM worlds: Let’s the guest oS see its own x86 virtualized hardware. Each VM runs on its own scheduled VMM world

User worlds: They can make system calls to the VMkernel to interact with VMs or the system itself

ESXi Agents:

DCUI: Yellow interface that lets you set basic configuration, permit access and restart management agents

CIM Broker: Common Information Model, provides agentless access to hardware monitoring via an externally accessible API

TSM: Technical Support mode to run some command line tools which was present is Service Console

Let’s now talk about the ESXi Flavors i.e. Installable and Embedded

ESXi Installable:

  • ESXi can boot from CD, PXE, USB local storage, Local disk, FC or iSCSI SAN (private LUN). The Image files can reside on CD, USB, FTP, HTTP, HTTPS and NFS export. But boot from NFS is not supported
  • Installation can either be Interactive or using a Kick Start file which can be stored on CD, USB, FTP, HTTP, HTTPS or NFS export
  • System Image can be deployed to Local Harddisk or USB, since 4.1 it can be on SAN LUN (FC, FCoE or iSCSI). iSCSI LUN if you are using NIC that supports iBFT (iSCSI Boot Firmware Table)
  • The Scratch Partition is a 4GB vFAT partition created by default if a local disk is found on the 1st boot. It captures the running state files such as Logs, coredumps etc.

ESXi Emebedded: A version of ESXi that is preinstalled as firmware in the factory or burned to a USB flash drive and installed in an Internal USB Socket on the Main system board.
What different tools are available to manage ESXi:

vSphere Client: Connect directly to the hosts

vCenter: Add the hosts to the vCenter and take advantage of DRS, host profile, Storage DRS etc.

vCLI: vCLI is a Perl Based set of scripts that mimics most of the commands available at the ESX console. The esxcfg- prefix has been renamed to vicfg- prefix

vMA: vSphere Management Assistant is a small just enough OS prepackaged linux virtual Appliance which has preinstalled vCLI. It can also be used as a syslog server

PowerCLI: Run PowerShell Scripts against vCenter inventory objects such as Hosts, VM’s, Storage, Network etc

vSphere Update Manager: Patching/Upgrading ESXi hosts

DCUI: Yellow interface that lets you set basic configuration, permit access and restart management agents

TSM:  Technical Support mode to run some command line tools which was present is Service Console. It is based on small executable called BusyBox (www.busybox.net)

Host Profiles: Used to apply customized settings on all the hosts at a time. It also helps check the compliance of a host and Cluster

Local Authentication: Can have local users with root privileges

Lockdown Mode: Disables all users from accessing the ESXi host, only root can access using DCUI and vpxuser can access the hosts using vCenter. It also affects the CIM access to get the hardware info, instead needs a ticket from vCenter so that the vpxuser can fetch the information. Do not enable the lockdown mode from the DCUI it restricts access to the local users rather do it from the vCenter

Logging: Many of the logs are combined into 3 files

  • VMkernel /var/log/messages (containes hostd log as well)
  • Management Daemon: /var/log/vmware/hostd.log
  • vCenter Agent: /var/log/vmware/vpxa.log

-End

vSphere Design, Get your basics right the first time!!!!!

How to start designing a vSphere environment, what is involved, whom to involve?

What is a Design?

“A streamlined process which helps the various elements in the organization to determine how to assemble and configure a
Virtual Infrastructure which is strong and Flexible. A design should contain all the important information that meets the
Functional Requirements of an Organization.”

The functional requirments unify 3 different facets of the design:

Technical – What to deploy?
Operational – How to deploy and Support?
Organizational – Who will deploy and support?

Why Vmware or any other Virtualization product for that matter?

There should be a strong reason/objective to deploy Virtualization in the organization, some of them are:

  • Datacenter Consolidation which will help saving Datacenter space, power and Cooling costs etc
  • New Application rollout, Exchange 2010 for ex.
  • Disaster Recovery/Business continuity, Deploy new DR/BC Solution using VMware vSphere and SRM
  • Virtual Desktop Infrastructure

Facets of the design:

Technical:

  • What type of servers, blades or Rack Mounted?
  • Type of Physical CPU’s in the server?
  • Type & Quantity of Storage?
  • Kind of Networking etc?

Organizational:

  • Who will manage the whole environment?
  • Who will provision what? Storage, Network etc?
  • Who will support VM’s, their backup’s etc?
  • Who will be responsible for Security Policies?

Operational:

  • How will the Hosts be managed?
  • How will the VM’s be provisioned?
  • How will I Failover to DR Site?
  • How will compliance be verified?

How to go about desigining any Infrastructure?

  • Review Infrastructure documents provided by the client, although it  may not provide the complete but most of the Functional Requirements
  • Interview the IT teams and IT Management teams (everyone and anyone as required) to understand the environment better
  • Top 5 issues at the user level that can be resolved using VMware Platform

 Assemble bits and pieces?

  • It’s always acceptable to remove functional requirements, if at any point, they don’t serve any purpose what’s the point in having them?
  • Set standards and Best practices, but every Best practice recommended by a vendor may not suite your environment, get a deeper look into the Best Practices and ensure they meet your functional requirements
  • Start documenting every bit of the Design, so that the implementation teams find it easy

All in all, “get your basics right” before you jump into any Technical Jargons 🙂

*Most of the information is being taken from Scott Lowe’s Design book which I thought is the best information available so far 🙂

General Best Practices for deploying Exchange 2010 on VMware vSphere

Best practices do not mean  “A hard and fast rule”, but they are rather some sort of indicators which help you deploy the best possible solution 🙂

CPU:

  • Start with lesser number of vCPU’s and increase on demand
  • 2vCPU’s minimum for mailbox (ideal 6 vcPUs), Unified messaging and Client Access roles and maximum is 12 cores (vSphere 4 has a maximum limit of 8vCPU)
  • Ensure the total number of vCPU’s is equal to or less than the total number of cores on the ESX/ESXi host machine
  • Performance Counters of Interest to Exchange Administrators are CPU %RDY and %USED

Memory:

  • 4GB minimum, 10 GB minimum for multiple roles
  • Size it as per workload, if workloads cannot be determined, use the MS Exchange Design best practices
  • No overcommitment, period
  • Reservertion is preferred, but not recommended, may limit vMotion
  • Do not disable the balloon driver (installed with VMware tools)
  • Performance Counters of Interest to Exchange Administrators are Memory %ACTV, SWW/s & SWR/s

Storage:

  •  Use RDM’s if you need in-guest clustering (no sVmotion)
  •  Install Guest OS on the VMFS datastore
  •  Install Log files and Exchange DB on RDM’s
  •  Maintain a 1:1 mapping between the number of virtual machines and LUNs to avoid any disk I/O contention
  •  Microsoft does not currently support NFS for the Mailbox Server role (clustered or standalone)
  • Performance Counters of Interest to Exchange Administrators are ACTV, DAVG/cmd & KAVG/cmd

Network:

  • Allocate Seperate network adapters/networks for VMotion, VMware FT logging traffic & Management traffic
  • VMXNET3 Network Adapter (available if VMware Tools are installed)
  • Do not use paravirtualized SCSI driver when a VM generates less than 2000 IOPS. For more information see VMware KB 1017652. However this issue has been fixed with 4.1 and later
  • Use VST (for VLAN tagging) which is most commonly used
  • Enable Jumbo Frames
  • Performance Counters of Interest to Exchange Administrators are MbRX/s & MbTX/s

General Recommendations:

  • Use smaller VM’s for eg. a mailbox server with 2 vCPU and 8GB RAM (vMotion can be much faster)
  • Size the VM building blocks considering License costs, more VM’s may mean more Licenses
  • During peak utilization ensure mailbox role is designed not to exceed 70% utilization
  • If deploying multiple roles on the server, the mailbox role should be designed not to exceed 35%
  • Typical guidline is to use n+1 HA/DRS clusters
  • Typical ESX/ESXi host can be of 16 cores (4×4 pCore) , 128GB of RAM, 2 HBA’s and 4 Gigabit Network Adapters
  • Run on a DRS cluster for balanced workloads

HA & DRS Solutions

Local site recovery:

  • Either use Vmware HA for VM level failover or use VMware HA+DAG (Database Availability Groups) for VM and Database level failover
  • vMotion, HA and DRS are not supported for MSCS nodes

Remote Site Availability Options:

  • SRM with DAV
  • Third party Software Replication

Backup & Restore Options:

  • Traditional LAN based backups, Agents installed within the Guest OS
  • Use vDR for Other Exchange Serve roles for ex. Client Access or HUB Transport etc
  • Array based backup Solutions

Also, worth visiting the below links:

Microsoft Exchange Server Profile Analyzer

Exchange 2010 Mailbox Server Role Requirements Calculator to calculate processor, memory, and storage

Mailbox Server Storage Design

VMware White papers on Exchange

*The above content is provided with an Assumption that the VMware environment is vSphere 4.x and later