Virtualization and ST25XX/6000/99XX Said Syed's Weblog

Sunday Oct 11, 2009

Hello All,


Here is my presentation for my session at Oracle Open World 2009 San Francisco, CA.


Feedback welcome. I will have the recording uploaded soon.


PRESENTATION SLIDES:


Oracle11g in Virtualized Environments - A Storage Perspective by Said A. Syed


PRESENTATION SLIDE SHOW WITH VOICE OVER:



Enjoy-

Wednesday Sep 09, 2009

This oracle site has a good explanation of how Oracle VM storage multipathing works.


http://wiki.oracle.com/page/Oracle+VM+Server+Configuration-+multipathed+SAN+storage

Thursday Jul 02, 2009

Building on the Bare Minimum


    When the need arises to upgrade from a bare minimum storage configuration or to start off with a more scalable storage configuration for VMware, Sun Storage arrays include all of the configuration combinations and choices needed. Virtualized Data Centers rely on on-demand highly scalable storage infrastructure in order to perform optimally. The need for more storage can increase exponentially over time and the storage infrastructure including storage arrays and the SAN need to be highly flexible in terms of scalability. The table below illustrates the scalability of Sun Storage arrays:




For latest and detailed Sun Storage architectural information, please refer to www.sun.com/storage.


Given the extensive possibilities on scalability for multiple architectural scenarios, Sun Storage arrays provide a very robust set of products and capabilities specifically for virtualization technologies such as VMware. In order to achieve optimal storage availability, performance and other architectural considerations, simple additions to the bare minimum designs may be sufficient.


Segregating Different Data Types


 For an optimal VMware infrastructure, segregating different types of data is essential. VMware makes use of storage for a number of reasons such as:

- Virtual Machine home directory which includes the VM Operating System, virtual memory and other essential objects
- Critical VM home directories should be segregated from Non-Critical VMs
   = Software ISO images for easy sharing and access of OS and application install files
- User and Application Data which may include critical and non-critical data such as User home directories and Employee Pay Roll


The need for Data Type Segregation arises due to the unique data access requirements for each. VM home directories have unique data access needs because the data includes the operating system and virtual memory of the VM which could lead to slow VM performance if placed along with random data type such as an Oracle OLTP  high performance random I/O database. Here are recommendations and strategies to address the issue of data segregation. :


1) Segregate Virtual Machine Home Directories based on the type of application and I/O. For example:
    - Test VMs and Production VMs
    - Database VMs which have higher memory requirements and may have higher swap file usage should not be on the same array volumes

2) Random I/O and Sequential I/O data should reside in different RAID Groups or Virtual Disks.
3) Use Luns or Volumes from different RAID Groups or Virtual Disks for different data types.
    - Luns for Virtual Machine Home Directories should not share the same RAID Group or Virtual Disk as an Oracle OLTP Database Lun
    - Non-Critical data such as ISOs and Non-Critical VM home directories may share the same RAID Group.

4) Use Low-Cost Modular Sun Storage arrays such as the Sun Storage 2510 for ISOs and non-critical VMs.
- VMs can boot from iSCSI targets on the ESX Server
    - Sun Storage 2510 can be shared across multiple ESX servers in order to take advantage of all VMware features

5) Mid-range modular for mid-range performance
    - Based on user requirements
6) High performance I/O intensive data such as Oracle OLTP, Data Warehousing should reside on high-end 6580/6780 and 9000
7) Use of Storage Domains on Sun Storage 2500 and 6000 series products is highly recommended in order to isolate ESX servers from other servers. Using Storage Domains allows creation of virtual partitions on the storage array and enables lun mapping per host initiator, host or host group. The following table shows Storage Domain feature scalability of the arrays.For more details on Storage Domains, please refer to Sun Storage Common Array Manager Installation Guide.



8) SAN switch zoning should also be used to segregate traffic for each initiator or HBA along with lun mapping and storage domains. Zoning is primarily used to isolate Initiator – to – Target traffic and disallows SAN error propagation across zones, the most important one being Registered State Change Notification events. An RSCN is a required notification sent out to all devices of a particular SAN, switch or a particular zone or set of zones every time due to a change in the SAN, causing the I/O traffic to halt momentarily. The change could be a host reboot, a cable plug/un-plug, a hardware error, a new storage device  and similar events. This I/O traffic halt, although brief, has been known to cause I/O exchange failures, path failures and in some very extreme situations data corruption See Figure 2 below. With zoning implemented, effects of RSCN are minimized to the specific zone which experienced the change event. Best practice for SAN zoning is to use one initiator (Host Bus Adapter) per zone and if possible, one target (Array port) per zone. Zoning also helps isolate Non-ESX server traffic from ESX server traffic.



In the figure above, there are three zones illustrated. Zone A and Zone C both have one initiator and one target, while Zone B has two initiators and targets. Both switches in the SAN are connected via an Inter-Switch Link. If Host X rebooted and it’s HBA in Zone B logs out of the SAN, an RSCN will be sent to Host Y’s initiator in Zone B and cause all I/O going to that initiator to halt momentarily and recover within seconds. However, another RSCN will be sent out to Host Y’s initiator in Zone B when Host X’s HBA logs back in to the SAN and cause another momentary halt in I\O. Initiators in Zone A and Zone C are protected from these events because there are no other initiators in these zones. Most latest SAN switches provide RSCN suppression methods, however, suppressing RSCNs is not recommended since RSCNs are the primary way for initiators to determine an event has occurred and to act on the specified event such as lost of access to targets. It is important to follow established SAN best practices such as single initiator zones in order to avoid situations described and others not listed.



Enhancing Availability


    VMware simply requires shared storage to enable all advanced features addressing Availability at the Virtual Machine level as long as there are more than one ESX servers sharing the same volume in a storage array. There is, however, a need to address Data Availability at the storage array level as well. Although Sun Storage arrays have built-in redundancy for data access, cache mirroring across controllers and power, and the likely hood of a complete failure is very small, it is recommended, if possible, to spread the data across multiple disk trays via mirror and/or striping on the low cost arrays specifically. Sun Storage 2500 series, 6000 series and StorageTek 9900V series offer modular add-on capabilities to attach disk trays/chassis/frame to an existing set of array controllers with no downtime in most cases. Striping data across multiple trays using RAID 5,6 or 1+0 enhances data availability by avoiding loss of access due to complete disk tray failure. Consider three virtual-disks, 4 disk RAID-5, a 5 disk RAID-6 and a 4 disk RAID-1+0, stripped/mirrored across a Sun Storage 2540 with four trays as shown in the figure below:






In this example, all virtual-disks will continue to function properly even with the loss of one entire disk tray. The RAID-6 virtual-disk will continue to to function even with the loss of two complete disk trays as long as the two trays did not include the bottom one. Similarly, The RAID-1+0 virtual-disk can sustain a combination of entire disk tray failures. This is just an example of how striping or mirroring across disk trays enhances availability and eliminates array single points of failures. For VMware environments where multiple servers are consolidated onto just a few physical machines and the data, including operating systems and virtual machine home directories, resident on shared storage, it is strongly recommended that RAID configurations similar to the example above be used to stripe and/or mirror data across multiple disk trays to avoid storage array single points of failures.



Enhancing Performance


    Storage I/O performance for VMware  can be enhanced using different strategies. Since VMware VI3  does not officially support Load-Balancing for Storage Multi-Pathing and only one path for each Lun is active on the ESX server at any given time regardless of the total number of paths available for use. Some important strategies include:


1) Lun Mapping should be balanced across each available Target Host Ports.
     - Spreading luns across all available target host ports is a best practice. This avoids overloading one target host port while other available ports are idle.
2) Avoid mapping more heavily used Luns to the same Target Host Ports unless the luns are mapped to a group of ports to further enhance availability.
- Preferred Owning Controller on the array and/or Preferred Path on VMware should be configured for each heavily used Lun mapped to multiple target ports such that one heavily used lun will use a different preferred target port over the other. It is recommended to use VMware to specify Preferred Path for Active/Active arrays such as the Sun StorageTek 9900 series and to set Preferred Owning Controller at the array for Active/Passive or Asymmetric arrays such as the Sun Storage 2500 and 6000 series arrays.



- In Figure-4 above, four Fibre Channel Host Bus Adapter ports on a Sun fire X4600-M2 are mapped to a Sun Storage 6780 array with one CSM200 disk tray through a fibre channel SAN. All four volumes are mapped to four Target Host Ports while one controller is the Preferred Owning Controller for each of the volumes as indicated above. Sun Common Array Manager can be used to configure Preferred Owning Controller for Sun Storage 2500 and 6000 series asymmetric arrays as show in Figure-5 below.



- VMware Preferred Path selection can be used for Active/Active storage arrays where all controllers on the subsystem present all paths to the ESX server as active paths for I/O and the server uses the specified path to be the Preferred Path when available. This can be done using VMware vCenter client as shown in Figure-6 below.
- It is not recommended to use both VMware and the array to specify preferred path. This can lead to Path Thrashing in case of a path failure or similar event. VMware will try to use any next available path while the array may want to force the Lun to use a different target port on the preferred controller if the lun was mapped to more than one target ports on the preferred controller. It is also not recommended to apply VMware Fixed Path Policy to an Asymmetric or Active/Passive controller array as it can lead to performance related and path thrashing problems as well. For further details on Path Thrashing, refer to VMware's SAN Configuration Guide.




3) Striping and mirroring across disk trays also enhances performance since I/O is separated across different back-end loops instead of the same loop being used to access disks on the same disk tray
4) RDM luns in VMware may offer slight performance improvement over VMFS volumes. If performance is a priority over some VMware features such as Storage VMotion, RDM can be the disk choice in order to slightly improve storage I/O performance. RDM volumes may be required for clustering such as Microsoft Cluster Service (MSCS) or data services on the storage arrays. More information on VMware RDM luns can be found in VMware's SAN Configuration Guide.


Tiered Storage Architecture


    Sun Storage Tiered Storage Architecture offer a multitude of possibilities, specifically for Virtualization technologies such as VMware. Tiered Storage can simplify Storage Infrastructure management by consolidating many storage sub-systems behind one which allows management and provisioning of the entire storage infrastructure from one storage array. Some uses of Tiered Storage are:
1) Consolidating Storage Sprawl behind one Storage Sub-System
    - Allows single point storage infrastructure management and administration
    - Allows single point lun storage provisioning

2) Protecting investment by attaching older storage sub-systems or disk trays behind newer and improved sub-system controllers. Luns resident on slower, older sub-systems or disk trays can be used for:
- Non-critical data such as ISO images, Test/Development Virtual machine home directories
    - Data Archiving and backup using in-system replication of production Luns to tiered sub-system and then provisioning the replicated luns to backup servers
    - Data not requiring high performance I/O
    - Data migration
    - Virtual Machines on older arrays may not have to be migrated to newer sub-systems. Tiered Storage can be used to provision the same luns to the ESX servers. A simple rescan of the luns from the ESX servers is sufficient to recognize the volumes and virtual machines on the luns. See Figures-7 below.



Sun Storage 6000 Series Tiered Storage Capabilities





    Sun Storage 6000 series series arrays offer unique Tiered Storage Architecture capabilities. A Sun Storage 6540 can be converted into the next generation Sun Storage 6780/6580 simply by replacing the 6450 array controller tray with a 6580/6780 controller tray. An existing 6140 can be moved under a 6580/6780 control by converting the 6140 controller tray into a Sun Storage CSM200 disk tray and then attaching to the 6780/6580 controller tray along with any existing CSM200 disk trays in the 6580/6780. This procedure allows migration of existing data and storage arrays, with data intact protection, under newer next generation 6580/6780 arrays which offer enhanced performance, scalability and reliability. See Tables 1 and 3 above. Users are able to re-use and protect capitol investment as a result of this capability.


NOTE: This procedure requires Sun Support Services on-site assistance. Failure to do so may result in complete data loss.


See Figure-8 below for an illustration of how a Sun Storage 6140 and 6540 is moved under Next-Generation a 6780 control.



Sun StorageTek 9900V Series Tiered Storage Capabilities





With Sun StorageTek 9900V series arrays' External Storage support, a complete Tiered Storage architecture can be created. Sun StorageTek 9900V offer enterprise class Tier-1 storage scalability, availability and performance (See Tables 1 and 3). Tier-2 and Tier-3 storage arrays can be virtualized behind Sun StorageTek 9900V arrays as external arrays. Luns or volumes from external arrays are provisioned through the front-end Sun StorageTek 9900V to the SAN for access by servers. A wide range of Sun Storage and Non-Sun Storage arrays are supported as External Storage for the Sun StorageTek 9900V. Additional capacity licenses may be needed to enable External Storage support and to take advantage of premium features such as Shadow Image (In-system replication), TrueCopy (Remote Replication), Dynamic Provisioning (Thin Provisioning) and others. Table-5 provides information on total external storage capacity currently supported with Sun StorageTek 9900V arrays.



Figure-9 is an illustration of a Sun StorageTek 9990V virtualizing an EMC Clariion and a Sun StorageTek 9970 array using the External Storage Feature for a Tiered Storage Architecture spreading older and non-critical data to older arrays while keeping critical Guest OS data, non-virtualized server data and high-performance data on the high performance newer array.



More details on Sun StorageTek 9900V External Storage Feature can be found on www.sun.com/storage


References

Sun Storage 6580 and 6780 Release Notes
- http://dlc.sun.com/pdf/820-5776-11/820-5776-11.pdf
Sun Storage 6580 and 6780 Hardware Installation Guide
- http://dlc.sun.com/pdf/820-5773-10/820-5773-10.pdf
Sun Storage 2500 Series Array Hardware Installation Guide
- http://dlc.sun.com/pdf/820-0015-12/820-0015-12.pdf
Sun Storage 6140 Array Hardware Installation Guide
- http://dlc.sun.com/pdf/819-7497-11/819-7497-11.pdf
Sun Storage Common Array Manager Software Installation Guide
- http://dlc.sun.com/pdf/820-5747-10/820-5747-10.pdf
Using RAID 6 for Increased Reliability Sun Blueprint
- http://wikis.sun.com/display/BluePrints/Using+RAID+6+for+Increased+Reliability+and+Performance
VMware Storage VMotion Best Practices for Sun Storage 2500 and 6000 Series Arrays
- http://www.sun.com/storage/white-papers/wmware-storage-VMotion.pdf
VMware VI3 and Virtual Infrastructure Basic System Administration Guide
- http://www.VMware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_admin_guide.pdf
VMware Storage/SAN Compatibility Guide
- http://www.VMware.com/resources/compatibility/pdf/vi_san_guide.pdf
VMware Remote Command-Line Interface Installation and Reference Guide
- http://www.VMware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_rcli.pdf
VMware vCenter Site Recovery Manager Compatibility Guidelines
- http://www.VMware.com/pdf/srm_storage_partners.pdf
VMware Fibre Channel SAN Configuration Guide
- http://www.VMware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_san_cfg.pdf






Building Blocks for Optimal Storage Design

    Storage architecture and design for VMware can become complex if not well planned and well thought out. To determine optimal storage design for VMware, following considerations need to be taken into account. All VMware features have a direct impact on how a storage infrastructure needs to be designed and architected. Cost, Capacity and Performance are also important points to consider while creating a storage layout for VMware. A “Cost Conscious”, peak performing scalable design is possible as long as all aspects of the design are understood. Some aspects of an Optimal Storage Design include the bare minimum design possible to create a functioning VMware infrastructure based on minimum configuration requirements and how scalable the design is.

The Bare Minimum Storage Design


    The “Bare Minimum Storage Design” is the minimum configuration essential for a functioning storage array. Bare minimum storage configurations are obviously very limited in terms of capacity, bandwidth and performance. If an application requires fully redundant, higher performance and higher capacity storage I/O, bare minimum storage configuration may not be an ideal choice. Sun Storage bare minimum configuration will support all VMware features. Following is the absolute bare minimum Sun Storage configurations supported with VMware and VMware features:





If there is a need to use Sun Storage Bare Minimum configurations for VMware environments, following best practices are recommended for optimal integration.

Disk and RAID layout


    With the limited number of disk drives available in the bare minimum Sun Storage array, choosing the right RAID type is essential. In order to take maximum advantage of a 5 disk configuration, all five disk drives should be of the same type and capacity. One drive needs to be designated as a Spare in case of a drive failure. With four remaining drives, there are limited options on choosing a RAID Group based on Cost, Capacity and Performance. For example, RAID-5 will provide available capacity of 3 out of the 4 disks with space equal to one disk used for parity stripes. RAID-1 will allow for usable space of upto two drives with the remaining two drives used for mirroring and decreased performance. RAID-0 will offer the most available space with usable capacity of all four drives, however, with no redundancy built in.Table-2 below provides general comparison of different RAID types for cost, capacity, performance and availability.




In order to maximize capacity and achieve optimal $$/GB value for a four disk virtual disk, a RAID 5 RAID group is the best option. RAID 5 provides the most capacity for cost, performance and availability. RAID 6 requires a minimum of five disks for the 2500 and 6140 arrays so that may not be an option in the “Bare Minimum configuration”. RAID 1 and 1+0 provide higher availability, however, at very high cost and a third of capacity of RAID 5. RAID 0 offers the best cost, capacity and performance with no availability. For Sun StorageTek 9900 series 3D + 1P RAID 5 would be the Parity Group of choice. For more information on Sun Storage hardware specifications go to www.sun.com/storage.


Designing for Maximum Capacity


    It is possible, with the limited configuration options available, to maximize capacity. SATA drives provide higher capacity at lower cost. Users most concerned with capacity can take advantage of Sun's high-capacity SATA disk drive options with the entire Sun Storage Product lineup. Currently, a maximum disk drive size of 1Terabyte SATA-II drives are available providing 3 Terabyte of usable space in a four disk RAID-5 configuration. It should be noted that SATA drives, although good for high capacity usage, are usually not ideal for higher performance I/O operations.


Consider a Sun Storage 2540 Fibre Channel Storage array with the bare minimum configuration of five 1 TB SATA-II drives. Considering one disk drive as a SPARE based on storage design best practices, Table-3 illustrates that RAID-5 configuration provides the maximum capacity for a RAID type using four disk drives offering the right mix of performance and data availability:






Designing for Maximum Performance


    Performance can be enhanced at multiple levels of Sun Storage arrays. However, with bare minimum configurations, there are two most common components which can have a higher impact. Controller Cache and disk drives. Depending on the type of I/O, one or the other or both may need to be optimized. For heavy sequential Reads, larger cache is recommended with cache pre-fetch enabled for optimal performance. The cache pre-fetching feature allows intelligent data block placement into cache prior to an I/O request based on past read requests, hence enhancing read performance. The higher the amount of cache, the more data can be pre-fetched into cache at a given time. In addition to larger cache, if high performance storage I/O is a key requirement for a given configuration, Sun Storage arrays can be configured with high performance SAS or Fibre Channel drives. For capacity and other SAS and Fibre Channel drive specifications, refer to www.sun.com/storage.




Volume Layout for VMware


    For optimal VMware volume layout configuration on a Sun Storage bare minimum configuration, it is recommended that only VMFS volumes be used. VMFS takes complete advantage of all VMware features including VMware level snapshots and Storage VMotion which would be an ideal tool for data migration if and when additional storage arrays are added to the bare minimum layout. Following recommendations should be followed if a bare minimum Sun Storage array is being configured for VMware:



  • Follow Virtual Machine sizing recommendations outlined in this document

  • Create a volume or a lun on the array equal to, or larger than, the sizing calculation for Virtual Machine home directories and use VMware to create VMFS filesystem on this volume or lun. This space will be used to create Virtual Machines and their home directories

  • If needed and necessary, create a volume or lun for ISO images which can be shared with ESX servers to install operating systems on Virtual Machines, applications and updates without the need for a cdrom drive on the servers and having to physically replace CDs as needed.

  • Create subsequent volumes or luns for user/application data as necessary for each Virtual Machine, ideally two to three volumes or luns based on application/data type.


STAY TUNED FOR PART III....!!!!!!!!!!!!!!!





Introduction


This document discusses consideration for accurately sizing the Storage Infrastructure for VMware virtualized environments and takes a more pragmatic approach to the need as opposed to common generalities. VMware is a multifaceted product with options and add-ons which have a direct impact on the amount of raw storage space required for a successful implementation. Without proper planning, there is room for errors and potential lack of scalability and support for some of these features. The purpose of the document is to provide architectural guidelines to implement a VMware configuration from the very basic storage array, such as the iSCSI Sun Storage 2510TM, to modular midrange and high Sun Storage arrays. Guidelines will be based on general VMware storage capabilities along with advanced features such as VMware VMotionTM, VMware HATM, VMware DRSTM, VMware Fault ToleranceTM and Disaster recovery. All of these features have an impact on how a storage array or a SAN, both IP based and FC based, needs to be implemented in order to achieve optimal availability, reliability and performance. Detailed best practices for such implementations will also be discussed. Application storage considerations are not included in this white paper.


General Storage Sizing Considerations


There are four main considerations to keep in mind for storage sizing:
1. Basic Space Requirement
2. Shared Storage
3. Storage VMotion
4. Disaster Recovery


Basic Space Requirement


VMFS Space Overhead


An additional amount of disk space is used up by VMware to lay down the VMFS filesystem on any given lun. It is not a set percentage of available space to be formatted, however, an algorithm based on the space required to save metadata. Currently, there isn't a specific calculation to determine what the exact overhead for VMFS is, however, as a general rule about 500MB of overhead should be considered as space used for metadata when creating a VMFS volume.


Virtual Machine Home Directory


VMware has specific requirements to calculate basic Virtual Machine Home Directory space. A Virtual Machine Home Directory includes



  • Virtual Machine Operating System

  • Virtual Memory for the Virtual Machine; In the form of a flat file

  • Log files

  • Virtual Machine Config file (vmx)

  • Swap file (vswp)

  • Snapshots (If any)


To determine the basic VM home directory space requirement, following questions need to be answered



  • # of VM: Number of virtual machines being consolidated to

  • Size of OS: OS + Swap + Patches

  • “Size of OS” x 2: Account for snapshot space VMware creates when a VM is set to “Suspend” mode

  • Memory: Virtual Memory for each VM

  • Log Size: VM Status, error logs


Once all of the required information has been gathered, the following calculation can be used to determine space required:


(# of VM x OS Size x 2) + (# of VM x Memory) + (# of VM x 100MB for logs) + (10% free space) = Space  for VM Home Directory

Example:


(100 VMs x 10 GB x 2) + (100 VMs x 4 GB) + (100 VMs x 100 MB for logs) + (10% of the total) = 1.55 TB (approximately)

This calculation lays the foundation for a very basic VMware storage sizing. Additionally, for configurations which may include snapshots and virtual machine templates it is recommended to reserve 15% of space for snapshots, another 15% for possible Virtual Machine Templates and 10% to 15% of free space. Example continuing from above:



1.55 TB + (1.55 TB x 15% for snapshots) + (1.55 TB x 15% for Virtual Machine Templates) =  2.02TB of space needed approximately




Shared Storage


    Almost all VMware advanced features require shared storage, with the exception of Storage VMotion. Storage VMotion will allow for a live migration of storage between internal hard disks in the ESX Server, direct attach or SAN (IP or FC) storage arrays in many different combinations. Features such as VMware VMotion, Distributed Resource Scheduling, Fault Tolerance and High Availability require shared storage configuration to work. Following is an example of a basic configuration which would allow all of these VMware advanced features to function:



Shared Storage Allows VMware advanced features such as two servers sharing a Sun Storage 2510 over the network

    In the example above, all Virtual Machines stored on the 2510 can be migrated using VMotion between both servers as long as the Virtual Machine Home Directories are saved on volumes mapped to both servers. DRS, Fault Tolerance and VMware HA can be configured on both servers for the given set of virtual machines as long as both servers have access to the storage where the virtual machines are stored. All virtual machines which require high availability should be part of of a DRS and/or VMware HA cluster to ensure the virtual machine can withstand resource related issues and hardware failures. In a direct attach configuration, all virtual machines will be exposed to downtime due to the server or the direct attach storage (JBOD for example) being a Single Point of Failure. However, the need for high availability is purely based on the what specific Virtual Machines are going to be used for. A company payroll database running on Oracle 11g with an SAP front-end probably requires high availability capabilities of VMware such as VMotion, VMware HA and DRS. However, a Virtual Machine created to test operating system and or application patch upgrades probably does not.


Storage VMotion


    Storage VMotion requires additional amount of space on the source Datastore (storage device or volume) which is equal to Virtual Machine Home Directory, any existing snapshots and if required, space used by application and/or user data. Storage VMotion requires this additional space in order to create a snapshot of the existing data on the current storage device. Once the data is copied over to the new storage device, the snapshot is removed. However, if the required space is not available, Storage VMotion process will fail.
    Since it is unlikely that all Virtual Machines on an ESX server will require Storage VMotion, and those which do require it will likely not use the capability at the same time due to the doubling of CPU and memory resource usage for the VM during the process, it is recommended to perform not more than the absolutely needed Storage VMotion migrations at any given time given the amount of CPU, Memory, Storage I/O Paths and Storage space available. In most cases, it can be very costly and unnecessary to allocate enough hardware resources to allow Storage VMotion on all VMs to occur at the same time. One or two at a time is likely. Since Storage VMotion temporarily requires twice the amount of space in the same volume where the VM data (home directory and application data) is located, storage sizing needs to be determined on a per VM basis along with the determination of how and where the VMs reside.

Continuing with example above, assuming ESX server has enough CPU and memory resources to accommodate two Storage VMotion processes at one time, that allows for 2% of additional space required:


2.02TB x 2% + 2.02TB =  2.06TB approximately

Refer to VMware Storage VMotion Best Practices for the 2500 and 6000 Series Arrays for further details.


Disaster Recovery


    Disaster recovery doubles the storage requirement as the exact same amount of storage is needed at both the Protected and DR sites. The most important determination to make is exactly how many Virtual machines at the Protected site need to be replicated to the DR site. Assuming the entire virtualized infrastructure need to be replicated, the total storage space requirement needs to be  doubled. Another question is whether Storage VMotion will be used at the DR site as well as the answer will have a significant impact of storage space requirement at the DR site. For example, assuming complete mirroring of the Protected Site to the DR site via remote replication along with Storage VMotion, amount of storage space required at the DR site will be equal to the amount of space used at the Protected site. Continuing with example above, a total of 2.06TB approximately will required at the DR
    The example is generalized. The actual calculation will be based on the number of Virtual Machines which require the use of Storage VMotion and protection via a disaster recovery implementation.


PART-II NOW POSTED.....!!!




Sunday Mar 29, 2009

Hello from New York. That’s right. I am attending the Cloud Computing Conference and Expo in NYC this week. I am looking to have a one on one with Amazon's CTO Dr. Vogels, IBM's Cloud CTO Kloeckner and of course, Sun's Cloud SVP Dave Douglas. I will hang out at Sun's Cloud Booth and have conversations with all these and other Cloud experts regarding the future of enterprise storage with Cloud Computing. I will blog about the entire event for the next three days.......!

Day 1

Amazon CTO Dr. Vogels:

Main Points of the presentation:

- Infrastructure as a service – Already a Reality

- Why Amazon needed to develop these services for them?

He went on to discuss a few companies using Amazon. An example:

- Animoto is a startup which creates videos. But owns no infrastructure. They use 100% of Amazon's services. They use EC2, S3 and all other Amazon's services.

What is Amazon’s Storage as a Service:

Key Value Storage Service: UPLOAD => ANALYSIS => RENDERING => DISTRIBUTION

All of this is done with Amazon SQS to EC2 to S3 and back. Maintaining the SLA is the key objective of the infrastructure. Scaling is required, sometimes overnight. Close to 500K developers using Amazon Cloud. Infrastructure becomes a variable cost as oppose to capitol expense. 600TB of pictures by Smug Mug are stored on Amazon S3. "Cloud Computing: Style of computing where massively scalable IT-related capabilities are provided as a service across the internet to multiple external customers" Servers should be available ON-Demand and Pay as you Go. Eli Lilly HPC was moved to Amazon WS. Provisioning and retiring becomes easy, a minute at times.

Then he went on to discuss the history of Amazon’s cloud:

Amazon went from App Server & Database Architecture => Service Orientation => Massively
Scalable Services

Amazon virtualized three different pieces:

- Compute => Amazon EC2, deployment of servers and infrastructure and retirement within minutes

- Messaging => Amazon's SQS ( simple queue service )

- Storage => Amazon S3, Simple DB, EBS (Extensible elastic block storage, hard disk in the sky, a virtual hard disk between 1GB and 1TB which can be mounted to EC2, highly available and replicated)

Infrastructure Services: Scalable (increase or decrease in minutes), Cost-Effective, Reliable, Secure

Hey, he actually mentioned OpenSolaris and MySQL as part of supported OS's and on Amazon Cloud. Coool.

Really, none of the discussion was ground breaking. Everyone out in the industry struggles with these issues day in and day out. Amazon seems to have understood this before hand and actually created the services usable to everyone out there on the WWW.

Since I was interested particularly in their S3 offering (Simple Storage as a Service), I went and looked up whatever
details I could find on their storage infrastructure for S3.

And here it is:
http://aws.amazon.com/s3/#requirements

Two points I notice in the S3 requirement they have put in place for themselves on this website:

- Inexpensive: Amazon S3 is built from inexpensive commodity hardware components. As a result, frequent node failure is the norm and must not affect the overall system. It must be hardware-agnostic, so that savings can be captured as Amazon continues to drive down
infrastructure costs.

- Simple: Building highly scalable, reliable, fast, and inexpensive storage is difficult. Doing so in a way that makes it easy to use for any application anywhere is more difficult. Amazon S3
must do both.

Building storage from inexpensive commodity hardware is just that I guess, inexpensive. They do understand that they have to ensure complete redundancy and availability. I wonder just how much of this commodity hardware they had to use to create the S3 offering. How much time was spent on creating the application which manages, administers and controls the storage? I guess the real question is, is this model a good idea OR even replicatable for companies looking to build internal clouds? OR, in reality is it scalable enough as cloud computing becomes bigger and bigger? Is commodity storage sustainable? I am currently not sure. I am sure as the days pass and I get a chance to actually talk to some of these folks, I will have a much better understanding.

NEXT UP - IBM:

So the talk was primarily high level, general discussion on why virtualization was key to the future of IT. How to attack management cost, human cost etc etc. Investment in Solid State Disk and all the innovations going on in the IT world and how to keep IT efficient is going to continue to get difficult. I agree with the gentleman that the number of compute appetite, more apps, and more data is not going to subside but actually increase. And of course, STORAGE is going to be front and center to all of this. Why? Because everyone is going to need somewhere to storage this increasing amount of data, so the DEMAND of storage is going to continue to increase exponentially, in my opinion, much much faster than demand for compute. The key I need to understand again is, how does enterprise storage fits into this, or may be, I come up with that idea and differentiator myself as opposed to waiting. Let’s see, once I talk to these folks during the next couple of days, I will have a better understanding of this all and will be in a better position to make a sound, technical judgment on the future of Enterprise storage in the Cloud.

Day 2

Well, Day 1 was a little uneventful quite honestly. I didn't get a chance to pin down some of the speakers to talk to them about Enterprise Storage etc. I will have a chance to have one-on-one with them today, Day 2, at the EXPO floor (Hopefully). I attended several other sessions but, to my surprise, most were marketing speeches on one or the other product. I was disappointed with EMC's sessions which seemed like it was going to discuss how to manage and design Clouds and storage for clouds but was primarily a talk about one of their virtualization management tool.

Anyways, I am spending all day today at the Cloud Boot Camp and three other STORAGE specific sessions for cloud and a couple of hours at the expo floor talking to folks.

The Cloud Computing Boot Camp

I have to say, for a conference, this was the best sessions I have ever attended. Completely vendor neutral, and full of deep technical details, without the spin, of what the CLOUD really is.

So what did I learn at the daylong Boot Camp?

He described a CLOUD as having 6 layers:

1. The Infrastructure Layer

2. The Storage Layer

3. Platform Layer

4. Application Layer

5. Services Layer

6. Client Layer

The focus was primarily on the Infrastructure and Storage which made me happy since I wanted to get more information on how storage fits into the cloud

The main point driven here is that Cloud is not for everyone. I completely agree. Especially for those who want to use the Cloud for Storage such as Amazon S3 or any provider's Storage as a Service. There are many providers in the market today but three are the main players. Nirvanix, Mosso and Amazon. All supposedly use commodity components to put together their
public clouds. The key point I took from Day 2 is that one would not put just anything on the cloud. And one would definitely not want to put everything on one cloud, say several TBs of data as that would really LOCK one out because it would be a huge logistic undertaking to move that data over to a new provider especially if the bandwidth is as slow as seen.

Is Cloud Storage or Cloud overall for everyone? Probably not. For transactional type databases or I/O, cloud is a very bad idea. For those who want to run 24/7/265 operations, it could actually be more expensive to run the operation on the cloud but that is relative as some offer as low as 10 cents per hour (Or was it a month???) for compute with a nice size disk drive (not cloud storage mind you). Again, still not high performance, high capacity, high I/O capable, especially over the cloud (S3 that is).

Do clouds experience outages? Do clouds lose data? Of course they do. Go to some of the forums and you will read horror stories all over the place. We all remember what happened to Amazon's S3 back in the summer of last year. Why is that? Does that have something to do with "Commodity" components for disks? I can't really say. Mind you, some of the issues I raise here are in some cases extreme but we all need to plan for the worse and hope for the best, especially when it comes to our data. There is something to be said about thoroughly test code and cross platform interoperability testing performed by industry leading engineers. But that is just my view. That is one reason why Sun's Open strategy works very well because it’s not only the few who get to test our code, it’s the entire community. The community gets to write device drivers and everyone else gets to test it and comment on it and even make it better.

So what about "INTERNAL CLOUDS"? Well, my personal feeling is, medium and large organizations who already have a well establish datacenter practices and processes will probably jump on the CLOUD concept but more Private than public. New startups and very small business will probably jump on the public cloud to save upfront cost and starting and running the business initially and eventually once their business grows and all of a sudden they need scalability beyond just basic web front end and so on, I think they will look into a traditional, private datacenter or co-loc private cloud. WHY? Because public clouds are "SHARED" without much in terms of resource management among different virtual machines. Most guarantees (SLAs) are primarily geared towards Uptime and nothing much else. If someone wants to actually be guaranteed specific CPU, Memory and I/O at ALL times, they are going to have to start paying higher dollars as opposed to an Amazon EC2 type cloud. This is where I think virtualization technologies which have specific distributed resource management technologies built in come into play: VMware VI3 and soon to be released VI4 with very cool new features which I can't really comment on right now due to NDA. Sun xVM | Server of course and others such as MS Hyper-V. All offer resource management tools such as VMware DRS by which pools of CPUs and memory resources along with I/O can be created and managed auto magically by the application itself. One could specify how much compute and I/O they want guaranteed and the resource manager tool can set policies which then go ahead and manage these resource needs for the particular virtual machine. A win-win situation for the Cloud user and the Cloud provider. But this level of control, today, is possible primarily in an internal Cloud with all the cool things which go along with virtualization such as dynamic disaster recovery. Obviously, Storage is at the middle of it all. SHARED storage allows the CLOUD to even Exist. Even if it is commodity storage, it still has to be shared somehow. But for medium to large companies, a private cloud becomes more viable because they can move existing, enterprise class storage resources (such as the Sun Storage 6000 series or the Sun Storage 9900 series), take advantage of built-in technologies such as thin-provisioning (Think provision ONLY what is used and manage the scalability as needed), Remote Replication such as RVM for Sun Storage 6000 series and UVR / TrueCopy for Sun Storage 9000 series for Disaster recovery with integrated tools such as VMware Site Recovery Manager. ZFS can also be used to spread the data across several arrays and several types of arrays.

Later on in the day, I had a one-on-one conversation with EMC's Senior Product Manager for Cloud Infrastructure Group (EMC ATMOS). I had made a point of looking at EMC's ATMOS product a couple of days ahead of the conference to understand what EMC's cloud storage offering was. We had a very good discussion on the future of block storage (specifically enterprise storage) in terms of Cloud Computing. I suggested that enterprise storage is here to stay for a while until Cloud S3 matures enough to provide the bandwidth, the availability and reliability required for high performance, high bandwidth and high availability applications such as Oracle OLTP, SAP etc. This is obviously my personal view and nothing to do with Sun's view on this.

I attended a session by IBM's Chief Technical Strategist (sounds like my counterpart position at IBM but at a higher level). Very knowledgeable gentleman in terms of storage and the future of storage I think. This was the second best sessions (in my opinion) I attended after the Cloud computing Boot Camp. He actually pointed out and confirmed several of my personal thoughts on how storage fits into the Cloud story. Commodity Components VS Homogeneous components. Not surprisingly, IBM's private cloud storage offering currently includes IBM 3200 and 4700 storage arrays along with others. More reliable, industry tested scalable storage arrays with built-in DR and thin provisioning capabilities are ideal for Internal Storage and Compute Clouds built on virtualization applications such as Sun xVM Server, VMware and others.

My Conclusions on the Future of Block Storage in terms of Cloud:

Block storage is extremely good at block access response times and transaction processing and RAS (reliability, availability and serviceability). I do not see enterprise storage going away anytime soon just because we can store TBs of data in the cloud. What matters is what we can and cannot do with the data. Enterprise block storage is here to stay and will continue to do what it does best. But why the push towards "Commodity Storage Components" (many many blade servers with large commodity disks)? Well, COST of course. Huge clouds such as Amazon S3 and others sit on commodity storage for a reason. They do not have to worry about offering high bandwidth, high I/O to the users because the users seem to understand what they can and can’t do with Cloud Storage as a Service offerings and they use it as such: Archiving, long term retention and other similar types of usage.

P.S. The views in ALL of my blogs are mine and not in any way of Sun Microsystems Inc.

Friday Feb 06, 2009

This blog in msdn provides a very good in-depth discussion on Microsoft 2008 Server multi-pathing policies:


 http://blogs.msdn.com/san/archive/2008/07/27/multipathing-support-in-windows-server-2008.aspx

Friday Nov 14, 2008

As promised. Here is a link to my presentation video for my session at the Sun Forum 2.0 event in Menlo Park.


I am also including my team mates Oracle Sizing Best Practices presentation link here as well for you all. Enjoy and feedback and/or questions welcome:


- Storage for Virtualized Data Centers


http://www.sun.com/events/forum2.0/media_shell.jsp?id=%202168293001


- Oracle Sizing Best Practices
http://www.sun.com/events/forum2.0/media_shell.jsp?id=%202164211001

Tuesday Nov 04, 2008

Sun is hosting a Storage Forum November 11th through the 13th in Menlo park, CA. The cool thing about this year's Sun Forum is that we are going to use latest technologies to enable individuals to participate remotely via Live Online Virtualization Participation. Participants will have real-time access to the conference keynotes, breakout sessions, and roundtable birds-of-a-feather session plus you will also be able to ask questions and chat amongst your fellow FORUM attendees without having to leave their office.Since the event is by invitation only, not everyone will be able to participate. However, I am speaking at the event on Block Storage Best Practices for Virtualization Technologies like Sun xVM | Server, VMware and Microsoft Hyper-V. I will make my presentation available on my blog after the event with the recording.

Friday Oct 31, 2008

I wrote a detailed White Paper on what Raid-6 is and best practices of using Raid-6 on Sun Storage 2500 series, 6140, 6580 and 6780 array. Here is the link to the paper

http://www.sun.com/offers/docs/820-7395.pdf

Happy reading. Feedback/Comments welcome.

I drove up to Oklahoma City University from Dallas yesterday to speak at the Oklahoma City OpenSolaris Users Group Meeting. Great audience. Had a very interactive discussion regarding Sun's Virtualization portfolio (Sun xVM | Server, xVM HyperVisor - OpenSolaris, VMware, MS Hyper-V, VirtualBox, xVM OPS Center) and how Sun's Block Storage, ST9000 series,ST6000 Series and ST2000 series, fit into a virtualized data center environment. We discussed best practices in calculating and designing virtualized storage infrastructures for virtualized datacenters. I will post the slide deck and the recording here once I have it compiled.

There website link is here:

http://opensolaris.org/os/project/okcosug/

Wednesday Oct 15, 2008

VMware has obviously revolutionized the concept of Disaster Recovery by introducing Site Recovery Manager earlier this year. SRM takes advantage of a storage array's remote replication capabilities to replicate Virtual Machine data necessary to bring up a Virtual Machine at the DR site without having to create and install a new OS and applications. Currently, Remote Replication configuration needs to be performed at the storage array level. All VMs that need to be part of the DR plan need to be resident on Primary Volumes (P-VOLs) part of a Remote Replication consistency group or pair. SRM simply creates a DR plan, and then makes use of Site Recovery Adapters (Provided by the specific array's Vendor) to replicate the data over to the secondary site. Once replication is complete, SRM can then allow users to perform DR tests without bringing the Production site down and even bring up the DR site if production site does go down. Now this is a very general description of how SRM works. I am not going into the intricate details of SRM's functionality at this time.

       So what if one doesn't have SRM, cant use it, cant afford it or the arrays are not supported with SRM currently? Well, the remote replication capabilities of the arrays is already there anyways and these capabilities can be used without SRM. Configuration of the remote replication is performed at the storage array anyways which is just as simple as using Sun StorageTek Common Array Manager Software for the Sun StorageTek 6000 series and via Storage Navigator for the Sun StorageTek 9000 series arrays.

       Once RemoteCopy (ST6000 series) or TrueCopy or Universal Replicator (ST9000 series) has been configured and P-VOL <=> S-VOL pairs have been established, it is just a matter of creating VMs using the P-VOLs, syncing the pairs and one will have a DR site ready.

       Obviously I make it sound simplistic. It isn't ofcourse and all of the nuances and caveats apply whether SRM is being used or not. Soon, I will post technical specifics on what these nuances and caveats are and configuration best practices to create a successful DR configuration for VMware environments.

Sunday Sep 28, 2008

Hello World. Welcome to my Weblog. Here, we will participate in discussions regarding how Sun Block Storage (Sun StorageTek 2500 Series, Sun StorageTek 6000 Series and Sun StorageTek 9000 Series) products integrate with Virualization technologies such as Sun xVM portfolio, VMware, Microsoft Hyper-V and so on. 

I plan on posting several topics here in the coming weeks which will be open for community discussion. Please do not hesitate to openly participate. 

Said-

===

Said A. Syed has over 13 years of industry experience, including over 7 years with Sun. Said started with Sun as a System Support Engineer in Chicago supporting high-end and mid-range servers and Sun storage products. Said joined Sun’s Storage Product Technical Support group in 2004 as the Sun Support Services global lead for Brocade SAN products. In this position, Said managed the Sun Support Services relationship with Brocade Support Services directly and supported world-wide Sun customers on high visibility, high severity escalations involving SAN infrastructure products and Sun’s high-end and low-cost storage products, the Sun StorageTek 3000 and 9000 series arrays. In 2008, Said was promoted to Staff Engineer role within Sun’s NPI and OEM array engineering group and is currently chartered with gaining in-depth understanding of how virtualization applications such as VMware ESX server, the Sun xVM platform, Sun Logical Domains (LDoms), Solaris™ Containers and other similar applications interact with Sun’s modular and high-end storage arrays, the Sun StorageTek 6000 and 9000 series arrays.

[Read More]