Networker Performance Tuning PDF
Networker Performance Tuning PDF
Welcome to EMC NetWorker Performance Tuning. The AUDIO portion of this course is supplemental to the material and is not a replacement for the student notes accompanying this course. EMC recommends downloading the Student Resource Guide from the Supporting Materials tab, and reading the notes in their entirety.
These materials may not be copied without EMC's written consent. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS IS. EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. EMC is a registered trademark, and NetWorker, is a trademark of EMC Corporation. All other trademarks used herein are the property of their respective owners
Course Objectives
Upon completion of this course, you will be able to: y Describe components that impact performance y Describe tests to isolate issues and discuss simple fixes y Discuss optimization features
The objectives for this course are shown here. Please take a moment to read them. This course is designed to address non-disruptive performance tuning options. Although some of the physical devices may not meet the expected performance, it is assumed that this option is exhausted and it is understood that when one physical component is replaced with a better performing replacement, another component ends up as a bottle neck. This course tries to address NetWorker performance tuning with minimal disruptions to the existing environment and looks at feature functions that can be fine-tuned to achieve better performance with the same set of hardware. This course also provides pointers to debugging the environment.
The objectives for this module are shown here. Please take a moment to read them.
Devices on backup networks can be grouped into four component groups. These groupings are based on how and where the devices are used. In a typical backup network, all four components are present. An example is a disk, when used as a source device, is considered a part of the storage component; when used as the backup target, is part of the target devices component.
The components that impact performance in system and storage configurations are listed here. Please take a moment to review them.
The components that impact performance in network and target device configurations are listed here. Please take a moment to review them.
NetWorker clients (application servers) hold mission critical data and are resource intensive. Applications on these NetWorker clients are the primary users of CPU, network and I/O resources. These NetWorker clients own the data that needs to be protected and act as a source for backups. In NetWorker clients, system, storage. and network components like CPU, memory, I/O, HBA, disk, RAID, IP networks, and storage networks, affect the performance of backups.
Target devices, like tape libraries, are attached to the storage nodes, which are dedicated for backup operations. Storage nodes are I/O intensive as they primarily move data from memory to tape. In storage nodes, system, network and target device components like CPU, memory, I/O, HBA, IP networks, storage networks, SCSI, and fibre configurations affect the performance of backups.
The NetWorker server is the backup server that manages all backup operations. NetWorker servers, which can also act as storage nodes, are typically CPU and memory intensive due to the index processing operations. In NetWorker servers system and network components like CPU, memory, I/O, HBA, IP networks and storage networks affect the performance of backups.
Connectivity Overview
y Components may perform well as standalone devices, but how well they perform with the other devices on the chain is what makes the configuration optimal y Components on the chain are of no use if they cannot talk to each other y Backups are data intensive operations and may generate large amounts of data. The data needs to be transferred at optimal speeds to meet the business needs y There will always be a component that is considered a bottleneck.
The backup environment consists of various devices from system, storage, network, and target device components, with hundreds of models from various vendors available for each of them. We could end up with millions of unique setups for backup environments with all the available devices. These devices perform at maximum speeds as advertised, when used as isolated devices. There is, however, no real use for these devices when used in isolation. When these devices are chained together, they can be used to perform backups. Interconnectivity of these devices typically determines the performance. Although devices can talk and understand each others protocols, it is important that the devices on both the ends perform on par with the other device. A GigE HBA can transfer data at about 60-80MB/sec. However, if the device in the middle is a 100 base-T switch, it can only receive and send at about 8 MB/sec. Similarly, if the disk reads can be performed at 20MB/sec and we use a GigE network to transfer the data, we may never use the GigE network at its optimal speeds. There will always be a device that is considered to be a bottleneck. It is important to use devices that can perform at par with the other devices in the chain instead of having some high and low performance devices on the chain.
Any device belonging to one of the four component groups can be a bottleneck. In this example, you can see that although the system is able to read more data, the network is not able to gather and send the same amount of data, thus slowing down the entire backup process. Just one network device on the chain, such as a hub, switch, or a NIC, can be a bottleneck and slow down the entire operation.
GigE Network
As you can see in this example, the network has been upgraded from a 100 base T network to a GigE network. Now that the network has been upgraded, the bottleneck has moved to another device. This host is the bottleneck because its not able to generate data fast enough to utilize the available network bandwidth. System bottlenecks can be due to lack of CPU, memory, or other resources.
GigE Network
In this example, the NetWorker client has been upgraded to a larger system to remove it as the bottleneck. With a better system and more network bandwidth, the bottleneck has moved to the target device. The tape devices are not able to perform as good as the other components. This can be due to various factors like limited SCSI bandwidth compared to available network bandwidth, or other reasons, such as maximum tape drive performance is reached. Improve the target device performance by introducing higher performance tape devices, such as fibre channel based drives. Utilizing SAN environment may also improve performance. Note: The example talks about one client backup being sent to one storage node. In typical environments where a single storage node receives data from multiple NetWorker clients, target devices handle lots of data movement and may often end up being the bottleneck when network is not a bottleneck.
GigE Network
SAN
NetWorker Client NetWorker Storage Node Control Path Data Path
This example introduces higher performance tape devices and adds them to the SAN to remove them from being the bottleneck. We now have plenty of system, network, and target device resources, so they are no longer the bottlenecks. We have now hit a bottleneck in storage. Although the local volumes are performing at their optimal speeds, they are not utilizing the available system, network, and target device resources. To improve the storage performance, move the data volumes to high performance external RAID arrays.
GigE Network
SAN
NetWorker Client NetWorker Storage Node Control Path Data Path
Now that we have introduced external RAID arrays and improved the system performance, it performs nearly on par with the other components in the chain and ensures that the performance expectations are met. Although there will always be a bottleneck, the impact of the bottleneck device is very limited here as all the devices in the chain are performing nearly on par with the other devices in the chain. Note: You do not have to move to a SAN environment to improve your backup performance. Also, this course does not suggest that you need to upgrade all your components to improve performance. These slides merely help to understand the bottlenecks in backup environments and its importance to have devices in the chain that perform at the same speed as other devices in the chain.
Data Overview
y The type of data to be backed up impacts the backups y Millions of files slow down traditional backups y Large files with changes impact Incremental/Differential backups
A full backup of 5 million 20 KB files will take longer than a backup of 500,000 200 KB files, although the total size of each save set is 100 GB. Performing an incremental/differential backup of a 100 GB save set, with 1000 100 MB files with 50 files modified, takes longer than 100,000 1 MB files with 50 files modified of 100 GB save set. It is very important to understand the nature of the data and the type of backup being performed. In the case of 5 million files, it takes longer to complete the full backup as NetWorker is a file based solution and needs more resources to process and move data to the target devices. However, in the case of 1000 files, as the average file size is large, and NetWorker being a file based backup solution, it would backup the entire file again during incremental/differential backups, even if just one block of data is changed. Hence, NetWorker would be moving more data to the target devices for 50 files of 100 MB each than 50 files of 1 MB each.
Backups are resource intensive operations and tend to impact the performance of primary applications. When sizing systems for applications, backups and the related bandwidth requirements are commonly overlooked, which has an immediate impact on the performance of the applications whenever backups are performed. Backups are not the primary processes on the NetWorker clients, but nevertheless are critical. Backups consume a good amount of resources on the NetWorker client; hence, when the NetWorker clients are low on resources, it impacts not only backup operations but the primary applications as well. Backups are slower on NetWorker clients with millions of files. As typical backup applications are file based solutions, the time lost for millions of files in disk rotational latency and seek times impact their performance. Both encryption and compression require large amounts of resources on the NetWorker client and can significantly affect the performance of the backups.
Contrary to the notion that backup servers need high I/O bandwidth, it is the storage nodes that need high I/O bandwidth as they handle the data transfer from network to target devices. In most instances, the NetWorker server also acts as a storage node, which increases workload on the NetWorker server. Storage nodes buffer data received from the NetWorker clients and send it to specific target devices as directed by the NetWorker server. Storage nodes act as intermediaries that receive the data from the NetWorker clients and direct it to the target devices.
Index and media management operations are some of the primary processes of the NetWorker server. For every save set backed up from a NetWorker client, the NetWorker server stores the associated meta-data in the clients index database, which, over a period of time and depending upon the environment, can grow.
Module Summary
Key points covered in this module are: y Components that impact performance y Role of connectivity and bottlenecks y Role of data y Typical factors that affect backups in client, storage node and NetWorker server
These are the key points covered in this module. Please take a moment to review them.
The objectives for this module are shown here. Please take a moment to read them.
Backup Environment
y Verify your backup environment with NetWorker server, storage nodes, clients and the network and storage infrastructure information y Review your Recovery Time Objective (RTO) for each client y Verify the backup window for each NetWorker client y List the amount of data to be backed up for each client during full and incremental backups y Determine the data growth rate for each client
It is often easy to say that the backup environment is not performing up to ones expectations. However it is difficult to precisely list the performance expectations, keeping in mind the environment and devices used. It is good to know the bottlenecks in the setup and set the expectations appropriately. Have a look at your backup environment and create a diagram, if you do not have any available. List all system, storage, network, and target device components, along with the data path. Mark down the bottleneck component in the data path of each client. It is very important to know how much down time you can afford for each of your NetWorker clients, which dictates your Recovery Time Objective (RTO). Review and document your RTO for each of the NetWorker clients. Verify the available backup window for each NetWorker client and list the amount of data that needs to be backed up from the clients for Full or Incremental backups and the average daily/weekly/monthly data growth on each NetWorker client.
Backup Configurations
y Verify the backup policies created and ensure that the policies are tuned towards meeting the Recovery Time Objective (RTO) of each of the clients y Estimate backup window for each NetWorker client based on the information collected y Verify the clients are organized in logical groups based on parameters like backup window, business criticality, physical location, retention policy
Ensure that you will be able to meet your Recovery Time Objectives with your backup schedules for each NetWorker client. The shorter the acceptable downtime, the more expensive your backups are. It may not be possible to construct a backup image from a full backup and multiple incremental backups if your acceptable down time is very short. You may have to perform full backups more frequently, which requires a longer backup window on additional days. It also increases network bandwidth requirements. Create backup time estimates for each of your NetWorker clients, considering the level of backup to be performed, the amount of data to be backed up, and the slowest component in the chain. Remember that in reality, actual performance will be less than theoretical performance. Organize the NetWorker clients into logical groups based on parameters, such as start time of the backup window, backup window duration, business criticality, physical location of the clients, and data retention policy.
Before suggesting that the performance of a particular NetWorker client is abnormally poor, it is important to observe how the client is performing over various parameters. The issue can be a simple defect in the software or firmware, which in most cases can be observed by some inconsistent backup speeds For each of the specific NetWorker client, observe, y Is the performance consistent for the entire duration of the backup? y Is there a change in performance if the backup is started in a different time window? y Is it consistent across all clients using specific storage node? y Is it consistent across all save sets for the client? y Is it consistent across all clients in the same subnet? y Is it consistent across all clients with similar operating systems, service packs, applications? y Does the backup performance improve during the save or does it decrease? These and similar questions help you understand the issues at hand.
One of the most common reasons for poor NetWorker performance is related to name resolution. NetWorker expects both forward and reverse lookups to match; any issues with these reduce the backup performance and, at times, leads to failure to start the backups. Please take a moment to review the suggestions shown above. Verify that your setup has no name resolution conflicts.
Another typical issue that slows data transfer and performance of NetWorker is related to the path the data travels from the client to the storage node. The smaller the number of hops, the faster the data travels to its destination. You should be able to verify the path using the operating system Trace Route tools. Verify that your setup uses the minimum number of hops to reach the destination.
Data does not travel via the dedicated backup network when it is not configured appropriately. This tends to slow down the backups as it shares the primary network, along with the other applications. Ensure that the Server Network Interface attribute in the NetWorker client resource reflects the appropriate backup network interface. Also make sure that all the network interfaces used for backups are in the same subnet. Having a clients network address in a subnet which is different from the storage nodes subnet slows down the backups.
A simple test to verify if the backups are performing at expected speeds, without using NetWorker components, is to perform an ftp test. An ftp test helps determine whether the network or the tape device is the bottleneck. Do not use local volumes to create and transfer files for ftp tests, use backup volumes. Create a large data file on your client or use the backup data and send it to the storage node using ftp. Note the time taken for the data transfer and compare it with the backup times of a similar size of data. If the ftp performs much faster than your backups, then the issue might be with your tape devices. If the performance is similar, then the network might be the reason for the slow performance.
Generic Tests: dd
y Create a large data file on your storage node and use dd to send it to the target device
Example: date; dd if=/tmp/5GBfile of=/dev/rmt/0cbn bs= 1MB; date Note the time taken for transferring the file and compare it with the tape performance
dd tests are another way to test the performance of the setup without using any of the NetWorker components. Using dd, you can send data to a target device and compare the devices throughput to the maximum throughput suggested by the manufacturer of the target device. Create large data files and use similar scripts, as shown in the slide. Backup these files to tape and observe their performance.
Native performance monitoring tools can be used to monitor the I/O, Disk, CPU, and network performance. When used to observe the performance over a period of time, these tools help us understand the resources consumed by each application, including NetWorker. Observations can show slow backups, but the network is used to its maximum capacity by other applications. In such cases, backups can be scheduled to start at a different time.
BIGASM and UASM are NetWorker based tests that are typically used to verify performance. BIGASM modules generate a file of specified size, transfers it over a network/SCSI connection, then writes it to a tape or other target devices. BIGASM creates a stream of bytes in memory and saves them to the target device and eliminates disk access. This helps to test the speed of NetWorker clients, network, and the tape devices ignoring disk access. The UASM command can be used to test disk read speeds and, by writing the data to a null device, may help in identifying disk-based bottlenecks.
Contact EMC NetWorker support and ask for performance testing tools, including the blaster utility. These tools are not distributed freely and help to identify bottlenecks in a heterogeneous NetWorker environment.
Module Summary
Key points covered in this module are: y Reviewing backup configurations y Performing generic tests to verify configuration y Performing NetWorker based tests to verify configuration
These are the key points covered in this module. Please take a moment to review them.
Module 3 - Optimization
Upon completion of this module, you will be able to: y Describe NetWorker based optimization features y Describe fine tuning servers, storage, network and target devices y Describe disk backup advantages
The objectives for this module are shown here. Please take a moment to read them.
EMC NetWorker has various optimization features used to tune the backup environment to achieve better performance. Server parallelism, client parallelism, target sessions, and Dynamic Drive Sharing are some of the features that directly impact performance when used alone or in conjunction with the other features.
The server parallelism attribute setting controls the maximum number of save sets that may be backed up simultaneously. This attribute, when used with the other performance attributes like client parallelism and target sessions, helps to optimize the setup and keep the data flowing to the target devices at maximum levels. When the server parallelism value is set to too high, it might overload the network, clients, storage nodes, or NetWorker server and impact the performance of other applications. When set to a low value, there may not be enough data coming into a storage node to keep the target devices performing at desired levels.
The client parallelism attribute setting enables the client to backup more than one save set at the same time. This attribute, when used with the other performance attributes like server parallelism and target sessions, helps optimize the setup and keeps the data flowing to the target devices at maximum levels. When the client parallelism value is set to too high, it might overload the NetWorker client resources and consume most of the CPU, memory, and network resources, thereby having an adverse effect on the primary application. When set to a low value, the backups may not complete within the backup window. Setting the client parallelism to a low value also impacts the overall performance as there may not be enough data coming into the storage node to keep the target devices performing at desired levels.
The target sessions attribute setting controls the number of simultaneous save streams directed to a device before additional save streams are directed to other devices. Target sessions allow multiple save streams to flow to the target device at the same time. This attribute, when used with the other performance attributes like server parallelism and client parallelism, helps optimize the setup and keep the data flowing to the target devices, at optimum levels. When the target sessions value is set to too high, the save streams in the tapes are fragmented, which might slow down the restores as one save set might span across tapes and increase seek times. When set to a low value, and all devices are in use and receiving maximum backup sessions based on their target sessions value, NetWorker server overrides these values and uses the device with least activity for the next session.
The dynamic drive sharing feature allows NetWorker to recognize the shared drives in a SAN environment. Dynamic drive sharing controls application requests for media and allows the NetWorker server and storage nodes to access and share all attached devices. Dynamic drive sharing enables NetWorker to skip the shared drives that are in use and route the backup or recoveries to other available shared drives. Dynamic drive sharing allows clients to be configured as SAN storage nodes. It routes the backup traffic via SAN, reduces the LAN traffic, and speeds up backup operations.
y NetWorker consumes lots of CPU and memory resources for skewed directories
Example: More resources are needed to process one directory with 1 million files than 1000 directories with 1000 files each
Seek time and rotational latency of the disk take up more time than controller overhead and data transfers in most cases, particularly small files. Having a highly fragmented file system can have a significant impact on the performance of applications as well as backups. Operating system tools and other freely available tools can be used to defragment the file system for better performance. A large page file/swap space also helps increase performance of both applications, as well as backups. Moving the page file to a separate volume helps reduce the I/O on the file system being backed up and provides more resources for backups. Skewed directories consume much more CPU and memory resources than typical directories with a smaller number of files, which at times impacts the performance of the backups. Identifying and correcting such file systems helps improve performance.
y Compression:
If Network is the bottleneck, ensure that the data is compressed before sending it to the storage nodes If CPU and memory are the bottlenecks, ensure that the data is not compressed on the file system
y In virtual environments like VMware, remember that the physical resources are shared between multiple virtual machines
Do not start the backups of all virtual clients belonging to the same physical system at the same time
2006 EMC Corporation. All rights reserved. EMC NetWorker Performance Tuning - 41
Some of the anti-virus software is known to slow down backups. Changing the anti-virus software settings to allow backup applications like NetWorker to bypass anti-virus checking for each file helps improve performance. If the network is identified as the bottleneck, then enabling compression at the NetWorker client helps improve performance as it reduces the amount of data that needs to be transferred over the network. Compression can be achieved using the compressasm directive in NetWorker. If CPU and memory on the NetWorker clients are the bottleneck, then ensure that compression is not enabled as it consumes more resources. When backing up servers in virtual environments, it is important to know that multiple servers use the same physical hardware and their collective performance cannot be largely different than the performance of a single server on the physical host. In such scenarios, ensure that all virtual hosts on the same physical servers are not backed up at the same time.
y Verify maximum open files value is a large one y When backup volumes are filled more than 90%, backup performance falls drastically
In the case of large file systems and file systems with large files, performance can be improved by adjusting the kernel parameters to read large blocks and setting the NSR_READ_SIZE variable to an appropriate value. Ensure that the kernel values match with the values set in NetWorker. Changing the maximum open files kernel parameter in the operating system can help improve performance in instances where a large number of files are opened and NetWorker cannot open as many files as it desires.
Storage RAID levels have a direct impact on the performance of the backups. The better the read transaction rate, the faster the backups are. However, the RAID level has to be balanced towards cost and application requirements. They cannot be configured with only backups in mind. When upgrading disks, the primary factor for upgrading is to increase the capacity. However, when the disk performance is overlooked, it leads to a slowdown in performance. Replacing a 320 GB 15000 RPM disk with a larger 500 GB 10000 RPM disk actually reduces the performance and slows down backups.
y Verify the settings on operating system to match the optimal speeds supported by the network card
Ensure a GigE card is set to full speed and duplex and the proper speed Ensure the entire path, including the switches, supports GigE Use switches instead of hubs where possible
2006 EMC Corporation. All rights reserved. EMC NetWorker Performance Tuning - 44
Here are some suggestions for fine tuning your network. Please take a moment to review them.
Most problems related to tape devices relate to the appropriate firmware, driver, or configuration. Drivers provided with the operating system may not be the appropriate and current drivers. Verifying with the tape library vendor and ensuring that they have been tested and qualified with NetWorker helps resolve most of the driver related issues. Verify that the block size setting is compatible and change the block size configuration in NetWorker, if needed. Device compression helps increase performance in tape devices.
Disk Backups
y Backup to disk targets can reduce the backup windows and speed up the entire backup and restore process y Hard disks are robust and more reliable than they were years ago. y Disks being random access devices, offer better performance than tape drives, which are sequential devices.
Disks can perform multiple simultaneous read and write operations Disks eliminate robotic movements, cartridge loads, seek times and slow transfer rates of tapes and increase backup and recovery speeds
One of the primary reasons disk backups have gained popularity is its performance advantages over traditional tape devices. Although tape devices are the best medium for backups when offsite archiving and disaster recovery is needed, disk backups reduce the backup windows and help meet recovery time objectives much better than tapes. Also, over the years, hard disks have grown more robust and are very comparable to tapes in reliability and cost. Apart from faster reads and writes, disks do not have the overheads like robotic movements, cartridge loads, and slow seek times. Also, disks give you the ability to read and write to the same volume at the same time, while a tape drive performing backup cannot be used for restores until the backup is complete.
Disk Backups
y Disk backups help reduce backup windows and frees resources for NetWorker client applications during critical business hours y Cloning and staging features of NetWorker allow backup data to be sent to tape, allowing offsite storage for Disaster Recovery y Disk backups allow restores from the same device while backups are happening, eliminating potential wait time
The performance advantages of disk backups help shrink the backup windows and free up the resources of NetWorker client for primary applications. NetWorkers staging and cloning features provide the flexibility needed to backup the data from disk backup targets to tape and allow offsite archiving for disaster recovery. When tape devices are identified as bottlenecks and the backup windows are growing unmanageable, it is best to consider a disk backup option to get the backups under control.
Module Summary
Key points covered in this module are: y NetWorker based optimization features y Fine tuning of servers, storage, network and target devices y Disk backup advantages
These are the key points covered in this module. Please take a moment to review them.
Course Summary
Key points covered in this course are: y Components that impact performance y Tests to isolate issues y NetWorker optimization features y Fine tuning of components
These are the key points covered in this training. Please take a moment to review them. This concludes the training. Please proceed to the course completion slide to take the assessment.