StorageBackup DataCenterMagazine 02 2011 EN
StorageBackup DataCenterMagazine 02 2011 EN
Visit: http://datacentermag.com/newsletter/
DataCENTER
FOR IT PROFESSIONALS
MAGAZINE
Content issue 02/2011
6. news storage
24. Building a Flexible
and Reliable
interview Storage on Linux
Eduardo Ramos dos Santos Júnior
storage backup
24. Building a Flexible and Reliable 42. Data backups just don’t cut it
Storage on Linux
anymore
Eduardo Ramos dos Santos Júnior
Herman van Heerden
In our Storage section we decided to give you a closer look on Storage in Linux system. We also decided to bring up
the Virtual Storage Security topic, as Storage Virtualization is a very important data center area nowadays.
You can also find more information about data center and network security and for those who are interested Cloud
Computing, we also have something in this issue.
Magdalena Mojska
Editor in Chief
http://datacentermag.com
Data Center magazine team DTP: Marcin Ziółkowski Senior Consultant/Publisher: Marketing Director: Magdalena Mojska
Editor in Chief: Graphics & Design Studio Paweł Marciniak [email protected]
Magdalena Mojska tel.: 509 443 977
[email protected] [email protected] CEO: Publisher:
www.gdstudio.pl Ewa Dudzic [email protected] Software Press Sp. z o.o. SK
Editorial Contributors: Stephen Breen, 02-682 Warszawa, ul. Bokserska 1
Dhananjay D. Garg, Mohsen Mostafa Special thanks to: Sean Curry, Stephen Production Director: Phone: 1 917 338 3631
Jokar, Mahdi Yelodari, Richard C. Batka, Breen, Erik Schipper, John Co, Adriel Andrzej Kuca
Mulyadi Santosa, BJ Gleason. Torres, Kelley Dawson. [email protected] www.datacentermag.com
News
SAN JOSE, Calif., February 22, 2011 – Force10 Networks, Inc., FTOS, the modular Force10 operating system software, runs
a global technology leader, today announced that Belvedere on both the C-Series resilient core switches and the S4810, pro-
Trading LLC, a leading proprietary trading firm specializing in viding Belvedere with a L2 and L3 feature richness, including
equity index and commodity derivatives, has selected and de- IGMP multicasting and high availability provided by stacking.
ployed high-performance Force10 S-Series™ top-of-rack (ToR) FTOS also provides underlying code stability and advanced
switches as part of a successful network design aimed at pro- monitoring and serviceability functions, including real-time net-
viding ultra-low latency, which better supports the highly trans- work and application traffic monitoring.
actional nature of their business. “Force10 has a rich heritage of delivering reliable, high-per-
Electronic traders, such as Belvedere, are diligently working formance switching solutions to environments grappling with
to ensure that their network infrastructure can support their mas- huge files or high transaction levels,” said Arpit Joshipura, chief
sive amount of transactions as well as drive down the latency marketing officer, Force10 Networks. “Coupling low latency
to nanosecond levels. With each transaction representing po- requirements and advanced L3 software features provides
tentially millions of dollars, ultra-low latency is critical in high- a unique value to customers in HFT environments.”
performance trading environments.
“In current trading environments, delivering reliable and su- About Force10 Networks
perior performance and ultra-low latency can represent the dif- Force10 Networks is a global technology leader that data cent-
ference between being profitable and losing opportunities,” said er, service provider and enterprise customers rely on when the
Yezdaan Baber, Director of Technology, Belvedere Trading LLC. network is their business. The company’s high performance
“Our continued investment in Force10 switches reflects our con- solutions are designed to deliver new economics by virtualizing
fidence in their technology mapping seamlessly to the needs and automating Ethernet networks. Force10’s high density and
of our business.” resilient Ethernet switching and routing products increase net-
Based on a recent network performance and power test, the work availability, agility and efficiency while reducing power and
S4810 access switch demonstrated as much as 5% to 70% cooling costs. Force10 provides 24x7 service and support ca-
lower latency than comparable switches in a 10 Gigabit Ether- pabilities to its global customer base in more than 60 countries
net (GbE) configuration. The test was conducted by independ- worldwide. For more information on Force10 Networks, please
ent analyst and publisher Nick Lippis of The Lippis Report and visit www.force10networks.com.
Ixia, a leader in converged IP network test solutions. Click here
to view the Lippis Report test results.
Belvedere selected the newly offered S4810 access switches
to be deployed strategically at their network edge and colloca-
tion facilities to leverage the performance of 10 Gigabit Ethernet
(GbE) for Belvedere’s financial derivatives trading practice.
S4810 Purpose-Built for High Frequency Trading- Force10 Networks, the Force10 Networks logo, Force10,
Type Applications E-Series, Traverse, and TraverseEdge are registered trade-
Recognizing the immediate need to ensure line-rate, non- marks and ExaScale, S-Series, TeraScale, FTOS, Open Auto-
blocking throughput and ultra-low-latency throughout its per- mation, JumpStart, SwitchLink, SmartScripts, and HyperLink,
formance-sensitive high-frequency trading (HFT) environment, are trademarks of Force10 Networks, Inc. All other company
Belvedere deployed the S4810 switches in geographical prox- names are trademarks of their respective holders.
imity to U.S. trading exchanges. Its 1.28 Tbps (full-duplex) non-
blocking, cut-through switching fabric delivers line-rate per- Contact: Kevin Kimball
formance under full load with ultra low latency. The compact Force10 Networks Inc.
S4810 switch design provides 64 10 GbE worth of throughput in +1 408-571-3544
a combination of 10 GbE and 40 GbE feeds... [email protected]
Open Stack
As of Feburary 3rd, Cisco systems has announced that it has OpenStack has some real competition out there though with
joined the OpenStack community. For those that don’t keep up enterprise software vendors like VMWare, with their vCloud direc-
with the emerging utility based computing platforms, OpenStack tor platform, and CA’s acquisition of 3Tera’s Applogic. VMWare
is an open-source infrastructure platform that has the ability is the 10000 pound gorilla in the enterprise virtualization market,
to offer services similar to AWS EC2, and S3. OpenStack is and will be interesting to watch as options like OpenStack evolve.
important to the datacenter community as while its adoption CA’s purchase of 3Tera will put the force of the software giant
grows in maturity, it has offers the potential of a true utility model, behind a strong existing community of managed service provid-
with customers migrating applications and services to the most ers running the platform, and could signal the beginning of some
optimal service for their needs possible, while still maintaining competition for the enterprise internal “cloud”.
the ability to transport those applications into private cloud sce- The obvious conclusion about Cisco’s interest is that it will
narios where required. For developers and software as a serv- be able contribute to the codebase and help enable advanced
ice cloud providers, this means they have can exercise scaling networking integration and features, which it has already done
models not available in traditional virtualization models. in other virtualization products such as its introduction of the
As of today, Openstack has a list of collaborators and con- Nexus 1000v and the Virtual Security Gateway. A less obvious
tributors that includes some true heavyweights, including NTT, alternative is that we might see Cisco utilizing the solution in
Dell, Citrix, Extreme Networks, and many others. OpenStacks some of its routing platforms, which already use virtualization
primary drivers have come from NASA, which contributed No- technologies to provide network oriented application services.
va, providing compute services, and Swift, developed by Rack- Openstack will be packaged into Ubuntu 11.04, due out in
space providing storage services. April.
Read more about aboutstack at: http://www.openstack.org/
Interview
with David Tuhy
Could you briefly introduce yourself to our readers? needs. It also requires processing power and integrated storage
David Tuhy, General Manager of Intel’s Storage Group features (e.g. de-duplication, faster disaster recovery, optimized
thin provisioning, etc.) that enable critical functions without the
Could you say a few words about your company? added cost of purchasing additional physical units when more
Intel is a world leader in computing innovation. The company storage is needed.
designs and builds the essential technologies that serve as the
foundation for the world’s computing devices. Could you briefly describe the difference between your
processor and this kind of products provided by the other
Which areas does Intel currently focus on? What can we companies?
expect from this company in the upcoming months when Scalable storage based on Intel Xeon processor architecture in-
it comes to Storage? tegrates features such as XOR RAID functions to deliver faster
Intel’s Storage Group is currently focused on solutions for en- workloads that use less computing resources and power for
terprises, small businesses and consumer homes. The Intel® RAID 5 writes. Intel Xeon also minimizes downtime by support-
Xeon® storage roadmap provides architectural enhancements ing Asynchronous DRAM refresh (ADR) operates DIMMs in self-
that will bring further optimization and performance for storage refresh mode to retain cached memory even through a power
solutions. failure and Non-Transparent Bridging (NTB) to keep nodes in
sync for high availability. Future platforms will also integrate
Can you tell us about the latest trends in storage virtuali- additional storage capabilities, such as integrates Serial Attach
zation? SCSI, also known as SAS, to eliminate the need for extra I/O
We’ve noted an increased interest in storage virtualization as controllers and reduce costs.
CIOs search for more efficient ways to manage the enormous
growth of digital data. Companies are using virtualization in stor-
age to aggregate multiple pieces of physical storage such as
hard disk drives and multiple NAS or SAN boxes into one or
more pieces of logical storage. This creates much more dynam-
ic and flexible storage that can service parallel storage business
functions – data de-duplication in parallel with IO accesses).
Businesses Beware
Application Performance Matters to Your Users!
as many businesses struggle looking forward in the future on how
to leverage many of the newer technologies, predictions can be
made based on current trends. Cloud computing has been hyped
but gains traction as businesses see the value it brings.
T
he promise that it offers of cost savings, scalability and manage agreements and track service levels over time. Addi-
agility is very attractive to many businesses, even those tionally, IT needs a means to measure the performance of these
that plan to innovate or transform their own IT opera- hybrid cloud applications that are both inside and outside of the
tions. As we evaluate these trends, we should look at the pos- control of the business. IT will need to understand what pieces
sible future of: of the application are performing poorly—CDNs, ISPs, etc.
cloud computing and what IT infrastructure will remain in IT’s Performance Matters
direct control. As business owners and IT managers determine Achieving results that increase the company’s revenues, cus-
the requirements for what to push to the cloud, the role of IT will tomer satisfaction and brand perception is how to thrive in the
be less focused on managing infrastructure than on ensuring business world today. Here are some typical results of success-
service providers are meeting the agreed-upon level of serv- ful application performance management:
ice. Additionally, IT Service Management frameworks such as It becomes clear that application architectures, zones of re-
ITILv3 or Six Sigma can assist businesses with defining proc- sponsibility and customer expectations must all be considered
esses for improving service quality and defining service level to manage performance and availability, efficiently and effec-
agreements. tively. As businesses around the world continue to transform
and leverage new technologies, it is necessary to:
Future Customer Expectations
Customers will have very little tolerance for poor-performing ap- • Manage performance globally as seen by the end-user
plications, due to steep competition. The shift will continue to- • Gain visibility into end-to-end application performance
ward self-service capabilities versus having “face-to-face” cus- • Align IT to meet business goals
tomer interactions. Additionally, users’ technical knowledge on • Optimize mean time to resolution
how to work around or find something better will flourish, mak-
ing companies strive to ensure they maintain high application Before a crisis becomes catastrophe, ensure your business
performance to retain and increase customers. stays out of the news headlines about application outages and
Furthermore, technology will continue to push itself into a new avoids performance issues that could hammer your bottom line,
era where computing is all about real-user experience. A form with application performance management that is end-to-end,
of hypothesis for application technology might look something from the enterprise to the Internet.
like:
Super-rich interfaces: 3-D application interfaces, hologram Kristen Allmacher is a Product Marketing Manager
imaging and smart phones display pictures on any surface. at Compuware Corporation. She has been working in
Smarter devices: easy-to-run operating systems, Internet ap- the IT Service Management industry for more than ten
plications for any business or personal use, quick viewing of years in the areas of R&D, strategic planning, product
streaming media/video, web access everywhere. management, sales engineering and process improve-
Optimized networks and telecommunications: the new 4G vs. ments. In addition, her side hobby as a graphic designer
3G, broadband+, vLANs, vWANs and VPNs. has been utilized to design many software interfaces
As these newer and more sophisticated ways of computing that are implemented and used today. Kristen is a Six
spread worldwide, customer performance demands will con- Sigma Green Belt and is ITIL Foundation v3 certified. Her passion is in wor-
tinue to escalate. king with people, processes and technology.
Be Prepared Against
Accidental File Deletions
Sooner or later, disasters happen to every computer user. They delete
important files, on purpose or accidentally. Or they update a document and
save it, overwriting the original version. The common practice of using an
existing document, spreadsheet, or presentation as a starting point for a new
one often ends in catastrophe when the user forgets to save the changes
under a new file name.
C
areers end on such errors. Defining the problem is simple: files tection. “Save overs” are what happens when you “save” a document
that should not be deleted often are, on individual PCs and file over another document. For example, if you make changes to docu-
servers. The immediate challenge is recovering them. Dealing ment “A”, and then simply hit the Save icon, you just lost all of the data
with the larger, long-term issue of prevention and retention on a corpo- in your original document.
rate-wide scale is more complex. To gain greater control over user’s files, it is increasingly common
Products that attempt to recover deleted files have been around as for IT departments to configure users’ PCs to save files to a network
long as personal computers themselves, but have a history of deliver- directory, rather than the computer’s local hard drive. Good reasons
ing mixed results. for doing so are plentiful, yet this scheme can further harm the already
The shortcoming is that these solutions, by definition, are after-the- small chances of recovering a file. The reason is simple: items deleted
fact fixes rather than preventative solution. They cannot resurrect a file from a network drive are permanently deleted; they are not sent to
that has been overwritten, and they make no attempt to archive the a Windows recycle bin.
many revisions that a typical document goes through in its lifetime. There is no shortage of products that tackle various aspects of file
Stronger measures are needed. recovery. Most of these areafter-the-fact solutions that help a desper-
The mathematics of accidental file erasure is alarming. A PC user is ate worker in an attempt to recover an already deleted file. A more ad-
likely to spend an averageof one hour in a frantic effort to recover the vanced approach to the problem is to prevent the situation to occur in
file (or files, or an entire directory) before turningto the help desk. Just the firstplace.
two occurrences per day – a conservative estimate for a corporate en- A complete solution for deletion protection must encompass several
vironment – translates to a minimum annual productivity loss of 520 key aspects. Foremost, it needs to capture any deleted file, regardless
hours, nearly 14 40-hour weeks. Office colleagues, in their attempts to of size or location. It must provide automatic version protection, sav-
help, add to lost productivity and are morelikely to hurt, not help, any ing multiple copies of Word, Excel and PowerPoint documents as they
chance of success. change throughout their lifetime, and doing so without any user action.
Once the IT department gets involved, costs add up quickly. With an It must be network savvy, providing protection for users that save their
IT technician earning $30 an hour, a single 30-minute venture to locate work to a network directory. From the IT perspective, a proper solution
and restore a file from a backup tape – if it’s there at all – costs $15. should have the ability to be push installed over the network to individual
While $15 does not seem like much, in a corporate environment, repeat- systems and allow the entire environment to be managed centrally.
ing that process twice a day costs $7,800 over a full year.) Tied up for Diskeeper Corporation developed Undelete® 2009 to deal with these
260 hours, the total quantifiable cost for user and IT is nearly 21 work circumstances. Undelete replaces the Windows recycle bin with its own
weeks, the equivalent of five months from one full-time employee. more powerful “Recovery Bin” that intercepts all deleted files, regardless
Clearly, recovering a deleted file from a local hard drive or perhaps of their size, their location, how they were deleted, or who did so.
from a prior day’s backup tapes are an expensive, time-consuming, The recovery bin employs a Windows Explorer-like interface for
productivity-robbing process. That’s if it can be done at all. And should navigation and a search capability. Protection is provided that guards
a crucial file be unrecoverable, the cost to the business itself could be against losses that occur when a changed Word, Excel and PowerPoint
incalculable. document is saved under the same name, overwriting the prior version.
A corporate retention strategy looks at the big picture. What such Undelete 2009 implements version protection, allowing users to restore
a plan is likely to miss is the workaday fine details down in the trenches, these overwritten files. In use, users need only click on the file and select
the often unintentional actions taken by individual rank-and-file work- the View Versions option. At that point, version restoration is available.
ers who spend hours at a computer every day. An average employee Preventing users from deleting or overwriting files is a nice idea, but
deletes or overwrites an individual file by accident, then desperately one that isn’t possible. It happens frequently, causing anguish for the
tries to get it back. user, aggravation for the IT department, and possible financial and le-
To deal with file erasure at the individual PC level, Windows offers gal harm for the business. The use of software that backs up files or
a safety net, the recycle bin. The idea is that deleted files are held here, attempts to recover them is logical. Unfortunately, these products are
for a while, allowing a user to recover a file before it disappears for good. often reactive in nature. They cannot recover files across network di-
Though an excellent idea, this contains shortcomings that limit its use- rectories nor do they provide protection for existing files overwritten by
fulness for the enterprise. a newer version. Furthermore, backups are intended for disaster recov-
There are many situations where the deletion will bypass the recycle ery from catastrophic systems failure, not as a method for file-by-file un-
bin, and, in the case of “save overs”, the recycle bin does not offer pro- deletion. Undelete 2009 was built with these challenges in mind.
Data Center World is presented by AFCOM, the premier data center association representing 4,500 of the largest data centers around the world.
To learn more about the association, visit www.afcom.com
Datacenter Dynamics – New York: Designing for Demand: Single Tier to Multi Tier to
Dynamic Tier
In our annual visit to New York, the DatacenterDynamics Conference and Expo series will address the challenges faced by owners and operators
of legacy structures in need of upgrading and new builds required to meet the complex combination of required service delivery, technology
advances, energy efficiency, resilience and security.
To learn more, visit: http://www.datacenterdynamics.com/conferences/2011/new-york
BACKUP AND RECOVERY up because fewer amount of files needs to be backed up,
Most businesses backup their data so that in case of a poten- but has slower restore because last full and all subsequent
tial loss due to deletion or data corruption, they can recover the incremental backups must be applied.
original data and protect themselves from major financial loss- c. Differential backup – A differential backup makes a copy
es. Now-a-days it has also become a regulatory requirement of all the files that were changed after the last full back-
for businesses to backup their data. Most companies prepare up. That means that if a file is modified after the incremen-
a backup, so that they can execute a disaster recovery after an tal backup then differential backup won’t save file changes
accidental deletion, erasure or hard disk corruption. that were made after incremental backup, but will save file
Disaster recovery is the term used for a complete plan to get changes that were made right after the full backup. Differ-
a computer system up and running again after a disaster. It is ential backup has slower backup because large amount of
always important to have a disaster recovery plan for recovering files needs to be backed up, but has faster restore because
computers, networks and data. Strong backup strategies can be only the last full and the last differential backup needs to
implemented for restoring data after any kind of accident. be applied.
drives are standard and can store up to 700MB and CD-RW place where all the data files are stored. The backup soft-
format allows unlimited rewrites to the same CD. But the ware on the backup server helps with the transfer of the
biggest disadvantage with them is that the maximum stor- files from the file server to the physical storage device.
age size isn’t enough. Then comes the DVD, which has al- b. Client based backup – In a client based backup, each net-
most replaced CD-ROMs because this storage media al- work user is allowed to have a certain amount of control
lows a storage capacity of at least 4.7GB (single layer) to over the backup of their files. In this case the user decides
8.5 (dual layer). which file to backup and which to not and to do so a back-
The Blu-ray Disc may replace the DVD in the near future up configuration software is installed on each user’s ma-
because it allows for a storage of 25GB (single layer) to chine. Just like the server based backup, backup software
50GB (dual layer), which is approximately six times more is installed on the backup server and this software controls
storage than on a DVD. the backup on the backup server side.
c. Hard Drives – External hard drive is a fast method of back- c. Frozen image backup – It often happens that when a file is
ing up and restoring. The disks in these drives spin at a backed up by a client or server based backup the access
speed close to the speed of sound and file retrieval by the to the file is blocked. The access to the file is not given to
backup program is as quick as it could be. The only prob- the user unless and until the backup of that file completes.
lem with them is that they are only good for a small com- But in frozen image backup, the backup software creates
pany or for individual users who backup their own data to an image of the data and then backs up an image of the
a backup server. Hard drives in a large network utilize too actual file. The main disadvantages of this backup is that it
much precious network bandwidth and they are difficult to uses up storage and network resources and during the re-
handle especially because of their price. store, the whole image is restored and not just a single file.
d. Removable Disks – Over the years, the floppy has become Server and client based backups allow you to restore one
quickly obsolete for backup purposes as the storage ca- file from your backup.
pacity is as small as 1.4MB and it has been replaced by d. SAN based backup – In a SAN based backup, the SAN, file
USB flash drives, both in terms of size and capacity. Flash server and backup server are all connected together. The
drives are much smaller than a floppy disk, weigh less than backup server has the backup software installed on it and
30g and can store as large as 256GB. Some flash drives it initiates the backup process. The backup software initi-
allow 1 million write or erase cycles and can be easily used ates the backup process in such a way that the file server
to backup various types of data. sends the data files directly to the SAN, where the physical
e. Network Attached Storage (NAS) – NAS is basically a com- storage device is located.
bination of hardware and software that is designed to serve
only a single purpose i.e., file sharing. The working of NAS SUMMARY
is simple, a data sharing software first connects the LAN It is always important to have a disaster recovery plan for re-
with the NAS server and then the NAS server which is the covering computers, networks and data. The disaster recovery
brain of the NAS system passes the data from the LAN on- plan should be made such that it suits the needs and budget
to the storage devices. NAS is a simple device and is good of the company. A full backup means the backup of all data,
for file availability, but the only problem with NAS is that it an incremental backup means the backup of all files that were
consumes the network bandwidth and thus presents itself modified after the last backup and a differential backup means
as only a partial solution for disaster recovery. a backup of all files that were modified after the last full back-
f. Storage Area Networking (SAN) – The only difference be- up. Most administrators use a combination of two out of three
tween SAN and NAS is that SAN is an entire network that backup types.
is dedicated to file storage and not There are numerous hardware tech-
just a single file storage system. nologies available in the market, but
SAN allows quick file access and it before choosing one you should exam-
greatly reduces the network band- ine factors like size of your company,
width requirement, but there are frequency of data modification, amount
some things that should be kept in of data to be backed up, and most im-
mind before using SAN and they portantly the budget of your compa-
are, viz. ny. These factors will help you decide
– SANs are complex, which hardware technology to go for.
– SANs are costly, After deciding which hardware tech-
– A SAN requires you to pur- nology to go with, you can decide as to
chase all components (hubs which backup strategy should be used
and fibre channel switches) by your backup software. Large compa-
from the same vendor for eve- nies can go with SAN based backups
rything to work. because it requires a significant eco-
nomic investment. Smaller companies
BACKUP STRATEGIES with a single LAN usually go with client
based and / or server based backups.
a. Server based backup – In a serv-
er based backup, the backup soft- DHANANJAY D. GARG
ware is stored on the backup serv- The author is a information security enthusiast
er and backup storage devices are who is based in India. He loves working on pro-
connected to it. File server is the Graphic 1. jects related to information security.
• Host Discovery,
• Port Scanning,
• Version Detection,
• OS Detection,
• Scriptable interaction with the target.
Figure 1. Work with Zenmap is easy and have a good Environment for work.
SYSTOR 2011: The 4th Annual International Systems and Storage Conference
SYSTOR 2011, the 4th Annual International Systems and Storage Conference, promotes computer systems and storage research, and will take
place in Haifa, Israel. SYSTOR fosters close ties between the Israeli and worldwide systems research communities, and brings together academia
and industry.
To learn more, visit: http://www.research.ibm.com/haifa/conferences/systor2011/
Table 1.
Nmap Works in two modes, In command line mode and GUI
mode.Graphic version of Nmap known as Zenmap. official GUI Feature Option
for Nmap versions 2.2 to 4.22 known as NmapFE, originally Don’t Ping -PN
written by Zach Smith. For Nmap 4.50, NmapFE was replaced Perform a Ping Only Scan -sP
with Zenmap, a new graphical user interface based on UMIT, TCP SYN Ping -PS
developed by Adriano Monteiro Marques.
TCP ACK Ping -PA
There are many features about nmap that We can not say all
in this article. We just tell some of the important features. UDP Ping -PU
SCTP INIT Ping -PY
Scan a Single Target ICMP Echo Ping -PE
For Scan a single target, your target can be specified as an IP
ICMP Timestamp Ping -PP
address or host name.
ICMP Address Mask Ping -PM
Usage syntax: nmap [target] IP Protocol Ping -PO
$ nmap 192.168.10.1 ARP Ping -PR
Starting Nmap 5.00 ( http://nmap.org )
Traceroute --traceroute
at 2009-08-07 19:38 CDT
Interesting ports on 192.168.10.1: Table 2.
Not shown: 997 filtered ports Feature Option
PORT STATE SERVICE TCP SYN Scan -sS
20/tcp closed ftp-data
TCP Connect Scan -sT
21/tcp closed ftp
80/tcp open http
UDP Scan -sU
Nmap done: 1 IP address (1 host up) TCP NULL Scan -sN
scanned in 7.21 seconds TCP FIN Scan -sF
Xmas Scan -sX
In above example, PORT show port number/protocol and
TCP ACK Scan -sA
STATE show state of port and SERVICE show type of service
for the port. Custom TCP Scan --scanflags
You can scan Multiple Targets with flowing syntax : IP Protocol Scan -sO
Send Raw Ethernet Packets --send-eth
Usage syntax: nmap [target1 target2 etc]
Send IP Packets --send-ip
$ nmap 192.168.10.1 192.168.10.100
192.168.10.101 Table 3.
Flag Usage
Scan a Range of IP Addresses SYN Synchronize
A range of IP addresses can be used for target specification
ACK Acknowledgment
as in the example below.
PSH Push
Usage syntax: nmap [Range of IP addresses] URG Urgent
$ nmap 192.168.10.1-100 RST Reset
FIN Finished
Scan an Entire Subnet
Nmap can be used to scan an entire subnet using CIDR. Table 4.
Feature Option
Usage syntax: nmap [Network/CIDR]
Do a quick scan -F
$ nmap 192.168.10.1/24
Scanning a specific port -p [port]
You can create a text file that contain of your victim and give Scanning port through a name -p [name]
this file to Nmap for Scan, see below example : Scanning Ports by Protocol -p U:[UDP ports],T:[TCP ports]
Scan All Ports -p “*”
Usage syntax: nmap -iL [list.txt]
Scan Top Ports --top-ports [number]
$ nmap -iL list.txt
Perform port scanning consecutive -r
Exclude Targets from a Scan Table 5.
For Exclude a target from scan, you can use this syntax :
Feature Option
Operating System Detection -O
Usage syntax: nmap [targets]
--exclude [target(s)] Trying to guess the unknown operating system --osscan-guess
$ nmap 192.168.10.0/24 Service Version Detection -sV
--exclude 192.168.10.100
Perform a RPC Scan --version-trace
Troubleshooting Version Scans -sR
Scan an IPv6 Target Other options are used in the above form, Only some of the
Addition of IPv4,Nmap can be scan IPv6.the -6 parameter is options require special settings.
used to perform IPv6 scan.
Custom TCP Scan
Usage syntax: nmap -6 [target] The --scanflags option is used perform a custom TCP scan.
# nmap -6 fe80::29aa:9db9:4164:d80e
Usage syntax: nmap --scanflags
A summary of some features about Discovery Options for [flag(s)] [target]
Quick read , Exist in the below table 1. # nmap --scanflags SYNURG 10.10.1.127
In this paper we refrain from explaining details and For ex-
ample, only show general form for using. The --scanflags option allows users to define a custom scan
using one or more TCP header flags (Table 3).
Don’t Ping
Port Scanning Options
Usage syntax: nmap -PN [target] There are a total of 131,070 TCP/IP ports (65,535 TCP and
$ nmap -PN 10.10.5.11 65,535 UDP).Nmap, by default, only scans 1,000 of the most
commonly used ports. In the table below,We’re showing some
Other features are used similarly. of options that you require to perform port scanning (Table 4).
Let, examine Advanced Scanning Options For use this options We’re showing some of the options re-
quire special settings.
Nmap supports a number of user selectable scan types.
By default, Nmap will perform a basic TCP scan on each tar- Do a quick scan
get system. In some situations, it may be necessary to per- The -F option instructs Nmap to perform a scan of only the
form more complex TCP (or even UDP) scans to find uncom- 100 most commonly used ports:
mon services or to evade a firewall. In below table we show
some option that you need to perform advanced scanning Usage syntax: nmap -F [target]
(Table 2). $ nmap -F 10.10.1.44
For example ,We only show general form of scanning and
some of options that require special settings, we explain Nmap scans the top 1000 commonly used ports by default.
they. The -F option reduces that number to 100.
advanced scanning used like other scanning, in below ex-
ample we show you how use options to scan target. Scanning port through a name
The -p option can be used to scan ports by name:
Note:
You must login with root/administrator privileges (or use the Usage syntax: nmap -p
sudo command) to execute many of the scans . [port name(s)] [target]
$ nmap -p smtp,http 10.10.1.44
TCP SYN Scan
To performs a TCP SYN scan you must use The -sS option. Scanning Ports by Protocol
Specifying a T: or U: prefix with the -p option allows you to
Usage syntax: nmap -sS [target] search for a specific port and protocol combination:
# nmap -sS 10.10.1.48
Usage syntax: nmap -p U:[UDP ports],
T:[TCP ports] [target]
# nmap -sU -sT -p
U:53,T:25 10.10.1.44
Use these options as well as other options are.for example. # nmap --source-port 53
scanme.insecure.org
Operating System Detection
The -O parameter enables Nmap’s operating system detec- Append Random Data
tion feature.
Usage syntax: nmap --data-length
Usage syntax: nmap -O [target] [number] [target]
# nmap -O 10.10.1.48 # nmap --data-length 25 10.10.1.252
Attempt to Guess In the above example 25 additional bytes are added to all
an Unknown Operating System packets sent to the target.
If Nmap is unable to accurately identify the OS, you can
force it to guess by using the --osscan-guess option. Randomize Target Scan Order
The -f option is used to fragment probes into 8-byte packets. Send Bad Checksums
In this example 10.10.1.41 is the zombie and 10.10.1.252 is Save Output to a Text File -oN
the target system. Save Output to a XML File -oX
To specify the source port to manually: Grepable Output -oG
Output All Supported File Types -oA
Usage syntax: nmap --source-port
133t Output -oS
[port] [target]
Other features are like each other but in Output All Supported References:
File Types option ,you don’t need to specify Extensions.
This property is used as follows : t IUUQFOXJLJQFEJBPSHXJLJ1PSU@TDBOOFS
t IUUQFOXJLJQFEJBPSHXJLJ7VMOFSBCJMJUZ@TDBOOFS
t IUUQONBQPSH
Usage syntax: nmap -oA [filename]
[target]
$ nmap -oA scans 10.10.1.1 Nmap® Cookbook The fat-free guide to network scanning
Be the
first one
to get all
the latest
news!
Subscribe to our newsletter: http://datacentermag.com/newsletter/
Data Centers
– Security First!
Data centers must be secure in order to provide a safe environment
for running enterprise to achieve maximum productivity,
protecting profitability, productivity and reputation. What
would happen if data center had an outage or security breach
that disrupted operations, accesses, and services. We expect
data centers to deliver timely, secure and trusted information
throughout consuming organizations.
T
heir complex physical and virtual structures face with ev- with a wide choice of strong authentication factors support-
er-evolving security demands and large companies need ed out of the box;
to build secure, dynamic information infrastructures to • Facilitate compliance with privacy and security regulations by
accelerate their innovations. Administration and access control leveraging centralized auditing and reporting capabilities;
considerations are the main issues in secure data centers. • Improve productivity and simplify the end-user experience by
automating sign-on and using a single password to access all
Administration – Do right users access to the right informa- applications;
tion? This is the main question that administrators deal with. • Enable comprehensive session management of kiosk or sha-
Various solutions in this area are proposed by different com- red workstations to improve security and user productivity;
panies. These solutions provide comprehensive identity man- • Enhance security by reducing poor end-user password be-
agement, access management, and user compliance auditing havior.
capabilities. Security solutions targeted the business agility and
reliability, operational efficiency and productivity for dynamic
infrastructures. Comprehensive security solutions can provide Reference: IBM Data Center Security
these features:
Cisco Live!
Cisco Live is Cisco’s annual marquee event and Cisco Australia and New Zealand’s largest annual forum that will offer three distinct event
programs (Networkers, IT Management and Service Provider) with the perfect mix of high-level visionary insights and deep-dive technical
education.
Learn more: http://www.ciscolive.com
Building a Flexible
and Reliable Storage
on Linux
There are many storage solutions which can meet recent needs
of storage, being the main requeriments used on choice: the level
of confiability, complexity, cost and flexibility of the solution.
I
n a dinamic world, surrounded by all kinds of technologies, There are different types of RAID, which each store the infor-
the persistence of informations is seen as a critical requeri- mation in a different way and have different modes to keep and
ment, since it is on top of it that the knowledge and the busi- read/write the data.
ness are modeled. With that, your availability and confiability be- The performance improvement in the reading and recording
come important too, since that the major active of the Institutions operations are achieved because these procedures are distrib-
from moderns time is the Information and access to it. uted amoung members of the array of storage devices.
The attendance these requiriments is done by keeping the The confiability and redundancy are achieved through the use
information in persistence storage devices, with enough read- of the parity bit used (see figure 1), that are stored in the distrib-
ing and writing performance and possibility of fast and efficient uted way among all the devices from array.
recovery, in case of failures. And tecnologies like RAID and The parity is responsible for the data fault tolerance. These
LVM show as useful tools to get a redundant, flexible and reli- bits ensure that, in case of failures of the only disk, the informa-
able storage scenario. tions aren’t lost. The information is lost only in the case where
more than one disk fails.
RAID The possibility of data recovery in the case of a disk failure is
RAID (Redundant Array of Inexpensive Disks) is a storage mode related to the fact that without a array member, the information
that allows you to achieve high storage capacity with confiabil- that was stored on it can be rebuild through the reading of all
ity through redundancy, achieved by using multiple storage de- sectors in the same regions in the disks and of the parity bits.
cives. With that, the contents of the original sector can be computed
For purposes of the paper, the user of RAID-5, which mini- based on this informations.
mum number of storage devices required to build it is three, Taking as example the figure 2.
shown the most appropriate, because with the same gets a Suppose the C drive fails. No information is lost, however the
balance between: data which was stored on the failed drive must be computed on
• I/O performance;
• confiability and redundancy;
• recover capacity in case of failures.
Figure 1. Figure 2.
the fly. This is the reason the performance is degraded when ity bits which, as said early, won’t be stored in only onde drive,
a drive fails. but in all members drives of the array.
Considering the first sector, the data in A1 and B1 will be To check out the newly created MD status, one can use the
compared with Dp. This is a logical comparation called XOR mdstat in the proc filesystem:
(exclusive OR) in the binary values of the data block. The at-
traction of this type of comparation is that if a missing value # cat /proc/mdstat
(drive fails), it can be recovery doing a XOR comparation in the
remaining values. Or even with the mdadm, with the option --detail, passing as
Suppose that A1=0110, B1=1010 and C1=1100, and consider parameter the path of the block device (/dev/md0):
the figure 3, the truth table from XOR. The D1 value, the parity
block from sector 1 is calculated as follows: # mdadm --detail /dev/md0
Dp == D1 == (A1 XOR B1) XOR C == If successful in the RAID-5 setup, it should be active and with
== (0110 XOR 1010) XOR 1100 == 1100 XOR 1100 == 0000 clean status, with 3 associated devices.
When C drive fails, its value can be recovery as follow: Simulanting a fault
It’s possible simulate a failure in one of the drives, in order to
C1 = (A1 XOR B1) XOR Dp == (0110 XOR 1010) XOR 0000 == get a confirmation that the information is still available even af-
== 1100 XOR 0000 == 1100 ter a failure. This scenario can be obtained through hardware,
turnning off the server and unplugging the disk or through soft-
This procedure is used both reading of the information when ware, using mdadm:
a drive fails and recovery process, when a replacement is al-
located. # mdadm /dev/md1 -f /dev/sda1
It’s noteworthy that, using RAID-5 to build a block device,
there is loss of storage capacity, being 1/n from total capacity In the next, just use one of the commands mentioned before to
reserved to stored parity bits, with ‘n’ being the total number of check out the status of the block device, being that now should
disks. have fault indication in one of the devices and the status array
changed to clean and degraded.
Creating a MD – Multiple Devices
There is the possibility to build a block device in RAID-5 by State : clean, degraded
hardware or software.
Here it will be done by software, using the mdadm tool After that fault simulation, now must adjust the array, what goes
[1], assuming a scenario with 3 SATA hard disks with from removing the failed disk:
2 TB capacity (/dev/sda, /dev/sdb and /dev/sdc). If done by
hardware, the performance would be better, but the server # mdadm /dev/md0 -r /dev/sda1
motherboard would need to have support to this kind of
operation. Until the addition of its replacement:
To build the MD block device it is necessary to first create
a partition of the type “Linux raid autodetect”, type “fd”, on the # mdadm /dev/md0 -a /dev/sda1
disks envolved in the scheme, what can be done using fdisk
command [2]. With that, is necessary wait for the array reconstrution, since
Using mdadm, for the creation of the RAID-5, can be done the its status in the moment indicate clean, still degraded, but
using the follow command: recovering, distributing again the parity bits amoung the array
members.
# mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sda1
/dev/sdb1 /dev/sdc1 State : clean, degraded, recovering -> LVM
That way, the block device /dev/md0 will be created in RAID-5, LVM (Logical Volume Management) is way to manage servers
with 2 TB capacity, using 3 devices, being one reserved to par- storage devices, allowing create, remove and resize partitions
according to needs of the applications.
Logically it is organized as shown in the figure 4.
The Volume Groups consist of Physical Volumes, where the
last ones can be partitions, disks or files. And it is in the Volume
Groups where are created Logical Volumes that will be used
by the system and/or applications. They necessarily need to be
mounted on the system to be available.
Some situations where LVM is useful:
This flexibility in operations that would be hard if treated with After that, a logical volume with 800 GB capacity was created
physical device directly is the great advantage of LVM. and it is ready to be formatted with some file system format that
is available [4] and it can be placed on the provision through a
Creating Physical Volumes mounted [5].
and Volumes Groups
With the construction of block device made earlier, using the Extending Logical Volume
disk array, now it’s possible to use it to a physical volume. For Depending on the evolution of the use of the storage server by
such, there is the lvm tool [3] and the creation can be processed the applications spread across the local network, it’s possible
with the following command: to adjust the partitions size (logical volumes) according to the
needs of the context in question.
# pvcreate /dev/md0 Besides the common questions which are taken into account
when choosing the file system to use (performance, space
To check out attributes of physical volume, just use pvdisplay, allocated, read latency, write latency, latency of creating and
which will show some information about it, like device name, removing large and small files, etc), those of which must be
size physical vlume, space allocated and space free. guided according to what type of application will use it, atten-
Now, as represented in the figure 4, one can create a Vol- tion must be payed attention too about the extending possibil-
ume Group: ity, when use LVM, needing check if the file system has sup-
port for this operation.
# vgcreate volume-files /dev/md0 A Logical Volume can have its size reduzed so another
can have its size extended or, maybe if one still have some
The sucessful of the operation above can be checked through space in the physical volume, the logical volume can just ex-
of the output commando below: tend its size. For that, it is necessary to initially umount the
file system.
# vgscan Following, the expansion operation must be done passing as
parameter how much of space you want to increase the logical
Reading all physical volumes. This may take a while... volume in question:
Found volume group “volume-files” using metadata type lvm2
Now, with the Volume Group created, it is possible to create # lvextend -L +100g /dev/volume-files/mailserver
and work with Logical Volumes.
If you want to create a storage pool to keep e-mail files of the A check on the status of the logical volume (lvdisplay) is advised,
Organization: so is a check in the file system consistency after the resize.
For the case in that reiserfs was chosen as the file system to
# lvcreate -n mailserver the logical volume in question, because it has supports the ex-
-size 800g volume-files pansion, to perfom the check and resizing of the file system:
Website:
• [1] – http://linux.die.net/man/8/mdadm
• [2] – http://linux.die.net/man/8/fdisk
• [3] – http://linux.die.net/man/8/lvm
• [4] – http://linux.die.net/man/8/mkfs
• [5] – http://linux.die.net/man/8/mount
• [6] – http://unixhelp.ed.ac.uk/CGI/man-cgi?df
Author
Figure 4. Eduardo Ramos dos Santos Júnior
Understanding
Linux Filesystem
for Better Storage Management
If you had a large warehouse, how would you organize it? Would you
want everything inside it to be scattered around? Or would you have
an arrangement pattern in mind? If so, what kind of arrangement
pattern? Would you group things based on size or similar? Of course,
the way a room is organized will determine the room’s first and final
impression to clients or customers for example. The benefit to having
an organized space is that it becomes much easier to locate items.
T
he same idea goes for storage, or to be precise, physical block size
discs. What I am referring to here are the regular IDE/ total inodes
ATA/SATA hard disk types, SSD, flash and so on (A note total blocks
about SSD or flash based discs: most filesystems are de- number of free blocks
signed with the idea of rotating discs in mind and consequently number of free inodes
do not take their limited rewrite capacity into consideration). last mount and last write time
Today, we are seeing disks with bigger capacities and faster
data transfer rates; at the same time, we are seeing smaller di- To get this information, you can use tools like dumpe2fs (for
mensions and lighter disks (usually). ext2/3/4) or debugreiserfs (for reiserfs). For example:
Most importantly, disks are becoming less expensive which
we are always thankful for! The question becomes, however: $ sudo dumpe2fs /dev/sda1
How do we manage our storage effectively? ...
Here is where the file system comes in. A File System (filesys- Filesystem features: has_journal ext_attr resize_inode
tem) is a layout structure that through its properties, tell us how dir_index filetype needs_recovery
the data will be stored on a disk. It utilizes a “driver” of sorts, sparse_super large_file
that performs management tasks. Default mount options: (none)
One of the properties usually assigned are the Linux access Filesystem state: clean
permissions which tells us, for example, which users may ac- Errors behavior: Continue
cess certain data and those users who cannot. Filesystem OS type: Linux
What choices do we have for filesystem types today? With Inode count: 6094848
any major Linux distribution serving as a reference, here are Block count: 12177262
some of options given to us: Reserved block count: 608863
Free blocks: 2790554
ext2 / ext3 Free inodes: 6090656
ext4 First block: 0
reiserfs (version 3 and 4) Block size: 4096
XFS Fragment size: 4096 ...
btrfs
A Superblock in one part of a filesystem has multiple copies of
Basic understanding itself propagating throughout the entire filesystem. This is in-
To really understand filesystems, we need to define a few terms. tended as a backup in case file corruption occurs. In the case of
ext2 or ext3, you can use dumpe2fs to find their locations:
What is a super-block?
A superblock is the meta-data describing the properties of an en- $ sudo dumpe2fs /dev/sda2 | grep -i superblock
tire file system. Some of the properties you can find here are: dumpe2fs 1.39 (29-May-2006)
This is helpful if your filesystem refuses to mount, for example. What is a block?
In this situation, the Fsck utility would at least restore the main A Block is the smallest storage unit of a file system. You can
superblock, thus making the whole filesystem somewhat read- think of it like the smallest brick that makes up a large building.
able. Of course, expect few missing pieces of data here and Don’t confuse it with the sector, which is the smallest storage
there. unit of a physical device (e.g your hard drive).This means that
physically, your disc address the data in terms of sector size,
What is an inode? while the filesystem uses block size.
Similar to the super-block describing the whole filesystem, an Today, most storage utilizes a 512-byte logical sector size,
inode describes the file’s properties (and this includes direc- while file systems usually use a 4 kb (kilobyte) block size. What
tory information as well, as a directory is just another form of a do we mean when we say “logical” sector size? Physically, the
file). You can find out partially what’s inside of inode by running sector size is 4 kilobytes. It used to be 512 bytes, but such a size
“stat” command: proved inadequate to reach the larger and larger capacities that
we are seeing today. Recall that our definition of a filesystem
included a driver to perform management tasks. This internal
controller performs a 512 byte-to-4 kilobyte conversion, so it is
therefore transparent to the block device driver which still might
utilize a 512 byte unit size for example.
Storage devices, as we mentioned, have started to adopt this
new 4 kilobyte sector size and operating systems have started
adapting in turn [2]. There are still problems, however, due to
issues with addressing alignment. At the time of this writing, ker-
nel developers are still working on addressing this issue. In the
near future, we should be able to fully utilize the 4 kilobyte-size
devices to their fullest capabilities.
Block size indirectly affects the percentage of storage utiliza-
tion. Why? Because the bigger the block size is, the bigger the
chance is you will introduce less occupied space. If you store 5
KiB data on file system that uses 4 KiB block, then it will occupy
2 blocks. You can’t cut the 2nd block into half. It’s all or nothing.
This wasted area is called slack space. So theoretically, if the
average of your file size is 3 KiB while the block size is 4KiB,
then roughly 25% of your disk space is wasted.
Again, the stat command is useful to find out how many blocks metadata change is always logged before being comitted to
a file will occupy. For example: disc. As for the data, it could be logged or logged not at all.
The benefit of logging both is that reliability is increased. The
$ stat -c “%b %B” /bin/ls consequence becomes that throughput is reduced due to the
208 512 extra I/O operations to the journal (data is written twice: to the
log and disc blocks). Journalling metadata only cuts the I/O fairly
This tells us that /bin/ls takes takes 208 blocks. If we multiply significant, but in the case of disc catastrophe, you might end
that with 512 byte (the block size), we get 106496 byte=104 up with garbage or lost data.
kilobyte. While in fact the file size is 95116 byte ( equals to 93 The compromise is to do ordering, like the ext3 ordered meth-
kilobyte). Please read “man stat” for further information of valid od does. Quoting from mount manual page: ordered.
format sequences you can use to get various information re- This is the default mode. All data is forced directly out to the
garding a file or filesystem. main file system prior to its metadata being committed to the
Fortunately, a technique called “tail packing” was recently journal.”
introduced to overcome this situation. The Kernel can insert Thus logged metadata in a journal acts like a barrier. If you
data of another file(s) into these slack spaces. In other words, see the metadata update in the log, then we are assured that the
a single block could host the content of more than one file. data is already written. If you don’t see one, then either nothing
A few filesystems already support this: btrfs and reiserfs (version happened or the update does not reflect the current metadata
3 and 4). Touching on the BSD systems, the UFS2 filesystem state. All in all, at least the file is still consistent.
also does tail packing.
hda (before the swap). In this case, every operation that points ible with ext2, ext4 is not compatible with ext3/2 due to several
to it must also be adjusted. Truly a headache for large system new internal designs. This means that ext4 cannot be mounted
management. as ext3. On the other hand, ext2 and ext3 filesystems can be
To solve that, label them and point to it. First name the file- mounted as ext4. (http://en.wikipedia.org/wiki/Ex4).
system of hda1 as “/data”: The most notable features of ext4 are:
LABEL=/data /data ext2 defaults 0 0 By using extents, ext4 will allocate a fair amount of physically
contiguous blocks. The old method, block mapping, does that
Now try to reattach the disc into other channel. Linux can still block by block. When dealing with large file size, extent is su-
mount it because all it cares now is that there is a partition perior since blocks are already contiguous so writing could be
named “/data”. done in one sweep.
For UUID, you simply grab it from filesystem specific tool such Information kept in file’s data is also simpler since extent
as dumpe2fs: represents more than one block. Where it is used to take 128
records to represent the whole file mapping, it might now be re-
$ sudo dumpe2fs /dev/sda7 | grep -i uuid duced down to 2 (two) or 4 (four) records only.
dumpe2fs 1.39 (29-May-2006)
Filesystem UUID: 38bf953a-3e12-429b-867a-833966567793 Persistent pre-allocation is a way to reserve blocks in a disc
before data is actually stored. This was once done by writing
Below is a snippet of grub.cfg that refer to a root filesystem by many zeros to disc blocks. Now this is fully managed by the
using UUID: Linux kernel and this behaviour cannot normally be changed by
a user. In ext4, via the fallocate() system call, developer can de-
linux /boot/vmlinuz-2.6.32-28-generic root=UUID=38bf953a- cide how many blocks to reserve at the time it is really wanted.
3e12-429b-867a-833966567793 ro quiet splash In general, pre-allocation helps a user get contiguous blocks
which means faster data access. Extent and pre-allocation are
good allies here.
Discussing various filesystems Delayed allocation works by delaying write access until the
Now let’s discuss some characteristics of different filesystems: very last moment possible. By doing so, merge operations are
expected to happen more. The final effect is fewer physical ac-
- ext2 (second extended file system) cesses and reduced head seeks.
1. ext4
It’s a moderate choice if you still need an access time record • Whenever possible, use an extra disc and put the swap
but don’t want to be burdened with too many inode updates. partition on it. By doing this, paging and file I/O operations
You might need this field for occassions where a mail client can happen concurrently. Of course, paging will still slow
relies on access time to determine whether new e-mail has ar- down system perfomance, but this way, paging and I/O op-
rived or not. erations will go into their own data channels.
Fortunately, the latest distro versions already use this flag
when mounting all of the filesystem (except swap of course). Here is the filesystem suggestion for a webserver:
Prepare a separate partition for /var/www as this is the lo-
• nosuid: suid (executables marked to be running on behalf cation most distros choose to place their web content. Mark
of its owner) effectively disabled. A quick way to lessen it as noexec if you are absolutely sure you don’t need to have
system compromise due to improper permission assign- anything other than static HTML in there. For scripts like PHP,
ment I suggest creating another partition and setting it as nosuid so
that no scripts are accidentically run as root.
As we all know, a root-owned suid binary means it runs as root A filesystem such as ext3 is a good choice here. It’s a com-
without the need for the usual privilege escalation first (such promise between simplicity and data safety.
as sudo or su). As an example, the passwd binary has to allow
some administrative delegation to let a user to change his/her • Database server:
own password.
However, if the binary is not properly audited and has a se- Choose a journalling filesystem whenever possible. Specifically,
curity hole (such as a buffer overflow), a well-crafted attack will pick one that uses delayed write (such as XFS) and/or extent
“promote” one as root very easily. based (such as ext4) operations. Delayed write, aside from the
benefits listed before, would route subsequent read operations
• noexec: forbid program execution. It’s probably a good during a high hit rate to the page cache. My personal experi-
idea to put this flag in a data-only partition, such as the ence shows that recently written data is also likely data that is
one that acts as a cache area for the caching proxy appli- soon to be read. Therefore, holding them for awhile in memory
cache is good idea. As for extent-based filesystems, they might area where reiserfs shines. [[It’s too bad reiserfs is no longer
help reduce fragmentation. Tables that hold big entries will re- really under development these days]].
ally gain benefit from extent-based filesystems. Using both de- As an alternative though, you can try and test XFS. Although
layed-write and extent-based filesystems would enhance read it’s still in the journalling filesystem category, its capability to
and write capabilities. handle massive I/O operations is impressive.
Note that some databases such as Oracle perform direct • File Server
I/O. This means any virtual file system and filesystem spe-
cific functions would be bypassed and data will go directly A setup that falls under this category is anything that serves
to the underlying block device. In this case, no matter which files using any kind of protocol: ftp, rsync, torrent, http, revi-
filesystem you choose, the perfomance would be fairly the sion control (for example git, subversion, mercurial, CVS), SMB/
same. In this scenario, using a striping RAID setup will en- CIFS and so on. Usually file servers are used as private serv-
hance speed. ers but they could also be public servers for the entire Internet,
for example.
• Web proxy/cache
Whatever the case may be, be prepared for non-stop and
What I am referring to here is a machine that does any kind of massive concurrent read access. Or, as is the case for revi-
caching, be it http proxying, ftp proxying and so on. Also, this sion control systems, massive write or upload access. You also
applies to forward or reverse proxies. would require a filesystem that could help you perform back-
One important characteristic of these workloads is that the ups.
content is safely thrashable at any time. To elaborate more, To meet all these needs, your choice might be the ZFS or Btrfs
you don’t really need any kind of data safeguard, since you are filesystems. ZFS is a filesystem ported from Solaris which has
simply dealing with a cache. proven to be reliable. It has a nice feature called deduplication
In this situation, what you need to focus on is I/O throughput, (or “dedup” for short). Dedup-ing means blocks which have the
while at the same time lowering I/O latency. Reiserfs was once same exact content are merged as one. As a best-case sce-
a good choice . When dealing with small-to-medium sized files nario, suppose you have a file “B” which is an exact copy of file
(roughly under 1 MiB) it was quite good. Mostly, in scenarios “A”. With ZFS, both are stored as one file and two separate
where files are largely distributed to a stack of directories is the metadata files are created pointing to the occupied blocks. This
equals a 50% savings on storage space!
Dedup is appealing because these days we are looking at
more and more data which is actually duplicated over and over
in some places.
Outside dedup, you can consider on-the-fly compression.
Btrfs provides this feature along with snapshotting, online de-
fragmentation and many others. Think of it as gziping every file
in the filesystem whenever you finish writing a file and gunziping
it before reading. Now it’s the filesystem that does this for you.
The current compression methods supported are lzo and zlib.
Zlib is one that is used by gzip.
Credits:
The author owes a lot to Greg Freemyer for his thorough review and feed-
back. This article would not have become any better had it not been for his
voluntary assistance. Thanks Greg!
Reference:
• http://www.unix.com/tips-tutorials/20526-mtime-ctime-atime.html
• http://lwn.net/Articles/322777/
• Ext2 http://en.wikipedia.org/wiki/Ext2
• Getting to know the Solaris file system Part 1 – Sunworld – May 1999
• http://www.solarisinternals.com/si/reading/sunworldonline/
swol-05-1999/swol-05-filesystem.html
C
ompanies are also starting to see storage differently. work failures and power outages. Although he points out that
Rather than just simply being a place to copy files, stor- with smart planning, companies can overcome the limitations
age has become the cornerstone of effective high avail- of smaller, non-redundant storage units.
ability and disaster recovery. “Desktop storage units can be, and are, used in the enter-
“All manner of companies are now moving towards central- prise. With smart planning, for instance by adding more network
ised storage. Without centralised storage, you just don’t get cards, isolating network traffic to a single switch, and adding a
high availability or effective disaster recovery,” says Herman number of units in an array, companies get the effect of enter-
van Heerden. prise storage but at a lower cost,” says van Heerden.
Van Heerden says there are other factors that contribute to “What you need is a smart storage solution that is ideal for
the increasing focus and spend on storage – the recession be- adding storage to networks or servers in both physical and vir-
ing one of them. tual environments.” he concludes.
“Focusing limited IT spend on storage just seems to make
good business sense. Compared with planning for new servers
to carry additional load, investing in a NAS/SAN units, which
simplify storage use, allocation and management, is a much Herman van Heerden
more cost-effective way of keeping existing business servers Herman van Heerden is from NEWORDER INDUSTRIES (Pty) Limited, a South
running and allowing them to grow. Africa-based technology company that specialises in enterprise risk mana-
“Purchasing storage also gives companies a greater sense gement, virtualisation and storage solutions.
of value for money because it’s physical hardware. Companies
can see the value in something that’s tangible. Software on the
other hand, in the face of shrinking budgets, now has a ques-
tion mark over whether or not it will really make a difference if
it is implemented.
“Furthermore, with the advances in technology, even smaller
businesses can have enterprise-size storage. The develop-
ment of fast SATA and cheaper SAS technologies and the rapid
growth of disk sizes allows for super fast and very cost effective
storage units.”
Another catalyst for increased storage spending, according
to van Heerden, is server virtualisation.
“Virtual environments are simply not viable without a sound
networked storage solution to allocate space for services across
multiple virtual machines. As the adoption of virtualisation tech-
nologies is widening, so the need for network storage solutions
is growing,” he says.
He classifies network storage units into two categories; enter-
prise storage units that allow for high throughput, making them
ideal for server virtualisation or carrying the heavy load for large
databases or mail boxes, and desktop storage units, which run
a little slower and are suited for data back up.
Typically, the higher the price tag, the more redundancy is
built into the unit, allowing it to cope with hardware failures, net-
AC
Power is literally the life blood of any data cent- Lessons learned: It was simple – the UPS had been tested
er. Bad power, such as sags, spikes, noise, and a few days earlier, during the final phases of the move, and was
surges, can cause systems to fail. And with left in a “bypass mode” – handy if the UPS is the cause of the
no power, all of our servers are just very heavy, expensive problem, unfortunate if the local power goes out. We would de-
chunks of metal. All critical systems should be on uninterrupt- velop a checklist for the testing of the UPS, and would include
able power supplies (UPS), and it would probably be a good the final step of “Switch UPS to ONLINE.” We also needed to
idea for you to look up the difference between an online and Install power failure lights, and have flashlights handy. Since
standby power UPS. You should also review voltage, am- I had neither, I had to stumble in the dark to find my way out
peres, and watts while you are it. The need for clean power of the center, and get into the UPS room to get to back online.
cannot be over emphasized. In December 2010, a Toshiba It was only after I got to the locked UPS room that I realized
chip fabrication plant experienced a 0.07 second power out- that the keys for it were still in the darkened data center. We
age, and it caused a 20% drop in shipments in the first few would also invest in a glow–in–the–dark keychain.
months of 2011: (http://online.wsj.com/article/SB1000142405 Good judgment comes from experience, and experience
2748703766704576009071694055878.html). comes from bad judgment. – Barry LePatner
While I could just rattle off a bunch of rules for data center
security and power (Install alarms, put all equipment on UPSs, Scenario: Shortly after having the main UPS at a secure
etc.), I thought it would be more interesting to just share with data center repaired, a wide–area power outage was sched-
you some war stories from my past 30 years of experience uled for the installation for a new transformer from 3AM to 5AM.
working, building, and managing data centers in universities, I decided to keep the facility open and make sure that the
banks, and military installations. No matter how carefully one UPS would work, and that all our systems were properly
plans, mistakes, accidents and oversights happen. Names, protected. At 3AM, the power was cut, and with the excep-
dates, locations, and other identifying information have been tion of a wall clock (purposely plugged into a non–protected
changed to protect the innocent, the guilty, and the oblivi- outlet), all the systems were still online. I moved the clock
ous. to a protected outlet, and started the process to secure the
Experience is a tough teacher, it gives the test first and the facility until the morning. When I called the security monitor-
lesson afterwards. – Vernon Law. ing center, I was told that because of the wide–area power
outage, the alarms wouldn’t work, so I could not secure the
Scenario: I sat in the new finished data center, with the rack facility, and would have to remain there until the power came
PC servers, and Sun Blades performing their modeling and back on.
simulation tasks. For a variety of reasons, we had just finished
moving the data center about 150 feet to the other side of the Lessons learned: It is not enough to make sure that all
building. As I sat behind my desk, I suddenly realized how quiet your equipment is protected in a power outage situation, but
the room was. There was no sound from the servers or the air any external/remote services you are using are also protected
conditioning unit. And it was pitch dark. Not a single LED could by redundant power systems. Also make sure that any com-
be seen in the very dark room. Turns out there were local–area munication links are powered as well.
power outage, but since this facility was running on a very large Experience is that marvelous thing that enables you to
UPS, with a generator, I was left literally in the dark as to what recognize a mistake when you make it again. –Franklin P.
had gone wrong. Jones
Scenario: We were having problems with a server randomly Scenario: Back in the days of the huge glass CRTs, the man-
rebooting. We tested the UPS, replaced the power supplies, ager of the center didn’t want to plug the CRTs into the UPS
and checked the motherboard and RAM, but usually every few since it would reduce the UPS run time. The center suffered
days the server would simply reboot. One day, when returning a power outage, and while the servers remained running, we
from lunch, we saw one of our supervisors exiting the server had no way to log into them and do a clean shutdown. We
room. When we walked in we saw the server booting up. We tried to plug the monitors into the UPS, only to find out there
approached the supervisor and asked him about it. He told were no free outlets on them, and because of the wiring mess,
us that he has been having problems with his PC, and if we it was impossible to determine what could be unplugged to
weren’t around, he would simply reboot the server to “solve” plug in the monitors. Unplugging random cords would be like
his problem. playing “Russian roulette” with the servers.
Lessons learned:We installed additional physical barriers Lessons learned: The entire system, including monitors,
between the supervisors at the power button. We normally ran should be on UPSs. Install UPS monitoring software for con-
the server racks without the glass doors, so we re–installed trolled shutdowns on low battery conditions. Don’t use all of
them, and locked them. We also re–ran the AC power cords the outlets on a UPS, leave some empty for expansion. Label
to secure them. The keys for the cabinets were then properly the power cords.
controlled and the supervisor was not given access to them. Experience is the name everyone gives to their mistakes.
You cannot create experience. You must undergo it. – Albert – Oscar Wilde
Camus (1913 – 1960)
Scenario: One of the administrators was a smoker, and
Scenario: We had an emergency power off switch (a big didn’t like having to leave the secured facility to smoke, so
red button) on a wall of the data center. When it was de- he would often just open a security door (it was not alarmed
signed, there was a nice clear space around the button, but during the day), and stand outside, often walking around the
as the data center grew, the space squeeze began. The area corner for the building for a smoke break. He would prop open
around the kill switch became the location for printer paper the door (since it could not be opened from the outside). One
storage. Late one evening, as paper was being stacked, a box day, while he was out smoking, a construction worker, do-
of paper was slid across the stack, and came in contact with ing some building maintenance, needed an electrical outlet,
the emergency power off button. The lights in the facility stayed saw the open door, walked in and plugged his high–powered
on, but all the power to the servers went down. The night device into an UPS outlet. When he started the equipment,
shift didn’t know how to reset the power (it had never hap- it overloaded the UPS and knocked on of the server offline.
pened before), but was directed over the phone to the power
room in the basement, which was locked. The data center Lessons learned: You can’t prop secured doors open, not
director had to come in, contact building maintenance, sign even long enough to have a smoke (or two). Aside from the
out the keys, and get an electrician to come over and reset obvious power problems here, it only takes seconds to install
the UPS and bring the power back on. keystroke recorders, or plug other devices into the network, or
do other damage. Keep the doors closed.
Lessons learned: Be careful of big red buttons. More im- Power corrupts. Absolute power is kind of neat. – John Leh-
portantly, make sure that your crew is trained on emergency man, Secretary of the Navy, 1981–1987.
procedures, have access to the required locations, and know
who to contact. Simulated drills should be conducted. But the best kind of power is nicely conditioned and protected
Experience teaches slowly and at the cost of mistakes. AC power flowing into your servers. So, test your UPS, make
– James A. Froude sure devices are plugged in to proper outlets, make sure your
power infrastructure is properly protected, and that perform
routine maintenance on it. Make sure your people are prop-
erly on basic security and know how to handle power related
emergencies.
INTRODUCTION CACHE
We are going to examine Facebook’s cloud computing envi- MemCacheD sort of sneaked into the LAMP stack. Here is
ronment. It’s not easy supporting 500+ million friends so how where MemCached fits into the stack:
exactly do they do it? First we need to take a look at the stack
that is now embraced by most of the world’s largest websites Operating System
(including Facebook). The term LAMP was first used in 1998 Webserver
in a German Computing Magazine in the hopes of promoting [CACHE: MemCached]
competition against Microsoft Windows NT. Database
Programming Language
LAMP
LAMP maps to the following functions: LANGUAGE
These days the P in LAMP (which use to exclusively mean
• Linux (Operating System) PHP) can mean anything-- Pearl, Python, and even Ruby.
• Apache (Web Server) Even Apache is not immune from radical reinterpretation.
• MySQL (Database Server) Some are swapping it out for EngineX or Lighttpd (pro-
• PHP (Programming Language) nounced as “Lightie”).
When all elements of this stack are working together, you can FACT
build truly scalable websites. The new thinking is to use open source components to build
scalable and successful websites.
INSTALL
LAMP is pervasive and easy to install. In some environments SCRIPTING LANGUAGES
like Ubuntu or Debian you can get started with LAP in as little The thing to remember when using scripting languages is that
as a single command line such as: they perform poorly in production environments.
HIPHOP FOR PHP They don’t use joins, complex queries, or pull multiple tables
Facebook’s solution to this problem was to go custom and cre- together using views or anything like that. One would believe
ate HipHop for PHP. that the database teams don’t care about the relational aspect
of the database. This is not true. They care a great deal, but
FACEBOOK CUSTOM SOLUTION they use the relational database capabilities in a nontraditional
Facebook created HipHop for PHP. Development took about way.
three plus years and was open sourced in 2010. HipHop for
PHP takes: FACT
At Facebook the fundamental ideas of relational databases
PHP SourceCode -> Transforms it into C++ -> have not gone away. They are just implemented in nonconven-
Compiles it using G++ -> Produces an executable binary. tional ways.
FACT ARCHITECTURE
Facebook has achieved a 50% reduction across the web tier When you look at the Facebook database architecture, it con-
using HipHop for PHP. sists of the following layers:
1) Web Servers
TIP 2) MemCacheD Servers (distributed secondary indexes)
You should evaluate HipHop for PHP for use in your enter- 3) Database Servers (MySQL - primary data store)
prise.
The folks at Facebook are running the innovation engine at JOINS
100% and have implemented two key changes around the LAMP When it comes to using joins, they use the webserver to com-
stack. The areas of key innovation are taking place around: bine the data. The join action takes place at the webserver
layer. This is where HipHop for PHP becomes so important.
• Databases Facebook web server code is CPU intensive because the code
• Large Data Analysis runs and does all these important things with data. This would
traditionally be handled by the relational database.
Data analysis is at the heart of understanding what a friend
“user” is doing on the Facebook website. People are trying to HISTORY
find areas of efficiency wherever they can even if that means at These are database issues that people talked about 30 or 40
the database level. Take for example the NO-SQL movement. years ago, except they are now being discussed and imple-
These days you have so many databases to choose from and mented at different layers in the stack. If you’re using MySQL,
the recommendation is that you choose the one that best suits or NoSQL, you’re not escaping the fact that you need to com-
your needs. bine all that data together-- you’ll still need a way to go and look
it up quickly.
ARCHITECTING THE WOLRDS LARGETS WEBSITE So if we look at the NoSQL technology stack, there are a
Facebook stores all of the user data in MySQL (except news- number of No-SQL families of databases:
feed) and Facebook uses 1000’s of nodes in MySQL clusters.
For the people whose responsibility it is to manage the data- • Document stores
base portion of the website, they don’t care that MySQL is a • Column family stores
relational database. • Graph databases
• Key vale pair db’s
Facebook runs a hadoop cluster with 2,200+ servers with about FACT
23,000 cpu cores inside of it. Facebook is seeing that the amount Facebook leadership, product management, and sales (a group
of data they need to store is growing rapidly. of 300 to 400 people) run hadoop and apache hive jobs every
single month in an attempt to make better business decisions.
FACT Facebook created this level of data access to help people of the
Facebook storage requirements grew by 70x between 2009- organization make better product decisions because the data
2010. is readily available.
By the time you read this article, Facebook will be storing over
50 petabytes of uncompressed information which is more than APACHE HIVE
all the works of mankind combined. All of this is a direct result of Apache hive is one of the other technologies that Facebook
increased user activity on Facebook. Currently over 500M peo- uses to provide a SQL interface. This technology sits on top of
ple use the site every month and over half use it every day. hadoop and provides a way for people to do data analysis. The
great thing is that all of these components are open source-
DATA ANALYSIS Things which you can use - Today!
Facebook has learned a lot about how important it is to con-
duct effective data analysis is the running of a large successful SUMMARY
website. The LAMP stack is evolving. You will now see the introduction
of a cache layer. The number of choices you have at the data-
KEY base layer is also increasing and the separation of relational da-
You need to look at things from a product perspective. tabase functions has also occurred. Facebook, like most com-
Take into consideration a common problem of corporate com- panies has decided to implement technologies from a number
munications. What is the most effective email that you could send of different providers to avoid lock-in to any one solution. Data
to users who have not logged on to the site for over a week? In analysis will continue to grow in importance (something global
other words, what email type would be the most compelling? enterprises have known for a very long time even though they
may have failed to act or may not have had enough time to act
• A- A email that is a combination of text, status updates and on that knowledge).
other data. Great tools are now available-- consider evaluating apache
• B- A email that is simple text saying “hello, there are 92 hadoop, apache hive, and scribe. Use these tools to really un-
new photos. derstand what’s happening at your site. Follow Facebook’s lead
and evaluate developments in the open source space-- Face-
The data shows that the ‘B’ email performs 3x better. book is committed to this and has even created an open source
website which can be found at facebook.com/open source.
ANALYSIS
Analysis is the key. Data helps the people at Facebook make better
product decisions. You can see clear examples of this in the emails RESOURCES
that they send raking ‘news feed’ and the addition of the ‘like’ but-
ton. Analysis of large data helps product managers understand • Facebook Open Source http://facebook.com/opensource
how the different features affect how people use the site. • The Delta Cloud Project http://www.deltacloud.org
• GoGrid http://www.gogrid.com
SCRIBE • Amazon Elastic Compute Cloud (Amazon EC2) http://aws.amazon.
com/ec2
Facebook uses an open source technology that they also cre-
• Microsoft Cloud Services http://www.Microsoft.com/Cloud
ated called scribe that takes the data from 10’s of thousands of
• NIST - National Institute of Standards and Technology http://www.
webservers and funnels it into the hadoop warehouse. nist.gov/index.html
• Book - The Challenge of the computer utility by Douglas Parkhill
PLATINUM CLUSTER • Book - The Mythical Man-Month: Essays on Software Engineering by
The initial problem they encountered was that too many web- Fred Brooks
sites were trying to funnel data into one place, so hadoop tries • InterviewTomorrow.Net - Helping America get to work. Free access
to break it out into a series of funnels and collect the data over to the 2011 executive recruiter database
time. The data is then pushed into the platinum hadoop clus-
ter about every 5 to 15 min. They also go and pull in data from
MySQL clusters on a daily schedule. ABOUT THE AUTHOR
The apache hadoop cluster is at the center of the Facebook Richard C. Batka - Business & Technology Executive. Author.
architecture and is vital to the business. It’s the cluster that if Mr. Batka is based in New York and provides advisory services
it went down, it would directly affect the business. Facebook to a select group of clients. Mr. Batka has worked for global le-
will need to focus redundancy efforts in this area. Currently it is aders Microsoft, PricewaterhouseCoopers, Symantec, Verizon,
highly maintained and monitored so everyone is taking a lot of Thomson Reuters and JPMorgan Chase. A graduate of New
time and care considering every query before they run it against York University w/ honors, he can be reached at [email protected] or follo-
the cluster. wed on Twitter at http://twitter.com/RichardBatka.
A
lthough most companies recognise the necessity of dis- ery, you should keep off-site copies of data and full systems at
aster recovery plans and data restoration procedures, a secure data centre. If data is not backed up off-site, it is not
some are still relying on old backup and recovery meth- backed up.
ods that can affect the prognosis for a full recovery. “It is important to have stringent security measures in place
“The method, frequency and medium used for data backups to ensure the protection of your data at the off-site data centre
can make all the difference between a complete or no recovery as well. While no data centre will allow just any one into the
from a disaster. Companies with one foot in the dark ages, when server room, you are not exempt from hackers gaining access
it comes to backups, and the other in ‘the now’, where there are from outside. Your storage and servers must be secured with
lots of data systems at play and business moves at a frenetic your own security systems. A shared firewall is like a public
pace, will struggle and take longer to get up and running again toilet; you can never tell when someone will leave the seat up.
in the event of a disaster or major systems failure. So own your own security,” he stresses.
“Disaster recovery is more than just doing backups so that For the best prognosis for disaster recovery, van Heerden
data can be restored should disaster strike. Disaster recovery recommends the following:
is also about restoring entire systems quickly to ensure busi-
ness continuity. The fact is that data backups just don’t cut it • Clones of full systems instead of file backups. Cloning is
anymore,” says Herman van Heerden. the new tape. Ideally, companies should first clone and
He says that although incremental backups, a method on which then incremental data backups, on disk, should be done to
many companies rely, may sound good enough, they’re not. complement system clones.
“Companies need to question how long it will take them to re- • Working backups housed on-site as well as off-site in a se-
store a system, never mind a whole host of systems, if they only cured data centre.
do incremental backups. A few hours might be digestible but • Secure and encrypted electronic communications between
a day is too long, and reality it can take much longer than that. on-site and off-site storage. Industrial espionage is real,
This is a waste of time, productivity and money,” he says. and not just a thing for late night B-rate Hollywood movies
The frequency of backups is another cause for concern. He anymore
says that some companies conduct backups once a day while • Data backups of differences (updates) rather than full data
others dare to walk on the wild side and only backup once replicates over the wire.
a month or quarterly. Obviously, the greater the gap between • Virtualisation of systems.
backups, the greater the risk of losing bigger amounts of data. • Smart planning of the use of storage systems like SANs
Then there’s the medium for backups that companies use. and NASs to allow 50% of the capacity for snapshots, and
Tape backups were once king and some companies still use the off-site backups being fed from these should be almost
them. Inherently, tape backup software only caters for file back- real time snapshots.
ups and not full systems, and van Heerden says companies • Enough technology resources on standby to enable sys-
that have actually successfully restored from tape are few and tems to get up and running again even in the event of
far between. “Companies still using tapes for backups should a complete infrastructure failure or loss. Hardware for this
switch to disk,” he advises. purpose can be hired.
Failure to have off-site backups is another shortfall in some
companies’ disaster recovery plans. “Having copies of your sys-
tem on the same site that your systems run is good for a quick
restore if hardware crashes or software malfunctions. However, Herman van Heerden
if an event like a fire or theft takes out your whole business and Herman van Heerden is from NEWORDER INDUSTRIES (Pty) Limited, a South
your backups are destroyed or stolen along with your systems, Africa-based technology company that specialises in enterprise risk mana-
you’re in a bit of trouble. If you’re serious about disaster recov- gement, virtualisation and storage solutions.
out of multiple data centers which are quite similar to the ones 5. Do you use a tape library to back up your servers?
that support companies’ internal resources. Biswas (2011, Janu- This question addresses the legal issue similar to the story of the
ary 20) also made the point that like all data centers, they are dentist office, newspaper office and Five and Dime burning down.
susceptible to security issue such as data theft, natural disas- Harris (2010) made the argument that most of the time, computer
ters, and man-made threats. This is where the trustworthiness of documents are considered hearsay and are not normally admis-
service providers comes in. In GoGrid (2010) white paper made sible in court unless it has firsthand evidence that can be used to
the point that the providers’ industrial background and how long prove the evidence’s accuracy trustworthiness, and reliability (p.
they have been around are important factors in security (p. 3). 898). Roach pointed out that only digital linear tape is admissible
Anne Roach researched the virtual storage providers to deter- in a court of law because of the linear mode of recording data (p.
mine the trustworthiness and provide a framework to determine 4). Regardless of the expense, tape libraries also provide a stor-
the background of the service providers. In Roach’s paper, she age format redundancy that allows data to be recovered even
made the point that the ultimate responsibility for making virtual in a major data disaster when the tapes are stored in a different
decisions is the person who makes the decision to trust critical location from the servers (Roach, n.d., p.4).
information to the cloud and it is the virtual storage providers’
responsibilities for informing end users about their physical stor- 6. What is your total storage capacity?
age facilitates and to help users make educated decisions about This question is a security factor that addresses availability.
where the data or data backups are stored (p.1). In her paper, A very interesting finding was that many companies refuse to
she asked thirty one (31) virtual storage providers the following admit that their virtual storage had a limit.
eight questions:
7. How are your servers monitored?
1. Do you own or lease your servers? Since human monitoring and availability to the health of a vir-
2. Are your servers redundant? tual storage system, it is important to determine if virtual storage
3. Where are your servers/server clusters located? provider offer 24/7 assistance. Roach (n.d.) noted in her study
4. Which vendors do you use for your servers? that at least one provider has a redundant-redundancy system
5. Do you use a tape library to back up your servers? to consistently keep information online which a backup to the
6. What is your total storage capacity? provider’s backup software system (p. 4).
7. How are your servers monitored?
8. How quickly would a damaged server be replaced? (p.4) 8. How quickly would a damaged server be replaced?
Roach (n.d.) did not discuss all of the aspects about physical
These questions can determine the trustworthiness of the vir- vulnerabilities such as smoke or fire damage, cooling system,
tual storage provider. Here are some possible vulnerability that power redundancy and physical security. This question will help
where uncovered from Roach’s survey. decision makers find a reliable service (p.4).
1. Do you own or lease your servers? Roach made very enlightened conclusions that can create a
If the virtual storage provider leases their servers and are unable lack of responsibility and a vulnerability to end-users:
to pay for the lease or if the lease can not be renewed then the
data on the servers must be migrated to a different location or 1. Some companies claim that they manage and maintain
en mass which equals down-time for the virtual storage user or their servers that are actually located far from the physical
possible loss. This question is also critical to determine that if location of the end user.
data needs to be removed from the server that no data remains 2. Power and data redundancy and physical security ele-
on those servers. ments may not be clearly communicated.
3. Some providers provide guarantees such as if data is lost,
2. Are your servers redundant? the client’s money will be refunded. This may be hollow
Roach (n.d.) explains that this question was designed to deter- when the company has little involvement in the protection
mine if virtual storage providers where duplicating the informa- and maintenance of information and the actual data guard-
tion. Her findings revealed that some of the providers felt that ian (p. 13 – 14)
RAID in single location compensated for the need to create
redundancy and at least two providers stated that their sys- The readers are encouraged to read Anne Roach’s paper be-
tems provide for two or more copies of data within their server cause she provided a synopsis of the thirty one (31) virtual stor-
(Roach, n.d., p. 4). age providers she studied.
Simard (as cited in Gruman, 2008, March 13) said that “graphics world.com/d/security-central/virtualizations-secret-security-threats-
cards and network cards today are really miniature computers 159?page=0,0
that see everything in all the VMs”. In other words, they could • Harris, S. (2010). All in One CISSP Exam Guide (5th ed.). New York, NY:
McGraw Hill.
be used as spies across all the VMs, letting a single PC spy on • Kay, R. (2008, October 6). QuickStudy: Storage vitualization. In Com-
multiple networks. puterWorld. Retrieved February 11, 2011, from http://www.compu-
terworld.com/s/article/325633/Storage_Virtualization?taxonomy-
Conclusion Id=19&page Number=1
Even though, virtualization decreases server costs, compa- • Lindstrom, P. (2008, January 3). Attacking and Defending Vitu-
ral Enviroment. In Burton Group. Retrieved February 11, 2011, from
nies are realizing that virtualization is simultaneously increas-
http://www.burtongroup.com/Download/Media/AttackingAnd.pdf
ing management and storage costs, and without a plan to pro- • Littlejohn Shinder, D. (2009, March 25). 10 security threats to watch
tect these environments, they may not realize the full return on out for in 2009. In TechRepublic. Retrieved February 10, 2011, from
investment (Symantec. 2011). As a result, an improper plan to http://www.techrepublic.com/blog/10things/10-security-threats-to-
protect these environments causes fragmented implementation watch-out-for-in-2009/602
and lack of standardization of virtual infrastructures and this will • Murphy, A. (n.d.). Virtualization Defined -Eight Different Ways.[White
Papers]. Retrieved February 12, 2011, from http://www.f5.com/pdf/
continue to expose gaps in the security, backup and high avail- white-papers/virtualization-defined-wp.pdf
ability of virtual environments (Symantec,2011). • Roach, A. (n.d.). Keeping it Spinning: A Background Virtual Storage Pro-
viders. Retrieved February 11, 2011, from http://fht.byu.edu/prev_
workshops/workshop08/papers/2/2-3.pdf
References: • Schofield, D. (n.d.). Data Integrity, Data Resilience and Data Securi-
• Biswas, S. (2011, January 20). Is Cloud Computing Secure? Yes Ano- ty. In CloudTweaks Plugging In the Cloud. Retrieved February 13, 2011,
ther Perspective. In CloudTweaks Plugging In the Cloud. Retrieved Fe- from http://www.cloudtweaks.com/2011/02/security-for-the-cloud-
bruary 10, 2011, from http://www.cloudtweaks.com/2011/01/the-qu- data-integrity-data-resilience-and-data-security/
estion-should-be-is-anything-truly-secure • Symantec. (2011). Symantec Reveals Top Security and Storage Pre-
• Butler, J. M., & Vandenbrink, R. (2009, September). IT Audit for the Vir- dictions for 2011 [Press Release]. In Symantec. Retrieved February 11,
tual Environment. In SANS. Retrieved February 13, 2011, from http:// 2011, from http://www.symantec.com/about/news/release/article.
www.sans.org/reading_room/analysts_program/VMware_ITAudit_ jsp?prid=20101209_02&om_ext_cid=biz_socmed_twitter_facebo-
Sep09.pdf ok_marketwire_linkedin_2010Dec_worldwide_2011predictio
• Data Loss Statistics (n.d.). In Boston Computing Network. Retrieved
February 11, 2011, from http://www.bostoncomputing.net/consulta-
tion/databackup/statistics/
• Crump, G. (n.d.). Storage Switzerland Report The Complexity of
VMWare Storage Management. In Storage Switzerland LLC. Retrieved Stephen Breen
February 12, 2011, from http://img.en25.com/Web/IsilonSystemsInc/ Stephen Breen obtained two master degrees, Master of Science Information
The Complexity of VMware Storage.pdf Technology, Information Security Specialization from Capella University and
• GoGrid, (2010). Cloud Infrastructure Security and Compliance.[Whi-
Master of Science in Computer and Information Sciences from Nova Southe-
te paper]. Retrieved February 11, 2011, from http://storage.pardot.
com/3442/12401/Cloud_Infrastructure_Security.pdf astern University. He worked as system analyst, programmer and quality as-
• Gruman, G. (2008, March 13). Virtualization’s secret security thre- surance for a software company. He is currently an information security con-
ats. In InfoWorld. Retrieved February 10, 2011, from http://www.info- sultant. His email address is [email protected]
MAGAZINE
3/2011
Our April issue will cover the Critical Power & Cooling topic!
e g i s t e r Early
R E!
and SAV
Attend
Choose from over
90
Classes
& Workshops!
Learn from the most experienced
Sheraton Boston Hotel SharePoint experts in the industry!
Keynote Over 55 “I would recommend SPTechCon to SharePoint admins and
developers. By far the best tech event I have attended.”
by Dux Raymond Sy
Exhibiting —Venki Oruganti, Software Developer, Pitney Bowes