0% found this document useful (0 votes)
37 views50 pages

StorageBackup DataCenterMagazine 02 2011 EN

Uploaded by

vcosmin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views50 pages

StorageBackup DataCenterMagazine 02 2011 EN

Uploaded by

vcosmin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Want to have all the issues of Data Center magazine?

Need to keep up with the latest IT news?


Think you’ve got what it takes to cooperate with our team?

Check out our website and subscribe to Data


Center magazine’s newsletter!

Visit: http://datacentermag.com/newsletter/
DataCENTER
FOR IT PROFESSIONALS

MAGAZINE
Content issue 02/2011
6. news storage
24. Building a Flexible
and Reliable
interview Storage on Linux
Eduardo Ramos dos Santos Júnior

9. Interview with David Tuhy (Intel)


28. Understanding
Linux filesystem
for better Storage
application performance Management
10. Bussinesses beware: Application Mulyadi Santosa

Performance matters to your users!


Cristen Allmacher
34. Companies have their
sights set
on Storage
data recovery Herman van Heerden

12. Be prepared against accidental file


deletions
security
36. Powering Data Centers
– Lessons Learned
basics BJ Gleason

14. Backup and Disaster Recovery:


The Basics
Dhananjay D. Garg
cloud computing
16. Examine your Network with Nmap 38. Facebook Cloud
Mohsen Mostafa Jokar
Computing SaaS,
PaaS and IaaS: Standards
22. Data Centers – Security First! & Practices
Mahdi Jelodari
Richard C. Batka

storage backup
24. Building a Flexible and Reliable 42. Data backups just don’t cut it
Storage on Linux
anymore
Eduardo Ramos dos Santos Júnior
Herman van Heerden

28. Understanding Linux filesystem


for better Storage Management
Mulyadi Santosa
virtualization
34. Companies have their sights set on 44. Virtual Storage Security
Storage
Stephen Breen
Herman van Heerden

4 Data Center 2 /2011


Dear Readers,
The March issue of our magazine is dedicated to the Storage & Backup topic. We hope you will find here many useful
information from this field of data center knowledge.

In our Storage section we decided to give you a closer look on Storage in Linux system. We also decided to bring up
the Virtual Storage Security topic, as Storage Virtualization is a very important data center area nowadays.

You can also find more information about data center and network security and for those who are interested Cloud
Computing, we also have something in this issue.

I wish you a good lecture!

Magdalena Mojska
Editor in Chief

http://datacentermag.com

Data Center magazine team DTP: Marcin Ziółkowski Senior Consultant/Publisher: Marketing Director: Magdalena Mojska
Editor in Chief: Graphics & Design Studio Paweł Marciniak [email protected]
Magdalena Mojska tel.: 509 443 977
[email protected] [email protected] CEO: Publisher:
www.gdstudio.pl Ewa Dudzic [email protected] Software Press Sp. z o.o. SK
Editorial Contributors: Stephen Breen, 02-682 Warszawa, ul. Bokserska 1
Dhananjay D. Garg, Mohsen Mostafa Special thanks to: Sean Curry, Stephen Production Director: Phone: 1 917 338 3631
Jokar, Mahdi Yelodari, Richard C. Batka, Breen, Erik Schipper, John Co, Adriel Andrzej Kuca
Mulyadi Santosa, BJ Gleason. Torres, Kelley Dawson. [email protected] www.datacentermag.com
News

Belvedere Trading Selects Force10 Networks Switches to


Drive Down Latency as it Rounds Out 10 Gigabit Ethernet
Migration

S-Series™ S4810 top-of-rack access switches complement


existing Force10
C-Series™ core implementation for high-frequency trading
environment

SAN JOSE, Calif., February 22, 2011 – Force10 Networks, Inc., FTOS, the modular Force10 operating system software, runs
a global technology leader, today announced that Belvedere on both the C-Series resilient core switches and the S4810, pro-
Trading LLC, a leading proprietary trading firm specializing in viding Belvedere with a L2 and L3 feature richness, including
equity index and commodity derivatives, has selected and de- IGMP multicasting and high availability provided by stacking.
ployed high-performance Force10 S-Series™ top-of-rack (ToR) FTOS also provides underlying code stability and advanced
switches as part of a successful network design aimed at pro- monitoring and serviceability functions, including real-time net-
viding ultra-low latency, which better supports the highly trans- work and application traffic monitoring.
actional nature of their business. “Force10 has a rich heritage of delivering reliable, high-per-
Electronic traders, such as Belvedere, are diligently working formance switching solutions to environments grappling with
to ensure that their network infrastructure can support their mas- huge files or high transaction levels,” said Arpit Joshipura, chief
sive amount of transactions as well as drive down the latency marketing officer, Force10 Networks. “Coupling low latency
to nanosecond levels. With each transaction representing po- requirements and advanced L3 software features provides
tentially millions of dollars, ultra-low latency is critical in high- a unique value to customers in HFT environments.”
performance trading environments.
“In current trading environments, delivering reliable and su- About Force10 Networks
perior performance and ultra-low latency can represent the dif- Force10 Networks is a global technology leader that data cent-
ference between being profitable and losing opportunities,” said er, service provider and enterprise customers rely on when the
Yezdaan Baber, Director of Technology, Belvedere Trading LLC. network is their business. The company’s high performance
“Our continued investment in Force10 switches reflects our con- solutions are designed to deliver new economics by virtualizing
fidence in their technology mapping seamlessly to the needs and automating Ethernet networks. Force10’s high density and
of our business.” resilient Ethernet switching and routing products increase net-
Based on a recent network performance and power test, the work availability, agility and efficiency while reducing power and
S4810 access switch demonstrated as much as 5% to 70% cooling costs. Force10 provides 24x7 service and support ca-
lower latency than comparable switches in a 10 Gigabit Ether- pabilities to its global customer base in more than 60 countries
net (GbE) configuration. The test was conducted by independ- worldwide. For more information on Force10 Networks, please
ent analyst and publisher Nick Lippis of The Lippis Report and visit www.force10networks.com.
Ixia, a leader in converged IP network test solutions. Click here
to view the Lippis Report test results.
Belvedere selected the newly offered S4810 access switches
to be deployed strategically at their network edge and colloca-
tion facilities to leverage the performance of 10 Gigabit Ethernet
(GbE) for Belvedere’s financial derivatives trading practice.

S4810 Purpose-Built for High Frequency Trading- Force10 Networks, the Force10 Networks logo, Force10,
Type Applications E-Series, Traverse, and TraverseEdge are registered trade-
Recognizing the immediate need to ensure line-rate, non- marks and ExaScale, S-Series, TeraScale, FTOS, Open Auto-
blocking throughput and ultra-low-latency throughout its per- mation, JumpStart, SwitchLink, SmartScripts, and HyperLink,
formance-sensitive high-frequency trading (HFT) environment, are trademarks of Force10 Networks, Inc. All other company
Belvedere deployed the S4810 switches in geographical prox- names are trademarks of their respective holders.
imity to U.S. trading exchanges. Its 1.28 Tbps (full-duplex) non-
blocking, cut-through switching fabric delivers line-rate per- Contact: Kevin Kimball
formance under full load with ultra low latency. The compact Force10 Networks Inc.
S4810 switch design provides 64 10 GbE worth of throughput in +1 408-571-3544
a combination of 10 GbE and 40 GbE feeds... [email protected]

6 Data Center 2/2011


News

AXI announces new Mobile Fuel Tank Cleaning


System MTC HC-80
AXI (Algae-X International) announces the latest addition to The MTC HC-80 Mobile Tank Cleaning System:
their line of standard Mobile Tank Cleaning Systems, the HC-
80, a high capacity, multi-stage, automated fuel tank cleaning • fully automated
system equipped with a Smart Filtration Controller. This new • compact design
system stabilizes and decontaminates diesel fuel, bio-diesel, • high capacity fuel processing
light oils and hydraulic fluids while restoring fuel to a “clear & • fuel flow and transfer of 80 GPM
bright” condition. • decontaminates fuel and tanks
The HC-80 is ideal for service providers in the tank cleaning • removes sludge and solids
industry. With changes in fuel production and the growing imple- • large contaminant holding capacity
mentation of bio-fuels, the need for fuel maintenance, including
periodic tank cleaning has increased dramatically. The MTC Algae-X International (AXI) designs, engineers and manufac-
HC-80 system efficiently cleans tanks, removes water and tures innovative fuel filtration, fuel conditioning and tank clean-
sludge and restores optimal fuel quality. The system is equipped ing equipment, now available in over 40 countries around the
with a fully automated controller that provides an instant visual globe.
status report of system power, pump operation and alarms for
high pressure, high vacuum and high water levels. For further information please contact:
AXI’s Mobile Tank Cleaning (MTC) systems excel in combin- Bill O’Connell
ing high capacity fuel optimization, filtration and water separa- 239 690-9589 or 877 425-4239
tion with low operating costs and a compact design to provide E-mail [email protected]
optimal fuel quality for reliable, peak engine performance. Visit Algae-X International online at www.algae-x.net.

Juniper Networks Introduces QFabric™


New Paradigm in Data Center Fabric Provides Breakthrough Juniper’s QFabric architecture is up to ten times faster, uses
Scale, Speed, Savings and Simplicity to Power the Cloud 77 percent less power, requires 27 percent fewer networking de-
Computing Era vices, occupies 90 percent less data center floor space, and de-
SAN FRANCISCO, Feb. 23, 2011 — Juniper Networks (NYSE: livers a nine fold reduction in operating resources than the near-
JNPR) today unveiled QFabric, the world’s first true data center est competitive offering. Improvements in operating expenses of
fabric. An outcome of “Project Stratus,” Juniper’s multiyear data this magnitude provide significant benefit to customers coping
center network research and development initiative, QFabric de- with the growing cost of continuing to scale data centers to drive
livers quantum leap improvements in data center performance, revenue and meet escalating demands driven by cloud comput-
operating cost and business agility, from the enterprise to the ing and mobile Internet.
large scale cloud provider. Engineered as a simplified, highly “QFabric is an evolutionary path to a revolution in comput-
scalable data center network solution, QFabric enables a supe- ing,” said Pradeep Sindhu, founder and CTO of Juniper Net-
rior approach to building and securing virtualized data centers works. “We have fundamentally reengineered the datacenter
that eliminates the tradeoff between quality of experience and network, and with QFabric we address the demand for exponen-
economics that plagues today’s legacy networks. tial speed, scale, security and efficiency for the next decade.”
QFabric enables exponential improvements in data center speed,
scale and efficiency, by removing legacy barriers and improving A New Class of Networking Products: The QFabric
business agility. QFabric’s flat architecture also enables the indus- Product Family
try’s first integrated security solution that provides visibility, enforce- QFabric is composed of three components that create a high-
ment and scale across the entire physical and virtual data center performance, low latency fabric, unleashing the full power of
fabric. Juniper has invested three years and more than $100 million the data center. The QF/Node acts as the distributed decision
in research and development to address these constraints, creat- engine of the fabric; the QF/Interconnect is the high speed trans-
ing a new architecture that is designed to be the foundation of data port device; and the QF/Director delivers a common window,
centers for the next decade. QFabric delivers on Juniper’s 3-2-1 controlling all devices as one.
data center network architecture – collapsing the traditional three- Available for order today, the Juniper Networks® QFX3500
layer network down to a single, high-performance layer. is the first product in the QFabric family and a stepping stone
“Data center compute and storage technologies have ad- into the QFabric architecture. It is also capable of operating as
vanced over the last decade, and the legacy approach to net- a stand-alone 64-port 10Gigabit Ethernet switch with FCoE and
working has not kept pace,” said Kevin Johnson, CEO of Juniper Fiber Channel gateway functionality. The QFX3500 offers the
Networks. “As cloud computing and the mobile internet acceler- fastest unicast and multicast performance in the industry.
ate, demand is intensifying for a quantum leap forward in data The QF/Interconnect and QF/Director will be available for or-
center capabilities. With QFabric, Juniper is transforming data der in Q3.
center economics by introducing the only network fabric that is From: http://www.juniper.net/us/en/company/press-center/
able to eliminate multiple layers of cost and complexity.” press-releases/2011/pr_2011_02_23-13_04.html

Data Center 2/2011 7


News

Keeping your Data Center ready for Cloud


Today most of the IT firms are looking forward to adopting cloud 2. All time support: Data centers should provide 24 hours
computing in their working environments. According to the Na- support for all 365 days of the year. Proper trained moni-
tional Institute of Standards and Technology (NIST), cloud com- toring, engineering and technical staff should be in place
puting is defined as, “Cloud computing is a model for enabling for providing non-stop support. In case of a break down, all
convenient, on-demand network access to a shared pool of con- time telephone support and on site assistance should be
figurable computing resources (e.g., networks, servers, storage, made available.
applications, and services) that can be rapidly provisioned and 3. Internet availability: To counter potential downtime, or-
released with minimal management effort or service provider ganizations should maintain two or three tier internet con-
interaction.” nectivity.
Some key features of cloud are viz. 4. Fix service: A individual doesn’t need to maintain physi-
cal or virtualized servers because the cloud service pro-
1. Cost: Companies can greatly reduce the cost by using vider takes this responsibility. A data center is expected to
cloud computing. regularly execute security patches for vulnerability correc-
2. Processing is dynamic: “Cloud” is used as a metaphor tion and maintain operating system and hardware support.
for internet and the computing in this cloud is not static i.e., Thus, separate staff should be in place for fixing any faults
the processing doesn’t take place in or more specifically in the data center.
known servers. 5. Redundant hardware support: Data centers should keep
3. Security: Due to the centralization of data, security is bet- redundant hard ware like routers, un-interrupted power
ter than in traditional systems. Most often the providers en- sources and switches so that in case of a break down the
sure that the data is protected and all the critical security spares for commonly damaged hardware parts are quickly
issues are well addressed. available.
4. Independence: Regardless of their location or system 6. Power support: Data centers should maintain an online
platform they are using the client can access data stored electrical power source that provides emergency power to
by them on cloud by using just a web browser. a load when the input power source fails. This power sup-
port is also called uninterruptible power source, UPS or
Although, cloud computing is good, not many data centers are battery/flywheel backup. Apart from UPSes, data centers
ready for the transformation. Below given are some of the most should have spare electrical grid connections which can
common factors that an IT firm should keep in mind before mov- serve as an independent power supply. This power sup-
ing on towards the cloud. ply can be useful in case of a power failure from the utility
mains or in case of a natural disaster.
1. Data Capacity: Of all things, data transmission capaci-
ty should be available for cloud computing in data centers.
Multiplexing protocols such as Synchronous Optical Net-
working (SONET) and Synchronous Digital Hierarchy (SDH)
can be used transferring multiple digital data bit streams DHANANJAY D. GARG
over optical fiber. SONET and SDH serve as a replacement The author is a information security enthusiast who is based in India. He lo-
to the Plesiochronous Digital Hierarchy (PDH) systems. ves working on projects related to information security.

Open Stack
As of Feburary 3rd, Cisco systems has announced that it has OpenStack has some real competition out there though with
joined the OpenStack community. For those that don’t keep up enterprise software vendors like VMWare, with their vCloud direc-
with the emerging utility based computing platforms, OpenStack tor platform, and CA’s acquisition of 3Tera’s Applogic. VMWare
is an open-source infrastructure platform that has the ability is the 10000 pound gorilla in the enterprise virtualization market,
to offer services similar to AWS EC2, and S3. OpenStack is and will be interesting to watch as options like OpenStack evolve.
important to the datacenter community as while its adoption CA’s purchase of 3Tera will put the force of the software giant
grows in maturity, it has offers the potential of a true utility model, behind a strong existing community of managed service provid-
with customers migrating applications and services to the most ers running the platform, and could signal the beginning of some
optimal service for their needs possible, while still maintaining competition for the enterprise internal “cloud”.
the ability to transport those applications into private cloud sce- The obvious conclusion about Cisco’s interest is that it will
narios where required. For developers and software as a serv- be able contribute to the codebase and help enable advanced
ice cloud providers, this means they have can exercise scaling networking integration and features, which it has already done
models not available in traditional virtualization models. in other virtualization products such as its introduction of the
As of today, Openstack has a list of collaborators and con- Nexus 1000v and the Virtual Security Gateway. A less obvious
tributors that includes some true heavyweights, including NTT, alternative is that we might see Cisco utilizing the solution in
Dell, Citrix, Extreme Networks, and many others. OpenStacks some of its routing platforms, which already use virtualization
primary drivers have come from NASA, which contributed No- technologies to provide network oriented application services.
va, providing compute services, and Swift, developed by Rack- Openstack will be packaged into Ubuntu 11.04, due out in
space providing storage services. April.
Read more about aboutstack at: http://www.openstack.org/

8 Data Center 2/2011


Interview
Interview

Interview
with David Tuhy
Could you briefly introduce yourself to our readers? needs. It also requires processing power and integrated storage
David Tuhy, General Manager of Intel’s Storage Group features (e.g. de-duplication, faster disaster recovery, optimized
thin provisioning, etc.) that enable critical functions without the
Could you say a few words about your company? added cost of purchasing additional physical units when more
Intel is a world leader in computing innovation. The company storage is needed.
designs and builds the essential technologies that serve as the
foundation for the world’s computing devices. Could you briefly describe the difference between your
processor and this kind of products provided by the other
Which areas does Intel currently focus on? What can we companies?
expect from this company in the upcoming months when Scalable storage based on Intel Xeon processor architecture in-
it comes to Storage? tegrates features such as XOR RAID functions to deliver faster
Intel’s Storage Group is currently focused on solutions for en- workloads that use less computing resources and power for
terprises, small businesses and consumer homes. The Intel® RAID 5 writes. Intel Xeon also minimizes downtime by support-
Xeon® storage roadmap provides architectural enhancements ing Asynchronous DRAM refresh (ADR) operates DIMMs in self-
that will bring further optimization and performance for storage refresh mode to retain cached memory even through a power
solutions. failure and Non-Transparent Bridging (NTB) to keep nodes in
sync for high availability. Future platforms will also integrate
Can you tell us about the latest trends in storage virtuali- additional storage capabilities, such as integrates Serial Attach
zation? SCSI, also known as SAS, to eliminate the need for extra I/O
We’ve noted an increased interest in storage virtualization as controllers and reduce costs.
CIOs search for more efficient ways to manage the enormous
growth of digital data. Companies are using virtualization in stor-
age to aggregate multiple pieces of physical storage such as
hard disk drives and multiple NAS or SAN boxes into one or
more pieces of logical storage. This creates much more dynam-
ic and flexible storage that can service parallel storage business
functions – data de-duplication in parallel with IO accesses).

What are the current Intel solutions in this area?


Intel® Xeon® multicore processors are ideal for virtualized en-
terprise storage systems. They provide the processing capa-
bilities and integrated advanced storage technologies to create
a unified foundation that delivers the dynamic, flexible and par-
allel processing required by today’s storage systems.

What is the most important feature of Intel® Xeon® proc-


essors?
The high performance multicore capability of Intel Xeon proces-
sors are key to the energy efficient performance, data protec-
tion capabilities and intelligence necessary for mission critical
environments.

What makes it ideal for virtualized enterprise storage sys-


tems?
The multicore technology of Intel Xeon processors provide
a strong foundation for storage systems that scale the entire
storage spectrum. Virtualized storage requires dynamically
reconfiguring storage assets based on ever-changing business

Data Center 2/2011 9


Application Performance

Businesses Beware
Application Performance Matters to Your Users!
as many businesses struggle looking forward in the future on how
to leverage many of the newer technologies, predictions can be
made based on current trends. Cloud computing has been hyped
but gains traction as businesses see the value it brings.

T
he promise that it offers of cost savings, scalability and manage agreements and track service levels over time. Addi-
agility is very attractive to many businesses, even those tionally, IT needs a means to measure the performance of these
that plan to innovate or transform their own IT opera- hybrid cloud applications that are both inside and outside of the
tions. As we evaluate these trends, we should look at the pos- control of the business. IT will need to understand what pieces
sible future of: of the application are performing poorly—CDNs, ISPs, etc.

• Application architecture An end-to-end performance management solution will be need-


• Who’s responsible or who’s accountable ed to ensure the applications that reside both within the enterprise
• Customer expectations and outside in the cloud are monitored and measured to meet
service levels and ensure the quality of service users expect.
Furthermore, IT operations will need to optimize performance in
Future Application Architecture the cloud application delivery chain and be able to quickly identify
IT will continue to extend legacy and distributed systems to add problems prior to users feeling any impacts.
virtualization and other technologies, such as SOA, Web 2.0 or
cloud computing, which will increase the complexity of applica- Who’s responsible – Who’s accountable
tion architectures. Business counterparts will continue to de- As IT organizations continue to utilize cloud computing, the role
mand, “Do more with less,” and IT will need to look for ways to of IT operations in managing infrastructure will change. IT teams
save money and manage service expectations more efficiently will shift roles from managing and monitoring the infrastructure
and effectively. components, network, server, database, etc., to having a role
Cloud computing will be the pathway to meeting many of the much like a service manager who will be responsible for:
business needs to save money and operate more proficiently.
The once-prevalent concern about “how secure will the data be • Selecting the service provider
if I move it to the cloud?” will continue to be a concern, regard- • Defining the service level agreements
less of who has to secure the data. But security of the applica- • Ensuring service level compliance
tion’s data is now in the hands of the provider. This means IT will
control less of the businesses operations, including managing This shift will force IT and its business counterparts to be aligned
data security, and become more responsible for managing serv- and agree to the type of service delivered. Business require-
ice providers. IT will need to understand how to build contracts, ments will determine what applications will be delivered through

10 Data Center 2/2011


Businesses Beware

cloud computing and what IT infrastructure will remain in IT’s Performance Matters
direct control. As business owners and IT managers determine Achieving results that increase the company’s revenues, cus-
the requirements for what to push to the cloud, the role of IT will tomer satisfaction and brand perception is how to thrive in the
be less focused on managing infrastructure than on ensuring business world today. Here are some typical results of success-
service providers are meeting the agreed-upon level of serv- ful application performance management:
ice. Additionally, IT Service Management frameworks such as It becomes clear that application architectures, zones of re-
ITILv3 or Six Sigma can assist businesses with defining proc- sponsibility and customer expectations must all be considered
esses for improving service quality and defining service level to manage performance and availability, efficiently and effec-
agreements. tively. As businesses around the world continue to transform
and leverage new technologies, it is necessary to:
Future Customer Expectations
Customers will have very little tolerance for poor-performing ap- • Manage performance globally as seen by the end-user
plications, due to steep competition. The shift will continue to- • Gain visibility into end-to-end application performance
ward self-service capabilities versus having “face-to-face” cus- • Align IT to meet business goals
tomer interactions. Additionally, users’ technical knowledge on • Optimize mean time to resolution
how to work around or find something better will flourish, mak-
ing companies strive to ensure they maintain high application Before a crisis becomes catastrophe, ensure your business
performance to retain and increase customers. stays out of the news headlines about application outages and
Furthermore, technology will continue to push itself into a new avoids performance issues that could hammer your bottom line,
era where computing is all about real-user experience. A form with application performance management that is end-to-end,
of hypothesis for application technology might look something from the enterprise to the Internet.
like:
Super-rich interfaces: 3-D application interfaces, hologram Kristen Allmacher is a Product Marketing Manager
imaging and smart phones display pictures on any surface. at Compuware Corporation. She has been working in
Smarter devices: easy-to-run operating systems, Internet ap- the IT Service Management industry for more than ten
plications for any business or personal use, quick viewing of years in the areas of R&D, strategic planning, product
streaming media/video, web access everywhere. management, sales engineering and process improve-
Optimized networks and telecommunications: the new 4G vs. ments. In addition, her side hobby as a graphic designer
3G, broadband+, vLANs, vWANs and VPNs. has been utilized to design many software interfaces
As these newer and more sophisticated ways of computing that are implemented and used today. Kristen is a Six
spread worldwide, customer performance demands will con- Sigma Green Belt and is ITIL Foundation v3 certified. Her passion is in wor-
tinue to escalate. king with people, processes and technology.

DATA CENTER 2/2011 11


Data Recovery

Be Prepared Against
Accidental File Deletions
Sooner or later, disasters happen to every computer user. They delete
important files, on purpose or accidentally. Or they update a document and
save it, overwriting the original version. The common practice of using an
existing document, spreadsheet, or presentation as a starting point for a new
one often ends in catastrophe when the user forgets to save the changes
under a new file name.

C
areers end on such errors. Defining the problem is simple: files tection. “Save overs” are what happens when you “save” a document
that should not be deleted often are, on individual PCs and file over another document. For example, if you make changes to docu-
servers. The immediate challenge is recovering them. Dealing ment “A”, and then simply hit the Save icon, you just lost all of the data
with the larger, long-term issue of prevention and retention on a corpo- in your original document.
rate-wide scale is more complex. To gain greater control over user’s files, it is increasingly common
Products that attempt to recover deleted files have been around as for IT departments to configure users’ PCs to save files to a network
long as personal computers themselves, but have a history of deliver- directory, rather than the computer’s local hard drive. Good reasons
ing mixed results. for doing so are plentiful, yet this scheme can further harm the already
The shortcoming is that these solutions, by definition, are after-the- small chances of recovering a file. The reason is simple: items deleted
fact fixes rather than preventative solution. They cannot resurrect a file from a network drive are permanently deleted; they are not sent to
that has been overwritten, and they make no attempt to archive the a Windows recycle bin.
many revisions that a typical document goes through in its lifetime. There is no shortage of products that tackle various aspects of file
Stronger measures are needed. recovery. Most of these areafter-the-fact solutions that help a desper-
The mathematics of accidental file erasure is alarming. A PC user is ate worker in an attempt to recover an already deleted file. A more ad-
likely to spend an averageof one hour in a frantic effort to recover the vanced approach to the problem is to prevent the situation to occur in
file (or files, or an entire directory) before turningto the help desk. Just the firstplace.
two occurrences per day – a conservative estimate for a corporate en- A complete solution for deletion protection must encompass several
vironment – translates to a minimum annual productivity loss of 520 key aspects. Foremost, it needs to capture any deleted file, regardless
hours, nearly 14 40-hour weeks. Office colleagues, in their attempts to of size or location. It must provide automatic version protection, sav-
help, add to lost productivity and are morelikely to hurt, not help, any ing multiple copies of Word, Excel and PowerPoint documents as they
chance of success. change throughout their lifetime, and doing so without any user action.
Once the IT department gets involved, costs add up quickly. With an It must be network savvy, providing protection for users that save their
IT technician earning $30 an hour, a single 30-minute venture to locate work to a network directory. From the IT perspective, a proper solution
and restore a file from a backup tape – if it’s there at all – costs $15. should have the ability to be push installed over the network to individual
While $15 does not seem like much, in a corporate environment, repeat- systems and allow the entire environment to be managed centrally.
ing that process twice a day costs $7,800 over a full year.) Tied up for Diskeeper Corporation developed Undelete® 2009 to deal with these
260 hours, the total quantifiable cost for user and IT is nearly 21 work circumstances. Undelete replaces the Windows recycle bin with its own
weeks, the equivalent of five months from one full-time employee. more powerful “Recovery Bin” that intercepts all deleted files, regardless
Clearly, recovering a deleted file from a local hard drive or perhaps of their size, their location, how they were deleted, or who did so.
from a prior day’s backup tapes are an expensive, time-consuming, The recovery bin employs a Windows Explorer-like interface for
productivity-robbing process. That’s if it can be done at all. And should navigation and a search capability. Protection is provided that guards
a crucial file be unrecoverable, the cost to the business itself could be against losses that occur when a changed Word, Excel and PowerPoint
incalculable. document is saved under the same name, overwriting the prior version.
A corporate retention strategy looks at the big picture. What such Undelete 2009 implements version protection, allowing users to restore
a plan is likely to miss is the workaday fine details down in the trenches, these overwritten files. In use, users need only click on the file and select
the often unintentional actions taken by individual rank-and-file work- the View Versions option. At that point, version restoration is available.
ers who spend hours at a computer every day. An average employee Preventing users from deleting or overwriting files is a nice idea, but
deletes or overwrites an individual file by accident, then desperately one that isn’t possible. It happens frequently, causing anguish for the
tries to get it back. user, aggravation for the IT department, and possible financial and le-
To deal with file erasure at the individual PC level, Windows offers gal harm for the business. The use of software that backs up files or
a safety net, the recycle bin. The idea is that deleted files are held here, attempts to recover them is logical. Unfortunately, these products are
for a while, allowing a user to recover a file before it disappears for good. often reactive in nature. They cannot recover files across network di-
Though an excellent idea, this contains shortcomings that limit its use- rectories nor do they provide protection for existing files overwritten by
fulness for the enterprise. a newer version. Furthermore, backups are intended for disaster recov-
There are many situations where the deletion will bypass the recycle ery from catastrophic systems failure, not as a method for file-by-file un-
bin, and, in the case of “save overs”, the recycle bin does not offer pro- deletion. Undelete 2009 was built with these challenges in mind.

12 Data Center 2/2011


Upcoming Conferences – March 2011

Data Center World


Data Center World is the largest global event of its kind and has been named one of the 50 fastest growing tradeshows in the U.S. It is the leading
educational conference for data center professionals.

Data Center World is presented by AFCOM, the premier data center association representing 4,500 of the largest data centers around the world.
To learn more about the association, visit www.afcom.com

When? March 27-31, 2011


Where? Mirage Hotel and Convention Center, Las Vegas, Nevada

Gartner Infrastructure, Operations & Data Center Summit


Providing the answer to these questions and more, The 6th Annual Gartner Data Center Conference delivers a wealth of strategic guidance and
tactical recommendations on the full spectrum of issues reshaping the 21st-century data center. It is the premier event IT operations executives
and Data Center professionals turn to for unbiased expertise, best practices, and sound management approaches to help them better manage
their data centers’ agility, to optimize costs, navigate critical next steps in emerging technologies like virtualization and cloud computing, and
deliver world-class services.
To learn more, visit: http://www.gartner.com/technology/summits/apac/data-center/index.jsp

When? March 15-16, 2011


Where? Sydney Convention Center, Sydney, Australia

Geek Day – Rock Stars of the Data Center


Geek Day is one of the nation’s premier technology events focused exclusively on virtualization and data center solutions. This year’s event
will feature a packed agenda filled with innovative technologies, interesting and informative breakout sessions and multiple hands-on learning
labs.
To learn more visit: http://www.geekday.com

When? March 10, 2011


Where: Renaissance St. Louis Grand Hotel, 800 Washington Avenue, St. Louis, Missouri 63101

Datacenter Dynamics – New York: Designing for Demand: Single Tier to Multi Tier to
Dynamic Tier
In our annual visit to New York, the DatacenterDynamics Conference and Expo series will address the challenges faced by owners and operators
of legacy structures in need of upgrading and new builds required to meet the complex combination of required service delivery, technology
advances, energy efficiency, resilience and security.
To learn more, visit: http://www.datacenterdynamics.com/conferences/2011/new-york

When? March 10, 2011


Where? Marriot Marquis, New York
Storage & Backup

Backup and Disaster


Recovery: The Basics
Backup is basically an additional copy of the original data that
can be used primarily to recover as a result to accidental loss. The
backup copy is usually secured and kept for possible future use
for a pre-determined time period. Backup copy can be created in
two ways, viz. simple copy and / or mirrored copy. The difference
between the two is that a mirrored copy is always updated with
the recent data that is being written to the primary copy while
a simple copy is not.

BACKUP AND RECOVERY up because fewer amount of files needs to be backed up,
Most businesses backup their data so that in case of a poten- but has slower restore because last full and all subsequent
tial loss due to deletion or data corruption, they can recover the incremental backups must be applied.
original data and protect themselves from major financial loss- c. Differential backup – A differential backup makes a copy
es. Now-a-days it has also become a regulatory requirement of all the files that were changed after the last full back-
for businesses to backup their data. Most companies prepare up. That means that if a file is modified after the incremen-
a backup, so that they can execute a disaster recovery after an tal backup then differential backup won’t save file changes
accidental deletion, erasure or hard disk corruption. that were made after incremental backup, but will save file
Disaster recovery is the term used for a complete plan to get changes that were made right after the full backup. Differ-
a computer system up and running again after a disaster. It is ential backup has slower backup because large amount of
always important to have a disaster recovery plan for recovering files needs to be backed up, but has faster restore because
computers, networks and data. Strong backup strategies can be only the last full and the last differential backup needs to
implemented for restoring data after any kind of accident. be applied.

BACKUP TYPES HARDWARE TECHNOLOGIES


Hardware technologies used for disaster recovery are an im-
a. Full backup – As the name indicates, this backup backs up portant part of any backup strategy. Backup hardware refers to
all the data on the target volume, regardless of any chang- all kinds of storage media that can be used to store the backup
es made to the data itself. After this backup, you’ll be able data. This section briefly discusses about the physical storage
to restore all files on any given computer but not the core devices that can be used for backup.
operating system files. The system state is commonly re-
ferred to be as the term for operat- a. Tape – The tape is one of the ear-
ing system-specific data. The sys- liest storage devices and is still
tem state backup allows you to re- around because it is cheap, simple,
store core operating system files and small and some gigabytes of
like the Windows registry files or data can be stored on a single tape.
the system boot files. To restore the Tapes are good but the biggest dis-
operating system, a system state advantage with them is that the da-
backup needs to be done. ta is stored sequentially and that
b. Incremental backup – An incremen- makes the file searching and storing
tal backup backs up all the files that a lot slower because each data bit is
were modified since the last backup. stored one after the other on a sin-
This kind of backup utilizes the least gle row.
amount of resources, both in terms b. Optical Storage - Here, optical stor-
of storage as well as bandwidth. In- age is a term used for CD-R/RW,
cremental backup has faster back- Graphic 1. DVD-R/RW and BD-R/RE. CD-ROM

14 Data Center 2/2011


Backup and Disaster Recovery: The Basics

drives are standard and can store up to 700MB and CD-RW place where all the data files are stored. The backup soft-
format allows unlimited rewrites to the same CD. But the ware on the backup server helps with the transfer of the
biggest disadvantage with them is that the maximum stor- files from the file server to the physical storage device.
age size isn’t enough. Then comes the DVD, which has al- b. Client based backup – In a client based backup, each net-
most replaced CD-ROMs because this storage media al- work user is allowed to have a certain amount of control
lows a storage capacity of at least 4.7GB (single layer) to over the backup of their files. In this case the user decides
8.5 (dual layer). which file to backup and which to not and to do so a back-
The Blu-ray Disc may replace the DVD in the near future up configuration software is installed on each user’s ma-
because it allows for a storage of 25GB (single layer) to chine. Just like the server based backup, backup software
50GB (dual layer), which is approximately six times more is installed on the backup server and this software controls
storage than on a DVD. the backup on the backup server side.
c. Hard Drives – External hard drive is a fast method of back- c. Frozen image backup – It often happens that when a file is
ing up and restoring. The disks in these drives spin at a backed up by a client or server based backup the access
speed close to the speed of sound and file retrieval by the to the file is blocked. The access to the file is not given to
backup program is as quick as it could be. The only prob- the user unless and until the backup of that file completes.
lem with them is that they are only good for a small com- But in frozen image backup, the backup software creates
pany or for individual users who backup their own data to an image of the data and then backs up an image of the
a backup server. Hard drives in a large network utilize too actual file. The main disadvantages of this backup is that it
much precious network bandwidth and they are difficult to uses up storage and network resources and during the re-
handle especially because of their price. store, the whole image is restored and not just a single file.
d. Removable Disks – Over the years, the floppy has become Server and client based backups allow you to restore one
quickly obsolete for backup purposes as the storage ca- file from your backup.
pacity is as small as 1.4MB and it has been replaced by d. SAN based backup – In a SAN based backup, the SAN, file
USB flash drives, both in terms of size and capacity. Flash server and backup server are all connected together. The
drives are much smaller than a floppy disk, weigh less than backup server has the backup software installed on it and
30g and can store as large as 256GB. Some flash drives it initiates the backup process. The backup software initi-
allow 1 million write or erase cycles and can be easily used ates the backup process in such a way that the file server
to backup various types of data. sends the data files directly to the SAN, where the physical
e. Network Attached Storage (NAS) – NAS is basically a com- storage device is located.
bination of hardware and software that is designed to serve
only a single purpose i.e., file sharing. The working of NAS SUMMARY
is simple, a data sharing software first connects the LAN It is always important to have a disaster recovery plan for re-
with the NAS server and then the NAS server which is the covering computers, networks and data. The disaster recovery
brain of the NAS system passes the data from the LAN on- plan should be made such that it suits the needs and budget
to the storage devices. NAS is a simple device and is good of the company. A full backup means the backup of all data,
for file availability, but the only problem with NAS is that it an incremental backup means the backup of all files that were
consumes the network bandwidth and thus presents itself modified after the last backup and a differential backup means
as only a partial solution for disaster recovery. a backup of all files that were modified after the last full back-
f. Storage Area Networking (SAN) – The only difference be- up. Most administrators use a combination of two out of three
tween SAN and NAS is that SAN is an entire network that backup types.
is dedicated to file storage and not There are numerous hardware tech-
just a single file storage system. nologies available in the market, but
SAN allows quick file access and it before choosing one you should exam-
greatly reduces the network band- ine factors like size of your company,
width requirement, but there are frequency of data modification, amount
some things that should be kept in of data to be backed up, and most im-
mind before using SAN and they portantly the budget of your compa-
are, viz. ny. These factors will help you decide
– SANs are complex, which hardware technology to go for.
– SANs are costly, After deciding which hardware tech-
– A SAN requires you to pur- nology to go with, you can decide as to
chase all components (hubs which backup strategy should be used
and fibre channel switches) by your backup software. Large compa-
from the same vendor for eve- nies can go with SAN based backups
rything to work. because it requires a significant eco-
nomic investment. Smaller companies
BACKUP STRATEGIES with a single LAN usually go with client
based and / or server based backups.
a. Server based backup – In a serv-
er based backup, the backup soft- DHANANJAY D. GARG
ware is stored on the backup serv- The author is a information security enthusiast
er and backup storage devices are who is based in India. He loves working on pro-
connected to it. File server is the Graphic 1. jects related to information security.

Data Center 2/2011 15


Basics

Examine your Network


With Nmap
What is network Scanning? SYN scanning
Network scanning is an important part of network That any sys- SYN scan is another form of TCP scanning. Rather than use
tem administrator must be done it. network scanning contain the operating system’s network functions, the port scanner ge-
of Port scanner and Vulnerability scanner. nerates raw IP packets itself, and monitors for responses. This
Port scanner is a software that designed to probe a server or scan type is also known as “half-open scanning”, because it
host for open ports. this is often used by administrators to never actually opens a full TCP connection.
verify security policies of their networks and can be used by
an attacker to identify running services on a host with the view UDP scanning
to compromise it. a port scan sends client requests to a server UDP is a connectionless protocol so there is no equivalent to
port addresses on a host for finding an active port. The design a TCP SYN packet. if a UDP packet is sent to a port that is
and operation of the Internet is based on TCP/IP. A port can not open, the system will respond with an ICMP port unreach-
be have some behavior like below: able message. if a port is blocked by a firewall, this method
will falsely report that the port is open. If the port unreachable
1. Open or Accepted: The host sent a reply indicating that a message is blocked, all ports will appear open.
service is listening on the port.
2. Closed or Denied or Not Listening: The host sent a reply ACK scanning
indicating that connections will be denied to the port. This kind of scan does not exactly determine whether the port
3. Filtered, Dropped or Blocked: There was no reply from is open or closed, but whether the port is filtered or unfiltered.
the host. This kind of scan can be good when attempting to probe for
the existence of a firewall and its rule sets.
Port scanning has several types such as: TCP scanning, SYN
scanning ,UDP scanning, ACK scanning, Window scanning, FIN scanning
FIN scanning, X-mas, Protocol scan, Proxy scan, Idle scan, Usually firewalls blocking packets in the form of SYN packets.
CatSCAN , ICMP scan. FIN packets are able to pass by firewalls with no modification
In below we explain number of this scan : to its purpose. Closed ports reply to a FIN packet with the ap-
propriate RST packet, whereas open ports ignore the packet
TCP scanning on hand.
The simplest port scanners use the operating system’s network Nmap Support large number of this scanning.
functions and is generally the next option to go to when SYN is not A vulnerability scanner is a computer program designed
a feasible option. to assess computers, computer systems, networks or applica-
tions for weaknesses. It is important that the network adminis-
trator is familiar with these methods.
There are many software For Scanning network ,some of
this software are free and some are not free, at http://sectools.
org/vuln-scanners.html you can find list of this software.
The significant point about nmap(Network Mapper) is Free
and Open Source. Nmap is a security scanner originally written
by Gordon Lyon (also known by his pseudonym Fyodor Vasko-
vich) for discover hosts and services on a computer network.
Nmap runs on Linux, Microsoft Windows, Solaris, HP-UX and
BSD variants (including Mac OS X), and also on AmigaOS
and SGI IRIX.
Nmap Includes the following features :

• Host Discovery,
• Port Scanning,
• Version Detection,
• OS Detection,
• Scriptable interaction with the target.
Figure 1. Work with Zenmap is easy and have a good Environment for work.

16 Data Center 2/2011


Upcoming Storage Events

SNW Spring 2011: Driving Innovation Through the Information Infrastructure


Produced by Computerworld and co-owned by Computerworld and SNIA (The Storage Networking Industry Association), SNW features more
than 120 educational sessions and presentations by top IT management experts covering today’s hottest IT topics — from cloud computing to
energy efficient data centers to virtualization to storage -- and so much more.
To learn more, visit: https://www.eiseverywhere.com/ehome/index.php?eventid=16204&

When? April 4-7, 2011


Where? Hyatt Regency Silicon Valley/Santa Clara, Convention Center, Santa Clara, California

Data Center Insights: Cloud Computing


DCI 2011 is an invitation-only, hosted summit designed for senior IT and business executives who want to understand how current and future
technology trends will impact their datacenter vision and strategies.
Visit: http://www.datacenterinsights.com/

When? May 23-25, 2011


Where? Arizona Grand Resort, Phoenix, AZ

SYSTOR 2011: The 4th Annual International Systems and Storage Conference
SYSTOR 2011, the 4th Annual International Systems and Storage Conference, promotes computer systems and storage research, and will take
place in Haifa, Israel. SYSTOR fosters close ties between the Israeli and worldwide systems research communities, and brings together academia
and industry.
To learn more, visit: http://www.research.ibm.com/haifa/conferences/systor2011/

When? May 30 – June 1, 2011


Where? Haifa, Israel

Tape Summit 2011


The Tape Summit is designed to bring a strong, cohesive message to all the market about the state of tape. The industry is under siege from
Disk, Big Data and EMC. Press and analysts understand what individual speeds and feeds each tape company offers in the media, drives and
library. What they don’t understand or are not clear on is what the use cases are, what the future is, or how tape plays a role in markets such as
archive. hey think of it simply as a backup target.
Visit: http://theexecevent.com/2011_tape_summit/

When? April 13-15, 2011


Where? Las Vegas Event Center Hotel, Las Vegas
Basics

Table 1.
Nmap Works in two modes, In command line mode and GUI
mode.Graphic version of Nmap known as Zenmap. official GUI Feature Option
for Nmap versions 2.2 to 4.22 known as NmapFE, originally Don’t Ping -PN
written by Zach Smith. For Nmap 4.50, NmapFE was replaced Perform a Ping Only Scan -sP
with Zenmap, a new graphical user interface based on UMIT, TCP SYN Ping -PS
developed by Adriano Monteiro Marques.
TCP ACK Ping -PA
There are many features about nmap that We can not say all
in this article. We just tell some of the important features. UDP Ping -PU
SCTP INIT Ping -PY
Scan a Single Target ICMP Echo Ping -PE
For Scan a single target, your target can be specified as an IP
ICMP Timestamp Ping -PP
address or host name.
ICMP Address Mask Ping -PM
Usage syntax: nmap [target] IP Protocol Ping -PO
$ nmap 192.168.10.1 ARP Ping -PR
Starting Nmap 5.00 ( http://nmap.org )
Traceroute --traceroute
at 2009-08-07 19:38 CDT
Interesting ports on 192.168.10.1: Table 2.
Not shown: 997 filtered ports Feature Option
PORT STATE SERVICE TCP SYN Scan -sS
20/tcp closed ftp-data
TCP Connect Scan -sT
21/tcp closed ftp
80/tcp open http
UDP Scan -sU
Nmap done: 1 IP address (1 host up) TCP NULL Scan -sN
scanned in 7.21 seconds TCP FIN Scan -sF
Xmas Scan -sX
In above example, PORT show port number/protocol and
TCP ACK Scan -sA
STATE show state of port and SERVICE show type of service
for the port. Custom TCP Scan --scanflags
You can scan Multiple Targets with flowing syntax : IP Protocol Scan -sO
Send Raw Ethernet Packets --send-eth
Usage syntax: nmap [target1 target2 etc]
Send IP Packets --send-ip
$ nmap 192.168.10.1 192.168.10.100
192.168.10.101 Table 3.
Flag Usage
Scan a Range of IP Addresses SYN Synchronize
A range of IP addresses can be used for target specification
ACK Acknowledgment
as in the example below.
PSH Push
Usage syntax: nmap [Range of IP addresses] URG Urgent
$ nmap 192.168.10.1-100 RST Reset
FIN Finished
Scan an Entire Subnet
Nmap can be used to scan an entire subnet using CIDR. Table 4.
Feature Option
Usage syntax: nmap [Network/CIDR]
Do a quick scan -F
$ nmap 192.168.10.1/24
Scanning a specific port -p [port]
You can create a text file that contain of your victim and give Scanning port through a name -p [name]
this file to Nmap for Scan, see below example : Scanning Ports by Protocol -p U:[UDP ports],T:[TCP ports]
Scan All Ports -p “*”
Usage syntax: nmap -iL [list.txt]
Scan Top Ports --top-ports [number]
$ nmap -iL list.txt
Perform port scanning consecutive -r
Exclude Targets from a Scan Table 5.
For Exclude a target from scan, you can use this syntax :
Feature Option
Operating System Detection -O
Usage syntax: nmap [targets]
--exclude [target(s)] Trying to guess the unknown operating system --osscan-guess
$ nmap 192.168.10.0/24 Service Version Detection -sV
--exclude 192.168.10.100
Perform a RPC Scan --version-trace
Troubleshooting Version Scans -sR

18 Data Center 2/2011


Examine your Network With Nmap.

Scan an IPv6 Target Other options are used in the above form, Only some of the
Addition of IPv4,Nmap can be scan IPv6.the -6 parameter is options require special settings.
used to perform IPv6 scan.
Custom TCP Scan
Usage syntax: nmap -6 [target] The --scanflags option is used perform a custom TCP scan.
# nmap -6 fe80::29aa:9db9:4164:d80e
Usage syntax: nmap --scanflags
A summary of some features about Discovery Options for [flag(s)] [target]
Quick read , Exist in the below table 1. # nmap --scanflags SYNURG 10.10.1.127
In this paper we refrain from explaining details and For ex-
ample, only show general form for using. The --scanflags option allows users to define a custom scan
using one or more TCP header flags (Table 3).
Don’t Ping
Port Scanning Options
Usage syntax: nmap -PN [target] There are a total of 131,070 TCP/IP ports (65,535 TCP and
$ nmap -PN 10.10.5.11 65,535 UDP).Nmap, by default, only scans 1,000 of the most
commonly used ports. In the table below,We’re showing some
Other features are used similarly. of options that you require to perform port scanning (Table 4).
Let, examine Advanced Scanning Options For use this options We’re showing some of the options re-
quire special settings.
Nmap supports a number of user selectable scan types.
By default, Nmap will perform a basic TCP scan on each tar- Do a quick scan
get system. In some situations, it may be necessary to per- The -F option instructs Nmap to perform a scan of only the
form more complex TCP (or even UDP) scans to find uncom- 100 most commonly used ports:
mon services or to evade a firewall. In below table we show
some option that you need to perform advanced scanning Usage syntax: nmap -F [target]
(Table 2). $ nmap -F 10.10.1.44
For example ,We only show general form of scanning and
some of options that require special settings, we explain Nmap scans the top 1000 commonly used ports by default.
they. The -F option reduces that number to 100.
advanced scanning used like other scanning, in below ex-
ample we show you how use options to scan target. Scanning port through a name
The -p option can be used to scan ports by name:
Note:
You must login with root/administrator privileges (or use the Usage syntax: nmap -p
sudo command) to execute many of the scans . [port name(s)] [target]
$ nmap -p smtp,http 10.10.1.44
TCP SYN Scan
To performs a TCP SYN scan you must use The -sS option. Scanning Ports by Protocol
Specifying a T: or U: prefix with the -p option allows you to
Usage syntax: nmap -sS [target] search for a specific port and protocol combination:
# nmap -sS 10.10.1.48
Usage syntax: nmap -p U:[UDP ports],
T:[TCP ports] [target]
# nmap -sU -sT -p
U:53,T:25 10.10.1.44

Scan Top Ports


The --top-ports option is used to scan the specified number of
top ranked ports:

Usage syntax: nmap --top-ports


[number] [target]
# nmap --top-ports 10 10.10.1.41

Operating System and Service Detection


One of Nmap’s features is its ability to detect operating sys-
tems and services on remote systems. This feature analyzes
responses from scanned targets and attempts to identify the
host’s operating system and installed services.
In the table below,We’re showing some of options that you
require to perform Operating System and Service Detection
(Table 5).

Data Center 2/2011 19


Basics

Use these options as well as other options are.for example. # nmap --source-port 53
scanme.insecure.org
Operating System Detection
The -O parameter enables Nmap’s operating system detec- Append Random Data
tion feature.
Usage syntax: nmap --data-length
Usage syntax: nmap -O [target] [number] [target]
# nmap -O 10.10.1.48 # nmap --data-length 25 10.10.1.252

Attempt to Guess In the above example 25 additional bytes are added to all
an Unknown Operating System packets sent to the target.
If Nmap is unable to accurately identify the OS, you can
force it to guess by using the --osscan-guess option. Randomize Target Scan Order

Usage syntax: nmap -O Usage syntax: nmap --randomize-hosts


--osscan-guess [target] [targets]
# nmap -O --osscan-guess 10.10.1.11 $ nmap --randomize-hosts 10.10.1.100-254
Spoof MAC Address
Evading Firewalls Usage syntax: nmap --spoof-mac
Firewalls and IDS designed to prevent tools like Nmap. Nmap [vendor|MAC|0] [target]
includes a number of features designed to circumvent these # nmap -sT -PN --spoof-mac
defenses (Table 6). 0 192.168.1.1
We quickly show how to use these options.
The --spoof-mac option Has the following parameters
Fragment Packets (Table 7).

The -f option is used to fragment probes into 8-byte packets. Send Bad Checksums

Usage syntax: nmap -f [target] Usage syntax: nmap --badsum [target]


# nmap -f 10.10.1.48 # nmap --badsum 10.10.1.41

Specify a Specific MTU


Table 6.

Usage syntax: nmap --mtu Feature Option


[number] [target] Fragment Packets -f
# nmap --mtu 16 10.10.1.48
Specify a Specific MTU --mtu
Use a Decoy -D
In the above example, the --mtu 16 argument instructs Nmap
to use tiny 16-byte packets for the scan. Idle Zombie Scan -sI
To specify the source port to manually --source-port
Use a Decoy Append Random Data --data-length
Randomize Target Scan Order --randomize-hosts
Usage syntax: nmap -D
[decoy1,decoy2,etc|RND:number] Spoof MAC Address --spoof-macc
[target] Send Bad Checksums --badsum
# nmap -D RND:10 10.10.1.48
Table 7.
In the above example nmap -D RND:10 instructs Nmap to Argument Function
generate 10 random decoys. 0 (zero) Generates a random MAC address
Specific MAC Address Uses the specified MAC address
Idle Zombie Scan
Vendor Name Generates a MAC address from the speci-
fied vendor (such as Apple, Dell, 3Com, etc)
Usage syntax: nmap -sI
[zombie host] [target] Table 8.
# nmap -sI 10.10.1.41 10.10.1.252
Feature Option

In this example 10.10.1.41 is the zombie and 10.10.1.252 is Save Output to a Text File -oN
the target system. Save Output to a XML File -oX
To specify the source port to manually: Grepable Output -oG
Output All Supported File Types -oA
Usage syntax: nmap --source-port
133t Output -oS
[port] [target]

20 Data Center 2/2011


Examine your Network With Nmap.

Only a system that poorly configuration, would a respond to $ cat scan.txt


a packet with a bad checksum. StaRtING NMap 5.00
( htTp://nmap.oRg )
Output Options aT 2009-08-13 15:45 CDT
Nmap Offers Some option for generate formatted output. !nt3r3St|ng pOrts 0n 10.10.1.1:
You can save your output in Text, Xml and single line grepa- n0t $h0wn: 998 cl0$3d p0rt$
ble file. P0RT $TATE seRV!CE
In the table below,We’re showing some of options that you 80/tcp Open hTtp
require to Generate your Desired output (Table 8). 443/tcp 0pen https
All Features to use as they are, So we just give an exam- Nmap DOnE: 1 Ip addresz (1 host up)
ple showing. $canned iN 0.48 $3c0nds

Save Output to a Text File Remotely scan


Nmap have a version that run online and you can scanning
For save output as a text file we use –oN Option. your target from remotely. Visit http://nmap-online.com/ and
enter your ip address for scanning and select your scan type
Usage syntax: nmap -oN [scan.txt] then click scan now button, Scanning results later will be dis-
[target] played.
$ nmap -oN scan.txt 10.10.1.1

Other features are like each other but in Output All Supported References:
File Types option ,you don’t need to specify Extensions.
This property is used as follows : t IUUQFOXJLJQFEJBPSHXJLJ1PSU@TDBOOFS
t IUUQFOXJLJQFEJBPSHXJLJ7VMOFSBCJMJUZ@TDBOOFS
t IUUQONBQPSH
Usage syntax: nmap -oA [filename]
[target]
$ nmap -oA scans 10.10.1.1 Nmap® Cookbook The fat-free guide to network scanning

Another option is 133t Output, It output only for joking. In below


you can see example of this option. MOHSEN MOSTAFA JOKAR

Usage syntax: nmap -oS [scan.txt] [target]


$ nmap -oS scan.txt 10.10.1.1

Be the
first one
to get all
the latest
news!
Subscribe to our newsletter: http://datacentermag.com/newsletter/

DATA CENTER 2/2011 21


Basics

Data Centers
– Security First!
Data centers must be secure in order to provide a safe environment
for running enterprise to achieve maximum productivity,
protecting profitability, productivity and reputation. What
would happen if data center had an outage or security breach
that disrupted operations, accesses, and services. We expect
data centers to deliver timely, secure and trusted information
throughout consuming organizations.

T
heir complex physical and virtual structures face with ev- with a wide choice of strong authentication factors support-
er-evolving security demands and large companies need ed out of the box;
to build secure, dynamic information infrastructures to • Facilitate compliance with privacy and security regulations by
accelerate their innovations. Administration and access control leveraging centralized auditing and reporting capabilities;
considerations are the main issues in secure data centers. • Improve productivity and simplify the end-user experience by
automating sign-on and using a single password to access all
Administration – Do right users access to the right informa- applications;
tion? This is the main question that administrators deal with. • Enable comprehensive session management of kiosk or sha-
Various solutions in this area are proposed by different com- red workstations to improve security and user productivity;
panies. These solutions provide comprehensive identity man- • Enhance security by reducing poor end-user password be-
agement, access management, and user compliance auditing havior.
capabilities. Security solutions targeted the business agility and
reliability, operational efficiency and productivity for dynamic
infrastructures. Comprehensive security solutions can provide Reference: IBM Data Center Security
these features:

• Facilitate compliance with security requirements and poli-


cies;
• Effectively administer mainframe security and improve pro-
ductivity while reducing administration time, complexity, im-
plementation efforts, and costs;
• Monitor and audit for threat incidents, audit configurations
and resource usage to help detect and prevent security ex-
posures and to report compliance;
• Leverage seamless log integration with a comprehensive
enterprise-wide view and sophisticated analysis of audit
and compliance efforts across other operating systems,
applications and databases.

Access management solutions that reduce costs, strengthen


security, improve productivity and address compliance require-
ments:

• Reduce password-related help-desk costs by lowering the


number of password reset calls;
• Strengthen security and meet regulations through strong-
er passwords and an open authentication device interface

22 Data Center 2/2011


Upcoming Backup & Data Recovery Events

Cloud Meets Big Data at EMC World 2011


2011 Conference Topics:
Backup, Recovery and Archiving
Big Data
Business Continuity and Disaster Recovery
Cloud Computing
Enterprise Applications
Federation
Mainframe Platforms
Momentum/Information Intelligence
Partner Solutions
Security and Compliance
Storage and IT Management
Technology Directions and Innovation
Tiered Storage and Automation
Virtualization

To learn more, visit: http://www.emcworld.com/

When? May 9-12, 2011


Where? The Venetian, Las Vegas

IT Roadmap Conference & Expo


This year, IT Roadmap’s agenda is more streamlined, with expanded topics and more of what you need to maximize your learning in just one
day, covering such high-priority IT topics as:

* Cloud Computing: Public, Private & Hybrid


* Modern Network & Infrastructure Management
* Unified Communications & Connected Enterprises
* Data Center Trends & Enterprise Storage Strategies
* Application Optimization & Management
* Evolving Security Threats, Data Protection & Identity Management
To learn more, visit: http://www.eiseverywhere.com/ehome/index.php?eventid=15913&tabid=20056

When? March 15, 2011


Where? Donald E. Stephens Convention Center, Chicago, Illinois

Cisco Live!
Cisco Live is Cisco’s annual marquee event and Cisco Australia and New Zealand’s largest annual forum that will offer three distinct event
programs (Networkers, IT Management and Service Provider) with the perfect mix of high-level visionary insights and deep-dive technical
education.
Learn more: http://www.ciscolive.com

When? March 29 – April 1, 2011


Where? Melbourne, Australia
Storage

Building a Flexible
and Reliable Storage
on Linux
There are many storage solutions which can meet recent needs
of storage, being the main requeriments used on choice: the level
of confiability, complexity, cost and flexibility of the solution.

I
n a dinamic world, surrounded by all kinds of technologies, There are different types of RAID, which each store the infor-
the persistence of informations is seen as a critical requeri- mation in a different way and have different modes to keep and
ment, since it is on top of it that the knowledge and the busi- read/write the data.
ness are modeled. With that, your availability and confiability be- The performance improvement in the reading and recording
come important too, since that the major active of the Institutions operations are achieved because these procedures are distrib-
from moderns time is the Information and access to it. uted amoung members of the array of storage devices.
The attendance these requiriments is done by keeping the The confiability and redundancy are achieved through the use
information in persistence storage devices, with enough read- of the parity bit used (see figure 1), that are stored in the distrib-
ing and writing performance and possibility of fast and efficient uted way among all the devices from array.
recovery, in case of failures. And tecnologies like RAID and The parity is responsible for the data fault tolerance. These
LVM show as useful tools to get a redundant, flexible and reli- bits ensure that, in case of failures of the only disk, the informa-
able storage scenario. tions aren’t lost. The information is lost only in the case where
more than one disk fails.
RAID The possibility of data recovery in the case of a disk failure is
RAID (Redundant Array of Inexpensive Disks) is a storage mode related to the fact that without a array member, the information
that allows you to achieve high storage capacity with confiabil- that was stored on it can be rebuild through the reading of all
ity through redundancy, achieved by using multiple storage de- sectors in the same regions in the disks and of the parity bits.
cives. With that, the contents of the original sector can be computed
For purposes of the paper, the user of RAID-5, which mini- based on this informations.
mum number of storage devices required to build it is three, Taking as example the figure 2.
shown the most appropriate, because with the same gets a Suppose the C drive fails. No information is lost, however the
balance between: data which was stored on the failed drive must be computed on

• I/O performance;
• confiability and redundancy;
• recover capacity in case of failures.

Figure 1. Figure 2.

24 Data Center 2/2011


Building a Flexible and Reliable Storage on Linux

the fly. This is the reason the performance is degraded when ity bits which, as said early, won’t be stored in only onde drive,
a drive fails. but in all members drives of the array.
Considering the first sector, the data in A1 and B1 will be To check out the newly created MD status, one can use the
compared with Dp. This is a logical comparation called XOR mdstat in the proc filesystem:
(exclusive OR) in the binary values of the data block. The at-
traction of this type of comparation is that if a missing value # cat /proc/mdstat
(drive fails), it can be recovery doing a XOR comparation in the
remaining values. Or even with the mdadm, with the option --detail, passing as
Suppose that A1=0110, B1=1010 and C1=1100, and consider parameter the path of the block device (/dev/md0):
the figure 3, the truth table from XOR. The D1 value, the parity
block from sector 1 is calculated as follows: # mdadm --detail /dev/md0

Dp == D1 == (A1 XOR B1) XOR C == If successful in the RAID-5 setup, it should be active and with
== (0110 XOR 1010) XOR 1100 == 1100 XOR 1100 == 0000 clean status, with 3 associated devices.

When C drive fails, its value can be recovery as follow: Simulanting a fault
It’s possible simulate a failure in one of the drives, in order to
C1 = (A1 XOR B1) XOR Dp == (0110 XOR 1010) XOR 0000 == get a confirmation that the information is still available even af-
== 1100 XOR 0000 == 1100 ter a failure. This scenario can be obtained through hardware,
turnning off the server and unplugging the disk or through soft-
This procedure is used both reading of the information when ware, using mdadm:
a drive fails and recovery process, when a replacement is al-
located. # mdadm /dev/md1 -f /dev/sda1
It’s noteworthy that, using RAID-5 to build a block device,
there is loss of storage capacity, being 1/n from total capacity In the next, just use one of the commands mentioned before to
reserved to stored parity bits, with ‘n’ being the total number of check out the status of the block device, being that now should
disks. have fault indication in one of the devices and the status array
changed to clean and degraded.
Creating a MD – Multiple Devices
There is the possibility to build a block device in RAID-5 by State : clean, degraded
hardware or software.
Here it will be done by software, using the mdadm tool After that fault simulation, now must adjust the array, what goes
[1], assuming a scenario with 3 SATA hard disks with from removing the failed disk:
2 TB capacity (/dev/sda, /dev/sdb and /dev/sdc). If done by
hardware, the performance would be better, but the server # mdadm /dev/md0 -r /dev/sda1
motherboard would need to have support to this kind of
operation. Until the addition of its replacement:
To build the MD block device it is necessary to first create
a partition of the type “Linux raid autodetect”, type “fd”, on the # mdadm /dev/md0 -a /dev/sda1
disks envolved in the scheme, what can be done using fdisk
command [2]. With that, is necessary wait for the array reconstrution, since
Using mdadm, for the creation of the RAID-5, can be done the its status in the moment indicate clean, still degraded, but
using the follow command: recovering, distributing again the parity bits amoung the array
members.
# mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sda1
/dev/sdb1 /dev/sdc1 State : clean, degraded, recovering -> LVM

That way, the block device /dev/md0 will be created in RAID-5, LVM (Logical Volume Management) is way to manage servers
with 2 TB capacity, using 3 devices, being one reserved to par- storage devices, allowing create, remove and resize partitions
according to needs of the applications.
Logically it is organized as shown in the figure 4.
The Volume Groups consist of Physical Volumes, where the
last ones can be partitions, disks or files. And it is in the Volume
Groups where are created Logical Volumes that will be used
by the system and/or applications. They necessarily need to be
mounted on the system to be available.
Some situations where LVM is useful:

• Appropriate resize of partitions, when necessary;


• Creation of backups through snapshots, that is an exact
copy of the Logical Volume, in a certain point of the time;
• Management of large amounts storage, can be easily add
Figure 3. or remove disks.

Data Center 2/2011 25


Storage

This flexibility in operations that would be hard if treated with After that, a logical volume with 800 GB capacity was created
physical device directly is the great advantage of LVM. and it is ready to be formatted with some file system format that
is available [4] and it can be placed on the provision through a
Creating Physical Volumes mounted [5].
and Volumes Groups
With the construction of block device made earlier, using the Extending Logical Volume
disk array, now it’s possible to use it to a physical volume. For Depending on the evolution of the use of the storage server by
such, there is the lvm tool [3] and the creation can be processed the applications spread across the local network, it’s possible
with the following command: to adjust the partitions size (logical volumes) according to the
needs of the context in question.
# pvcreate /dev/md0 Besides the common questions which are taken into account
when choosing the file system to use (performance, space
To check out attributes of physical volume, just use pvdisplay, allocated, read latency, write latency, latency of creating and
which will show some information about it, like device name, removing large and small files, etc), those of which must be
size physical vlume, space allocated and space free. guided according to what type of application will use it, atten-
Now, as represented in the figure 4, one can create a Vol- tion must be payed attention too about the extending possibil-
ume Group: ity, when use LVM, needing check if the file system has sup-
port for this operation.
# vgcreate volume-files /dev/md0 A Logical Volume can have its size reduzed so another
can have its size extended or, maybe if one still have some
The sucessful of the operation above can be checked through space in the physical volume, the logical volume can just ex-
of the output commando below: tend its size. For that, it is necessary to initially umount the
file system.
# vgscan Following, the expansion operation must be done passing as
parameter how much of space you want to increase the logical
Reading all physical volumes. This may take a while... volume in question:
Found volume group “volume-files” using metadata type lvm2
Now, with the Volume Group created, it is possible to create # lvextend -L +100g /dev/volume-files/mailserver
and work with Logical Volumes.
If you want to create a storage pool to keep e-mail files of the A check on the status of the logical volume (lvdisplay) is advised,
Organization: so is a check in the file system consistency after the resize.
For the case in that reiserfs was chosen as the file system to
# lvcreate -n mailserver the logical volume in question, because it has supports the ex-
-size 800g volume-files pansion, to perfom the check and resizing of the file system:

# lvdisplay # reiserfsck --check /dev/volumes-files/mailserver


# resize_reiserfs /dev/volumes-files/mailserver
--- Logical volume ---
Finally, just remount the file system on the path chosen and
LV Name /dev/volume-files/mailserver check with df command [6] if the resizing works in fact (have
VG Name volume-files confirmed earlier with lvdisplay).
LV Status available
LV Size 800 GB Overview
Thinking about network infrastructure, the future of storage
is open to new ways to organize, always trying get great-
er storage capacity, searches and writings efficiency. The
combination of more than one type of RAID, called Nested
RAID or Hybrid RAID and Cloud Computing are some pos-
sibilities to be explored and show as a promising trends in
this area.

Website:
• [1] – http://linux.die.net/man/8/mdadm
• [2] – http://linux.die.net/man/8/fdisk
• [3] – http://linux.die.net/man/8/lvm
• [4] – http://linux.die.net/man/8/mkfs
• [5] – http://linux.die.net/man/8/mount
• [6] – http://unixhelp.ed.ac.uk/CGI/man-cgi?df

Author
Figure 4. Eduardo Ramos dos Santos Júnior

26 Data Center 2/2011


STORAGE

Understanding
Linux Filesystem
for Better Storage Management
If you had a large warehouse, how would you organize it? Would you
want everything inside it to be scattered around? Or would you have
an arrangement pattern in mind? If so, what kind of arrangement
pattern? Would you group things based on size or similar? Of course,
the way a room is organized will determine the room’s first and final
impression to clients or customers for example. The benefit to having
an organized space is that it becomes much easier to locate items.

T
he same idea goes for storage, or to be precise, physical block size
discs. What I am referring to here are the regular IDE/ total inodes
ATA/SATA hard disk types, SSD, flash and so on (A note total blocks
about SSD or flash based discs: most filesystems are de- number of free blocks
signed with the idea of rotating discs in mind and consequently number of free inodes
do not take their limited rewrite capacity into consideration). last mount and last write time
Today, we are seeing disks with bigger capacities and faster
data transfer rates; at the same time, we are seeing smaller di- To get this information, you can use tools like dumpe2fs (for
mensions and lighter disks (usually). ext2/3/4) or debugreiserfs (for reiserfs). For example:
Most importantly, disks are becoming less expensive which
we are always thankful for! The question becomes, however: $ sudo dumpe2fs /dev/sda1
How do we manage our storage effectively? ...
Here is where the file system comes in. A File System (filesys- Filesystem features: has_journal ext_attr resize_inode
tem) is a layout structure that through its properties, tell us how dir_index filetype needs_recovery
the data will be stored on a disk. It utilizes a “driver” of sorts, sparse_super large_file
that performs management tasks. Default mount options: (none)
One of the properties usually assigned are the Linux access Filesystem state: clean
permissions which tells us, for example, which users may ac- Errors behavior: Continue
cess certain data and those users who cannot. Filesystem OS type: Linux
What choices do we have for filesystem types today? With Inode count: 6094848
any major Linux distribution serving as a reference, here are Block count: 12177262
some of options given to us: Reserved block count: 608863
Free blocks: 2790554
ext2 / ext3 Free inodes: 6090656
ext4 First block: 0
reiserfs (version 3 and 4) Block size: 4096
XFS Fragment size: 4096 ...
btrfs
A Superblock in one part of a filesystem has multiple copies of
Basic understanding itself propagating throughout the entire filesystem. This is in-
To really understand filesystems, we need to define a few terms. tended as a backup in case file corruption occurs. In the case of
ext2 or ext3, you can use dumpe2fs to find their locations:
What is a super-block?
A superblock is the meta-data describing the properties of an en- $ sudo dumpe2fs /dev/sda2 | grep -i superblock
tire file system. Some of the properties you can find here are: dumpe2fs 1.39 (29-May-2006)

28 Data Center 2/2011


Understanding Linux Filesystem

Primary superblock at 0, Group descriptors at 1-3 $ stat /bin/ls


Backup superblock at 32768, Group descriptors at 32769-32771 File: `/bin/ls’
Backup superblock at 98304, Group descriptors at 98305-98307 Size: 95116 Blocks: 208 IO Block: 4096 regular file
Backup superblock at 163840, Group descriptors at 163841- Device: 803h/2051d Inode: 2117545 Links: 1
163843 Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Backup superblock at 229376, Group descriptors Access: 2011-02-13 00:02:31.000000000 +0700
at 229377-229379 Modify: 2010-03-01 05:33:21.000000000 +0700
Backup superblock at 294912, Group descriptors Change: 2010-10-29 04:09:52.000000000 +0700
at 294913-294915
Backup superblock at 819200, Group descriptors These days, inodes not only contains metadata but sometimes
at 819201-819203 the file’s content itself, so they do not become separated into
Backup superblock at 884736, Group descriptors exclusive data blocks. With the advent of the 256-byte inode as
at 884737-884739 the new standard, we shall see such consolidations more and
Backup superblock at 1605632, Group descriptors more. After all, isn’t allocating a single 512-byte block simply to
at 1605633-1605635 store 100-byte characters a complete waste?
Backup superblock at 2654208, Group descriptors With inodes, the three most important descriptors present in
at 2654209-2654211 them are as follows:
Backup superblock at 4096000, Group descriptors
at 4096001-4096003 • access time (atime): the last time the file’s content or its
Backup superblock at 7962624, Group descriptors metadata has been accessed. This means the more times
at 7962625-7962627 that you read a file, the more times this atime flag will be
Backup superblock at 11239424, Group descriptors updated.
at 11239425-11239427
• Change time (ctime): timestamp of the last change to a
As you can see, we have twelve backups. Please note that there file’s metadata.
is absolutely no guarantee that they are synchronized with lat- • Modification time (mtime): timestamp of the last change
est update of the main superblock, therefore we need to use the to a file’s content.
information gleaned above as an argument to fsck.ext3: (Since change and modification are quite synonymous, it’s
not surprising if you don’t understand the difference just by
# fsck.ext3 -b 32768 /dev/sda2 looking at the names).

This is helpful if your filesystem refuses to mount, for example. What is a block?
In this situation, the Fsck utility would at least restore the main A Block is the smallest storage unit of a file system. You can
superblock, thus making the whole filesystem somewhat read- think of it like the smallest brick that makes up a large building.
able. Of course, expect few missing pieces of data here and Don’t confuse it with the sector, which is the smallest storage
there. unit of a physical device (e.g your hard drive).This means that
physically, your disc address the data in terms of sector size,
What is an inode? while the filesystem uses block size.
Similar to the super-block describing the whole filesystem, an Today, most storage utilizes a 512-byte logical sector size,
inode describes the file’s properties (and this includes direc- while file systems usually use a 4 kb (kilobyte) block size. What
tory information as well, as a directory is just another form of a do we mean when we say “logical” sector size? Physically, the
file). You can find out partially what’s inside of inode by running sector size is 4 kilobytes. It used to be 512 bytes, but such a size
“stat” command: proved inadequate to reach the larger and larger capacities that
we are seeing today. Recall that our definition of a filesystem
included a driver to perform management tasks. This internal
controller performs a 512 byte-to-4 kilobyte conversion, so it is
therefore transparent to the block device driver which still might
utilize a 512 byte unit size for example.
Storage devices, as we mentioned, have started to adopt this
new 4 kilobyte sector size and operating systems have started
adapting in turn [2]. There are still problems, however, due to
issues with addressing alignment. At the time of this writing, ker-
nel developers are still working on addressing this issue. In the
near future, we should be able to fully utilize the 4 kilobyte-size
devices to their fullest capabilities.
Block size indirectly affects the percentage of storage utiliza-
tion. Why? Because the bigger the block size is, the bigger the
chance is you will introduce less occupied space. If you store 5
KiB data on file system that uses 4 KiB block, then it will occupy
2 blocks. You can’t cut the 2nd block into half. It’s all or nothing.
This wasted area is called slack space. So theoretically, if the
average of your file size is 3 KiB while the block size is 4KiB,
then roughly 25% of your disk space is wasted.

Data Center 2/2011 29


STORAGE

Again, the stat command is useful to find out how many blocks metadata change is always logged before being comitted to
a file will occupy. For example: disc. As for the data, it could be logged or logged not at all.
The benefit of logging both is that reliability is increased. The
$ stat -c “%b %B” /bin/ls consequence becomes that throughput is reduced due to the
208 512 extra I/O operations to the journal (data is written twice: to the
log and disc blocks). Journalling metadata only cuts the I/O fairly
This tells us that /bin/ls takes takes 208 blocks. If we multiply significant, but in the case of disc catastrophe, you might end
that with 512 byte (the block size), we get 106496 byte=104 up with garbage or lost data.
kilobyte. While in fact the file size is 95116 byte ( equals to 93 The compromise is to do ordering, like the ext3 ordered meth-
kilobyte). Please read “man stat” for further information of valid od does. Quoting from mount manual page: ordered.
format sequences you can use to get various information re- This is the default mode. All data is forced directly out to the
garding a file or filesystem. main file system prior to its metadata being committed to the
Fortunately, a technique called “tail packing” was recently journal.”
introduced to overcome this situation. The Kernel can insert Thus logged metadata in a journal acts like a barrier. If you
data of another file(s) into these slack spaces. In other words, see the metadata update in the log, then we are assured that the
a single block could host the content of more than one file. data is already written. If you don’t see one, then either nothing
A few filesystems already support this: btrfs and reiserfs (version happened or the update does not reflect the current metadata
3 and 4). Touching on the BSD systems, the UFS2 filesystem state. All in all, at least the file is still consistent.
also does tail packing.

What is a sparse file?


What is a journal? Ever saw the file which has the properties listed below?
A journal is a log of filesystem operations which change filesys-
tem content and/or structure. It’s a critical part of a journalling file $ ls -lsh test.bin
system. Ext3 (and 4) and reiserfs are some examples of widely 12K -rw-rw-r-- 1 mulyadi mulyadi 1.1M Feb 18 01:52 test.bin
known journalling types and are the primary choices of many
system administrators. With a journalling file system, you have The above shows 1.1 megabyte in size, only taking up 12K
a way to track down which operation that has been completed physically! Why is this? To avoid confusion, that file was cre-
and which one has not (or which one is broken). One of the ben- ated with this command:
efits becomes evident when a machine crashes and needs to be
restarted. Without journalling, you’re left with one option: check $ dd if=/dev/zero of=./test.bin bs=1K seek=1K count=1
the entire file system for possibly invalid or inconsistent inodes
or superblocks which could take hours when you are dealing Technically, this means when I created the file, I jumped straight
with data sizes on the order of hundreds of gigabytes or more. to as many as one megabyte ( 1 kilo (bs=block size) times 1 kilo
With journalling, the filesystem checker simply checks the log (seek length) ). At that offset, I wrote one kilobyte ( 1 kilo (bs)
and quickly spots any incorrect operations. So for example, if times one (count) ). During the jump, I wrote nothing. Only after
the journal said “let’s write 100 byte data” but there is no “write I “landed”, does it write something to disc.
complete”, it knows that the write operation was somehow inter- The result? A Sparse file. A file that has holes inside. Just like
rupted or broken. The check operation will redo this operation, a net the fisher uses, from the distance you see it like true solid
complete it, and will fix the fault. wide piece. While in fact it has quite tiny holes here and there.
Also, a Journal does not need to be saved in the same place Why do most filesystems provide this ability? The answer
as the filesystem itself. Take the ext3 filesystem for example. In is to save space. Instead of having a 1 megabyte file with 999
the following example, we convert an ext2 filesystem into ext3 kilobytes of blanks (zero content, not space or similar charac-
at the same time defining where to put the journal: ters), we could just mark them as hole and just use the last 1
kilobyte of space to store meaningful data. Logically, application
# tune2fs -j -J device=/dev/sdb1 /dev/sda1 still has to seek into the edge to get the data. It is a nice way
/dev/sdb1 will effectively be the host of the journal to represent a big file without having to allocate all the blocks
of /dev/sda1. at once initially.
UUID and Labels:
The journal device itself must prepared beforehand using this A device name such as /dev/sda1 is the de facto way to re-
command: fer a partition or certain pseudo device. But this is not the only
way. We can also use a label or a UUID. A label can be thought
# mke2fs -O journal_dev /dev/sdb1 of as a room being named “meeting room” or “guest room”,
note: please read “man tune2fs” for further notice while the UUID is like a social security number for a citizen. A
regarding device journal. label could be any string meaningful to you, while the UUID is
a unique number generated by the filesystem formatter. UUID
Journalling itself does two thing: is guaranteed to be unique across system existence, similar to
how MAC addresses of Ethernet cards are unique. Bear in mind
• metadata journalling, that both are properties of a filesystem, so they only exist after
• data journalling. a formatting session.
Why use a label or a UUID? Device names are not really
As the names imply, you could guess that they log the file’s persistent. For example, if you swap a primary master IDE disk
metadata and its content respectively. In journaling file system, to secondary master, it would be known as hdc now instead of

30 Data Center 2/2011


Understanding Linux Filesystem

hda (before the swap). In this case, every operation that points ible with ext2, ext4 is not compatible with ext3/2 due to several
to it must also be adjusted. Truly a headache for large system new internal designs. This means that ext4 cannot be mounted
management. as ext3. On the other hand, ext2 and ext3 filesystems can be
To solve that, label them and point to it. First name the file- mounted as ext4. (http://en.wikipedia.org/wiki/Ex4).
system of hda1 as “/data”: The most notable features of ext4 are:

# e2label /dev/hda1 “/data” extent-based


persistent pre-allocation
Then mount it: delayed allocation

LABEL=/data /data ext2 defaults 0 0 By using extents, ext4 will allocate a fair amount of physically
contiguous blocks. The old method, block mapping, does that
Now try to reattach the disc into other channel. Linux can still block by block. When dealing with large file size, extent is su-
mount it because all it cares now is that there is a partition perior since blocks are already contiguous so writing could be
named “/data”. done in one sweep.
For UUID, you simply grab it from filesystem specific tool such Information kept in file’s data is also simpler since extent
as dumpe2fs: represents more than one block. Where it is used to take 128
records to represent the whole file mapping, it might now be re-
$ sudo dumpe2fs /dev/sda7 | grep -i uuid duced down to 2 (two) or 4 (four) records only.
dumpe2fs 1.39 (29-May-2006)
Filesystem UUID: 38bf953a-3e12-429b-867a-833966567793 Persistent pre-allocation is a way to reserve blocks in a disc
before data is actually stored. This was once done by writing
Below is a snippet of grub.cfg that refer to a root filesystem by many zeros to disc blocks. Now this is fully managed by the
using UUID: Linux kernel and this behaviour cannot normally be changed by
a user. In ext4, via the fallocate() system call, developer can de-
linux /boot/vmlinuz-2.6.32-28-generic root=UUID=38bf953a- cide how many blocks to reserve at the time it is really wanted.
3e12-429b-867a-833966567793 ro quiet splash In general, pre-allocation helps a user get contiguous blocks
which means faster data access. Extent and pre-allocation are
good allies here.
Discussing various filesystems Delayed allocation works by delaying write access until the
Now let’s discuss some characteristics of different filesystems: very last moment possible. By doing so, merge operations are
expected to happen more. The final effect is fewer physical ac-
- ext2 (second extended file system) cesses and reduced head seeks.

This was previously the most popular filesystem. According to 1. XFS


Wikipedia (http://en.wikipedia.org/wiki/Ext2), the ext2 develop-
ment was done to fulfill the weaknesses of the original ext file Created by SGI, it was an advanced file system at the time it
system which lacked separate access controls. was invented around 1993. It was originally the filesystem for
Today, and compared to other filesystems, ext2 is the simplest the Irix systems but was then ported to Linux circa 2001.
one. Even so, it might be the fastest one due to its simplicity. The Besides journalling and many of the other features I already
downside of this filesystem is that a power failure, kernel bug, or covered in the extended file systems, it has additional features
other disturbance can make lead to inconsistent or even dam- such as:
aged data. Thus it is something to carefully consider as these
days, reliability is just as important as throughput. • Direct I/O
• Guaranteed rate I/O
ext3

A journalling version of ext2. It was introduced to enhance ext2


and has the ability to do online growth. When it was introduced
in 2001, online growth was still a long-awaited feature in ex-
tended filesystem types.
Not long after its introduction, many Linux distributions adopt-
ed it as the default filesystem. It pushes ext3 further as one of
the most widely used filesystem. Most people like it due to the
fact that you could easily migrate ext2 partition as ext3 in-place
without losing any data. And ext3 too can be mounted as ext2
and everything will still work seamlessly. In this mode, it is only
the journalling features that are not used.

1. ext4

Still unsatisfied with ext3, Linux kernel developers moved for-


ward and created ext4. Unlike ext3 which backward compat-

Data Center 2/2011 31


STORAGE

• Snapshots cation Squid, or in log folders. Normally we would not ex-


• Online defragmentation pect to find any executables in there, would we?
• sync: do write operation synchronously. In other words,
Direct I/O is a feature where the I/O will bypass most of filesys- suppose there are three (3) write-operations queued to
tem code path and go straight to the block device. Some appli- three (3) different disc sectors. In this situation, the subse-
cations need this to enhance speed. quent write-operation will only be allowed to start if the pre-
vious operation has already been completed and the data
Mount flags has already been written. It’s important to remember that
You can alter the way filesystem behaves by passing certain a completed operation doesn’t necessarily imply the da-
flags. You do this by passing the -o parameter to the mount ta has already been written, since by default a partition is
command: mounted in async mode (the opposite of sync).

# mount -o remount,noatime /dev/sda2 /opt Best practice examples


Now it’s time to combine all of the things we have talked about
Please notice the remount flag. It is a way to change filesystem and design a partition structure with a corresponding filesystem.
property without doing unmount, so in-flight I/O operation isn’t First, a couple of basic rules:
interrupted.
Some flags that you can pass during mount are: • the /boot partition should be isolated into a separate par-
tition. If possible, make it as the primary partition to deal
• noatime: don’t update access time. Every time you access with any potential legacy boot loaders.
a file or directory, the atime field of it is updated. You can
imagine they’re like being “time-stamped” every time you As for the filesystem, a simple one like ext2 is enough. We don’t
read or write it. really need a journalling option as most of the files are read-only
once the boot process begins. What if corruption occurs? Well,
On one hand, this can be valuable as you know exactly when restoring from a backup or reinstalling from the package manag-
the last time you or somebody interacted with this piece of data. er should fix the situation quickly. Another benefit of using ext2:
The downside is this adds a fair amount of I/O overhead. Under almost all known bootloaders support it very reliably.
certain circumstances, however, the additional I/O overhead
might not be good for your server. • the /tmp partition should be of a tmpfs type. The contents
Noatime comes to rescue. It simply turns it off, period. Mind are very likely unimportant and safely wiped out upon re-
you that this is done on a per file system basis: you can’t choose boot. This enhances overall system perfomance since the
which file to be noatime-ed and which one is not. Therefore, the files are stored in volatile RAM and it is much faster than
partition layout becomes quite important. accessing a physical disc. Most Linux distributions already
do that for you. Check if the content of your /etc/fstab con-
• Relatime: if access time is updated after last modification tains the following line:
time, then it will not updated again until the next modifica-
tion. tmpfs /dev/shm tmpfs defaults 0 0

It’s a moderate choice if you still need an access time record • Whenever possible, use an extra disc and put the swap
but don’t want to be burdened with too many inode updates. partition on it. By doing this, paging and file I/O operations
You might need this field for occassions where a mail client can happen concurrently. Of course, paging will still slow
relies on access time to determine whether new e-mail has ar- down system perfomance, but this way, paging and I/O op-
rived or not. erations will go into their own data channels.
Fortunately, the latest distro versions already use this flag
when mounting all of the filesystem (except swap of course). Here is the filesystem suggestion for a webserver:
Prepare a separate partition for /var/www as this is the lo-
• nosuid: suid (executables marked to be running on behalf cation most distros choose to place their web content. Mark
of its owner) effectively disabled. A quick way to lessen it as noexec if you are absolutely sure you don’t need to have
system compromise due to improper permission assign- anything other than static HTML in there. For scripts like PHP,
ment I suggest creating another partition and setting it as nosuid so
that no scripts are accidentically run as root.
As we all know, a root-owned suid binary means it runs as root A filesystem such as ext3 is a good choice here. It’s a com-
without the need for the usual privilege escalation first (such promise between simplicity and data safety.
as sudo or su). As an example, the passwd binary has to allow
some administrative delegation to let a user to change his/her • Database server:
own password.
However, if the binary is not properly audited and has a se- Choose a journalling filesystem whenever possible. Specifically,
curity hole (such as a buffer overflow), a well-crafted attack will pick one that uses delayed write (such as XFS) and/or extent
“promote” one as root very easily. based (such as ext4) operations. Delayed write, aside from the
benefits listed before, would route subsequent read operations
• noexec: forbid program execution. It’s probably a good during a high hit rate to the page cache. My personal experi-
idea to put this flag in a data-only partition, such as the ence shows that recently written data is also likely data that is
one that acts as a cache area for the caching proxy appli- soon to be read. Therefore, holding them for awhile in memory

32 Data Center 2/2011


Understanding Linux Filesystem

cache is good idea. As for extent-based filesystems, they might area where reiserfs shines. [[It’s too bad reiserfs is no longer
help reduce fragmentation. Tables that hold big entries will re- really under development these days]].
ally gain benefit from extent-based filesystems. Using both de- As an alternative though, you can try and test XFS. Although
layed-write and extent-based filesystems would enhance read it’s still in the journalling filesystem category, its capability to
and write capabilities. handle massive I/O operations is impressive.
Note that some databases such as Oracle perform direct • File Server
I/O. This means any virtual file system and filesystem spe-
cific functions would be bypassed and data will go directly A setup that falls under this category is anything that serves
to the underlying block device. In this case, no matter which files using any kind of protocol: ftp, rsync, torrent, http, revi-
filesystem you choose, the perfomance would be fairly the sion control (for example git, subversion, mercurial, CVS), SMB/
same. In this scenario, using a striping RAID setup will en- CIFS and so on. Usually file servers are used as private serv-
hance speed. ers but they could also be public servers for the entire Internet,
for example.
• Web proxy/cache
Whatever the case may be, be prepared for non-stop and
What I am referring to here is a machine that does any kind of massive concurrent read access. Or, as is the case for revi-
caching, be it http proxying, ftp proxying and so on. Also, this sion control systems, massive write or upload access. You also
applies to forward or reverse proxies. would require a filesystem that could help you perform back-
One important characteristic of these workloads is that the ups.
content is safely thrashable at any time. To elaborate more, To meet all these needs, your choice might be the ZFS or Btrfs
you don’t really need any kind of data safeguard, since you are filesystems. ZFS is a filesystem ported from Solaris which has
simply dealing with a cache. proven to be reliable. It has a nice feature called deduplication
In this situation, what you need to focus on is I/O throughput, (or “dedup” for short). Dedup-ing means blocks which have the
while at the same time lowering I/O latency. Reiserfs was once same exact content are merged as one. As a best-case sce-
a good choice . When dealing with small-to-medium sized files nario, suppose you have a file “B” which is an exact copy of file
(roughly under 1 MiB) it was quite good. Mostly, in scenarios “A”. With ZFS, both are stored as one file and two separate
where files are largely distributed to a stack of directories is the metadata files are created pointing to the occupied blocks. This
equals a 50% savings on storage space!
Dedup is appealing because these days we are looking at
more and more data which is actually duplicated over and over
in some places.
Outside dedup, you can consider on-the-fly compression.
Btrfs provides this feature along with snapshotting, online de-
fragmentation and many others. Think of it as gziping every file
in the filesystem whenever you finish writing a file and gunziping
it before reading. Now it’s the filesystem that does this for you.
The current compression methods supported are lzo and zlib.
Zlib is one that is used by gzip.

About the Author


Mulyadi Santosa, 32 years Indonesian. As a non permanent lecturer, he te-
aches about computer science topics in several universities. He is also a fre-
elance IT writer and a start up owner that trains people to get to know mo-
re about Linux in various topics. He can be contacted at mulyadi.santosa@
gmail.com. Visit his blog http://the-hydra.blogspot.com to discover his opi-
nions over various topics.

Credits:
The author owes a lot to Greg Freemyer for his thorough review and feed-
back. This article would not have become any better had it not been for his
voluntary assistance. Thanks Greg!

Reference:
• http://www.unix.com/tips-tutorials/20526-mtime-ctime-atime.html
• http://lwn.net/Articles/322777/
• Ext2 http://en.wikipedia.org/wiki/Ext2
• Getting to know the Solaris file system Part 1 – Sunworld – May 1999
• http://www.solarisinternals.com/si/reading/sunworldonline/
swol-05-1999/swol-05-filesystem.html

Data Center 2/2011 33


Storage

Companies have their


sights set on storage
With the high volumes of data companies generate, collect
and need to store nowadays, storage is priority focus area
– and increasingly so – even for small businesses.

C
ompanies are also starting to see storage differently. work failures and power outages. Although he points out that
Rather than just simply being a place to copy files, stor- with smart planning, companies can overcome the limitations
age has become the cornerstone of effective high avail- of smaller, non-redundant storage units.
ability and disaster recovery. “Desktop storage units can be, and are, used in the enter-
“All manner of companies are now moving towards central- prise. With smart planning, for instance by adding more network
ised storage. Without centralised storage, you just don’t get cards, isolating network traffic to a single switch, and adding a
high availability or effective disaster recovery,” says Herman number of units in an array, companies get the effect of enter-
van Heerden. prise storage but at a lower cost,” says van Heerden.
Van Heerden says there are other factors that contribute to “What you need is a smart storage solution that is ideal for
the increasing focus and spend on storage – the recession be- adding storage to networks or servers in both physical and vir-
ing one of them. tual environments.” he concludes.
“Focusing limited IT spend on storage just seems to make
good business sense. Compared with planning for new servers
to carry additional load, investing in a NAS/SAN units, which
simplify storage use, allocation and management, is a much Herman van Heerden
more cost-effective way of keeping existing business servers Herman van Heerden is from NEWORDER INDUSTRIES (Pty) Limited, a South
running and allowing them to grow. Africa-based technology company that specialises in enterprise risk mana-
“Purchasing storage also gives companies a greater sense gement, virtualisation and storage solutions.
of value for money because it’s physical hardware. Companies
can see the value in something that’s tangible. Software on the
other hand, in the face of shrinking budgets, now has a ques-
tion mark over whether or not it will really make a difference if
it is implemented.
“Furthermore, with the advances in technology, even smaller
businesses can have enterprise-size storage. The develop-
ment of fast SATA and cheaper SAS technologies and the rapid
growth of disk sizes allows for super fast and very cost effective
storage units.”
Another catalyst for increased storage spending, according
to van Heerden, is server virtualisation.
“Virtual environments are simply not viable without a sound
networked storage solution to allocate space for services across
multiple virtual machines. As the adoption of virtualisation tech-
nologies is widening, so the need for network storage solutions
is growing,” he says.
He classifies network storage units into two categories; enter-
prise storage units that allow for high throughput, making them
ideal for server virtualisation or carrying the heavy load for large
databases or mail boxes, and desktop storage units, which run
a little slower and are suited for data back up.
Typically, the higher the price tag, the more redundancy is
built into the unit, allowing it to cope with hardware failures, net-

34 Data Center 2/2011


Security

Powering Data Centers


– Lessons Learned
Any decent security book will have a few chapters on physical
security and power considerations. When large-scale modern
data centers are being built, the security and power are properly
designed and built-in from the beginning. But often, small to mid-
size centers are retro-fitted into pre-existing facility, and things are
sometimes overlooked.

AC
Power is literally the life blood of any data cent- Lessons learned: It was simple – the UPS had been tested
er. Bad power, such as sags, spikes, noise, and a few days earlier, during the final phases of the move, and was
surges, can cause systems to fail. And with left in a “bypass mode” – handy if the UPS is the cause of the
no power, all of our servers are just very heavy, expensive problem, unfortunate if the local power goes out. We would de-
chunks of metal. All critical systems should be on uninterrupt- velop a checklist for the testing of the UPS, and would include
able power supplies (UPS), and it would probably be a good the final step of “Switch UPS to ONLINE.” We also needed to
idea for you to look up the difference between an online and Install power failure lights, and have flashlights handy. Since
standby power UPS. You should also review voltage, am- I had neither, I had to stumble in the dark to find my way out
peres, and watts while you are it. The need for clean power of the center, and get into the UPS room to get to back online.
cannot be over emphasized. In December 2010, a Toshiba It was only after I got to the locked UPS room that I realized
chip fabrication plant experienced a 0.07 second power out- that the keys for it were still in the darkened data center. We
age, and it caused a 20% drop in shipments in the first few would also invest in a glow–in–the–dark keychain.
months of 2011: (http://online.wsj.com/article/SB1000142405 Good judgment comes from experience, and experience
2748703766704576009071694055878.html). comes from bad judgment. – Barry LePatner
While I could just rattle off a bunch of rules for data center
security and power (Install alarms, put all equipment on UPSs, Scenario: Shortly after having the main UPS at a secure
etc.), I thought it would be more interesting to just share with data center repaired, a wide–area power outage was sched-
you some war stories from my past 30 years of experience uled for the installation for a new transformer from 3AM to 5AM.
working, building, and managing data centers in universities, I decided to keep the facility open and make sure that the
banks, and military installations. No matter how carefully one UPS would work, and that all our systems were properly
plans, mistakes, accidents and oversights happen. Names, protected. At 3AM, the power was cut, and with the excep-
dates, locations, and other identifying information have been tion of a wall clock (purposely plugged into a non–protected
changed to protect the innocent, the guilty, and the oblivi- outlet), all the systems were still online. I moved the clock
ous. to a protected outlet, and started the process to secure the
Experience is a tough teacher, it gives the test first and the facility until the morning. When I called the security monitor-
lesson afterwards. – Vernon Law. ing center, I was told that because of the wide–area power
outage, the alarms wouldn’t work, so I could not secure the
Scenario: I sat in the new finished data center, with the rack facility, and would have to remain there until the power came
PC servers, and Sun Blades performing their modeling and back on.
simulation tasks. For a variety of reasons, we had just finished
moving the data center about 150 feet to the other side of the Lessons learned: It is not enough to make sure that all
building. As I sat behind my desk, I suddenly realized how quiet your equipment is protected in a power outage situation, but
the room was. There was no sound from the servers or the air any external/remote services you are using are also protected
conditioning unit. And it was pitch dark. Not a single LED could by redundant power systems. Also make sure that any com-
be seen in the very dark room. Turns out there were local–area munication links are powered as well.
power outage, but since this facility was running on a very large Experience is that marvelous thing that enables you to
UPS, with a generator, I was left literally in the dark as to what recognize a mistake when you make it again. –Franklin P.
had gone wrong. Jones

36 Data Center 2/2011


Powering Data Centers – Lessons Learned

Scenario: We were having problems with a server randomly Scenario: Back in the days of the huge glass CRTs, the man-
rebooting. We tested the UPS, replaced the power supplies, ager of the center didn’t want to plug the CRTs into the UPS
and checked the motherboard and RAM, but usually every few since it would reduce the UPS run time. The center suffered
days the server would simply reboot. One day, when returning a power outage, and while the servers remained running, we
from lunch, we saw one of our supervisors exiting the server had no way to log into them and do a clean shutdown. We
room. When we walked in we saw the server booting up. We tried to plug the monitors into the UPS, only to find out there
approached the supervisor and asked him about it. He told were no free outlets on them, and because of the wiring mess,
us that he has been having problems with his PC, and if we it was impossible to determine what could be unplugged to
weren’t around, he would simply reboot the server to “solve” plug in the monitors. Unplugging random cords would be like
his problem. playing “Russian roulette” with the servers.

Lessons learned:We installed additional physical barriers Lessons learned: The entire system, including monitors,
between the supervisors at the power button. We normally ran should be on UPSs. Install UPS monitoring software for con-
the server racks without the glass doors, so we re–installed trolled shutdowns on low battery conditions. Don’t use all of
them, and locked them. We also re–ran the AC power cords the outlets on a UPS, leave some empty for expansion. Label
to secure them. The keys for the cabinets were then properly the power cords.
controlled and the supervisor was not given access to them. Experience is the name everyone gives to their mistakes.
You cannot create experience. You must undergo it. – Albert – Oscar Wilde
Camus (1913 – 1960)
Scenario: One of the administrators was a smoker, and
Scenario: We had an emergency power off switch (a big didn’t like having to leave the secured facility to smoke, so
red button) on a wall of the data center. When it was de- he would often just open a security door (it was not alarmed
signed, there was a nice clear space around the button, but during the day), and stand outside, often walking around the
as the data center grew, the space squeeze began. The area corner for the building for a smoke break. He would prop open
around the kill switch became the location for printer paper the door (since it could not be opened from the outside). One
storage. Late one evening, as paper was being stacked, a box day, while he was out smoking, a construction worker, do-
of paper was slid across the stack, and came in contact with ing some building maintenance, needed an electrical outlet,
the emergency power off button. The lights in the facility stayed saw the open door, walked in and plugged his high–powered
on, but all the power to the servers went down. The night device into an UPS outlet. When he started the equipment,
shift didn’t know how to reset the power (it had never hap- it overloaded the UPS and knocked on of the server offline.
pened before), but was directed over the phone to the power
room in the basement, which was locked. The data center Lessons learned: You can’t prop secured doors open, not
director had to come in, contact building maintenance, sign even long enough to have a smoke (or two). Aside from the
out the keys, and get an electrician to come over and reset obvious power problems here, it only takes seconds to install
the UPS and bring the power back on. keystroke recorders, or plug other devices into the network, or
do other damage. Keep the doors closed.
Lessons learned: Be careful of big red buttons. More im- Power corrupts. Absolute power is kind of neat. – John Leh-
portantly, make sure that your crew is trained on emergency man, Secretary of the Navy, 1981–1987.
procedures, have access to the required locations, and know
who to contact. Simulated drills should be conducted. But the best kind of power is nicely conditioned and protected
Experience teaches slowly and at the cost of mistakes. AC power flowing into your servers. So, test your UPS, make
– James A. Froude sure devices are plugged in to proper outlets, make sure your
power infrastructure is properly protected, and that perform
routine maintenance on it. Make sure your people are prop-
erly on basic security and know how to handle power related
emergencies.

About the Author


BJ Gleason has been teaching computer science and security classes for
25 years, and has been working to data centers as soon as he knew what
they were. He first came to Korea in 1995 to teach with the University of
Maryland, and is now also working for Group W Incorporated as a Sys-
tems and Security administrator. He has served as a consultant to Exxon,
Unisys, Atari, and the Federal Reserve Board, among others. He holds an
Educational Specialist (Ed.S) in Computers in Education, as well as degrees
in Computer Science, Asian Studies, and Criminal Justice. Mr. Gleason has
numerous computer certifications, including Microsoft MCSE, Cisco CCNA,
SANS GCFA and CISSP. He is also a Certified Computer Examiner from the
International Society of Forensic Computer Examiners.

Data Center 2/2011 37


Cloud Computing

Facebook Cloud Computing


SaaS, PaaS, and IaaS:
Standards & Practices
It’s not easy supporting 500+ million friends.

WHAT YOU SHOULD KNOW: WHAT YOU WILL LEARN: DEFINITIONS

• Cloud computing concepts • Updates to the LAMP stack What is LAMP?


• Application development • Advantages of MemCacheD LAMP is an acronym for a software stack that
• LAMP architecture • Apache hadoop cluster consists of the following: Linux, Apache,
• Unix • Apache hive MySQL, and PHP.
• Database concepts (RDBMS) • RDBMS redistribution

INTRODUCTION CACHE
We are going to examine Facebook’s cloud computing envi- MemCacheD sort of sneaked into the LAMP stack. Here is
ronment. It’s not easy supporting 500+ million friends so how where MemCached fits into the stack:
exactly do they do it? First we need to take a look at the stack
that is now embraced by most of the world’s largest websites Operating System
(including Facebook). The term LAMP was first used in 1998 Webserver
in a German Computing Magazine in the hopes of promoting [CACHE: MemCached]
competition against Microsoft Windows NT. Database
Programming Language
LAMP
LAMP maps to the following functions: LANGUAGE
These days the P in LAMP (which use to exclusively mean
• Linux (Operating System) PHP) can mean anything-- Pearl, Python, and even Ruby.
• Apache (Web Server) Even Apache is not immune from radical reinterpretation.
• MySQL (Database Server) Some are swapping it out for EngineX or Lighttpd (pro-
• PHP (Programming Language) nounced as “Lightie”).

When all elements of this stack are working together, you can FACT
build truly scalable websites. The new thinking is to use open source components to build
scalable and successful websites.
INSTALL
LAMP is pervasive and easy to install. In some environments SCRIPTING LANGUAGES
like Ubuntu or Debian you can get started with LAP in as little The thing to remember when using scripting languages is that
as a single command line such as: they perform poorly in production environments.

apt-get install FACEBOOK & PHP


apache2 php5-mysql Facebook uses PHP extensively. The entire front end of Fa-
libapache2-mod-php5 cebook is written in PHP. Let’s take a closer look at what
mysql-server Facebook has done around PHP from a performance and
open source perspective to really help it scale. After all PHP
LUSTER is a scripting language and that means you get some instant
For the most part LAMP is considered ‘old’ since it has not engineering and productivity benefits in terms of being able
been updated to reflect the latest thinking. LAMP has been to create something, refresh it, and see it in your browser;
static for over 10 years and todays demanding environments but again, be prepared for some unacceptable performance
and even more demanding younger developers want more from hits when you are running it in production.
the stack.

38 Data Center 2/2011


Facebook Cloud Computing SaaS

HIPHOP FOR PHP They don’t use joins, complex queries, or pull multiple tables
Facebook’s solution to this problem was to go custom and cre- together using views or anything like that. One would believe
ate HipHop for PHP. that the database teams don’t care about the relational aspect
of the database. This is not true. They care a great deal, but
FACEBOOK CUSTOM SOLUTION they use the relational database capabilities in a nontraditional
Facebook created HipHop for PHP. Development took about way.
three plus years and was open sourced in 2010. HipHop for
PHP takes: FACT
At Facebook the fundamental ideas of relational databases
PHP SourceCode -> Transforms it into C++ -> have not gone away. They are just implemented in nonconven-
Compiles it using G++ -> Produces an executable binary. tional ways.

FACT ARCHITECTURE
Facebook has achieved a 50% reduction across the web tier When you look at the Facebook database architecture, it con-
using HipHop for PHP. sists of the following layers:

1) Web Servers
TIP 2) MemCacheD Servers (distributed secondary indexes)
You should evaluate HipHop for PHP for use in your enter- 3) Database Servers (MySQL - primary data store)
prise.
The folks at Facebook are running the innovation engine at JOINS
100% and have implemented two key changes around the LAMP When it comes to using joins, they use the webserver to com-
stack. The areas of key innovation are taking place around: bine the data. The join action takes place at the webserver
layer. This is where HipHop for PHP becomes so important.
• Databases Facebook web server code is CPU intensive because the code
• Large Data Analysis runs and does all these important things with data. This would
traditionally be handled by the relational database.
Data analysis is at the heart of understanding what a friend
“user” is doing on the Facebook website. People are trying to HISTORY
find areas of efficiency wherever they can even if that means at These are database issues that people talked about 30 or 40
the database level. Take for example the NO-SQL movement. years ago, except they are now being discussed and imple-
These days you have so many databases to choose from and mented at different layers in the stack. If you’re using MySQL,
the recommendation is that you choose the one that best suits or NoSQL, you’re not escaping the fact that you need to com-
your needs. bine all that data together-- you’ll still need a way to go and look
it up quickly.
ARCHITECTING THE WOLRDS LARGETS WEBSITE So if we look at the NoSQL technology stack, there are a
Facebook stores all of the user data in MySQL (except news- number of No-SQL families of databases:
feed) and Facebook uses 1000’s of nodes in MySQL clusters.
For the people whose responsibility it is to manage the data- • Document stores
base portion of the website, they don’t care that MySQL is a • Column family stores
relational database. • Graph databases
• Key vale pair db’s

Then you ask yourself, what problem am I trying to solve?


What family of No-SQL databases am I going to use?
Look closer. There are a number of differences inside. Take
one database category:

Cassandra & HBase


They have a lot of tradeoffs:

• From a consistency perspective


• From a replication perspective

So at the end of the day you need to return to your original


question:

• What is the problem I am trying to solve?


• What is the best database for me to use?

Facebook stores most of its data inside MySQL. Facebook


stores about 150 terabytes (1024 TB) of data inside cassandra
which is used for inbox search on the site. About 36 petabytes

Data Center 2/2011 39


Cloud Computing

of uncompressed data is stored within a apache hadoop clus- SILVER CLUSTER


ter overall. They then replicate the data to a second cluster which they call
the silver cluster. The silver cluster is where people can run ad
HADOOP & BIG DATA hoc queries.

Facebook runs a hadoop cluster with 2,200+ servers with about FACT
23,000 cpu cores inside of it. Facebook is seeing that the amount Facebook leadership, product management, and sales (a group
of data they need to store is growing rapidly. of 300 to 400 people) run hadoop and apache hive jobs every
single month in an attempt to make better business decisions.
FACT Facebook created this level of data access to help people of the
Facebook storage requirements grew by 70x between 2009- organization make better product decisions because the data
2010. is readily available.
By the time you read this article, Facebook will be storing over
50 petabytes of uncompressed information which is more than APACHE HIVE
all the works of mankind combined. All of this is a direct result of Apache hive is one of the other technologies that Facebook
increased user activity on Facebook. Currently over 500M peo- uses to provide a SQL interface. This technology sits on top of
ple use the site every month and over half use it every day. hadoop and provides a way for people to do data analysis. The
great thing is that all of these components are open source-
DATA ANALYSIS Things which you can use - Today!
Facebook has learned a lot about how important it is to con-
duct effective data analysis is the running of a large successful SUMMARY
website. The LAMP stack is evolving. You will now see the introduction
of a cache layer. The number of choices you have at the data-
KEY base layer is also increasing and the separation of relational da-
You need to look at things from a product perspective. tabase functions has also occurred. Facebook, like most com-
Take into consideration a common problem of corporate com- panies has decided to implement technologies from a number
munications. What is the most effective email that you could send of different providers to avoid lock-in to any one solution. Data
to users who have not logged on to the site for over a week? In analysis will continue to grow in importance (something global
other words, what email type would be the most compelling? enterprises have known for a very long time even though they
may have failed to act or may not have had enough time to act
• A- A email that is a combination of text, status updates and on that knowledge).
other data. Great tools are now available-- consider evaluating apache
• B- A email that is simple text saying “hello, there are 92 hadoop, apache hive, and scribe. Use these tools to really un-
new photos. derstand what’s happening at your site. Follow Facebook’s lead
and evaluate developments in the open source space-- Face-
The data shows that the ‘B’ email performs 3x better. book is committed to this and has even created an open source
website which can be found at facebook.com/open source.
ANALYSIS
Analysis is the key. Data helps the people at Facebook make better
product decisions. You can see clear examples of this in the emails RESOURCES
that they send raking ‘news feed’ and the addition of the ‘like’ but-
ton. Analysis of large data helps product managers understand • Facebook Open Source http://facebook.com/opensource
how the different features affect how people use the site. • The Delta Cloud Project http://www.deltacloud.org
• GoGrid http://www.gogrid.com
SCRIBE • Amazon Elastic Compute Cloud (Amazon EC2) http://aws.amazon.
com/ec2
Facebook uses an open source technology that they also cre-
• Microsoft Cloud Services http://www.Microsoft.com/Cloud
ated called scribe that takes the data from 10’s of thousands of
• NIST - National Institute of Standards and Technology http://www.
webservers and funnels it into the hadoop warehouse. nist.gov/index.html
• Book - The Challenge of the computer utility by Douglas Parkhill
PLATINUM CLUSTER • Book - The Mythical Man-Month: Essays on Software Engineering by
The initial problem they encountered was that too many web- Fred Brooks
sites were trying to funnel data into one place, so hadoop tries • InterviewTomorrow.Net - Helping America get to work. Free access
to break it out into a series of funnels and collect the data over to the 2011 executive recruiter database
time. The data is then pushed into the platinum hadoop clus-
ter about every 5 to 15 min. They also go and pull in data from
MySQL clusters on a daily schedule. ABOUT THE AUTHOR
The apache hadoop cluster is at the center of the Facebook Richard C. Batka - Business & Technology Executive. Author.
architecture and is vital to the business. It’s the cluster that if Mr. Batka is based in New York and provides advisory services
it went down, it would directly affect the business. Facebook to a select group of clients. Mr. Batka has worked for global le-
will need to focus redundancy efforts in this area. Currently it is aders Microsoft, PricewaterhouseCoopers, Symantec, Verizon,
highly maintained and monitored so everyone is taking a lot of Thomson Reuters and JPMorgan Chase. A graduate of New
time and care considering every query before they run it against York University w/ honors, he can be reached at [email protected] or follo-
the cluster. wed on Twitter at http://twitter.com/RichardBatka.

40 Data Center 2/2011


Backup

Data backups just


don’t cut it anymore
Few companies can afford to lose a day’s worth of data, let
alone a month or more. Salary information, strategy documents,
project information, confidential client information and even
email communication lost can seriously disrupt business and cost
companies money.

A
lthough most companies recognise the necessity of dis- ery, you should keep off-site copies of data and full systems at
aster recovery plans and data restoration procedures, a secure data centre. If data is not backed up off-site, it is not
some are still relying on old backup and recovery meth- backed up.
ods that can affect the prognosis for a full recovery. “It is important to have stringent security measures in place
“The method, frequency and medium used for data backups to ensure the protection of your data at the off-site data centre
can make all the difference between a complete or no recovery as well. While no data centre will allow just any one into the
from a disaster. Companies with one foot in the dark ages, when server room, you are not exempt from hackers gaining access
it comes to backups, and the other in ‘the now’, where there are from outside. Your storage and servers must be secured with
lots of data systems at play and business moves at a frenetic your own security systems. A shared firewall is like a public
pace, will struggle and take longer to get up and running again toilet; you can never tell when someone will leave the seat up.
in the event of a disaster or major systems failure. So own your own security,” he stresses.
“Disaster recovery is more than just doing backups so that For the best prognosis for disaster recovery, van Heerden
data can be restored should disaster strike. Disaster recovery recommends the following:
is also about restoring entire systems quickly to ensure busi-
ness continuity. The fact is that data backups just don’t cut it • Clones of full systems instead of file backups. Cloning is
anymore,” says Herman van Heerden. the new tape. Ideally, companies should first clone and
He says that although incremental backups, a method on which then incremental data backups, on disk, should be done to
many companies rely, may sound good enough, they’re not. complement system clones.
“Companies need to question how long it will take them to re- • Working backups housed on-site as well as off-site in a se-
store a system, never mind a whole host of systems, if they only cured data centre.
do incremental backups. A few hours might be digestible but • Secure and encrypted electronic communications between
a day is too long, and reality it can take much longer than that. on-site and off-site storage. Industrial espionage is real,
This is a waste of time, productivity and money,” he says. and not just a thing for late night B-rate Hollywood movies
The frequency of backups is another cause for concern. He anymore
says that some companies conduct backups once a day while • Data backups of differences (updates) rather than full data
others dare to walk on the wild side and only backup once replicates over the wire.
a month or quarterly. Obviously, the greater the gap between • Virtualisation of systems.
backups, the greater the risk of losing bigger amounts of data. • Smart planning of the use of storage systems like SANs
Then there’s the medium for backups that companies use. and NASs to allow 50% of the capacity for snapshots, and
Tape backups were once king and some companies still use the off-site backups being fed from these should be almost
them. Inherently, tape backup software only caters for file back- real time snapshots.
ups and not full systems, and van Heerden says companies • Enough technology resources on standby to enable sys-
that have actually successfully restored from tape are few and tems to get up and running again even in the event of
far between. “Companies still using tapes for backups should a complete infrastructure failure or loss. Hardware for this
switch to disk,” he advises. purpose can be hired.
Failure to have off-site backups is another shortfall in some
companies’ disaster recovery plans. “Having copies of your sys-
tem on the same site that your systems run is good for a quick
restore if hardware crashes or software malfunctions. However, Herman van Heerden
if an event like a fire or theft takes out your whole business and Herman van Heerden is from NEWORDER INDUSTRIES (Pty) Limited, a South
your backups are destroyed or stolen along with your systems, Africa-based technology company that specialises in enterprise risk mana-
you’re in a bit of trouble. If you’re serious about disaster recov- gement, virtualisation and storage solutions.

42 Data Center 2/2011


Storage Virtualization

Virtual Storage Security

Doom’s day is coming are you ready? Virtual Storage Concerns


Russell Kay describes storage virtualization as a new layer of
software and/or hardware between storage systems and serv-
ers, so that applications no longer need to know on which spe-
cific drives, partitions or storage subsystems their data resides,
Servers see the virtualization layer as a single storage device,
and all the individual storage devices see the virtualization layer
as their only server. There are basically two general types: block
virtualization and file virtualization. Block virtualization technol-
ogy consists of the Storage Area Network (SAN) and the Net-
work Attached Storage (NAT). SAN devices typically implement-
ed RAID (Murphy, A, n.d., p.2). File virtualization conceals the
static virtual pointer of the file from the physical location which
allows the back-end network to remain dynamic. From a se-
curity perspective, this virtualization provides another layer of
security but it also provides complexity. One school of security
professional would make the argument that complexity itself is
vulnerability. Littlejohn Shinder, D. (2009, March 25) argued that
Are you prepared for the business doom’s day? How long can virtualization makes security more complicated because it intro-
your business survive if it stopped its operation? Most impor- duces another layer that must be secured. Crump (n.d.) argued
tantly, if your data is lost will you be able to continue operating that managing one large storage system is always easier then
or fulfill your legal obligations such as filing your tax forms? A managing five or six storage systems which is especially true
Five and Dime located in a small strip mall was set on fire that in the “shared everything” world of virtualization (p.3). On the
spread and burned down two other small business stores. A other hand, Dan Simard (as cited in Gruman, 2008, March 13),
dental clinic also located next to a newspaper office burned to the commercial solutions director at the U.S. National Security
the ground because an irate reader set the newspaper office Agency, the electronic intelligence and cryptographic agency
on fire. Before the fires burned down these establishments, you believes that virtualization needs protection with a new layer-
ask them about their business continuity plan and they would ing approach by having an independent layer handle security,
not take you too seriously. The Boston Computing Network’s so that even if an OS has security flaws, a separate layer that
article “Data Loss Statistics” provided the following interesting the OS cannot compromise handles security threats such as
statistics: viruses and worms or implements firewall. The security benefits
in using virtual storage are:
• 30% of all businesses that have a major fire go out of busi-
ness within a year and 70% will fail within five years (Home
Office Computing Magazine);
• 60% of companies that lost their data go out of business
within 6 months after the disaster; and
• 93% of companies that lost their data center for 10 days or
more due to a disaster filed for bankruptcy with one year
and 50% of business that found themselves without data
management for 10 days filed for bankruptcy immediately
(Nation Achieves and Record Administration in Washing-
ton).

One of the possible solutions to help lessen the impact of a


business doom’s day is the use of virtual storage. This article
discusses some of the virtual storage issues that the security
professional may face. Then it takes a closer look at the different
factors in choosing a virtual storage provider because this area
is usually and easily overlooked. Lastly, the article aims to bring
attention to the risks outside of the security principles realm.

44 Data Center 2/2011


Virtual Storage Security

Schofield (n.d.) looked at three essential aspects of data and


• Availability; how they relate to virtualization such as the cloud, data integrity,
• Separation/Isolation; data resilience and data security.
• Containment;
• Recoverability; Data Integrity
Schofield (n.d.) made an excellent point that if a company mi-
Availability grates an application to a virtual storage provider, it is impor-
Kay (2008, October 6) mentioned that availability increases with tant to consider how to backup the data, what backup schedule
storage virtualization because applications are not restricted to to use, how long backups are kept and the trade-off between
specific resources and thus, is insulated from most interruptions. the criticalness of the data and any legislative requirements
Lindstrom (2008, January 3) pointed out that this flexibility is a against the associated costs. He also pointed out that to run a
boon for the disaster recover specialist because maintaining disaster recovery test on a regular basis the aim should be to
replicated environments that are physically separate and cre- restore a fully working copy of the system from a recent backup.
ating images that can be quickly recovered contribute to the In case of virtualization such as cloud solution can occur a lot
overall availability of the resources. quicker than it would in a traditional data center environment
(Schofield, n.d.)
Separation/Isolation
Separation of resources and content allows stronger protection Data Resilience
and reduces the overall impact of a compromise (Lindstrom, Virtualization such as the cloud provides data resilience be-
2008, January 3, par. Separation/Isolation). Application(s) cause it is completely unaffected by the loss of a physical device
can be separated from other application by being run in Vir- or location (Schofield, n.d.).
tual Machine (VM) guest. In addition, virtualization enhances
separation of duties and role base access controls. Butler and Data Security
Vandenbrink (2009, September) pointed out that using a vCent- Schofield’s article also pointed out the three security concerns,
er interface some of the following separation of duties options namely:
can be achieved:
• Server administrators can be only given rights to power on/ 1. Worst mistakes a company can make is to relax access to
off; the company data and allow too many people in;
• Network administrators can be granted patching rights; 2. Need to check with the hosting company if they have regu-
• VMware administrators can be granted only rights to install lar patch schedule; and
new VM but cannot modify them; and 3. The principle behind data security in virtualization environ-
• Auditors can be granted only view only rights (p.6) ment is no different from other kind of hosting. So make
sure the company’s procedures are correct from the outset
Containment and Recoverability and perform regular audits of these procedures and you
Lindstrom (2008, January) described two virtual environment will have covered yourself as much as you possibly can.
techniques as:
One quagmire is that the data is only as safe as where it is be-
1. One technique is creating “sandbox” which is used for risky ing stored. The problem with an isolated data center is that all
activities, ignoring or at least allowing any changes or cor- the “eggs are in one basket” which may have catastrophic effect
responding damage that may occur through attack and if the building burns down or if the building is flooded or expe-
compromise; and rienced a long term power outage. In addition, if unauthorized
2. Second technique of virtual machines is the number of person are able to access the backup data, they would have the
configurations available to create checkpoints which has “key to the kingdoms”. However using the cloud technology, the
the potential for a much quicker recovery than traditional eggs are in different baskets. Biswas (2011, January 20) pointed
systems. This addresses the drawback of the sandbox ap- out that “being on the cloud” means cloud computing is based
proach which is the amount and time and effort to recover
from a compromise

Do you know your Data Centers ?


Remember the days when backups where placed on tapes and
then an employee would bring it offsite and left the tape in the
car to be fried in the hot sun? How about when you need to re-
store the data off the backup tape but failed? How much trust
can you place on the person who is doing the backup?
Data Loss Statistics (n.d.) reported that 34% of companies
fail to test their tape backups, and of those that do, 77% have
found tape back-up failures. These statistics manifest the fol-
lowing concerns:

• Data corruption or loss;


• Hardware failure like the backup tape; and
• Unauthorized access.

Data Center 2/2011 45


Storage Virtualization

out of multiple data centers which are quite similar to the ones 5. Do you use a tape library to back up your servers?
that support companies’ internal resources. Biswas (2011, Janu- This question addresses the legal issue similar to the story of the
ary 20) also made the point that like all data centers, they are dentist office, newspaper office and Five and Dime burning down.
susceptible to security issue such as data theft, natural disas- Harris (2010) made the argument that most of the time, computer
ters, and man-made threats. This is where the trustworthiness of documents are considered hearsay and are not normally admis-
service providers comes in. In GoGrid (2010) white paper made sible in court unless it has firsthand evidence that can be used to
the point that the providers’ industrial background and how long prove the evidence’s accuracy trustworthiness, and reliability (p.
they have been around are important factors in security (p. 3). 898). Roach pointed out that only digital linear tape is admissible
Anne Roach researched the virtual storage providers to deter- in a court of law because of the linear mode of recording data (p.
mine the trustworthiness and provide a framework to determine 4). Regardless of the expense, tape libraries also provide a stor-
the background of the service providers. In Roach’s paper, she age format redundancy that allows data to be recovered even
made the point that the ultimate responsibility for making virtual in a major data disaster when the tapes are stored in a different
decisions is the person who makes the decision to trust critical location from the servers (Roach, n.d., p.4).
information to the cloud and it is the virtual storage providers’
responsibilities for informing end users about their physical stor- 6. What is your total storage capacity?
age facilitates and to help users make educated decisions about This question is a security factor that addresses availability.
where the data or data backups are stored (p.1). In her paper, A very interesting finding was that many companies refuse to
she asked thirty one (31) virtual storage providers the following admit that their virtual storage had a limit.
eight questions:
7. How are your servers monitored?
1. Do you own or lease your servers? Since human monitoring and availability to the health of a vir-
2. Are your servers redundant? tual storage system, it is important to determine if virtual storage
3. Where are your servers/server clusters located? provider offer 24/7 assistance. Roach (n.d.) noted in her study
4. Which vendors do you use for your servers? that at least one provider has a redundant-redundancy system
5. Do you use a tape library to back up your servers? to consistently keep information online which a backup to the
6. What is your total storage capacity? provider’s backup software system (p. 4).
7. How are your servers monitored?
8. How quickly would a damaged server be replaced? (p.4) 8. How quickly would a damaged server be replaced?
Roach (n.d.) did not discuss all of the aspects about physical
These questions can determine the trustworthiness of the vir- vulnerabilities such as smoke or fire damage, cooling system,
tual storage provider. Here are some possible vulnerability that power redundancy and physical security. This question will help
where uncovered from Roach’s survey. decision makers find a reliable service (p.4).

1. Do you own or lease your servers? Roach made very enlightened conclusions that can create a
If the virtual storage provider leases their servers and are unable lack of responsibility and a vulnerability to end-users:
to pay for the lease or if the lease can not be renewed then the
data on the servers must be migrated to a different location or 1. Some companies claim that they manage and maintain
en mass which equals down-time for the virtual storage user or their servers that are actually located far from the physical
possible loss. This question is also critical to determine that if location of the end user.
data needs to be removed from the server that no data remains 2. Power and data redundancy and physical security ele-
on those servers. ments may not be clearly communicated.
3. Some providers provide guarantees such as if data is lost,
2. Are your servers redundant? the client’s money will be refunded. This may be hollow
Roach (n.d.) explains that this question was designed to deter- when the company has little involvement in the protection
mine if virtual storage providers where duplicating the informa- and maintenance of information and the actual data guard-
tion. Her findings revealed that some of the providers felt that ian (p. 13 – 14)
RAID in single location compensated for the need to create
redundancy and at least two providers stated that their sys- The readers are encouraged to read Anne Roach’s paper be-
tems provide for two or more copies of data within their server cause she provided a synopsis of the thirty one (31) virtual stor-
(Roach, n.d., p. 4). age providers she studied.

3. Where is your servers/server clusters located? The Risks of Virtual Storage


Servers that are located in an area of earthquakes, typhoons, Lindstrom (2008, January) paper presented the following two risks:
hurricanes etc should have the proper physical controls to prop-
erly handle these events. In addition, the further the servers 1. Anytime content is shared across systems, a VM’s level of
are away from the company, the less bandwidth is available. isolation is reduced which creates another attack vector for
Transport of data over a larger distance causes more overhead systems that are running sandboxes;
(Roach, n.d., p.4). 2. The virtual disk resides as a file on the system at the hy-
pervisor or host OS level and may be manipulated using
4. Which vendors do you use for your servers? normal file operations. This means that the disk file may
This question helps decision makers decide whether to “put all be opened in an editor and its contents viewed external to
the eggs in one basket” and if so, what happens if vendor goes the VM itself, or it may be mounted as an external drive to
out of business or to use different vendors. some other system. (p. 18)

46 Data Center 2/2011


Virtual Storage Security

Simard (as cited in Gruman, 2008, March 13) said that “graphics world.com/d/security-central/virtualizations-secret-security-threats-
cards and network cards today are really miniature computers 159?page=0,0
that see everything in all the VMs”. In other words, they could • Harris, S. (2010). All in One CISSP Exam Guide (5th ed.). New York, NY:
McGraw Hill.
be used as spies across all the VMs, letting a single PC spy on • Kay, R. (2008, October 6). QuickStudy: Storage vitualization. In Com-
multiple networks. puterWorld. Retrieved February 11, 2011, from http://www.compu-
terworld.com/s/article/325633/Storage_Virtualization?taxonomy-
Conclusion Id=19&page Number=1
Even though, virtualization decreases server costs, compa- • Lindstrom, P. (2008, January 3). Attacking and Defending Vitu-
ral Enviroment. In Burton Group. Retrieved February 11, 2011, from
nies are realizing that virtualization is simultaneously increas-
http://www.burtongroup.com/Download/Media/AttackingAnd.pdf
ing management and storage costs, and without a plan to pro- • Littlejohn Shinder, D. (2009, March 25). 10 security threats to watch
tect these environments, they may not realize the full return on out for in 2009. In TechRepublic. Retrieved February 10, 2011, from
investment (Symantec. 2011). As a result, an improper plan to http://www.techrepublic.com/blog/10things/10-security-threats-to-
protect these environments causes fragmented implementation watch-out-for-in-2009/602
and lack of standardization of virtual infrastructures and this will • Murphy, A. (n.d.). Virtualization Defined -Eight Different Ways.[White
Papers]. Retrieved February 12, 2011, from http://www.f5.com/pdf/
continue to expose gaps in the security, backup and high avail- white-papers/virtualization-defined-wp.pdf
ability of virtual environments (Symantec,2011). • Roach, A. (n.d.). Keeping it Spinning: A Background Virtual Storage Pro-
viders. Retrieved February 11, 2011, from http://fht.byu.edu/prev_
workshops/workshop08/papers/2/2-3.pdf
References: • Schofield, D. (n.d.). Data Integrity, Data Resilience and Data Securi-
• Biswas, S. (2011, January 20). Is Cloud Computing Secure? Yes Ano- ty. In CloudTweaks Plugging In the Cloud. Retrieved February 13, 2011,
ther Perspective. In CloudTweaks Plugging In the Cloud. Retrieved Fe- from http://www.cloudtweaks.com/2011/02/security-for-the-cloud-
bruary 10, 2011, from http://www.cloudtweaks.com/2011/01/the-qu- data-integrity-data-resilience-and-data-security/
estion-should-be-is-anything-truly-secure • Symantec. (2011). Symantec Reveals Top Security and Storage Pre-
• Butler, J. M., & Vandenbrink, R. (2009, September). IT Audit for the Vir- dictions for 2011 [Press Release]. In Symantec. Retrieved February 11,
tual Environment. In SANS. Retrieved February 13, 2011, from http:// 2011, from http://www.symantec.com/about/news/release/article.
www.sans.org/reading_room/analysts_program/VMware_ITAudit_ jsp?prid=20101209_02&om_ext_cid=biz_socmed_twitter_facebo-
Sep09.pdf ok_marketwire_linkedin_2010Dec_worldwide_2011predictio
• Data Loss Statistics (n.d.). In Boston Computing Network. Retrieved
February 11, 2011, from http://www.bostoncomputing.net/consulta-
tion/databackup/statistics/
• Crump, G. (n.d.). Storage Switzerland Report The Complexity of
VMWare Storage Management. In Storage Switzerland LLC. Retrieved Stephen Breen
February 12, 2011, from http://img.en25.com/Web/IsilonSystemsInc/ Stephen Breen obtained two master degrees, Master of Science Information
The Complexity of VMware Storage.pdf Technology, Information Security Specialization from Capella University and
• GoGrid, (2010). Cloud Infrastructure Security and Compliance.[Whi-
Master of Science in Computer and Information Sciences from Nova Southe-
te paper]. Retrieved February 11, 2011, from http://storage.pardot.
com/3442/12401/Cloud_Infrastructure_Security.pdf astern University. He worked as system analyst, programmer and quality as-
• Gruman, G. (2008, March 13). Virtualization’s secret security thre- surance for a software company. He is currently an information security con-
ats. In InfoWorld. Retrieved February 10, 2011, from http://www.info- sultant. His email address is [email protected]

Data Center 2/2011 47


DataCENTER
FOR IT PROFESSIONALS

MAGAZINE

3/2011

Our April issue will cover the Critical Power & Cooling topic!

Have some knowledge and experience on this field? You’re up to


date with all the hottest news that could be interesting for our
readers? Help us create the upcoming issue of our magazine.

To know more, contact:


[email protected]
A BZ Media Event

Developing for the


iPhone or iPad?
Register
Early
AND SAVE!

Choose from Attend


Over 50 Classes!
iPhone/iPad
Developer Essentials
Enterprise Apps
Management
Spotlight on iPad
April 4-6, 2011
Learn from our
expert speakers
BOSTON
Hyatt Regency Cambridge
Our top-notch faculty are the most
established apps developers and marketers “This conference will get you up
in their business.They’re not talking heads to speed quickly. Good content,
or corporate drones: They’re hands-on keynotes, and the support team
experts with real-world apps experience.
was excellent.”
They know what it takes to succeed
—Dr. Joy Sargis, Manager, Development Unit, Geisel Library, UCSD
in the mobile marketplace, and at iPhone/iPad
DevCon, they’ll share their knowledge,
SPONSORED BY
iPhone/iPad DevCon™ is a trademark of BZ Media LLC. iPhone® and iPad® are registered trademarks of Apple Inc.

insights and best practices with you to give


you a great technical conference experience.

Registration Now Open!


www.iphonedevcon.com
SharePoint Comes Back
to Boston!

e g i s t e r Early
R E!
and SAV

Attend
Choose from over

90
Classes
& Workshops!
Learn from the most experienced
Sheraton Boston Hotel SharePoint experts in the industry!
Keynote Over 55 “I would recommend SPTechCon to SharePoint admins and
developers. By far the best tech event I have attended.”
by Dux Raymond Sy
Exhibiting —Venki Oruganti, Software Developer, Pitney Bowes

Companies! “Conference was very informative and educational.


I learned a lot and am excited to try some
new solutions at work.”
—Dolores Bertin, DRMS Admin.. Inter Pipeline Fund
A BZ Media Event

Download the Full Class Listing!


Follow us at twitter.com/SPTechCon
www.sptechcon.com

You might also like