0% found this document useful (0 votes)

59 views7 pages

GRID DATA POOLING Exploitation of The Grid System For Environmental Applications

The goal of this paper is the introduction to Grid Technology. The Grid is a unified source of distributed resources through a fast network. The Grid supplies its users with a network of computational and storage elements.

Uploaded by

vayiap

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views7 pages

GRID DATA POOLING Exploitation of The Grid System For Environmental Applications

Uploaded by

vayiap

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

GRID DATA POOLING

Exploitation of the Grid System for Environmental Applications

Panagiotidi V. <[email protected]>, Georgatos F. <[email protected]>

National Technical University of Athens

Abstract : The goal of this paper is the introduction to Grid Technology. The Grid is
a unified source of distributed resources through a fast network. The Grid
supplies its users with a network of computational and storage elements.

Our main goal is to create a single pool of reference for data that has been
collected from the Internet that is of use to geoscientists and is available to
them in a grid-like way. The catalogue involves items like GTOPO, ETOPO
databases and even bigger datasets such as free ozone data. When data is
registered to the Grid a unique file name is assigned to it, an lfn (logical file
name). The catalogue contains these file names with which Grid users can
access the data.

Key words: Grid, data pooling, LCG, EGEE

Introduction

The basic idea behind a Computing Grid architecture is that of the electric power
grid; a variety of resources contribute power into a shared "pool" for many consumers
to access on an as-needed basis [2]. The Grid concept goes well beyond simple
communication between computers and aims ultimately to turn the global network of
computers into one vast computational resource. Ultimately, Grid computing is a
form of distributed computing that involves coordinating and sharing computing,
application, data, storage, or network resources across dynamic and geographically
dispersed organizations. The LCG/EGEE Grid is a service for sharing computer
power and data storage capacity over the Internet [3].

Although the Grid was designed for the LHC experiment, the largest scientific
instrument in the world which will begin operations at CERN in year 2007 and
produce data at about 10 Petabytes per year [4]. Its use branches out into every field
and science that needs great computing power. Theoretically, the grid can be used in
any application that requires a computer, since essentially the grid is a very large
computer; due to that it introduces itself over time to many more scientific groups. In
reality, the first grid users are those with very demanding applications that cannot be
implemented in simple computer systems. So today the Grid is being used in High
Energy Physics, Astronomy, Biomedicine, Chemistry, Environmental Sciences and so
on. Scientists form collaborating communities called Virtual Organizations, also
known as VOs, eg. for LHC these include ATLAS, CMS, LHCB, ALICE, and there
exist more for other sciences, such as MAGIC, BIOMED, COMPCHEM, ESR etc.

Grid and Environmental Sciences

The ESR VO (Earth Science Research) aims at users interested in environmental
or geological applications. So far, members of the ESR-VO have shown more interest
in the fields of Climate Data Analysis, Earth Observation Experiments and Solid
Earth Physics. There are five ideas behind the Grid that make it appealing for ESR
members:

1. Sharing resources on a global scale.

2. Security. The GRID can ensure a high level of trust between resource users
and providers.
3. Balancing the load on the shared resources, allows computers everywhere to
be used more efficiently, and queues for access to advanced computing
resources to be shortened.
4. Distance no longer matters. The grid provides access to all resources without
any significant reduction in speed, no matter where they are.
5. Open standards exist in order to make sure that researchers and developers
worldwide contribute in a constructive way to the development of the GRID
while leaving room for commercial development of Grid services and
infrastructure.

Typical data analysis, eg. with climate applications, involve the collection, cross
correlation (intercomparison) and adaptation of model data, as well as validation
against observational data. Model data was until previously produced in dedicated
HPC centers and data was stored in dispersed data center facilities, with no logical
links between the different copies (replicas). Individual dataset sizes for ESR
normally range between 100s of Megabytes and 10s of Gigabytes, even when that
involves single files, and the overall data managed by individual data centers is often
in the order of Petabytes.
Thus the major challenge in ESR related data-analysis is the access to, and transfers
of, huge amounts of data; the grid can do that. The data formats used are self-
describing, at least the most common ones: NetCDF, HDF, GRIB. Typical data
analysis software used in pre-/postprocessing are: PINGO, CDO, AFTERBURNER
[5]. Such transfers can take a great deal of time, if using a personal computer for data
analysis. We can do better, if the data is already in place when processing takes place
on the Grid.

The Grid comes to provide the storage and computing resources that would reduce
the time consumed to a fraction of the time needed using one or two personal
computers. This is helped by replication of data.
Many applications concerning earth sciences use large data sets as the ones above.
The first ESR application ported on the grid was the comparison of Ozone profiles
obtained by different means: the GOME satellite and the ground based lidar stations.
As GOME produces much more ozone profiles than the ground lidars, the key factor
of the comparison was to be able to accurately find satellite data that can be compared
with the existing lidar data. Data coincidence is determined by two criteria: location
and date. The grid helps the scientific community analyzing the ozone retrieval by
enabling a single computing environment for the different steps, allowing the data to
be easily shared between "producer" and "consumer". The sharing of computing
resources allows time consuming calculations to be carried out faster for the benefit of
every one. In this case, the OPERA algorithm is the one that stresses the most the grid
environment by the calculating resources it requires and by the large number of files it
generates)[6].

This is why the metadata availability and security is crucial for the ozone experiment.
The community is heavily implied in the testing of grid solutions for relational
databases management.

Implementation

In order to address this issue our main goal is to create a single pool of reference
for data that are already available on the Internet, in a way which is going to be of use
to geoscientists, in a grid-like way. The proposed catalogue involves items like
GTOPO, ETOPO databases and similar and even bigger datasets such as free ozone
data.

The method of working involves the following steps: The datasets are first retrieved
and moved to a specific file repository in a Unix account within a User Interface,
which acts as the entrance door for the LCG/EGEE Grid. Most of the work has been
by using the tool wget [7]. Then, the file is copied using the function lcg-cr to a
Storage Element [8], which is being provided by the ESR-VO for its users. When
“registering” the file to the Grid a logical file name – LFN must be appointed for
further reference and use of the file. An example follows,

European Center for Medium-Range Weather Forecasts – ECMWF www.ecmwf.int

The ECWMF has a large data base of weather measurements for the last 40 years. We
retrieved a part of this data and registered it to the Grid as described above.

First, the file is retrieved.

Next, the file is registered to the Grid.

The file has been registered as:

ECWMF_40year_analysis_ozone_temp_1984_06h.grib. We used a name which gives
a hint for the origin of the data: ECWMF_40year_analysis, the information it
concerns is ozone and temperature of the year 1984, along with the time the data was
collected which was 06.00, and its format, grib.
Other examples are ETOPO files: eg. ETOPO30_d10g.gz, ozone files from the co-
operative program for monitoring and evaluation of the long range transmissions of
air pollutants in Europe
eg. emep_ozone_allsites_o3_1990.zip and so on.

The complete list of files moved to the Grid and specifically to the storage element
se01.isabella.grnet.gr can be found at [9].
The files that were moved and registered to the Grid are only a small segment of the
available and needed data regarding Earth Sciences on the Internet. The point was to
portray the method of registering data to the Grid and of course doing so for a specific
range of data.
Future Improvements

At the time of our initial work the ESR-VO followed the RLS architecture,
Replica Location Service protocol, which had two downfalls:
first only flat files can be registered to the Grid and
second, a number of security issues that didn't allow for confidentiality.
The newer version LFC – LHC File Catalogue allows branched files to be created and
gives the user the right to add secure access to certain users through passwords ext.
Today the ESR-VO has already migrated to LFC. Registering files to the Grid for the
ESR-VO at this point must be done using the corresponding commands for the LFC,
which are mostly the same [10].

Apart from the drawbacks created by the RLS protocol, there were other obstacles in
the process of registering files to the grid. The procedure was quite time consuming
due to the fact that each file was migrated and registered to the grid individually.
Before registering the files new names for each one had to be chosen, names that
would give future users a good idea of what the contents of this file might be. One can
understand that when talking about a few such files this wouldn’t be an issue. In our
case though only from the data at ECWMF did we migrate and register 180 files to
the grid. In the future we hope to go around such obstacles with scripts where it is
possible.

The point of our work is to introduce the importance of data pooling in the Grid, not
only for environmental data but for other sciences as well. Future work would be to
register databases to the Grid in cooperation with scientists and other users seeking to
run their applications on the Grid. Ultimately, this work can be done for all sciences
and is going to find great use and potential outcomes in the implementation of such
applications.

References
[1] I. Foster, C. Kesselman, S. Tuecke, The Anatomy of The Grid

[2] I. Foster and C. Kesselman, Morgan Kaufmann Publishers 1998, The Grid: Blueprint for a
Future Computing Infrastructure, p. 3.

[3] Terena, EGEE Enabling Grids for E-sciencE,

http://public.eu-egee.org/intro/

[4] Rosy Mondardini , Grid Café,

http://gridcafe.web.cern.ch/gridcafe/GridatCERN/gridatcern.html

[5] David Weissenbach, Web Home ESR TWIKI,

http://esr.knmi.nl/twiki/bin/view/ESR/ClimateReseach

[6] Julian Linford, Technical Report, March 2005, The GOME Application Deployment on
EGEE
[7] Linux / Unix Command: wget, http://linux.about.com/od/commands/l/blcmdl1_wget.htm

[8] Antonio Delgado Peris, Patricia Mendez, Lorenzo, Flavia Donno, Andrea Sciab` a, Simone
Campana, Roberto Santinelli, 2004, LCG-2 User Guide, p. 64.

[9] Vayia Panagiotidi, Diploma Dissertation – NTUA, October 2005, Exploitation of the Grid
Systems for Environmental Applications, http://www.hep.ntua.gr/files/vayia.pdf

[10] Tony Calanducci, User Training and Induction, June 2005, LFC: The LCG File Catalog,
www.phenogrid.dur.ac.uk/howto/LFC.pdf

Remote Sensing and GIS
No ratings yet
Remote Sensing and GIS
145 pages
Grid and Cloud Computing
No ratings yet
Grid and Cloud Computing
10 pages
Unit 1
No ratings yet
Unit 1
63 pages
A Case Study: Grid Services For Satellite Image Processing
No ratings yet
A Case Study: Grid Services For Satellite Image Processing
5 pages
Jsaer2015 02 03 09 12
No ratings yet
Jsaer2015 02 03 09 12
4 pages
Module 5
No ratings yet
Module 5
22 pages
Data Mining Foster
No ratings yet
Data Mining Foster
26 pages
Module 3 Compressed
No ratings yet
Module 3 Compressed
29 pages
Grid Computing Technical Paper On Ict: School of Management Studies CUSAT Kochi-22
No ratings yet
Grid Computing Technical Paper On Ict: School of Management Studies CUSAT Kochi-22
12 pages
Presented by K.Sreekala (2007-2011) Iiird It From XENIX Group
No ratings yet
Presented by K.Sreekala (2007-2011) Iiird It From XENIX Group
13 pages
Grid Computing Architecture and Benefits: Shruti N. Pardeshi, Chitra Patil, Snehal Dhumale
No ratings yet
Grid Computing Architecture and Benefits: Shruti N. Pardeshi, Chitra Patil, Snehal Dhumale
4 pages
Turner Introduction To GISAnd Grid Computing
No ratings yet
Turner Introduction To GISAnd Grid Computing
25 pages
Tern Poster Final
No ratings yet
Tern Poster Final
3 pages
Role of Geo
No ratings yet
Role of Geo
6 pages
Remote Sensing: An Overview of Platforms For Big Earth Observation Data Management and Analysis
No ratings yet
Remote Sensing: An Overview of Platforms For Big Earth Observation Data Management and Analysis
25 pages
The Earth System Grid
No ratings yet
The Earth System Grid
18 pages
Emergency Geographic Information Services: A Framework and Typical Applications
No ratings yet
Emergency Geographic Information Services: A Framework and Typical Applications
10 pages
Hermal Analysis of Climate Regions Using Remote Sensing and Grid Computing
No ratings yet
Hermal Analysis of Climate Regions Using Remote Sensing and Grid Computing
16 pages
Implementing A Secure Academic Grid System - A Malaysian Case
No ratings yet
Implementing A Secure Academic Grid System - A Malaysian Case
8 pages
7423 CH 10
No ratings yet
7423 CH 10
14 pages
An Overview of Ontology Based Approach To Organize The
No ratings yet
An Overview of Ontology Based Approach To Organize The
6 pages
Open Source Software Related To Geoscience and Remote Sensing
No ratings yet
Open Source Software Related To Geoscience and Remote Sensing
21 pages
DataFed Architecture IEEE
No ratings yet
DataFed Architecture IEEE
6 pages
7 Things You Should Know About
No ratings yet
7 Things You Should Know About
5 pages
Nerc Ecological Data Grid (Ecogrid)
No ratings yet
Nerc Ecological Data Grid (Ecogrid)
4 pages
Application of Remote Sensing and GIS in Disaster Management
No ratings yet
Application of Remote Sensing and GIS in Disaster Management
21 pages
Grid Computing: Seminar On 28Th Aug. 2009 at Niit Surat
No ratings yet
Grid Computing: Seminar On 28Th Aug. 2009 at Niit Surat
16 pages
Remote Sensed Image Processing On Grids For Training in Earth Observation
No ratings yet
Remote Sensed Image Processing On Grids For Training in Earth Observation
27 pages
Grid Characteristics and Uses: A Grid Definition: (Migbot, Yannis, Edugom) @tel - Uva.es
No ratings yet
Grid Characteristics and Uses: A Grid Definition: (Migbot, Yannis, Edugom) @tel - Uva.es
8 pages
Drought Prone Areas
No ratings yet
Drought Prone Areas
6 pages
Pre Processing and Analysis Tools
No ratings yet
Pre Processing and Analysis Tools
23 pages
Synopsis "Time Series Geospatial Big Data Analysis Using Array Database"
No ratings yet
Synopsis "Time Series Geospatial Big Data Analysis Using Array Database"
5 pages
Effective Application of Geographic Information System in The Field of Earthquake Engineering and Disaster Prevention
No ratings yet
Effective Application of Geographic Information System in The Field of Earthquake Engineering and Disaster Prevention
8 pages
Grid Computing: College of Computer Science and Information Technology, Junagadh
No ratings yet
Grid Computing: College of Computer Science and Information Technology, Junagadh
5 pages
3D Visualization Management System of Remote Sensing Satellite Data 2011 Procedia Environmental Sciences
No ratings yet
3D Visualization Management System of Remote Sensing Satellite Data 2011 Procedia Environmental Sciences
6 pages
The Anatomy of The Grid
No ratings yet
The Anatomy of The Grid
2 pages
(IHS) Grid Computing at IHS
No ratings yet
(IHS) Grid Computing at IHS
4 pages
Remote Sensing For Natural Disaster Management
No ratings yet
Remote Sensing For Natural Disaster Management
14 pages
Integrating Remote Sensing and Gis by Objectoriented Data Modelling
No ratings yet
Integrating Remote Sensing and Gis by Objectoriented Data Modelling
17 pages
Ensemble Handling in GrADS
No ratings yet
Ensemble Handling in GrADS
34 pages
Midterm Exam Key: CMPT 354
100% (1)
Midterm Exam Key: CMPT 354
7 pages
MultiDimensional Data Model
No ratings yet
MultiDimensional Data Model
22 pages
Google Cloud Architect Design and Process Workbook
No ratings yet
Google Cloud Architect Design and Process Workbook
23 pages
Class12 Cs Practical File
No ratings yet
Class12 Cs Practical File
60 pages
Talend Quick Book
No ratings yet
Talend Quick Book
38 pages
Power Bi Interview Questions
No ratings yet
Power Bi Interview Questions
15 pages
2.viva Questions Excel Interview Questions With Answers PDF
No ratings yet
2.viva Questions Excel Interview Questions With Answers PDF
10 pages
Bda Notes
No ratings yet
Bda Notes
241 pages
10 Database and File Concepts
No ratings yet
10 Database and File Concepts
14 pages
CIA3 PPT 2020 SU05
No ratings yet
CIA3 PPT 2020 SU05
50 pages
Class Presentation.: Topic: Google Scholar
No ratings yet
Class Presentation.: Topic: Google Scholar
12 pages
Data Wrangling
No ratings yet
Data Wrangling
18 pages
BDA - M 3 - NoSQL
No ratings yet
BDA - M 3 - NoSQL
81 pages
University Database E-R Diagram
No ratings yet
University Database E-R Diagram
5 pages
Mining Frequent Patterns and Associations
No ratings yet
Mining Frequent Patterns and Associations
52 pages
Aiesec X Aws Workshop
No ratings yet
Aiesec X Aws Workshop
45 pages
Project 07 Inventory
No ratings yet
Project 07 Inventory
3 pages
How To Use SQL Trace and TKPROF For Performance Issues With EBusiness Suite
No ratings yet
How To Use SQL Trace and TKPROF For Performance Issues With EBusiness Suite
6 pages
DWDM Unit1
No ratings yet
DWDM Unit1
93 pages
Extend Now Platform Capabilities
No ratings yet
Extend Now Platform Capabilities
3 pages
William Chang Resume Azure
No ratings yet
William Chang Resume Azure
6 pages
SELECT Operation in Relational Algebra - 20241024 - 102938 - 0000
No ratings yet
SELECT Operation in Relational Algebra - 20241024 - 102938 - 0000
11 pages
ET22BTEC010 - Diya Patel - Practical 11 - PHP Database
No ratings yet
ET22BTEC010 - Diya Patel - Practical 11 - PHP Database
9 pages
ABAP Class05
No ratings yet
ABAP Class05
14 pages
1191
No ratings yet
1191
3 pages
CV Cloud Operations Engineer Umme Ammara-2
No ratings yet
CV Cloud Operations Engineer Umme Ammara-2
2 pages
Data Warehousing Answer Booklet SuppExam2021
No ratings yet
Data Warehousing Answer Booklet SuppExam2021
10 pages
221902285-Database Lab Report 5
No ratings yet
221902285-Database Lab Report 5
4 pages
Data Fabric Corp
No ratings yet
Data Fabric Corp
2 pages
Azure Redis Implementation
No ratings yet
Azure Redis Implementation
7 pages
Building Scalable Data-Intensive Applications
From Everand
Building Scalable Data-Intensive Applications
Chandani Kaul
No ratings yet
Accelerated Computing with HIP
From Everand
Accelerated Computing with HIP
Yifan Sun
4.5/5 (2)
Information Storage and Management: Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments
From Everand
Information Storage and Management: Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments
EMC Education Services
No ratings yet
The Study of Building the Data Warehouse
From Everand
The Study of Building the Data Warehouse
venkateswara Rao
No ratings yet
Software-Defined Networks: A Systems Approach
From Everand
Software-Defined Networks: A Systems Approach
Larry Peterson
5/5 (1)
Network Coding and Signcryption for Cloud Data Integrity
From Everand
Network Coding and Signcryption for Cloud Data Integrity
Noah Joan
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
Distributed Storage Networks: Architecture, Protocols and Management
From Everand
Distributed Storage Networks: Architecture, Protocols and Management
Thomas C. Jepsen
No ratings yet
Administering ArcGIS for Server
From Everand
Administering ArcGIS for Server
Hussein Nasser
No ratings yet
Virtual Report Processing: The Mapper Story
From Everand
Virtual Report Processing: The Mapper Story
Louis Schlueter
No ratings yet
On-Line Data-Acquisition Systems in Nuclear Physics, 1969
From Everand
On-Line Data-Acquisition Systems in Nuclear Physics, 1969
al.
No ratings yet
LOTED: a semantic web portal for the management of tenders from the European Community
From Everand
LOTED: a semantic web portal for the management of tenders from the European Community
Francesco Valle
No ratings yet
Practical NetCDF Techniques: Definitive Reference for Developers and Engineers
From Everand
Practical NetCDF Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Private 5G: A Systems Approach
From Everand
Private 5G: A Systems Approach
Larry L Peterson
No ratings yet
Networking Programming with C++: Build Efficient Communication Systems
From Everand
Networking Programming with C++: Build Efficient Communication Systems
Robert Johnson
No ratings yet
Hitchhiker's Guide to the Internet
From Everand
Hitchhiker's Guide to the Internet
Ed Krol
No ratings yet
Exploring Semantic Technologies and Their Application to Nuclear Knowledge Management
From Everand
Exploring Semantic Technologies and Their Application to Nuclear Knowledge Management
IAEA
No ratings yet
Introduction to Internet & Web Technology: Internet & Web Technology
From Everand
Introduction to Internet & Web Technology: Internet & Web Technology
Dr. Yashpal singh
No ratings yet
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Audio Visual Speech Recognition: Advancements, Applications, and Insights
From Everand
Audio Visual Speech Recognition: Advancements, Applications, and Insights
Fouad Sabry
No ratings yet

GRID DATA POOLING Exploitation of The Grid System For Environmental Applications

Uploaded by

GRID DATA POOLING Exploitation of The Grid System For Environmental Applications

Uploaded by

GRID DATA POOLING

Exploitation of the Grid System for Environmental Applications

Panagiotidi V. <[email protected]>, Georgatos F. <[email protected]>

Key words: Grid, data pooling, LCG, EGEE

Grid and Environmental Sciences

1. Sharing resources on a global scale.

European Center for Medium-Range Weather Forecasts – ECMWF www.ecmwf.int

First, the file is retrieved.

The file has been registered as:

[3] Terena, EGEE Enabling Grids for E-sciencE,

[4] Rosy Mondardini , Grid Café,

[5] David Weissenbach, Web Home ESR TWIKI,

You might also like