See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/267534272
The Future of Distributed Computing
Conference Paper · February 2015
CITATIONS READS
0 1,184
1 author:
Davide Salomoni
INFN - Istituto Nazionale di Fisica Nucleare
71 PUBLICATIONS 193 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
HPC-LQCD View project
EU H2020 INtegrating Distributed data Infrastructures for Global ExplOitation - Indigo-DataCloud View project
All content following this page was uploaded by Davide Salomoni on 08 June 2015.
The user has requested enhancement of the downloaded file.
(The Future of) Scientific
Distributed Computing
Davide Salomoni
Director of Technology
Italian National Institute for Nuclear Physics (INFN)
AAAS Annual Meeting – Feb 14, 2015
This work is licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 4.0 International License
“The horse is here to stay, but the automobile
is only a novelty - a fad.”
(President of the Michigan Savings Bank, 1903)
Progress is impossible without change,
and those who cannot change their minds cannot change anything.
(attributed to G.B. Shaw)
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 2
Scientific Grid Computing
• Distributed and
federated resource
provisioning has been
working for several
years.
• A great success case
in the scientific world is
given by Grid
Computing.
• See http://goo.gl/i5Rkt:
§ Real-time status of the
computational jobs
and of data transfer
connected to the
Large Hadron Collider
(LHC) experiments
taking place at CERN.
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 3
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 4
The Next Steps
1. Ease of access, for small and
big collaborations alike.
2. Software and economic
sustainability.
3. Robustness (no single points
of failure).
4. Modular, scalable architecture.
5. Open source software, vendor
independence.
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 5
Cloud Computing Caution,
Buzzword!
• The classical definition comes Search trends of
“Grid Computing” vs.
from the US National Institute “Cloud Computing”
of Standards and Technology (from Google Trends)
(NIST) (http://goo.gl/eBGBk)
• In summary, Cloud computing
deals with:
1 Supplying
2 information and communication technologies
3 as a service
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 6
The 5 Cloud Postulates
1. Self-service, on-demand
2. Access through the network In the end,
3. Resource pooling Applications Rule.
4. Elasticity (with infinite
resources)
5. Pay as you go
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 7
Looking for Solutions…
• Biological and medical science
§ Biological, molecular and medical imaging, life science
research applied to medicine, agriculture, bio-industries
and society, structural biology.
• Social sciences, arts and humanities
§ Georeferencing (e.g. of current or historical maps), cultural
heritage, smart sensors.
• Environmental and earth science
§ Biodiversity and ecosystem research, interactions between
geosphere, biosphere and hydrosphere, earth system
modeling.
• Physical sciences
§ Astrophysics, theoretical and experimental research in
physics.
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 8
A Typical Scientific Problem
1. What type of analysis/problem am I working on?
2. I need to develop my code – can I use macros or ready-to-
go software components?
3. Which data do I need? Is it in files? Objects? Where are
they? Are there catalogs? How do I access them?
4. “Something” should now locate the resources I need and
eventually run my code / analysis.
5. I need to fetch back results and present them, to myself and/
or others.
Note the strong interaction between my work, my data and
the (distributed) resources I need.
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 9
What is Missing?
Or, How to…
• … orchestrate and federate Cloud, Grid and HPC
[public or private] resources?
• … avoid software and vendor lock-in?
• … exploit specialized hardware, such as GPUs or low-
latency interconnections?
• … manage dynamic and complex workflows for
scientific data analysis?
• … combine data from multiple sources and stored in
multiple locations through incompatible technologies?
• … support federated identities and provide privacy and
distributed authorization in open Cloud platforms?
• … provide APIs to exploit the above and write
applications, customizable portals and mobile
views?
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 10
Some Gaps Need to Be Filled…
Source: http://goo.gl/wT8XEq
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 11
Toward a Cloud-based Scientific Data
& Computing Platform
• The INDIGO-DataCloud
project has recently been
approved under the Horizon
2020 EU Framework
Programme for Research and
Innovation.
• INDIGO will build an open
• The INDIGO Consortium,
source data & computing made of 26 academic
platform targeted at scientific and industrial partners
communities, deployable on from 11 European
multiple hardware, provisioned countries, will be funded
with 11.1 million Euros in
over private or public e- 30 months (expected
infrastructures. starting date May 2015).
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 12
INtegrating Distributed data
Infrastructures for Global ExplOitation
In the past 10 years, much attention was paid to infrastructural
problems and to low-level abstractions (such as “jobs” or, more
recently, “virtual machines”). We now need to move higher in the
abstraction chain.
èResource Virtualization – Of [heterogeneous] computing, storage,
networks. Also, the postulate of infinite resources of the Cloud needs to
come to terms with reality.
èPlatform as a Service – Let’s assume that IaaS offerings are
pervasive, simple enough, and based on well-known standards. How
do we abstract from what the market offers us, so that we are able to
exploit all resources that might be available?
èWorkflows, Portals and User Interfaces – API’s that can support
big data workflows for science, with possible integration into portals,
desktops and mobile applications, and that can interconnect IaaS,
PaaS and SaaS.
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 13
Global Collaborations
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 14
Conclusions
• Grid Computing has been and still is a great example of
distributed exploitation of federated resources.
• However, we need to tackle issues such as software
sustainability, economic viability, and avoid issues such
as software or infrastructure lock-in.
• Cloud Computing technologies can be exploited to
achieve this, provided that:
§ Abstractions are implemented at higher (PaaS, SaaS) layers and
not only at the infrastructural level.
§ Cloud infrastructures are widely available and provide standard
interfaces.
§ Solutions can be multi-disciplinary.
§ Heterogeneous resources (from different e-Infrastructures,
possibly from both public and private providers) can be used.
§ Discovering, using resources and writing scientific applications is
greatly simplified compared to the current (Grid) distributed
models.
Davide Salomoni AAAS Annual Meeting - Feb 14, 2015 15
View publication stats