CERN Accelerating science

Article
Title Modelling High-Energy Physics Data Transfers
Author(s) Bogado, Joaquin (UNLP, La Plata (main)) ; Monticelli, Fernando (UNLP, La Plata (main)) ; Diaz, Javier (UNLP, La Plata (main)) ; Lassnig, Mario (CERN) ; Vukotic, Ilija (U. Chicago (main))
Publication 2018
Number of pages 2
In: 14th eScience IEEE International Conference, Amsterdam, Netherlands, 29 Oct - 1 Nov 2018, pp.334-335
DOI 10.1109/eScience.2018.00081
Subject category Computing and Computers ; Detectors and Experimental Techniques
Abstract In scientific data management systems like Rucio[1], the possibility to know when a file transfer is going to be finished at the moment of submission opens a wide range of opportunities to improve the schedule techniques actually being used, and therefore to optimize the use of the available resources. We developed a model that can predict the number of pending transfers in a file transfer system[2] queue at a given time, and therefore, with some level of confidence, the estimated time to complete for each transfer. Using data analytics methods on historical data, we also managed to make predictions about the average rate of the transfers based only in their sizes. The models use information about the submission time stamp, i.e., the moment the transfer enters to the data management system, and the size of the transfer in bytes, to calculate the starting time stamp, i.e., the beginning of the usage of the network, and finishing time stamp. The rate of each transfer needs to be known or approximated. Also, the limits of concurrent active transfers need to be known. We got the rate approximation doing fit using ordinary least squares regression from scipy optimize package[4] to the function described in Equation (1) on 500 random transfers in the first dataset.
Copyright/License © 2018-2025 IEEE

Corresponding record in: Inspire


 Datensatz erzeugt am 2022-02-03, letzte Änderung am 2022-02-03