Table of Contents
T3. Optimization of data access and transfer in LCG
Task coordinator : V. Korenkov, belonging to team: JINR, deputy V. Mitsyn (JINR)
Other participants: E. Tickhonenko (JINR), V. Motyakov and V. Kotlyar (both IHEP), L. Guy (CERN-IT), M. Kunze (FZK)
The cluster of Tier2 centers will be created in Russia as an operational part of the LCG infrastructure. The corresponding distributed facilities will consist of several computing farms located in different places (Moscow institutes, in Protvino and Dubna, in St-Petersburg and Novosibirsk). These institutes have now (and will have) an access to Tier1 centers (in CERN and/or FZK) with different characteristics. The facilities for permanent data storage will be created in Russian Tier2 Cluster. However they will differ from site to site. Moreover, overall facilities, most probably, will not answer to the requirements of Russian physicists in a full scope. Altogether these conditions require a development of the algorithms and software to optimize the data migration paths between end-users and Russian Tier2 centers, and then with Tier1 sites. In particular, one should determine the locations for data keeping at Russian institutes and getting methods of a quick access to data inside Tier2's in Russia and, if necessary, in Tier1's. It is necessary to take into account current rates of communication channels, a topology of the Tier2 cluster, the time and place of data processing. The main goals are: 1) optimization of the use of Russia-LCG (GEANT) and regional communication links, 2) effective use of permanent storage facilities and CPU resources of Russian Tier2 Cluster; 3) stable and effective access of end-users (physicists) to data stored in the Tier2-Tier1 system.
Analysis of current data transfer rates and prediction of a free space for data keeping. Accounting system on real rates of data transfer via different communication channels. It is supposed to develop a toolkit to optimize data migration paths and to provide a quick access to data with a possible data prefetching to a required place both on user demand and in semi-automatic way. While development the existing GRID/LCG software facilities will be used and it is proposed to develop the additional LCG tools as some kind of sensor sets of a low level for data channel monitoring and monitoring of a free space for data keeping. All this will be considered in a context of peculiarities of a complex hierarchical access from end-users at Russian Tier2 clusters inside a cloud of distributed clusters in Russia and to Tier1(s) in Europe.
The task is depending on : Deployment of LCG for creating Russian GRID segment. The current infrastructure of Russian LCG-1 (and the further releases) segment will be used. The statistics on Russia-GEANT channel. The statistics on an available free space for data keeping. The statistics on data access at different nodes of the Russian LCG segment. Replica Location Service, Replica Location Index, Local Replica Catalog for determination of locations of data.
Result, milestones :
- development of algorithms of communication channels monitoring and monitoring of a free space for data keeping, and then development of algorithms of optimization of data migration paths and usage of an available space for data keeping, Feb. 2004 - July 2004;
- monitoring software development, Aug. 2004 - Dec. 2004;
- development of software for providing the data migration handling including data prefetching, Jan. 2005 - Dec. 2005.
The main results of this task are expected to be:
- the optimization of the usage of communication channels between Russian LCG segment and Tier1 at CERN and FZK,
- the optimization of the usage of an available space for data keeping and
- the reduction of data access delays inside Russian Tier2.
Toolkit and sensor sets for communication channels monitoring and monitoring of a space for data keeping. Tools for optimization of data migration paths. Detailed report on activities.