nexpres project: recent developments in italy mauro nanni franco mantovani istituto di...

Download NEXPReS Project: recent developments in Italy Mauro Nanni Franco Mantovani Istituto di Radioastronomia - INAF Bologna Italy

If you can't read please download the document

Upload: jaelyn-grassman

Post on 31-Mar-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

  • Slide 1

NEXPReS Project: recent developments in Italy Mauro Nanni Franco Mantovani Istituto di Radioastronomia - INAF Bologna Italy Slide 2 NEXPReS is an Integrated Infrastructure Initiative (I3), funded under the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement n RI-261525. Design document of storage element allocation method This report is the deliverable D8.3 of the WP 8 Participants to D8: JIVE, ASTRON, INAF, UMAN, OSO, PSNC and AALTO. Slide 3 14Institutes 21 Radio telescopes Yellow/: current operational EVN stations Cyan/Red: existing telescopes soon to be EVN stations Cyan/Blue: new EVN stations under construction Pink: non-EVN stations that have participated in EVN observations Green: non-EVN stations with whom initial EVN tests have been carried out Slide 4 Data acquisition and storage EVN observations > concentrated in three periods along the year > lasting about twenty days each > a burst in data production (and more time to plan the data storage and data processing) IVS observations scheduled more frequently have a rather shorter duration. there are much less data to handle and less time to process them. The present analysis will be based on the amount of data acquired for both astronomical and geodetic observations during a period of about six years Slide 5 Astronomical observations Total data acquired by EVN antennas in all sessions scheduled during the period 2005-2011 Slide 6 Data acquired by a single antenna in all sessions scheduled during the period 2005-2011 Slide 7 Sess. CmEbWbJbOnMcNtTrYsSvZcBdUrShHhArRoMhMer 2009-1 404844464846484460002418070232 2009-2 2160 586059 582700052390100014 2009-3 2864366260626441360002717031119 2010-1 167968636179 634402215 0121007 2010-2 5184 9284 0 3017 3866080013 2010-3 1764456564 0453225 00133560 2011-1 2668605868 0362123 24 26184000 2011-2 048494448 0 2034 3537129300 2011-3 085837985830775440 38 76237010 Max 51858492858379845440 385276231211632 Terabyte of data acquired by each EVN antenna in the period 2009-2011 Slide 8 Size distribution of the data acquiredDistribution by observing bandwidth Information kindly provided by Alessandra Bertarini Richard Porcas Slide 9 Conclusions (from astronomical observations) Up to 100 Terabyte of data per session per station Some antennas are recording a lower amount of data In a 20 day session 30 datasets are at present collected on average We can estimate 50 datasets per session as a realistic upper limit (with 16 antennas we need to manage a maximum 800 datasets for session) Next future New back-ends will allow 2 Gpbs and 4 Gbps bandwidth (at 5 GHz) The capacity to store 150 - 200 Terabyte of data per station required Slide 10 Geodetic observations Do not require too much space on disks Bottleneck is the data transfer speed from stations to correlators An antenna needs to store locally up to 5 Terabyte for few weeks. (There is a plan to upgrade the bandwidth to 512 Mbit/s up to 12 Terabyte per antenna) Name Frequency days Duration hours Bandwidth Mbit/s Size Terabyte Number Antennas Total Size Terabyte R1, R47242563824 Euro1424256112 Ohig45*24128166 T21424128116 Slide 11 The storage units in the antenna network Near real time e-VLBI is a possibility a) store the data at the stations b) transfer the data sets to the correlator via the fibre optic c) start the correlation process This strategy is used in geodetic VLBI observations Slide 12 Copy data of low stations on the correlator disks Correlation of local and remote data Slide 13 Lets suppose > EU stations can transfer data at 10 Gbps > Extra-EU have a poorer connectivity Requirement > 300 Terabyte of disk space at the correlator The time required to transfer all data depends on the transfer speed ( To transfer 40 Terabyte at 512 Mbps requires 5 to 10 days) The next figure illustrate a possible configuration Slide 14 1 Attended antennas with local storage system (B) 2 Groups of antennas with a central storage system and a local correlator (C) 3 Unattended antenna with local storage system (D) 4 Antennas with good network connection using a remote storage (E or A) 5 Antennas with poor network connection using legacy disk-pack Slide 15 A possible situation in future > Heterogeneneous systems at the stations > More than one correlator in operation > Distributed correlation (data read simultaneusly by different correlators) Requirement A specialized data centre is needed Located on a primary node of the fibre optic network Able to provide the needed data throughput Slide 16 Deliverable D8.3 Hardware design document for simultaneous I/O storage elements Search for a possible model The storage system should be > Cheap > High performance solution System: NAS-24D SuperMicro with motherboard X8DTL-IF CPU Intel Xeon E5620 at 2.4GHz Raid board 3 Ware 9650SE 24 disks 2Tby Sata II Slide 17 Tests on 3 different Raid configurations with standard ext3 linux file-system Single disk 24 disks Raid_5 24 disks Raid_6 24 disks Raid_0 123 Mbyte/s 655 Mbyte/s 580 Mbyte/s 680 Mbyte/s Recording speed: 4 Gbit/s Raid_5: prevents the loss of all data in case of one disk crashes; it can continue to work at lower recording speed Raid_0: a disk crash implies the loss of all the recorded data Raid_6: performs like Raid_5, however is more realiable Slide 18 Three storage units prototype at IRA-INAF MotherboardSuperMicro X8DTH-IF7 PCI-Expres x8 CPU2 X Intel Xeon E5620 Raid board 3Ware SAS 9750-24i4eSATA 3 support Disks12 X 2 Tbyte SATA-312/24 for tests NetworkIntel 82598EB 10 Gbits/s Costs per unit:7,500 Euros Slide 19 The storage system is managed as a collection of tanks of radio data connected to a router Cisco 4900 Slide 20 Evaluation of > various file-systems and related parameters > several systems and transmission protocols Results Excellent performance with ext4 file-system Increased writing speed: 1Gbyte/s in Raid_5 Transfer speed between tanks of 500MByte/s (using Grid-FTP) Slide 21 Space allocation method Stations should have enough disk space available to store data recorded along a full sessions (about 100 TByte) Experiments From 20 to 50 in a session Typical size of an experiment is 2-4 Terabyte Maximum size of an experiment is 20 Terabyte Experiment: a file for each scan, hundreds of files of tens of Gbyte maximum size of a file more than 1 Terabyte Antennas have different storage systems Example An economic COTS tank holds from 20 to 80 Terabyte of data Many tanks needed Some file systems provide only 16 Terabyte per partition Slide 22 Two possible solutions: a) Optimize the disk usage filling up the partitions regardless of files produced by different experiments b) Organize individual experiment in a directory tree Both solutions require a table at each antenna describing the structure of the storage system and the amount of space available Those tables require to be updated at the beginning of a new session The storage allocation method is simple if the file are sequentially saved: an experiment is easily found by the first scan file and by the number of files belonging to that experiment Slide 23 Management of Storage units A common access policy to the network storage system should be established Many network autentication/authorization systems can run under Linux: LDAP, Radius, SSH keys, and certificates A Certification Authority can be establish at the correlation centres Grid-File Transfer Protocol under test to evaluate if it is fine for our needs (it allows parallel and stripped transfer, fault tolerance and restart, third party transfer, able to also use TCP, UDP, UDT protocols) Comparison with Tsunami Slide 24 Institute of Radio Astronomy Observatories Medicina Noto Slide 25 Sardinia Radio Telescope Slide 26 Thanks for your attention Slide 27 Slide 28