 |
DOT Research Projects
DOT enables crucial research that has been, until now, restricted by the unavailability persistent distributed systems that utilize highly distributed computational and data facilities interlinked by advanced optical networks and supported by sophisticated middleware. The DOT testbed has been established to facilitate a wide-range of research projects that can benefit from this type of infrastructure. Although initially the focus of the work on DOT will be on a select number of specific applications, the techniques developed on this testbed will be made available to the high-performance computing community for general use.
The preliminary research activities on DOT are being driven by three applications: (1) ENZO, an adaptive cosmological application, (2) Cactus, a open framework used to solve Einstein’s equations and (3) AudioVoice, a virtualized distributed audio application with physical simulations that have real-time deadlines and varying computational demands. All three of these applications have been parallelized using MPI. Each of the three applications presents unique challenges, including adaptivity, flexible framework and simulations with real-time deadlines. Therefore, these three applications have the potential to benefit from DOT research activities in: areas of dynamic load balancing, performance monitoring and predictions, and data management. Consequently, these techniques are the focus the initial DOT research. However, these techniques are required by many high-performance computing applications and they are representative of the critical areas that often hinder the performance of such applications.
Current research activities and investigators are:
- Dynamic Load Balancing (V. Taylor): the development of techniques that take into consideration the heterogeneity of the processors and networks of distributed systems to dynamically balance the load during execution; the techniques utilize network performance predictions.
- Performance Monitoring and Prediction (P. Dinda, X. Sun, V. Taylor): the extension of performance monitoring, modeling and prediction techniques that have been focused on parallel systems and broadband network to distributed systems with optical networks and different topologies.
- Data Management (A. Choudhary, I. Foster): the development of techniques that manage the distributed data such that the actual data location is transparent and the data is accessed efficiently.
|
|