Wiki:Projects
From CSWiki
Swift: Fast, Reliable, Loosely Coupled Parallel Computation
Scientific computation often involves thousands or even millions of tasks operating on large quantities of data, such data is often diversely structured and stored in heterogeneous physical formats, and scientists must specify and run such computations over extended periods on collections of compute, storage and network resources that frequently change constantly. Swift is a parallel programming tool specifically designed to address such challenges for concise specification, and fast and reliable execution of large-scale scientific computation.
Falkon: a Fast and Light-weight tasK executiON framework
Falkon aims to enable the rapid and efficient execution of many independent jobs on large compute clusters. Falkon combines three techniques to achieve this goal: (1) multi-level scheduling techniques to enable separate treatments of resource provisioning and the dispatch of user tasks to those resources; (2) a streamlined task dispatcher able to achieve order-of-magnitude higher task dispatch rates than conventional schedulers; and (3) performs data caching and uses a data-aware scheduler to leverage the co-located computational and storage resources to minimize the use of shared storage infrastructure. Falkon has become a Globus incubator project since November 2007, which has information about source code downloads, instalation and usage instructions, and mailing lists; the Falkon Globus Incubator Site can be found here; the general Falkon project page can be found here.
AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis
The AstroPortal is an astronomy gateway to grid resources on the TeraGrid using the Falkon framework. The astronomy dataset used is the Sloan Digital Sky Survey (SDSS), DR4/DR5, which is comprised of over 300 million objects dispersed over 1.3 million files adding up to 3 terabytes of compressed data or over 9TB of uncompressed imaging data. The analysis currently supported by the AstroPortal prototype is stacking, the summation of multiple observations of the same part of the sky; stacking will both help identify variable sources and to detect faint objects. The AstroPortal gives the astronomy community a new tool to advance their research and to open new doors to opportunities never before possible on such a large scale.
DiPerF: an automated DIstributed PERformance testing Framework
DiPerF aims to simplify and automate service performance evaluation. DiPerF coordinates a pool of machines that test a target service, collects and aggregates performance metrics, and generates performance statistics. The aggregate data collected provide information on service throughput, on service ‘fairness’ when serving multiple clients concurrently, and on the impact of network latency on service performance. Furthermore, using this data, it is possible to build predictive models that estimate a service performance given the service load.
ServMark: an Architecture for Testing Grid Services
ServMark is the result of the integration of the GrenchMark project and DiPerF. ServMark addresses two orthogonal research questions: (1) How to test a large-scale, distributed, and (grid-)service-based environment? and (2) How to generate realistic testing traces for a wide-range of testing scenarios?
DI-GRUBER: A Distributed Grid Resource Broker
DI-GRUBER, an extension to the GRUBER brokering framework, was developed as a distributed grid UUsage SLA-based resource broker that allows multiple decision points to coexist and cooperate in real-time. DI-GRUBER ultimately addresses issues regarding how usage USLAs can be stored, retrieved, and disseminated efficiently in a large distributed environment.
GangSim Simulator
GangSim is a tool developed for Grid scheduling studies, capable of supporting studies for controlled resource sharing based on uSLAs. The new name, GangSim, reflects both the origins of the implementation (Ganglia Monitoring Toolkit) and the fact that it can be used to simulate "gangs" of consumers and resources.
GriPhyN: Grid Physics Network
The Grid Physics Network collaboration is a team of experimental physicists and information technology (IT) researchers who plan to implement the first Petabyte-scale computational environments for data intensive science in the 21st century. Driving the project are unprecedented requirements for geographically dispersed extraction of complex scientific information from very large collections of measured data. To meet these requirements, which arise initially from the four physics experiments involved in this project but will also be fundamental to science and commerce in the 21st century, GriPhyN will deploy computational environments called Petascale Virtual Data Grids (PVDGs) that meet the data-intensive computational needs of a diverse community of thousands of scientists spread across the globe.
beta-Grid
The beta-Grid project aims to:
- define the behavior of a standard "Grid-enabled cluster" in terms of the protocols it must speak, the scheduling disciplines it must implement, its performance characteristics, etc.
- develop a standard software suite, comprising protocol implementations, schedulers, configuration scripts, etc., that implement the standard behaviors.
GrADS: Grid Application Development Software Project
The goal of the Grid Application Development Software Project is to simplify distributed heterogeneous computing in the same way that the World Wide Web simplified information sharing over the Internet. The GrADS project will explore the scientific and technical problems that must be solved to make grid application development and performance tuning for real applications an everyday practice.
