Title: Pairwise Computation in Cloud Infrastructures and the Application
Presenter: Dr. Jin Soung Yoo
Abstract
To compute a function on all pairs of elements in a given set (called N-body problem) is a familiar task and a basic building block in many algorithms throughout a wide range of applications such as covariance matrix, clustering, cross-document reference, graph layout, gene regulatory network and spatial statistics.
If all pairwise forces are computed directly, this requires quadratic operations at each time step. The challenge rises with the size of the dataset or the complexity of the evaluation function. Execution on a single machine is prohibited due to memory or computation limitations. Consequently, algorithms that leverage parallel infrastructures are needed for pairwise element computation.
Cloud computing frameworks facilitating the distributed execution of massive tasks are popular with the MapReduce programming model, Spark and the cloud infrastructures, e.g., provided by Amazon AWS.
In this seminar, we discuss various methods to parallelize this task where each element needs to be processed with all other elements in a shared-nothing architecture.