The Ultimate Guide to Distributed Computing (including Xgrid)

October 10th, 2007 by Brandon

Distributed (Grid) computing is the most effective way to build your own supercomputer. Whether you’re Folding@Home to understand protein folding and cure disease, or you’re on the lookout (with 3+ million others) for space invaders with SETI@Home, distributed computing can change the way you compute. No longer do you need a single xgridsupercomputer that takes up 2000 sq. ft. in your basement, but you can still harness the same power with individual computers.

SETI@Home and Folding@Home are distributed computing applications on a world wide level. You can use Apple’s Xgrid and distribute your computing power across a lab of computers or an entire network of computers. This could come in handy when you are rendering large Final Cut Pro files or using Adobe Photoshop filters on large files. That is how supercomputing can be scaled down from the earth level to your small network or campus.

You also could use U.C. Berkeley’s BOINC client and set up your distributed computing project on a world scale.

In this post you’ll find info about Apple’s Xgrid and distributed computing with U.C. Berkeley’s BOINC client. If you have any questions, post them in the comments.

Folding@HomeFolding@Home - The goal of Folding@Home is to understand protein folding, misfolding, and related diseases.

SETI@Home logoSETI@Home - SETI@home is a scientific experiment that uses Internet-connected computers in the Search for Extraterrestrial Intelligence (SETI). You can participate by running a free program that downloads and analyzes radio telescope data.

ClimatePredictionClimatePrediction.net - CP’s goal is to try and produce a forecast of the climate in the 21st century. ClimatePrediction will try to guage what will happen next with our climate. From the website, “There is a broad scientific consensus that the Earth will probably warm over the coming century; climateprediction.net should, for the first time, tell us what is most likely to happen.”

Chess960@HomeChess960@Home - Chess960@Home, apart from seemingly being dead, is a distributed computing task that is more for pleasure than science. Here is a description, “In Chess960, just before the start of every game, the initial configuration of the chess pieces is determined randomly, that means that the king, the queen, the rook, the bishop and the knight are not necessarily placed on the same home squares as in classical chess.”

Einstein@HomeEinstein@Home - Einstein@Home is a program that searches for spinning neutron stars (also called pulsars) using data from the LIGO and GEO gravitational wave detectors. Why they do this and what they hope to find is beyond me. The website didn’t have much info about what pulsars are.

Leiden ClassicalLeiden Classical - Leiden Classical is building a grid dedicated to general Classical Dynamics for any scientist or science student. More info on Classical Dynamics can be found here and here.

LHC@HomeLHC@Home - LHC@Home supports accelerator physicists simulating the proton beam stability of the future Large Hadron Collider (LHC). As of autumn 2006, there are plans to distribute a second software package, Garfield, which does simulations of gases in high fields, to simulate the behaviour of particle detectors used at the LHC.

NanoHive@HomeNanoHive@Home - The goal of NanoHive@Home is to perform large-scale nanosystems simulation and analysis that is otherwise too intensive to be calculated via normal means, and thereby enable further scientific study in the field of nanotechnology.

Orbit@HomeOrbit@Home - Orbit@Home is a project which uses the Orbit Reconstruction, Simulation and Analysis framework to monitor the impact hazard posed by Near-Earth objects. This seems to be a small project with an expected donation in the next few months. Currently there are no work units available.

PlanetQuestPlanetQuest - PlanetQuest’s scientific mission is the discovery—by PlanetQuesters—of thousands of new planets in our galaxy within the next five years. Over 200 planets around other stars have been discovered since 1995. The difficulty is that planets around other stars are too small and faint to be seen directly. Their presence must be determined indirectly through a process that requires careful analysis of astronomical amounts of astronomical data.

Predictor@HomePredictor@Home - Predictor@home is a distributed computing project that aims to predict protein structure from protein sequence in the context of the Critical Assessment of Techniques for Protein Structure Prediction. A major goal of the project is the testing and evaluating of new algorithms to predict both known and unknown protein structures.

PrimeGridPrimeGrid - PrimeGrid is a distributed computing project for searching prime numbers and finding twin primes of world-record size. Primegrid worked with the Twin Prime Search to find record-size twin primes that are approximately 58,700 digits long. The project ended when a new twin of that size was discovered on January 15, 2007 (sieved by Twin Prime Search and tested by PrimeGrid). A project to search for twin primes that are just above 100,000 digits long is currently in progress.

QMC@HomeQMC@Home - QMC@Home is a project designed to further develop the Quantum Monte Carlo (QMC) method for general use in Quantum Chemistry. With the help of volunteers all over the world we want to acquire the computing power that is needed to test and further develop the opportunities of the promising new approach of Quantum Monte Carlo.

Rosetta@HomeRosetta@Home - Rosetta@Home’s goal is to develop computational methods that accurately predict and design protein structure and protein complexes. This computational endeavor may ultimately help researchers develop cures for human diseases such as HIV/AIDS, cancer, Alzheimer’s disease, malaria and many other diseases.

Riesel SieveRiesel Sieve - Riesel Sieve is a distributed effort to prove the Riesel conjecture by removing prime candidates for the remaining 101 68 K from over 11 million k/n pairs. Individual sieving efforts per single K can take months to reach a sufficient level. This coordinated effort will allow us to sieve 100 times deeper and much quicker. No more sieving to 3T and then stopping in frustration as the hours per factor mount, now we can go to 300T and beyond.

Spinhenge@HomeSpinhenge@Home - In the research-field “nano-technology”, in the zone “Molecular Magnets: Controlled Nanoscale Magnetism”, promoted by the American energy ministry (DOE) of interdisciplinary main research project, physicists, chemists, mathematicians and engineers are assigned to make molecular magnetic materials technologically appropriable. Thereby mathematical calculations need to be performed. Because these calculations are very time-consuming, a synchronized execution on numerous computers is obvious.

SZTAKI Desktop GridSZTAKI Desktop Grid - Szdg is an online architecture, run by the Laboratory of Parallel and Distributed Systems. The staff of the laboratory maintains the system, which is open for any scientific research seeking immense computing power. Szdg currently hosts one mathematical project. The SZTAKI Desktop Grid is different from other projects in that it takes applications for what should be scientific problem should be looked at.
Closed Distributed Computing Projects

Africa@HomeAFRICA@Home - Currently closed to new members. Africa@Home will host a variety of projects (applications). The first application being developed for AFRICA@home is called MalariaControl.net. This application models the way malaria spreads in Africa and the potential impact that new anti-malarial drugs may have on the region.

Proteins@HomeProteins@Home - Proteins@Home is a large-scale protein structure prediction project that will help to advance an important area of science. By increasing our knowledge of proteins, we can contribute to a better understanding of many diseases and pathologies, and to progress in both medicine and technology. Proteins@Home is not for profit.

SIMAPSIMAP - Similarity Matrix of Proteins, or SIMAP, is a database of protein similarities created using distributed computing, which is freely accessible for scientific purposes. SIMAP uses the FASTA algorithm to precalculate protein similarity, while another application uses Hidden Markov Models to search for Protein domains. The database has now been completed, but will be updated for newly discovered proteins. Look for future work in June 2007.

Installing Xgrid

Xgrid is Apple’s solution for distributed computing. Mainly designed for single lab and single campus solutions, Xgrid has been taken off-site to multiple locations for specialized needs.

To install Xgrid you’ll need to be able to follow simple directions.

What is an Xgrid Agent

The Agent is the worker ant in the Xgrid operation. The agent makes itself available for tasks to the controller. The agent then receives a task, makes the necessary computations and returns the task to the controller. The agent can be just about any OS X Mac including desktops, laptops, servers and even RAID setups.

What is a Xgrid Controller

The Controller is the queen ant in the Xgrid setup. The controller takes the job you (the client) send it and breaks it apart for all of the agents to handle. This helps to spread the load and get a task finished quicker than a single computer could do. Once the controller receives the computations back from the agents it directs the data back to the client. Each agent can only be connected to one controller so the controller is the hub of the Xgrid setup.

What is a Xgrid Client

You are the client in the Xgrid setup. The Xgrid client sends the tasks to the controller. Once the controller has distributed the task and received the data, it sends the computations and calculations back to the client.

More Xgrid information

Apple official web page
Apple - Mac OS X - Xgrid
The simple solution for distributed computing. Features(pdf)
The Xgrid Tutorials (Part I): Xgrid Basics
The Xgrid Tutorials (Part II): GridStuffer Basics
The Xgrid Tutorials (Part III): Running Batch Jobs
The Xgrid Tutorials (Part IV): Submit Jobs with Ruby
Server Admin 10.4 Help: About Xgrid
Mac OS X Server(pdf, Xgrid overview)
Xgrid@Stanford Widget - Dashboard widget
Xgrid Fuse
How to enable your AppleTV as an Xgrid node
Public Xgrid projects

Public Xgrid and Distributed Computing Examples

Xgrid at University of Utah

Private Xgrid and Distributed Computing Projects

Xgrid@StanfordXgrid@Stanford - Xgrid@Stanford is currently trying to modelize the conformational changes of the beta 2 adrenergic receptor, and have a better understanding of its pharmacology.

Posted in Distributed Computing, Resource Guides, Supercomputers, Xgrid

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.