Parallelisation and Java

So, lets build our example model. First, how would we build it without the parallelization? Here's some code: Model.java, Agent.java, Landscape.java.

Look through it and try and understand what it is doing. We've documented the code to help, though we've skipped the doc comments on later versions to make it easier to see the code. Start with the class-level comments for Model.java.

The Landscape only really keeps and calculates a map of the agent densities, but it does also give us a framework for a more sophisticated landscape than just a set of arrays, should we wish to expand the model. At the moment the model runs for a fixed number of iterations. As we have agents that move from high-density areas with the idea that they evenly distribute, you could obviously replace the run and report sections in Model.java with something that calculates the statistical difference between an even population and the current result stored in Lnadscape, and uses it in a stopping rule. Note that the density is in a 1D array, mainly because it was eventually going to be converted to an image, though, as it happens, MPI send/recv commands demand 1D arrays.

NB: Notice also that most of the code is in "main" -- MPI is a bit funny about running code outside of the main method, but inside the main class, so we're starting from a point in which everything is done in main.

Run the code and see what it does. We'll now parallelise it. There's no need to edit the code - we'll give you a full copy at the end - just spend the time trying to understand the changes.

To parallelise the code, we need to think about two types of node (processors, PCs, virtual machines, etc.): the node we start the code running from, which is where we want to see the results (we'll call this node zero), and all the other nodes (which we'll call "worker nodes", and which are numbered one upwards). We want to get different things to happen on different nodes. The key thing to understand is that each node gets its own copy of the code, but that it is all of the code: an identical copy, not just the bit that runs on that node. We control which bits of the code run on which node by finding out the number of the node the code is actually running on and, for example, using "if" statements to say that some things should only happen if the code is running on node zero.

MPI supplies a way of finding out the number of nodes in total and the current node number. We need to add the following to the Model class (new code in pink):

import mpi.*; public class Model { public static void main (String args[]) { int node = 0; int numberOfNodes = 0; try { MPI.Init(args); node = MPI.COMM_WORLD.Rank(); numberOfNodes = MPI.COMM_WORLD.Size(); } catch (MPIException mpiE) { mpiE.printStackTrace(); }

This imports the MPI package, starts the MPI communication framework, and finds out the node numbers.

We can now make our code node specific. For example, telling it to report only on node zero:

// Report if (node == 0) { for (int x = 0 ; x < width; x++) { for (int y = 0; y < height; y++) { System.out.print(landscape.getDensity(x,y) + " "); } System.out.println(""); } }

From now on, top code above happens on all nodes, but the the Report section only runs on node zero. Neat hu?

Next we'll divide up the agents between the nodes.