William: February 2011

Wednesday, February 23, 2011

MURPA-Lin Wei-Week 7

1. User Interface of OpalNimrod actor
In week 7, I implemented the interface of OpalNimrod actor. The screen shot below shows that users select application name from a drop down list. From the users' point of view, it is more convenient than manually defining an URL because they don't need to know the details of any service URL. The actor obtains the service URLs and host meta-data from Opal registry.

A dummy xml file which represents the Opal registry was uploaded to NBCR Wiki. OpalNimrod actor access this xml to retrieve service URLs and number of free CPUs. The real Opal registry may be implemented in a database in the future. But for the purpose of this project and simplicity reason, a dummy xml file is sufficient.

2. Using Nimrod/K Scheduler in OpalNimrod Actor
I'm thinking about creating an instance of the NimrodResource class in OpalNimrod actor to interact with the Opal registry to select a service URL. Unfortunately, the NimrodResource class assigns an actor to resource with available slots rather than an URL. So an OpalResource class needs to be created to return a service URL to the job submission side of the OpalNimrod actor. Since the OpalResource class is only useful to OpalNimrod actor, I plan to write it as a nested class in the actor.

3. Autodock workflow

For unknown reasons, the Opal actor doesn't work with Sequence to Tag actor to concurrently execute Autodock application in Kepler 1.0. However, the Autodock workflow works well in Kepler 2.0 without Sequence to Tag actor. An Autodock workflow tutorial was uploaded to NBCR WiKi Kepler page.

Tuesday, February 15, 2011

MURPA-Lin Wei-Week 6

In this week, Wilfred agrees that we should build an OpalNimrod actor. The steps required to implement a prototype of the OpalNimrod actor:

1. Create a dummy xml file that stores application metadata.

2. Create a OpalNimrod actor based on the existing Opal actor.

3. The OpalNimrod actor uses Java Document Object Model to access the xml file.

4. Create a new user interface for the OpalNimrod to allow users to select an application name from a drop down list instead of defining the service URL. The drop down list is dynamically generated from the dummy xml file. After users select an application and click the commit button, the interface should display the input parameters for the selected application.

5. After users specify the parameter values and run the workflow, all the serviceURLs that run the application are stored in an array along with the serviceURL metadata. The array is passed to the Nimrod/K scheduler. The scheduler chooses a serviceURL based on the number of free CPUs.

6. At last, the input parameters are submitted to the chosen service URL.

The actor can be tested by changing the value of the number of free CPUs in the XML file and run a PDB2PQR workflow. ws.nbcr.net and kryptonite are the stable hosts that provide PDB2PQR web service. The Nimrod scheduler should choose between these two hosts.

Apart from creating an OpalNimrod actor, I will make a tutorial on how to run an Autodock virtual screening workflow on the NBCR wiki page because the new Opal plug-in will be released soon.

Tuesday, February 8, 2011

MURPA-Lin Wei-Week 5

In week 5, I spent the majority of my time fixing the opal.xml workflow, working on the OpalResource class design and studying the OpalClient.java. In terms of the workflow, it still doesn’t work. I have tried all the suggestions that were made by my supervisors and my own idea, but the error remains.

Opal.xml workflow

The actions I performed include:

Action	Detailed Step	Results
Colin’s suggestion	1. Install Nimrod/K and copy all the jar files that Colin gave me to Kepler 2. using this URL for the workflow: http://kryptonite.nbcr.net/app1267557427870/. Jane has double checked for me that this URL is a working URL. 3. On the workflow, update the Constant actor and update the all the "u" fields with this URL and update the "dpf" fields with the local copy (where kepler is) of 2HU4_B.dpf. 4. Run the workflow	ptolemy.kernel.util.IllegalActionException: We could not find the attribute filter_file in .opal.OpalClient_C0_TCA.OpalClient_C0 with tag colour {sequenceID=0, metadata={}, parameters={dpf="/home/linwei/kepler/2HU4_B.dpf", u="http://kryptonite.nbcr.net/app1267557427870/"}, hashcode=65311548} in .opal.OpalClient_C0_TCA.OpalClient_C0 at edu.sdsc.nbcr.opal.OpalClient._makeCmd(OpalClient.java:898) at edu.sdsc.nbcr.opal.OpalClient.fire(OpalClient.java:235) at org.monash.nimrod.NimrodDirector.NimrodProcessThread.run(NimrodProcessThread.java:394)
Wilfred’s suggestion	1. Export the OpalClient.java in Kepler 2.0 as an OpalClient.jar file 2. Replace the OpalClient.jar in Kepler 1.0 with the new OpalClient.jar 3. Run the workflow with the URL and the dpf file. I’m not sure if these are the correct steps. But the error message indicates that the OpalClient.java may be different because the OpalClient._makeCmd are on different lines	ptolemy.kernel.util.IllegalActionException: We could not find the attribute filter_file in .opal.OpalClient_C0_TCA.OpalClient_C0 with tag colour {sequenceID=0, metadata={}, parameters={dpf="/home/linwei/kepler/2HU4_B.dpf", u="http://kryptonite.nbcr.net/app1267557427870/"}, hashcode=65311548} in .opal.OpalClient_C0_TCA.OpalClient_C0 at edu.sdsc.nbcr.opal.OpalClient._makeCmd(OpalClient.java:893) at edu.sdsc.nbcr.opal.OpalClient.fire(OpalClient.java:235) at org.monash.nimrod.NimrodDirector.NimrodProcessThread.run(NimrodProcessThread.java:394)
My idea	1. Open the PDB2PQR.xml workflow that was created in Kepler 2.0 in Kepler 1.0 2. Open the opal.xml workflow in Kepler 1.0. 3. Copy the OpalClient actor in PDB2PQR workflow and paste it in opal.xml workflow to replace the original OpalClient actor. 4. Using the same serviceURL, u, dpf file and all other parameter values for the copied OpalClient actor. 5. Run the workflow and skip the error.	It is interesting that the workflow runs successfully with error messages flood the entire error message dialogue window. I showed Jane the generated web pages and she confirmed that the job was actually executed.

It remains unknown that if the OpalClient actor has been modified and used in the opal.xml workflow. All people involved in the opal.xml workflow development have been consulted and no one could tell what exactly he/she did when created the workflow. However, I suspect that the OpalClient actors are different because the interfaces of the actors in Kepler 2.0 and Kepler 1.0 are different. Here are the screenshots of the interfaces:

OpalClient interface in Kepler1.0

OpalClient Interface in Kepler 2

OpalResource class Design

The OpalResource class design needs to be modified. Wilfred points out that if we build the OpalResource class inside the OpalClient actor, the Opal actor will be dependent on the Nimrod/K API. Due to copyright issues, the Nimrod/K does not come with the standard Kepler distribution. The users have to agree to the licence, download and install the Nimrod/K manually. Wilfred suggests create an OpalNimrod actor which does the same job as the OpalResource class, but it adds burdens to maintenance since we will end up with two actors. We will discuss the design issues further with Colin in week 6.

Spirit Night

We went out to watch basketball competition at UCSD on spirit night. I was amazed by the passionate students and atmosphere of the competition. It’s a pity that Monash doesn’t have such sports event.

Tuesday, February 1, 2011

MURPA-Lin Wei-Week 4

Progress of the project

During the Skype chat with Colin, I have been advised that Nimrod/K has both parameter sweep and meta-scheduling capabilities. Therefore, instead of deploying a scheduler or Nimrod/G in the Opal server to make scheduling decisions, we can use the existing scheduler inside the Nimrod/K. There are two reasons for this choice. On the one hand, researchers at MeSsAGE Lab are focusing on developing Nimrod/K, using Nimrod/K will leverage the new features and functions. On the other hand, the end users will have the freedom to configure the Nimrod/K actors in the Kepler to fulfil their specific resource requirements. For instance, the users can specify the number of CPUs and the size of the memory to be assigned to execute a job.

The initial design of how to use the Nimrod/K scheduler with Opal web services are detailed as follow:

Overview of the OpalResource Class

In order to utilize the Nimrod/K’s meta-scheduling capability, a new class called OpalResource needs to be introduced to the Opal actor. There are three major tasks that the OpalResource class handles.

1.It synchronizes the metadata (Number of totoal CPUs, number of free CPUs, Free Memory, etc) about the Opal resource with the Opal server. The Opal resource could be cluster, grid or cloud resources that runs scientific application. At the initial stage of the project, the scheduler makes allocation decision based on the number of CPUs required to execute a job.

2.It provides the Opal resource metadata to the scheduler.

3.It monitors and maintains the job submission and resource allocation request to the Opal server.

Both the scheduler and the Opal actor has a reference to the OpalResource class so that they can interacts with the Opal resource.

Creating OpalResource Instances

This diagram illustrates how OpalResource class creates instances of Opal resources. When end users click on the run button in Kepler to execute the Autodock workflow, it triggers the OpalReousr class. The OpalResource class requests for the metadata of the resources that can run the Autodock application. Since the Opal server has records of all the Autodock resources, it can simply fullfill the request by transferring the Autodock resource metadata to the OpalResource class. Then, for each Autodock resource, the OpalResource class creates an OpalResource instance to keep track of the resource metadata. It is important to notice that the entire process is syncronized to ensure that the resources metadata in the Opal server is identical to the ones in the OpalResource instances. For instance, if 10 CPUs becomes available in an Autodock resource, it is also reflected in its corresponding OpalResource instances.

Creating Copies of Opal Actor

Nimrod/K performs parameter sweep by generating multiple copies of the Opal actor. Each copy of the Opal actor is supplied with a set of parameters. It is up to the end user to define the number of parameter set and the value of parameters in each parameter set. The parameter sets are prepared in GirdTokenFiles. Each GridTokenFileis is tagged with a different color to distinguish themselves. Although the copies of the Opal actor are not shown in Kepler, these copies are created and running on the background.

Resource Allocation

Each copy of the Opal actor for Autodock experiment may have different resource requirements. The diagram takes one copy of the Autodock Opal actor to demonstrate how resources are allocated for job execution. First, the copy of the Opal actor provides the resource requirement to the OpalResource class. Then, the OpalResource class pass the resource requirement along with the references of the Autodock OpalResource instances to the scheduler. The scheduler calls the getFreeCPUs to access to Autodock OpalResource metadata and invokes getSlots(int) method to make the allocation decision. After the allocation decision has been made, it is passed back to the OpalResource class. At the end, the OpalResource class submits the Autodock job request along with the resource allocation decision to the Opal server. The Opal server will execute the job on the Autodock resources according to the allocation decision and notify the OpalResource class if the allocation request is fulfilled. Again, The communication in the entire resource allocation process is synchronized. But for the convenience of describing the process, it is broken down into steps.

Some Thoughts About the Opal.xml Workflow

What is Nimrod/K director’s dynamic parallelism?

Why the users have to type the input in an array in the Constant actor? Is there a more convenient way for users to provide input?

How sequences are converted tags? Why manually pre-stage files? Can we design a mechanism to automatically handle the files?

What does Expression actor do?

Can we get rid of the Array to Sequence actor and Expression actor so that the Nimrod/K and the Opal actor can interacts directly?

William