Resource Prediction Service

From Ogce

The Resource Prediction Service (RPS) predicts an optimal set of resources for running your application. The RPS is a limited version of the Fault Tolerance and Recovery Service (FTR) that is used in the LEAD (Linked Environments for Atmospheric Discovery) and the VGrADS (Virtual Grid Application Development Software) projects.

Contents

How to install the Resource Prediction Service (RPS)

Pre-requisites and Overview

The RPS service is included in the OGCE Axis Services suite. The easiest way to get started is to download the entire package or check it out of our SVN. In addition to the Web Service, RPS includes an agent service that runs as a separate process to the Web service.

  1. Java 1.5 or later
  2. Maven 2.0.7 or later
  3. Apache Ant 1.7 or later
  4. MySQL 5.x or later

Setting up a MySQL database

Before you begin make sure you have a mysql database and a mysql user with privileges to create, update and delete tables from the database. To create a mysql database follow these steps.

  1. If you are starting with a fresh mysql installation, you need to setup the mysql root password. For this use the mysqladmin command as root.
        sudo mysqladmin password "root_password"
        
  2. Login as root into the mysql database
        mysql -u root -p
        
  3. You can list the current databases using the "show databases" command. To create a database and grant the required privileges, run the commands
        create database rps_dev;
        grant all on rps_dev.* to 'rps_dev'@'%' identified by 'rps_dev_password';
        
  4. Now try to login as the user 'rps_dev' using the password 'rps_dev_password' to the 'rps_dev' database.
        $ mysql -u rps_dev -p -D rps_dev
        Enter password:
        
       If you can successfully login, then you can run some commands to see if you have the required privileges
    
        create table foo (col1 int, col2 int);
        show tables;
        insert into foo values(1, 2);
        select * from foo;
        update foo set col2=3 where col1=1;
        select * from foo;
        drop table foo;
        show tables;
        

Install the RPS service

Download the OGCE Axis Services suite, unpack, and run the command

   mvn clean install

in the ogce-axis-services directory. This will build and install everything.

If you want to rebuild just this service after initial installation, use

  mvn clean install -f rps/pom.xml

Maven's -o option is also useful to speed up subsequent builds.

Start the RPS Agent

The RPS agent is a program that collects information from the QBETS and NWS services. This information is needed by the RPS service to predict the optimal set of resources for running your application. Use the following commands to run it:

  1. Start the RPS Agent by running the command "./rpsAgent.sh ./rps.properties"
  2. You should see the log messages in the var/service.log file.
  3. You should see 4 tables created in your MySQL database (BQP_TABLE, COMPUTE_RESOURCES_TABLE, NWS_TABLE, PERF_TABLE and QUEUE_INFO_TABLE)
  4. After a few minutes the COMPUTE_RESOURCES_TABLE, NWS_TABLE and QUEUE_INFO_TABLE should start getting populated with some values.
  5. The BQP_TABLE and PERF_TABLE will be empty till you insert performance models for your application(s) into the PERF_TABLE. You can add performance models either through the web service interface (see README.html) or directly using MySQL commands.

The RPS agent is compiled along with everything else by the master Maven command line. You can also recompile it by running

  mvn clean install -f rps-agent/pom.xml

Alternatively you can cd to rps-agent and run the command "ant" with no arguments.

How to use the Resource Prediction Service

Web Clients

RPS Web client examples are included in the GTLAB project. For debugging purposes, the command line tools below can be used to debug. These duplicate the command line functions:

  1. List resources
  2. Add resources
  3. Delete resources

Using the command-line client to access the RPS web service

The RPS source code has a sample command-line client to access the RPS web service. Run these commands in the rps-agent directory. Output and errors will go to var/client.log.

How to invoke the "listResources" operation?

The listResources operation returns a list of resources for which the RPS service has queue wait time information and network latency and bandwith information. This list is the "universe" of resources for the RPS service.

To invoke the listResources operation, run the command

./client.sh --epr http://localhost:8080/axis2/services/Rps --operation listResources --params

You should see results of the invocation in var/client.log file that looks like this.

INFO [main] (RpsClient.java:244) - Got list of resources from service...
INFO [main] (RpsClient.java:247) - kittyhawk
INFO [main] (RpsClient.java:247) - cobalt
INFO [main] (RpsClient.java:247) - queenbee
INFO [main] (RpsClient.java:247) - abe
INFO [main] (RpsClient.java:247) - eldorado
INFO [main] (RpsClient.java:247) - ornlteragrid
INFO [main] (RpsClient.java:247) - bigben
INFO [main] (RpsClient.java:247) - ucteragrid
INFO [main] (RpsClient.java:247) - sdscteragrid
INFO [main] (RpsClient.java:247) - bigred
INFO [main] (RpsClient.java:247) - uctg-spruce
INFO [main] (RpsClient.java:247) - mayhem
INFO [main] (RpsClient.java:247) - ucsbeuca
INFO [main] (RpsClient.java:247) - utkeuca
INFO [main] (RpsClient.java:247) - ec2
INFO [main] (RpsClient.java:247) - uheuca
INFO [main] (RpsClient.java:247) - rencieuca

How to invoke the "addOrUpdateAppPerfModels" operation?

A performance model for an application is the amount of time the application takes to execute on a given number of CPUs on a given resource. An application is identified by a namespace, name and version number.

./client.sh --epr http://localhost:8080/axis2/services/Rps --operation addOrUpdateAppPerfModels --params app_namespace app_name app_version resource_name cpus wall_time

for e.g

./client.sh --epr http://localhost:8080/axis2/services/Rps --operation addOrUpdateAppPerfModels --params http://www.renci.org WRF V3.0.1 bigred 1024 3600

./client.sh --epr http://localhost:8080/axis2/services/Rps --operation addOrUpdateAppPerfModels --params http://www.renci.org WRF V3.0.1 bigred 512 5000

./client.sh --epr http://localhost:8080/axis2/services/Rps --operation addOrUpdateAppPerfModels --params http://www.renci.org WRF V3.0.1 bigred 256 8000

./client.sh --epr http://localhost:8080/axis2/services/Rps --operation addOrUpdateAppPerfModels --params http://www.renci.org WRF V3.0.1 ornlteragrid 128 7200

./client.sh --epr http://localhost:8080/axis2/services/Rps --operation addOrUpdateAppPerfModels --params http://www.renci.org WRF V3.0.1 sdscteragrid 512 5000

The above performance models will be added to the MySQL database. It will however take a few minutes for the service to gather enough information to be able to find the optimal resources for running your application. After that, an invocation of the 'findOptimalResources' operation on the RPS web service should return results within seconds.

How to invoke the "findOptimalResources" operation?

The findOptimalResources operation returns a set of optimal resources (optimized over data transfer time, queue wait time and compute time) for running your application.

First, wait for a few minutes after adding performance models for any application to the database (either through the web service as described above or directly through MySQL commands). Then run the command,

./client.sh --epr http://localhost:8080/axis2/services/Rps --operation findOptimalResources --params app_namespace app_name app_version --inputData data_source_1 data_size_1 data_source_2 data_size_2 ... data_source_n data_size_n

for e.g

./client.sh --epr http://localhost:8080/axis2/services/Rps --operation findOptimalResources --params http://www.renci.org WRF V3.0.1 --inputData bigred 1024000000 sdscteragrid 1024000000

If your application does not need to stage any input data files to the machine on which it will run, then you can omit the "--inputData" from the above command.

for e.g

./client.sh --epr http://localhost:8080/axis2/services/Rps --operation findOptimalResources --params http://www.renci.org WRF V3.0.1

You should see the results of the above invocation in the var/client.log file. The result of the above invocation is an array of the following

1. The rank of the resource (lower rank means more optimal resource) 2. The resource name 3. The queue name 4. The number of CPUs to use on the above resource and the above queue 5. The expected time to transfer data from the source (bigred in the above example) to the above computational resource 6. The expected queue wait time on the above resource and queue based on the number of CPUs and wall time obtained from the performance model 7. The expected wall time

INFO [main] (RpsClient.java:190) - Got list of optimal resources...
INFO [main] (RpsClient.java:194) - Rank: 0, Resource name: sdscteragrid, queue name: dque, num cpus:512, exp data trans time: 2.234362650223991E9, exp queue wait time: 25934.0, exp wall time: 5000.0
INFO [main] (RpsClient.java:194) - Rank: 1, Resource name: bigred, queue name: DEBUG, num cpus:1024, exp data trans time: 2.2444635513588905E9, exp queue wait time: 0.0, exp wall time: 3600.0
INFO [main] (RpsClient.java:194) - Rank: 2, Resource name: ornlteragrid, queue name: dque, num cpus:512, exp data trans time: 3.770845237085046E9, exp queue wait time: 1.0, exp wall time: 5000.0
 
Web site tools