LSDCAS Jobs Monitor

Click here to view the active hosts.
Click here to view the active jobs.

The Jobs Monitor system is designed to monitor, and distribute jobs to nodes on the LSDCAS network. These jobs are intended to be used to control how experiments are transferred from the nodes where they were originally acquired to a central data server.

Before I discuss the technical workings of the system, it is important that we describe the nodes of interest that will be involved in this system.

LSDCAS Network Nodes

  • lsdcas1.lsdcas.z - Experiment acquisition node
  • lsdcas2.lsdcas.z - Experiment acquisition node
  • lsdcas.biomed-eng.uiowa.edu - Router for the experiment acquisition private network, temporary data server for experiment acquisition nodes.
  • homer.biomed-eng.uiowa.edu - Central dataserver
  • lsdcas.engineering.uiowa.edu - Database and web application server for the network.
  • lotka.ecn.uiowa.edu - Workstation
  • pisello.ecn.uiowa.edu - Workstation
  • ilya.ecn.uiowa.edu - Workstation
  • erwin.ecn.uiowa.edu - Workstation

Acquisition Steps

  • Setup
  • Initialization
  • Finalization
  • Transfer
  • Mpeg Creation
  • Preview Creation
  • Deletion

Acquisition

Acquisition is the period of time when an experiment daemon is running. This process is monitor by a cron job that searches the output of the 'ps' command for a given process and then reports the results to the database.

Initialization

The first time an acquisition process is detected, the experiment metadata is inserted into the database.

Finalization

After the acquisition monitor has detected that the acquisition process has terminated, the database is updated.

Transfer

Once an experiment has been finalized, it is ready to be transferred from the temporary datastore to the central server.

Mpeg Creation / Preview Creation

Once an experiment has been fully transferred, workstation nodes can begin processing the experiment data to create mpeg movies and a single frame preview.

Deletion

Once a new experiment has been started on an acquisition node, it is safe to assume that we are no longer in any previous experiments. After making sure that the old experiment is in the database and has been transferred, we are free to delete it from the temporary server.

Tecnical Information

The monitor database acts as a queue of jobs that can be processed. Each node has jobs scheduled to contact the database and ask for any new jobs it can run.

Once a job has been run, new jobs are scheduled by the job process to continue the flow of work.

Database Utility Scripts

  • casAddJob.pl - Adds jobs to the queue
  • casDoneWithJob.pl - Notifies the database that a job was completed successfully
  • casEmail.pl - Queue an email notification
  • casGetJob.pl - Request a job from the database
  • casJobError.pl - Notify the database that there was an error performing a job
  • casMonitorDaemon.pl - Check the process table for an active acquisition process and notify database of results
  • casSMS.pl - Queue an SMS notification

Job Scripts

  • casCreateMpegs.sh - Create mpegs for an experiment. Automatically overlays fluorescent data when available.
  • casCreatePreviews.sh - Create single frame previews for an experiment. Automatically overlays fluorescent data when available.
  • casDelete.sh - Tests that an experiment can be deleted, then deletes the experiment if the tests are passed.
  • casFinalizeExperiment.sh - Finalize experiment data in the database.
  • casInitializeExperiment.sh - Initialize experiment data in the database.
  • casSendMessages.pl - Send all pending email and sms notifications.
  • casTransfer.sh - Transfer an experiment from the temporary storage server to the central data store.

Job Scheduling

All job scripts are entered in the crontab of the appropriate host and scheduled to run every 5 minutes. The database is smart enough to know not to give a host more than one file at a time. This is accomplished by looking to see if the host currently has a job that has not completed (succesfully or unsuccessfully) and if so, returns that there are no more jobs.

Logging

All log files are placed by convention in ~/.cas/logs/$( /bin/hostname -f )/