[galaxy-dev] recovery: galaxy restart
Harendra chawla
chawla.harendra at gmail.com
Thu Jun 2 02:08:11 EDT 2011
Hi Nate,
I have used the local.py logic for submission of my jobs, for a grid based
architecture. But local.py does not have the recover() function so I was
trying to use the concept of recover() from other modules. My job submission
also uses the drmaa api's, so I thought of using the drmaa.py module.
Moreover the local.py doesn't have the monitor() or check_watched_items()
functions which means it doesn't use the concept of monitor_queue. So how
can I create a recover() function in such a case.
Can you suggest anything regarding this issue.
Thanks
Harendra
On Wed, Jun 1, 2011 at 10:52 PM, Nate Coraor <nate at bx.psu.edu> wrote:
> Harendra chawla wrote:
> > Hi Nate,
> >
> > Ya I have seen the function __check_jobs_at_startup(), it calls the
> recover
> > function after checking the state of the job from the database and
> updating
> > the jobWrapper. But what it does after calling the recover function and
> one
> > more thing, does it use the monitor() and the check_watched_item() in the
> > drmaa.py.
>
> Yes, notice that the last function of the recover() method is to insert
> the job state object into the DRM monitor queue with:
>
> self.monitor_queue.put( drm_job_state )
>
> Which is the same as the final step of the submission process, the
> queue_job() method. Once this happens, the monitor thread will pick up
> the job state from monitor_queue (a Python Queue instance) and monitor
> it with monitor()/check_watched_items().
>
> --nate
>
> >
> > Thanks
> > Harendra
> >
> > On Wed, Jun 1, 2011 at 9:10 PM, Nate Coraor <nate at bx.psu.edu> wrote:
> >
> > > Harendra chawla wrote:
> > > > Hi Nate,
> > > >
> > > > I got your point but which part of the code is doing all these
> things, I
> > > > mean how exactly this is done.
> > > > Is it using any other function apart from recover?
> > >
> > > Yes, see __check_jobs_at_startup() in lib/galaxy/jobs/__init__.py
> > >
> > > --nate
> > >
> > > >
> > > > Regards
> > > > Harendra
> > > >
> > > > On Wed, Jun 1, 2011 at 8:56 AM, Nate Coraor <nate at bx.psu.edu> wrote:
> > > >
> > > > > Harendra chawla wrote:
> > > > > > Hi everyone,
> > > > > >
> > > > > > I am trying to modify the *recover* function from the drmaa.py
> > > > > > (/galaxy_central/lib/galaxy/job/runners/drmaa.py) as per my
> > > requirements.
> > > > > > But I am not ale to understand the flow of that function.
> > > > > >
> > > > > > The recover function is called when the galaxy server is
> restarted.
> > > It
> > > > > first
> > > > > > looks for the running jobs from the database. Then my problem is
> how
> > > it
> > > > > > regains the same old state of the galaxy (specially the GUI)
> which
> > > was
> > > > > > before the galaxy got restarted.
> > > > > > Can anyone explain me the flow of the recover function and how
> the
> > > old
> > > > > state
> > > > > > is regained.
> > > > >
> > > > > Hi Harendra,
> > > > >
> > > > > I'm not sure I understand what you mean by old state and the GUI -
> all
> > > > > that's really necessary here is to determine what Galaxy considers
> to
> > > be
> > > > > the state of the job (new, queued, running), recreate the in-memory
> job
> > > > > components (the JobWrapper), and place the job back in Galaxy's
> > > > > DRM-monitoring queue, which will then proceed with the process of
> > > > > finishing the job if it's finished in the DRM or waiting for it to
> > > > > finish if it's still queued or running in the DRM.
> > > > >
> > > > > --nate
> > > > >
> > > > > >
> > > > > > Regards
> > > > > > Harendra
> > > > >
> > > > > > ___________________________________________________________
> > > > > > Please keep all replies on the list by using "reply all"
> > > > > > in your mail client. To manage your subscriptions to this
> > > > > > and other Galaxy lists, please use the interface at:
> > > > > >
> > > > > > http://lists.bx.psu.edu/
> > > > >
> > > > >
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.bx.psu.edu/pipermail/galaxy-dev/attachments/20110602/1cc2232b/attachment.html>
More information about the galaxy-dev
mailing list