[Backgroundrb-devel] trouble stopping backgroundrb
hemant kumar
gethemant at gmail.com
Thu Sep 18 23:14:30 EDT 2008
Okay,
So, Did you find out, why "stop" didn't work from logrotate, in first
place. I think, thats rather critical.
On Thu, 2008-09-18 at 11:24 -0700, Woody Peterson wrote:
> In my particular case I know it's not a permissions issue, as I'm
> always using the same user.
>
> I just tried restarting it, and with Hemant's patch I got:
>
> script/backgroundrb:52:in `getpgid': No such process (Errno::ESRCH)
>
> Via the above I found that in this particular case what happened is
> that my logrotate wasn't calling stop, only start (it meant to call
> stop, but was in a failing if statement checking if the pid existed).
> When you call start, it doesn't check to see if it's already running,
> so it starts backgroundrb, overwrites the pid file, then backgroundrb
> fails to start but has had it's pid file changed. The original process
> is still running, but can't stop because it doesn't have the correct
> pid in the pid file.
>
> Thus, I rewrote script/backgroundrb to be more LSB compliant (http://refspecs.linux-foundation.org/LSB_3.2.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
> ) so I don't have to check for existing pid files myself. I made a
> patch, but it's almost as big as the script itself and Hemants patch
> didn't apply for me (I must have changed something earlier in the
> file), so the whole thing is at the end of the email.
>
> While we're on the topic, is there a place to load all the
> requirements other than this file? backgroundrb status takes a matter
> of seconds to do a simple File.exists?(pid) 'cuz it has to load all
> the backgroundrb requirements. Not that it really matters...
>
> -Woody
>
> #!/usr/bin/env ruby
>
> RAILS_HOME = File.expand_path(File.join(File.dirname(__FILE__),".."))
> BDRB_HOME = File.join(RAILS_HOME,"vendor","plugins","backgroundrb")
> WORKER_ROOT = File.join(RAILS_HOME,"lib","workers")
> WORKER_LOAD_ENV = File.join(RAILS_HOME,"script","load_worker_env")
>
> ["server","server/lib","lib","lib/backgroundrb"].each { |x|
> $LOAD_PATH.unshift(BDRB_HOME + "/#{x}")}
> $LOAD_PATH.unshift(WORKER_ROOT)
>
> require "rubygems"
> require "yaml"
> require "erb"
> require "logger"
> require "packet"
> require "optparse"
>
> require "bdrb_config"
> require RAILS_HOME + "/config/boot"
> require "active_support"
>
> BackgrounDRb::Config.parse_cmd_options ARGV
> BDRB_CONFIG = BackgrounDRb::Config.read_config("#{RAILS_HOME}/config/
> backgroundrb.yml")
>
> require RAILS_HOME + "/config/environment"
> require "bdrb_job_queue"
> require "backgroundrb_server"
>
> PID_FILE = "#{RAILS_HOME}/tmp/pids/
> backgroundrb_#{BDRB_CONFIG[:backgroundrb][:port]}.pid"
> SERVER_LOGGER = "#{RAILS_HOME}/log/
> backgroundrb_debug_#{BDRB_CONFIG[:backgroundrb][:port]}.log"
>
> def kill_process arg_pid_file
> pid = nil
> File.open(arg_pid_file, "r") { |pid_handle| pid =
> pid_handle.gets.strip.chomp.to_i }
> pgid = Process.getpgid(pid)
> puts "stopping backgroundrb"
> Process.kill('-TERM', pgid)
> File.delete(arg_pid_file) if File.exists?(arg_pid_file)
> end
>
> def status
> File.exists?(PID_FILE)
> end
>
> def start
> if fork
> sleep(5)
> exit
> else
> if status
> puts "already running"
> exit
> end
>
> puts "starting backgroundrb"
>
> op = File.open(PID_FILE, "w")
> op.write(Process.pid().to_s)
> op.close
> if BDRB_CONFIG[:backgroundrb][:log].nil? or
> BDRB_CONFIG[:backgroundrb][:log] != 'foreground'
> log_file = File.open(SERVER_LOGGER,"w+")
> [STDIN, STDOUT, STDERR].each {|desc| desc.reopen(log_file)}
> end
>
> BackgrounDRb::MasterProxy.new()
> end
> end
>
> def stop
> pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
> pid_files.each { |x| kill_process(x) }
> end
>
> case ARGV[0]
> when 'start'
> start
> when 'stop'
> stop
> when 'restart'
> stop
> start
> when 'status'
> if status
> puts "running"
> exit
> else
> puts "not running"
> exit!(3)
> end
> else
> BackgrounDRb::MasterProxy.new()
> end
>
>
> On Sep 18, 2008, at 3:21 AM, John O'Shea wrote:
>
> > Slight variation that
> > - deletes pid for already-gone processes
> > - exits (with errror code -1) without deleting the pid file if there
> > was a permission problem
> >
> > begin
> > - pgid = Process.getpgid(pid)
> > - Process.kill('TERM', pid)
> > - Process.kill('-TERM', pgid)
> > - Process.kill('KILL', pid)
> > - rescue Errno::ESRCH => e
> > - puts "Deleting pid file"
> > - rescue
> > + pgid = Process.getpgid(pid) + Process.kill('-TERM',
> > pgid) + rescue Errno::ESRCH
> > + puts $!
> > + # No process - Do nothing.
> > + rescue Errno::EPERM
> > + # Permission denied. + puts $!
> > + Process.exit!
> > ensure File.delete(arg_pid_file) if File.exists?
> > (arg_pid_file)
> > end
> > hemant kumar wrote:
> >> Okay folks here is a patch to "backgroundrb" script, which should fix
> >> some issues:
> >>
> >> diff --git a/script/backgroundrb b/script/backgroundrb
> >> index dabf80b..8d4bb78 100755
> >> --- a/script/backgroundrb
> >> +++ b/script/backgroundrb
> >> @@ -49,18 +49,9 @@ when 'stop'
> >> def kill_process arg_pid_file
> >> pid = nil
> >> File.open(arg_pid_file, "r") { |pid_handle| pid =
> >> pid_handle.gets.strip.chomp.to_i }
> >> - begin
> >> - pgid = Process.getpgid(pid)
> >> - Process.kill('TERM', pid)
> >> - Process.kill('-TERM', pgid)
> >> - Process.kill('KILL', pid)
> >> - rescue Errno::ESRCH => e
> >> - puts "Deleting pid file"
> >> - rescue
> >> - puts $!
> >> - ensure
> >> - File.delete(arg_pid_file) if File.exists?(arg_pid_file)
> >> - end
> >> + pgid = Process.getpgid(pid)
> >> + Process.kill('-TERM', pgid)
> >> + File.delete(arg_pid_file) if File.exists?(arg_pid_file)
> >> end
> >> pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
> >> pid_files.each { |x| kill_process(x) }
> >>
> >> What it does is:
> >> 1. Deleting by group id is enough for master process. 2. Do not
> >> delete the pid file if, there was an exception while stopping
> >> the daemon.
> >> 3. Do not handle exceptions silently.
> >>
> >> Please try this and let me know, how it goes.
> >>
> >>
> >>
> >> On Wed, 2008-09-17 at 17:35 +0100, John O'Shea wrote:
> >>
> >>> Jonathan,
> >>> Glad you raised this, I've been spending some time trying to
> >>> diagnose this exact same problem. The exception handling code
> >>> in the "when 'stop'" block (in script/backgroundrb) could
> >>> definitely could be improved somewhat
> >>> - check that the process with 'pid' exists before trying to kill it
> >>> - rescue permission exceptions (Errno::EPERM)
> >>> - only delete the pid file if the process pid does not still exist
> >>> (in ensure block)
> >>> - be a little more verbose to stdout/stderr
> >>>
> >>> While we are on the subject of shutdown, - when the backgroundrb
> >>> process gets a HUP signal does it wait for existing workers to
> >>> complete any work methods that are executing or is the
> >>> 'Process.kill('-TERM', pgid)' call intended to make the OS handle
> >>> this?
> >>> We use capistrano to deploy our application (stopping and
> >>> restarting backgroundrb after the rails app has been updated). It
> >>> would be great if we could have more predictability regarding
> >>> shutting down backgroundrb (i.e. have the backgroundrb disable the
> >>> reactor loop in idle workers and wait for all active workers to
> >>> finish methods, then shutdown").
> >>>
> >>> John.
> >>>
> >>> Jonathan Wallace wrote:
> >>>
> >>>> Hi Ryan,
> >>>>
> >>>> I recently ran into the same issue where the backgroundrb process
> >>>> would not respond to ./script/backgroundrb stop command. The pid
> >>>> file
> >>>> was being deleted but the actual process was not being killed. I'm
> >>>> running packet 0.1.12 on gentoo.
> >>>>
> >>>> I'm not exactly sure what conditions put backgroundrb into such a
> >>>> state but I've decided to modify the script/backgroundrb to
> >>>> behave a
> >>>> little differently.
> >>>>
> >>>> My hypothesis is that if one of the Process.kill method calls in
> >>>> script/backgroundrb raises an exception, the pid file is deleted
> >>>> even
> >>>> though the kill signal is never sent. At this point, running
> >>>> starting
> >>>> and stopping backgroundrb never affects the original still running
> >>>> backgroundrb process.
> >>>>
> >>>> There are a couple of reasons that I believe an exception could be
> >>>> raised. Either the Process.getpgid(pid), Process.kill('TERM',
> >>>> pid) or
> >>>> the PRocess.kill('-TERM', pgid) raise an exception or the effective
> >>>> uid of the user running script/backgroundrb stop does not have
> >>>> permission to kill those processes.
> >>>>
> >>>> To fix this, we've removed the Process.getpgid and the two
> >>>> Process.kill's that are sending the TERM signal. Since we've
> >>>> architected our backgroundrb jobs to be persistent and idempotent
> >>>> (a
> >>>> db backed queue written before the feature appeared in bdrb), we'll
> >>>> just use the KILL signal.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>> Thanks,
> >>>> Jonathan
> >>>>
> >>>> On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case
> >>>> <mrryancase at gmail.com> wrote:
> >>>>
> >>>>> Hi folks -
> >>>>>
> >>>>> I'm having trouble getting backgroundrb to stop after one of the
> >>>>> packet_worker_r processes dies.
> >>>>>
> >>>>> If backgroundrb is running properly,
> >>>>> "/path/to/application/script/backgroundrb stop" works fine, but
> >>>>> often
> >>>>> one of the packet_worker_r processes dies, and the stop command no
> >>>>> longer works after that (it runs, but it does not stop the
> >>>>> processes,
> >>>>> and so then start doesn't work).
> >>>>>
> >>>>> The only thing that seems to work at that point is to manually
> >>>>> kill
> >>>>> the processes that are still running, and then the start works,
> >>>>> but
> >>>>> that is going to make restarting via monit a lot less clean.
> >>>>>
> >>>>> Any ideas would be much appreciated!
> >>>>>
> >>>>> I'm using github version of backgroundrb, and packet 0.1.13
> >>>>> running on ubuntu.
> >>>>>
> >>>>> Thanks!
> >>>>> Ryan
> >>>>> _______________________________________________
> >>>>> Backgroundrb-devel mailing list
> >>>>> Backgroundrb-devel at rubyforge.org
> >>>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
> >>>>>
> >>>>>
> >>>> _______________________________________________
> >>>> Backgroundrb-devel mailing list
> >>>> Backgroundrb-devel at rubyforge.org
> >>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
> >>>>
> >>>
> >>
> >>
> >
> >
> > --
> > John O'Shea, CTO at Nooked
> > www: http://www.nooked.com/
> > cell: +353 87 992 9959
> > skype: joshea
> >
> > _______________________________________________
> > Backgroundrb-devel mailing list
> > Backgroundrb-devel at rubyforge.org
> > http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
More information about the Backgroundrb-devel
mailing list