From normalperson at yhbt.net Fri Sep 4 15:23:18 2009 From: normalperson at yhbt.net (Eric Wong) Date: Fri, 4 Sep 2009 12:23:18 -0700 Subject: [ANN] unicorn 0.91.0 - finally supports HTTP/0.9! Message-ID: <20090904192318.GA30326@dcvr.yhbt.net> Unicorn is a Rack HTTP server for Unix, fast clients and nothing else. 18 years too late, Unicorn finally gets HTTP/0.9 support as HTTP was implemented in 1991. Shortlog as follows: Eric Wong (16): Documentation updates examples/echo: "Expect:" value is case-insensitive http: make strings independent before modification http: support for multi-line HTTP headers tee_input: fix rdoc unicorn_http: "fix" const warning http: extension-methods allow any tokens http: support for simple HTTP/0.9 GET requests test_http_parser_ng: fix failing HTTP/0.9 test case launcher: defer daemonized redirects until config is read test to ensure stderr goes *somewhere* when daemonized http: SERVER_PROTOCOL matches HTTP_VERSION http: add HttpParser#headers? method Support HTTP/0.9 entity-body-only responses Redirect files in binary mode unicorn 0.91.0 * site: http://unicorn.bogomips.org/ * git: git://git.bogomips.org/unicorn.git * cgit: http://git.bogomips.org/cgit/unicorn.git/ * list: mongrel-unicorn at rubyforge.org * nntp: nntp://news.gmane.org/gmane.comp.lang.ruby.unicorn.general -- Eric Wong From chris at ozmm.org Fri Sep 4 23:21:19 2009 From: chris at ozmm.org (Chris Wanstrath) Date: Fri, 4 Sep 2009 20:21:19 -0700 Subject: Pidfiles and cwd? In-Reply-To: <8b73aaca0909042020w73fb03dfpf6c77c85c1c486ad@mail.gmail.com> References: <8b73aaca0909042020w73fb03dfpf6c77c85c1c486ad@mail.gmail.com> Message-ID: <8b73aaca0909042021v25d5f0eajd926250f71623042@mail.gmail.com> Yikes. Let me try that again. Hi, Thanks for unicorn! Two questions: A) Is there a reason `unicorn` allows you to specify the pidfile's location but `unicorn_rails` does not? B) Is there a reason `unicorn_rails` must start in the app root and doesn't allow it as a config option? Thanks, Chris On Fri, Sep 4, 2009 at 8:20 PM, Chris Wanstrath wrote: > > Hi, > Thanks for unicorn! > Two questions: > A) Is there a reason `unicorn` allows you to specify the pidfile's location but `unicorn_rails > > -- > Chris Wanstrath > http://github.com/defunkt -- Chris Wanstrath http://github.com/defunkt From normalperson at yhbt.net Sat Sep 5 00:28:06 2009 From: normalperson at yhbt.net (Eric Wong) Date: Fri, 4 Sep 2009 21:28:06 -0700 Subject: Pidfiles and cwd? In-Reply-To: <8b73aaca0909042021v25d5f0eajd926250f71623042@mail.gmail.com> References: <8b73aaca0909042020w73fb03dfpf6c77c85c1c486ad@mail.gmail.com> <8b73aaca0909042021v25d5f0eajd926250f71623042@mail.gmail.com> Message-ID: <20090905042806.GA9507@dcvr.yhbt.net> Chris Wanstrath wrote: > Yikes. Let me try that again. > > Hi, > > Thanks for unicorn! Hi Chris, no problem! > Two questions: > > A) Is there a reason `unicorn` allows you to specify the pidfile's > location but `unicorn_rails` does not? `unicorn` was designed to mimic `rackup` in terms of command line options and `unicorn_rails` was designed to mimic `script/server` in Rails. I really didn't know what I was doing with the command-line options for this, so I decided to steal from others :) For long-running servers, I'm not a fan of command-line options in general because they're easy to forget, so `unicorn` only supports it because `rackup` does it (so the embedded CLI options in config.ru can be shared). For `unicorn_rails`, --daemonize already sets a default PID path in RAILS_ROOT/tmp/pids/unicorn.pid whereas `script/server` chooses RAILS_ROOT/tmp/pids/server.pid. Since Rails values "convention over configuration", I figured I might as well hard code it... Additionally, the "-P" parameter used by unicorn_rails and script/server is used to set RAILS_RELATIVE_URL_ROOT so it conflicts with the short option used by rackup/unicorn. > B) Is there a reason `unicorn_rails` must start in the app root and > doesn't allow it as a config option? Since the config file is just Ruby, you can just Dir.chdir inside it. And since the chdir is done when the config file is evaluated, the chdir can be done across restarts/reloads (you can point it to a symlinked directory) to pick up new code/releases. If you do that, I would initially start Unicorn in "/" or some other directory that won't get deleted so you'll be safe for upgrades. If you managed to forget that, you can set the following in your Unicorn config: Unicorn::HttpServer::START_CTX[:cwd] = "/" And then HUP the process before doing the USR2+QUIT to reexec. Subsequent Unicorns will always start in "/" and then you can Dir.chdir to wherever you run your app. Hopefully that makes sense, one thing I've been trying to avoid with the configuration is having too many ways to do the same thing. -- Eric Wong From normalperson at yhbt.net Sat Sep 5 13:15:14 2009 From: normalperson at yhbt.net (Eric Wong) Date: Sat, 5 Sep 2009 10:15:14 -0700 Subject: Pidfiles and cwd? In-Reply-To: <20090905042806.GA9507@dcvr.yhbt.net> References: <8b73aaca0909042020w73fb03dfpf6c77c85c1c486ad@mail.gmail.com> <8b73aaca0909042021v25d5f0eajd926250f71623042@mail.gmail.com> <20090905042806.GA9507@dcvr.yhbt.net> Message-ID: <20090905171514.GA8761@dcvr.yhbt.net> Eric Wong wrote: > > B) Is there a reason `unicorn_rails` must start in the app root and > > doesn't allow it as a config option? > > Since the config file is just Ruby, you can just Dir.chdir inside it. > And since the chdir is done when the config file is evaluated, the > chdir can be done across restarts/reloads (you can point it to a > symlinked directory) to pick up new code/releases. > > If you do that, I would initially start Unicorn in "/" or some other > directory that won't get deleted so you'll be safe for upgrades. > > If you managed to forget that, you can set the following in your > Unicorn config: > > Unicorn::HttpServer::START_CTX[:cwd] = "/" Maybe having a 'working_directory "/path/to/app/root"' that does: Dir.chdir(Unicorn::HttpServer::START_CTX[:cwd] = arg) Internally would make things easier? I see nginx has a "working_directory" config option as well: http://wiki.nginx.org/NginxHttpMainModule -- Eric Wong From normalperson at yhbt.net Sat Sep 5 17:50:45 2009 From: normalperson at yhbt.net (Eric Wong) Date: Sat, 5 Sep 2009 14:50:45 -0700 Subject: merging Unicorn HTTP parser back to Mongrel Message-ID: <20090905215045.GB28829@dcvr.yhbt.net> Hello, (ok, this email got longer than expected, I now consider the most important parts the first and last paragraphs of the last footnote). The Unicorn HTTP parser is feature complete as far as I can tell and supports things the Mongrel one does not. I would very much like to see it used in places that Unicorn isn't suited for[1]. In fact, a chunk of the new features are much better suited for a server with better slow client handling like Mongrel. The big roadblock to getting this back into Mongrel is the Java/JRuby version of the parser Mongrel uses. Simply put, I don't do Java; somebody else will have to port it. But I'll have to convince you that these features are worth going into Mongrel, too :) I could provide a standalone C parser that can be wrapped with FFI, but I'm not sure if the performance would be acceptable. I'm fairly certain that a pure-Ruby version with Ragel-generated code would not provide acceptable performance anywhere; maybe a hand-coded one could, but I'm not particularly excited about doing that... The MRI-C parser should just work on Win32. Unlike the rest of Unicorn, the HTTP parser remains portable to non-UNIX platforms and thread-safe. There are no system-calls made directly through it (only memory allocations through the Ruby C API). New features that aren't in Mongrel are: * HTTP/0.9 support - blame a network BOFH hell bent for hell on saving bytes with a health-checker config for this :) The HttpParser#headers? method has been added to determine if headers should be sent in the response (HTTP/0.9 didn't have response headers). * "Transfer-Encoding: chunked" request decoding support I've been told mobile devices[2] do uploads like this (since they may lack the storage capacity to store large files). This will be useful to Mongrel since Mongrel can handle slow clients better (mobile devices). I also have a use case that goes like this: tar zc $BIG_DIRECTORY | curl -T- http://unicorn/path/to/upload This designed to be slurp-resistant so clients cannot control memory usage of the server and DoS it even with huge chunk sizes. * Trailers support (with Transfer-Encoding: chunked). I haven't run across applications that use this yet (Amazon S3 maybe?) but one use case that I can forsee is generating a Content-MD5 trailer with the above "tar | curl" command. * Multiline continuation headers - Pound sends them, I don't care for Pound but I figured I might as well do it just in case somebody else starts doing it... * Absolute Request URI parsing - It was done with URI.parse originally, I figured I might as well do it in Ragel since it's part of rfc 2616. I think client-side proxies use it so maybe one day somebody can turn Mongrel or a derived server into a client-side HTTP proxy... * Repeated headers handling - they're joined with commas now since Rack doesn't accept arrays in HTTP_* entries . I posted a standlone patch for this in <20090810001022.GA17572 at dcvr.yhbt.net> * HttpParser#keepalive? method - the parser can tell you if it's safe to handle a keepalive request. Not used with Unicorn at the moment. Chunk extensions is one thing that the parser currently just ignores, this is because I've yet to see any use of them anywhere and Rack does not mention them.. Parser Limits: Request body handling: Maximum Content-Length is the maximum value of off_t. I don't think this should be a problem for anyone as Ruby defaults to _FILE_OFFSET_BITS=64 on 32-bit arches. Mongrel does not have this limit in the parser, but since it buffers large uploads to a Tempfile, the limit always existed anyways. Maximum chunk size is also the maximum value of off_t, which is usually a 64-bit long (since Ruby defaults to _FILE_OFFSET_BITS=64 on my 32-bit boxes). I don't expect valid clients to send any values close to this limit, but that's just what it is. Headers: Mostly the same as Mongrel, all headers must fit into the same <=112K string object; which shouldn't be a problem for anything capable of running Ruby. Continuation lines can bypass the per-header size limit, but everything still stays under 112K which is a pretty large limit. Trailers: These can fit into another <=112K string, space taken up during header processing doesn't affect Trailer processing, so you could end up with 224K of combined metadata. You can get a full changelog since I branched from fauna/mongrel via: git log v0.0.0.. -- ext Finally, the new API is documented via RDoc here: http://unicorn.bogomips.org/Unicorn/HttpParser.html I don't consider the API set in stone, but I do consider the header handling part a bit simpler/less error prone than the old one. Disclaimer: Due to the large amounts of changes to the C/Ragel portions, another security audit/pair-of-eyes would be nice. All use of Unicorn so far has been on LANs with trusted clients or with nginx in front. While I'm very comfortable with C and fairly comfortable with Ragel, I'm far from infallible so close review from a second pair of eyes would be greatly appreciated. Future: I'm also planning on porting this to Rubinius, too. I haven't had a chance to look at it yet but the Mongrel/C one has already been ported so it shouldn't be too hard (I only know/can stomach a small amount of C++, though I suspect I won't even need it ...) Footnotes: [1] - Comet/long-polling/reverse HTTP, and sites that rely heavily on external services (including OpenID) are all badly suited for Unicorn. [2] - As a side effect, Unicorn also uses a TeeInput class that allows the request body to be read in real-time within the Rack application (while "tee-ing" to a temporary file to provide rewindability). This also allows Mongrel Upload Progress to be implemented in the future in a Rack::Lint-compliant manner. The one weird thing about TeeInput is that: env["rack.input"].read(NR_BYTES) Is not guaranteed to return NR_BYTES, only NR_BYTES at most. So every #read can provide "last block" semantics. Rack does not enforce this behavior, so it should be fine. This should not be a problem in practice since most read() and read()-like APIs provide no such guarantee even if implied when reading from "fast" devices like the filesystem. CGI apps that get a socket as stdin also got similar semantics as what apps under Unicorn get. I imagine this feature to be hugely useful for slow mobile clients that stream data slowly as it allows the server to start processing data as it is being uploaded. -- Eric Wong From chris at ozmm.org Sat Sep 5 20:50:17 2009 From: chris at ozmm.org (Chris Wanstrath) Date: Sat, 5 Sep 2009 17:50:17 -0700 Subject: Pidfiles and cwd? In-Reply-To: <20090905042806.GA9507@dcvr.yhbt.net> References: <8b73aaca0909042020w73fb03dfpf6c77c85c1c486ad@mail.gmail.com> <8b73aaca0909042021v25d5f0eajd926250f71623042@mail.gmail.com> <20090905042806.GA9507@dcvr.yhbt.net> Message-ID: <8b73aaca0909051750h1520fa88ub42bcd4cd3944dbc@mail.gmail.com> On Fri, Sep 4, 2009 at 9:28 PM, Eric Wong wrote: >> A) Is there a reason `unicorn` allows you to specify the pidfile's >> location but `unicorn_rails` does not? > > `unicorn` was designed to mimic `rackup` in terms of command line > options and `unicorn_rails` was designed to mimic `script/server` in > Rails. > > I really didn't know what I was doing with the command-line options for > this, so I decided to steal from others :) > > For long-running servers, I'm not a fan of command-line options in > general because they're easy to forget, so `unicorn` only supports it > because `rackup` does it (so the embedded CLI options in config.ru can > be shared). > > For `unicorn_rails`, --daemonize already sets a default PID path in > RAILS_ROOT/tmp/pids/unicorn.pid whereas `script/server` chooses > RAILS_ROOT/tmp/pids/server.pid. ?Since Rails values "convention over > configuration", ?I figured I might as well hard code it... > > Additionally, the "-P" parameter used by unicorn_rails and script/server > is used to set RAILS_RELATIVE_URL_ROOT so it conflicts with the short > option used by rackup/unicorn. I'm 100% on board with your reasoning here, but I don't understand why we can't set the pid in the config file. Am I reading this wrong? (Very possibly.) Doesn't this line mean that any `pid` setting in the configurator is ignored: http://git.bogomips.org/cgit/unicorn.git/tree/lib/unicorn.rb#n83 If so, how would you suggest setting the pid at runtime? I don't mean to be difficult, we just want to keep everything in /var/run >> B) Is there a reason `unicorn_rails` must start in the app root and >> doesn't allow it as a config option? > > Since the config file is just Ruby, you can just Dir.chdir inside it. > And since the chdir is done when the config file is evaluated, the > chdir can be done across restarts/reloads (you can point it to a > symlinked directory) to pick up new code/releases. > > If you do that, I would initially start Unicorn in "/" or some other > directory that won't get deleted so you'll be safe for upgrades. > > If you managed to forget that, you can set the following in your > Unicorn config: > > ?Unicorn::HttpServer::START_CTX[:cwd] = "/" > > And then HUP the process before doing the USR2+QUIT to reexec. > Subsequent Unicorns will always start in "/" and then you can > Dir.chdir to wherever you run your app. > > Hopefully that makes sense, one thing I've been trying to avoid with the > configuration is having too many ways to do the same thing. Sounds good to me. Unicorn has quite completely convinced me that pure-Ruby config files are the way to go. I've already done some crazy stuff in there during our migration that a YAML file (or whatever) just wouldn't have been able to handle. Cheers, -- Chris Wanstrath http://github.com/defunkt From normalperson at yhbt.net Sat Sep 5 21:38:49 2009 From: normalperson at yhbt.net (Eric Wong) Date: Sat, 5 Sep 2009 18:38:49 -0700 Subject: Pidfiles and cwd? In-Reply-To: <8b73aaca0909051750h1520fa88ub42bcd4cd3944dbc@mail.gmail.com> References: <8b73aaca0909042020w73fb03dfpf6c77c85c1c486ad@mail.gmail.com> <8b73aaca0909042021v25d5f0eajd926250f71623042@mail.gmail.com> <20090905042806.GA9507@dcvr.yhbt.net> <8b73aaca0909051750h1520fa88ub42bcd4cd3944dbc@mail.gmail.com> Message-ID: <20090906013849.GC28829@dcvr.yhbt.net> Chris Wanstrath wrote: > On Fri, Sep 4, 2009 at 9:28 PM, Eric Wong wrote: > > >> A) Is there a reason `unicorn` allows you to specify the pidfile's > >> location but `unicorn_rails` does not? > > > > `unicorn` was designed to mimic `rackup` in terms of command line > > options and `unicorn_rails` was designed to mimic `script/server` in > > Rails. > > > > I really didn't know what I was doing with the command-line options for > > this, so I decided to steal from others :) > > > > For long-running servers, I'm not a fan of command-line options in > > general because they're easy to forget, so `unicorn` only supports it > > because `rackup` does it (so the embedded CLI options in config.ru can > > be shared). > > > > For `unicorn_rails`, --daemonize already sets a default PID path in > > RAILS_ROOT/tmp/pids/unicorn.pid whereas `script/server` chooses > > RAILS_ROOT/tmp/pids/server.pid. ?Since Rails values "convention over > > configuration", ?I figured I might as well hard code it... > > > > Additionally, the "-P" parameter used by unicorn_rails and script/server > > is used to set RAILS_RELATIVE_URL_ROOT so it conflicts with the short > > option used by rackup/unicorn. > > I'm 100% on board with your reasoning here, but I don't understand why > we can't set the pid in the config file. Am I reading this wrong? > (Very possibly.) You can definitely set the pid in the config file, let me know if you're having issues... > Doesn't this line mean that any `pid` setting in the configurator is ignored: > > http://git.bogomips.org/cgit/unicorn.git/tree/lib/unicorn.rb#n83 It's only ignored at that time, the pid is dropped later in the start method: http://git.bogomips.org/cgit/unicorn.git/tree/lib/unicorn.rb#n114 self.pid = config[:pid] So I don't drop the pid file until listeners are bound. I can't remember why I drop the pid late, however :) I know I don't bind the listeners right away because I try inheriting them first. It might be to avoid issues with the pid clobbering in the parent, or that some health checkers rely on both the pid and listen socket or something... > If so, how would you suggest setting the pid at runtime? I don't mean > to be difficult, we just want to keep everything in /var/run The config file really should work, but be sure to let me know if it's still not working for you. I just tested unicorn_rails with a test app and it seems to work here. Perhaps you're hitting permissions problems? As far as permissions, I'm considering adding user switching, but I'm afraid it could lead to more problems/support issues since I don't know anybody who runs Mongrel/Thin as root. However I've already written a huge comment/example in the Configurator RDoc for it :) > >> B) Is there a reason `unicorn_rails` must start in the app root and > >> doesn't allow it as a config option? > > > > Since the config file is just Ruby, you can just Dir.chdir inside it. > > And since the chdir is done when the config file is evaluated, the > > chdir can be done across restarts/reloads (you can point it to a > > symlinked directory) to pick up new code/releases. > > > > If you do that, I would initially start Unicorn in "/" or some other > > directory that won't get deleted so you'll be safe for upgrades. > > > > If you managed to forget that, you can set the following in your > > Unicorn config: > > > > ?Unicorn::HttpServer::START_CTX[:cwd] = "/" > > > > And then HUP the process before doing the USR2+QUIT to reexec. > > Subsequent Unicorns will always start in "/" and then you can > > Dir.chdir to wherever you run your app. > > > > Hopefully that makes sense, one thing I've been trying to avoid with the > > configuration is having too many ways to do the same thing. > > Sounds good to me. Unicorn has quite completely convinced me that > pure-Ruby config files are the way to go. I've already done some crazy > stuff in there during our migration that a YAML file (or whatever) > just wouldn't have been able to handle. Cool, thanks for the feedback. I've had lots of issues dealing with ugly/limited config files and seen too many ways of templating them to count. The after_fork, before_fork, and before_exec hooks sealed the deal for Ruby and Ezra / Rack::Builder helped push me in the right direction with the DSL. It was originally "write a small ruby script to run this thing" :) -- Eric Wong From normalperson at yhbt.net Wed Sep 9 15:53:23 2009 From: normalperson at yhbt.net (Eric Wong) Date: Wed, 9 Sep 2009 12:53:23 -0700 Subject: [PATCH] http_response: don't "rescue nil" for body.close Message-ID: <20090909195323.GA28219@dcvr.yhbt.net> This can hide bugs in Rack applications/middleware. Most other Rack handlers/servers seem to follow this route as well, so this helps ensure broken things will break loudly and more consistently across all Rack-enabled servers :) --- This will be in the next release, so fix your applications/middleware before hand... lib/unicorn/http_response.rb | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/lib/unicorn/http_response.rb b/lib/unicorn/http_response.rb index e0ff805..0d05b2c 100644 --- a/lib/unicorn/http_response.rb +++ b/lib/unicorn/http_response.rb @@ -69,7 +69,7 @@ module Unicorn body.each { |chunk| socket.write(chunk) } socket.close # flushes and uncorks the socket immediately ensure - body.respond_to?(:close) and body.close rescue nil + body.respond_to?(:close) and body.close end end -- Eric Wong From normalperson at yhbt.net Tue Sep 15 05:06:03 2009 From: normalperson at yhbt.net (Eric Wong) Date: Tue, 15 Sep 2009 02:06:03 -0700 Subject: HTTP parser C extension should be Rubinius-compatible Message-ID: <20090915090603.GA26660@dcvr.yhbt.net> Hi all, I've just pushed out some changes to the C HTTP parser that should make it compatible with a recent Rubinius[1] using the C API. While I got the http_parser and http_parser_ng tests to pass with the new changes, most of the other tests that use pure Ruby actually failed(!). If anybody wants to pick up where I left off (even if it's to properly report bugs to the Rubinius team), please do so. I'm not quite motivated enough to do much more myself for a variety of reasons: 1) non-(CLI|email) bug trackers scare me 2) IRC kills my concentration 3) BDD specs are weird to me[2] 4) lack of folks in need of Rubinius support *right* *now* I'm sure if more things start using/working-with Rubinius I'd be more inclined to do more, but right now we're stuck in a Catch-22... Shortlog of relevant changes pushed out to git://git.bogomips.org/unicorn tonight: Eric Wong (6): http: define OFFT2NUM macro on Rubies without it http: no-op rb_str_modify() for Rubies without it http: compile with -fPIC http: use rb_str_{update,flush} if available http: create a new string buffer on empty values Update documentation for Rubinius support status [1] tested with db612aa62cad9e5cc41a4a4be645642362029d20 [2] to be fair, Unicorn using gmake to parallelize test/unit is weird to the rest of the world, too -- Eric Wong From normalperson at yhbt.net Wed Sep 16 17:11:50 2009 From: normalperson at yhbt.net (Eric Wong) Date: Wed, 16 Sep 2009 14:11:50 -0700 Subject: Rainbows... Message-ID: <20090916211150.GA30500@dcvr.yhbt.net> Some of you may have noticed the "rainbows" branch of the git repository. I may be changing that however and making a small group of gems (or even one gem) that can get loaded at runtime and monkey patch parts of Unicorn. Rainbows will primarily be to support things Unicorn sucks at: 1. apps that rely on out-of-datacenter network connections (CAPTCHA services, OpenID, real-time feed aggregation, etc...) 2. Comet / long-polling / reverse HTTP Since the majority of Unicorn use cases (or even requests within an application) do not need these things, I'm hesitant to make Unicorn itself more complicated and more difficult to support for the majority of apps. Instead, I'm leaning towards putting that burden on Rainbows for the applications that absolutely need it. For 1), the apps I've seen that rely on out-of-datacenter network connections don't use them for the large majority (>= 95%) of HTTP requests, so the overall application with a few poorly-performing application endpoints were fine and predictable even with basic Unicorn. Of course, some folks I know want to make a proxy with Unicorn and thats why I starting working on it. As far as concurrency models go, forked workers sharing listener sockets will continue to be used to better exploit CPU/memory concurrency[1]. However, each worker process will also support Threads or Actors so there'll be a more flexible M:N mapping of processes:clients instead of the 1:1 mapping Unicorn uses. Rev and EventMachine are being considered, too, but mapping those programming models to TeeInput will be more work... [1] Even on Ruby implementations without a big global lock for threads, forked workers that don't have to share large amounts of memory on big SMP boxes, so they'll experience less memory/cache contention and should experience better performance as a result. -- Eric Wong From normalperson at yhbt.net Thu Sep 17 02:04:37 2009 From: normalperson at yhbt.net (Eric Wong) Date: Wed, 16 Sep 2009 23:04:37 -0700 Subject: [PATCH] SIGHUP no longer drops lone, default listener Message-ID: <20090917060437.GB22961@dcvr.yhbt.net> The following patch (or something functionally equivalent) along with some sort of test case will be in the next release of Unicorn (due this week along with more documentation updates). Until then, if you're not explicitly specifying a listener anywhere, specify one, even if it's the default (preferably in the config file) if you plan on using SIGHUP to reload configs. Just add the following line to your config file: listen 8080 >From beefdb590c4cae9671bd96bf94962634ecbc6161 Mon Sep 17 00:00:00 2001 From: Eric Wong Date: Wed, 16 Sep 2009 22:31:20 -0700 Subject: [PATCH] SIGHUP no longer drops lone, default listener When SIGHUP reloads the config, we didn't account for the case where the listen socket was completely unspecified. Thus the default listener (TCP port 8080), did not get preserved and reinjected into the config properly. Note that relying on the default listen or specifying listeners on the command-line means it's practically impossible to unbind listeners with a configuration file reload. We also need to preserve the (unspecified) default listener across upgrades that later result in SIGHUP, too; so we inject the default listener into the command-line for upgrades. --- lib/unicorn.rb | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/lib/unicorn.rb b/lib/unicorn.rb index 4cc5c2d..0e46261 100644 --- a/lib/unicorn.rb +++ b/lib/unicorn.rb @@ -110,6 +110,8 @@ module Unicorn config_listeners -= listener_names if config_listeners.empty? && LISTENERS.empty? config_listeners << Unicorn::Const::DEFAULT_LISTEN + init_listeners << Unicorn::Const::DEFAULT_LISTEN + START_CTX[:argv] << "-l#{Unicorn::Const::DEFAULT_LISTEN}" end config_listeners.each { |addr| listen(addr) } raise ArgumentError, "no listeners" if LISTENERS.empty? -- Eric Wong From tom at github.com Fri Sep 18 00:54:48 2009 From: tom at github.com (Tom Preston-Werner) Date: Thu, 17 Sep 2009 21:54:48 -0700 Subject: 502s with Nginx, Unicorn, and Unix Domain Sockets Message-ID: I'm doing some benchmarking on our new Rackspace frontend machines (8 core, 16GB) and running into some problems with the Unix domain socket setup. At high request rates (on simple pages) I'm getting a lot of HTTP 502 errors from Nginx. Nothing shows up in the Unicorn error log, but Nginx has the following in its error log: 2009/09/17 19:36:52 [error] 28277#0: *524824 connect() to unix:/data/github/current/tmp/sockets/unicorn.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: 172.17.1.5, server: github.com, request: "GET /site/junk HTTP/1.1", upstream: "http://unix:/data/github/current/tmp/sockets/unic orn.sock:/site/junk", host: "github.com" This problem does not exist with the nginx -> haproxy -> unicorn setup. Thinking this might be a file descriptor problem, I upped the fd limit to 32768 with no luck. Then I tried upping net.core.somaxconn to 262144 which also had no effect. I thought I'd ask about the problem here to see if anyone knows a simple solution that I'm missing. Perhaps there is an Nginx configuration directive I need? Thanks. Unicorn rocks! Tom -- Tom Preston-Werner GitHub Cofounder http://tom.preston-werner.com github.com/mojombo From normalperson at yhbt.net Fri Sep 18 02:48:31 2009 From: normalperson at yhbt.net (Eric Wong) Date: Thu, 17 Sep 2009 23:48:31 -0700 Subject: 502s with Nginx, Unicorn, and Unix Domain Sockets In-Reply-To: References: Message-ID: <20090918064831.GA5285@dcvr.yhbt.net> Tom Preston-Werner wrote: > I'm doing some benchmarking on our new Rackspace frontend machines (8 > core, 16GB) and running into some problems with the Unix domain socket > setup. At high request rates (on simple pages) I'm getting a lot of > HTTP 502 errors from Nginx. Nothing shows up in the Unicorn error log, > but Nginx has the following in its error log: Hi Tom, At what request rates were you running into this? Also how large are your responses? It could be the listen() backlog overflowing if Unicorn isn't logging anything. Anything in the system/kernel logs (doubtful, actually)? Does increasing the listen :backlog parameter work? Default is 1024 (which is pretty high already), maybe try a higher number along with the net.core.netdev_max_backlog sysctl. Is there a large discrepancy between the times your benchmark client logs, the request time nginx logs, and whatever Rails/Rack logs for request times for any particular request? If the Rails/Rack logging times all seem consistently low but your nginx/benchmark has some weird spikes/outliers, then some are stuck in the kernel listen backlog. How much of the 8 cores are being used on those boxes when this starts happening? > 2009/09/17 19:36:52 [error] 28277#0: *524824 connect() to > unix:/data/github/current/tmp/sockets/unicorn.sock failed (11: > Resource temporarily unavailable) while connecting to upstream, > client: 172.17.1.5, server: github.com, request: "GET /site/junk > HTTP/1.1", upstream: > "http://unix:/data/github/current/tmp/sockets/unic > orn.sock:/site/junk", host: "github.com" Raising proxy_connect_timeout in nginx may be a work around, what is it set to now? On the other hand, keeping it (and :backlog in Unicorn) low would give better indications for failover to other hosts. > This problem does not exist with the nginx -> haproxy -> unicorn > setup. Thinking this might be a file descriptor problem, I upped the > fd limit to 32768 with no luck. Then I tried upping net.core.somaxconn > to 262144 which also had no effect. I thought I'd ask about the > problem here to see if anyone knows a simple solution that I'm > missing. Perhaps there is an Nginx configuration directive I need? > Thanks. Unicorn rocks! Definitely not a file descriptor problem (at least not inside Unicorn). Also, I'm not sure there's a reason to keep haproxy between nginx and Unicorn... Maybe haproxy in front of the entire cluster of servers. Are you already hitting higher request rates (and more consistent times logged by client/nginx) with: nginx -> unicorn/unix vs nginx -> unicorn/tcp(localhost) ? Under extremely high loads, 502s may actually be wanted since it allows failover to a less loaded box if there's uneven balancing; but we really need to have numbers on the request rates. -- Eric Wong From normalperson at yhbt.net Fri Sep 18 18:16:00 2009 From: normalperson at yhbt.net (Eric Wong) Date: Fri, 18 Sep 2009 15:16:00 -0700 Subject: [ANN] unicorn 0.92.0 Message-ID: <20090918221600.GA9801@dcvr.yhbt.net> Unicorn is a Rack HTTP server for Unix and fast clients Small fixes and documentation are the focus of this release. James Golick reported and helped me track down a bug that caused SIGHUP to drop the default listener (0.0.0.0:8080) if and only if listeners were completely unspecified in both the command-line and Unicorn config file. The Unicorn config file remains the recommended option for specifying listeners as it allows fine-tuning of the :backlog, :rcvbuf, :sndbuf, :tcp_nopush, and :tcp_nodelay options. There are some documentation (and resulting website) improvements. setup.rb users will notice the new section 1 manpages for `unicorn` and `unicorn_rails`, Rubygems users will have to install manpages manually or use the website. Edit: That's not entirely true, I screwed up the package but you can get them from http://unicorn.bogomips.org/unicorn.1 and http://unicorn.bogomips.org/unicorn_rails.1 The HTTP parser got a 3rd-party code review which resulted in some cleanups and one insignificant bugfix as a result. Additionally, the HTTP parser compiles, runs and passes unit tests under Rubinius. The pure-Ruby parts still do not work yet and we currently lack the resources/interest to pursue this further but help will be gladly accepted. The website now has an Atom feed for new release announcements. Those unfamiliar with Atom or HTTP may finger unicorn at bogomips.org for the latest announcements. Eric Wong (53): README: update with current version http: cleanup and avoid potential signedness warning http: clarify the setting of the actual header in the hash http: switch to macros for bitflag handling http: refactor keepalive tracking to functions http: use explicit elses for readability http: remove needless goto http: extra assertion when advancing p manually http: verbose assertions http: NIL_P(var) instead of var == Qnil http: rb_gc_mark already ignores immediates http: ignore Host: continuation lines with absolute URIs doc/SIGNALS: fix the no-longer-true bit about socket options "encoding: binary" comments for all sources (1.9) http_response: don't "rescue nil" for body.close CONTRIBUTORS: fix capitalization for why http: support Rubies without the OBJ_FROZEN macro http: define OFFT2NUM macro on Rubies without it http: no-op rb_str_modify() for Rubies without it http: compile with -fPIC http: use rb_str_{update,flush} if available http: create a new string buffer on empty values Update documentation for Rubinius support status http: cleanup assertion for memoized header strings http: add #endif comment labels where appropriate Add .mailmap file for "git shortlog" and other tools Update Manifest with mailmap Fix comment about speculative accept() SIGNALS: use "Unicorn" when referring to the web server Add new Documentation section for manpages test_exec: add extra tests for HUP and preload_app socket_helper: (FreeBSD) don't freeze the accept filter constant Avoid freezing objects that don't benefit from it SIGHUP no longer drops lone, default listener doc: generate ChangeLog and NEWS file for RDoc Remove Echoe and roll our own packaging/release... unicorn_rails: close parentheses in help message launchers: deprecate ambiguous -P/--p* switches man1/unicorn: avoid unnecessary emphasis Add unicorn_rails(1) manpage Documentation: don't force --rsyncable flag with gzip(1) Simplify and standardize manpages build/install GNUmakefile: package .tgz includes all generated files doc: begin integration of HTML manpages into RDoc Update TODO html: add Atom feeds doc: latest news is available through finger NEWS.atom: file timestamp matches latest entry pandoc needs the standalone switch for manpages man1/unicorn: split out RACK ENVIRONMENT section man1/unicorn_rails: fix unescaped underscore NEWS.atom.xml only lists the first 10 entries unicorn 0.92.0 * site: http://unicorn.bogomips.org/ * git: git://git.bogomips.org/unicorn.git * cgit: http://git.bogomips.org/cgit/unicorn.git/ * list: mongrel-unicorn at rubyforge.org * nntp: nntp://news.gmane.org/gmane.comp.lang.ruby.unicorn.general * finger: unicorn at bogomips.org -- Eric Wong From normalperson at yhbt.net Fri Sep 18 22:30:13 2009 From: normalperson at yhbt.net (Eric Wong) Date: Fri, 18 Sep 2009 19:30:13 -0700 Subject: 502s with Nginx, Unicorn, and Unix Domain Sockets In-Reply-To: <20090918064831.GA5285@dcvr.yhbt.net> References: <20090918064831.GA5285@dcvr.yhbt.net> Message-ID: <20090919023013.GA3608@dcvr.yhbt.net> Hi Tom, any updates on this? I'd really like to get to the bottom of this, thanks! -- Eric Wong From tom at github.com Sat Sep 19 16:23:24 2009 From: tom at github.com (Tom Preston-Werner) Date: Sat, 19 Sep 2009 13:23:24 -0700 Subject: 502s with Nginx, Unicorn, and Unix Domain Sockets In-Reply-To: <20090918064831.GA5285@dcvr.yhbt.net> References: <20090918064831.GA5285@dcvr.yhbt.net> Message-ID: On Thu, Sep 17, 2009 at 11:48 PM, Eric Wong wrote: > At what request rates were you running into this? ?Also how large are > your responses? ?It could be the listen() backlog overflowing if Unicorn > isn't logging anything. I was hitting the 502s at about 1300 req/sec and 80% CPU utilization. Response size was only a few bytes + headers. I was just testing a very simple string response from our Rails app to make sure our setup could tolerate very high request rates. > Does increasing the listen :backlog parameter work? ?Default is 1024 > (which is pretty high already), maybe try a higher number along with the > net.core.netdev_max_backlog sysctl. This was the first thing I tried after getting your response, and it seems that upping the :backlog to 2048 solves the 502 problem! I'm now able to get 1500 req/sec out of Unicorn/UNIX (as opposed to 1350 req/sec with the TCP/HAProxy setup). I'm quite satisfied with this result, and I think this is how we'll end up deploying the app. Thanks for your help, and I'll try to keep you updated on how our installation performs and if I see any strange behavior under normal traffic. Tom From normalperson at yhbt.net Sat Sep 19 18:08:56 2009 From: normalperson at yhbt.net (Eric Wong) Date: Sat, 19 Sep 2009 15:08:56 -0700 Subject: 502s with Nginx, Unicorn, and Unix Domain Sockets In-Reply-To: References: <20090918064831.GA5285@dcvr.yhbt.net> Message-ID: <20090919220850.GA6650@dcvr.yhbt.net> Tom Preston-Werner wrote: > On Thu, Sep 17, 2009 at 11:48 PM, Eric Wong wrote: > > At what request rates were you running into this? ??Also how large are > > your responses? ??It could be the listen() backlog overflowing if Unicorn > > isn't logging anything. > > I was hitting the 502s at about 1300 req/sec and 80% CPU utilization. > Response size was only a few bytes + headers. I was just testing a > very simple string response from our Rails app to make sure our setup > could tolerate very high request rates. Yup, as I suspected: your UNIX socket setup was maxing out right around where your TCP setup was maxing out. TCP is just better at handling/recovering from errors. > > Does increasing the listen :backlog parameter work? ??Default is 1024 > > (which is pretty high already), maybe try a higher number along with the > > net.core.netdev_max_backlog sysctl. > > This was the first thing I tried after getting your response, and it > seems that upping the :backlog to 2048 solves the 502 problem! I'm now > able to get 1500 req/sec out of Unicorn/UNIX (as opposed to 1350 > req/sec with the TCP/HAProxy setup). I'm quite satisfied with this > result, and I think this is how we'll end up deploying the app. Good to know it worked! However, I do hesitate to recommend a large listen() backlog for production. It can impede with monitoring/failover/load-balancing in multi-server setups even if it looks good on benchmarks. I'll make a separate call-for-testing mailing list related to this subject in a bit... > Thanks for your help, and I'll try to keep you updated on how our > installation performs and if I see any strange behavior under normal > traffic. No problem, thanks for the feedback! It's great to know people actually use it. -- Eric Wong From chris at ozmm.org Sat Sep 19 19:07:02 2009 From: chris at ozmm.org (Chris Wanstrath) Date: Sat, 19 Sep 2009 16:07:02 -0700 Subject: GitHub on Unicorn Message-ID: <8b73aaca0909191607r63fd3987w552728c799fb0f0a@mail.gmail.com> Just wanted to say that GitHub has been running on Unicorn for about two weeks now. In that time it's successfully served millions of pages and has survived two separate DDoS attacks. Here's the config file we currently use (complete with a fun hack to gracefully kill the old master when a new worker pool is ready): http://gist.github.com/189623 (Tom's thread with the `backlog` fix concerns our new servers, which aren't yet in production.) I plan to do a writeup on our transition from mongrel_cluster to Unicorn in the near future, in case others are interested. I'll post the link here when it's available. Also: I'm keeping a mirror of the project at http://github.com/defunkt/unicorn for any other GH users who want to watch updates in their generalized feed. I update it semi-regularly. Long live fork(2)! And thanks again for the project. Cheers, -- Chris Wanstrath http://github.com/defunkt From normalperson at yhbt.net Sat Sep 19 19:23:45 2009 From: normalperson at yhbt.net (Eric Wong) Date: Sat, 19 Sep 2009 16:23:45 -0700 Subject: [CFT] multi server failover setup Message-ID: <20090919232345.GA14498@dcvr.yhbt.net> I've been meaning to test this setup somewhere for a while, but never got the right app/(real) traffic/hardware to do it with. So maybe somebody can try it out and let us know if this works... It's for applications running the same code/data on a cluster of machines, so this doesn't apply to the majority of low-traffic sites out there. The goal is to avoid having to run a dedicated load balancer/proxy/virtual IP in front of the application servers by using round-robin DNS for the general load-balancing case. The immediate downside to this approach is that if one host goes completely dead (or just the nginx instance), your clients will have the burden of doing failover. I know curl will failover if DNS resolves multiple addresses but I'm not sure about other HTTP clients... This setup requires that nginx + unicorn run on all application boxes. The request flow for a 3 machine cluster would look like this: /--> host1(nginx --> unicorn) / / client ----> host2(nginx --> unicorn) \ \ \--> host3(nginx --> unicorn) Now in the unicorn configs: # We configure unicorn to listen on both a UNIX socket and TCP: # First the UNIX socket socket listen "/tmp/sock", :backlog => 5 # fails quickly if overloaded # use a internal IP here since Unicorn should not talk to the # public... listen "#{internal_ip}:8080", :backlog => 1024 # fail slowly # the exact numbers for the :backlog values are flexible # the idea here is just to have a very low :backlog for the # UNIX domain socket and big one as a fallback for the # TCP socket And the nginx configs: upstream unicorn_failover { # primary connection, "fail_timeout=0" is to ensure that # we always *try* to use the UNIX socket on every request # that comes in: server unix:/tmp/sock fail_timeout=0; # failover connections, "backup" ensures these will not # be used unless connections to unix:/tmp/sock are failing # it may be advisable to reorder these on a per-host basis # so "host1" does not connect to "host1_private" as its # first choice... server host1_private:8080 fail_timeout=0 backup; server host2_private:8080 fail_timeout=0 backup; server host3_private:8080 fail_timeout=0 backup; } The idea is to have the majority of requests will use the UNIX socket which is a bit faster than the TCP one. However, if _some_ of your machines start getting overloaded, nginx can failover to using TCP, likely on a different host which may be less loaded. So under heavy load, you may end up with requests flowing like this: /------>-----\ / \ /--> host1(nginx --> unicorn)<--+-<-\ / \___/ | | / / \ | | client ----> host2(nginx --> unicorn) V ^ \ \___/ | | \ / \ | | \--> host3(nginx --> unicorn)<--/ | \ | `------>----------' All the extra lines from this diagram are the "backup" flows. This should help address the problem of certain (rare) actions being extremely slow while the majority of the actions still run quickly. It would help smooth out pathological cases where all the slow actions somehow end up clustering on a small subset of machines in the cluster while the rest of the machines are still in the comfort zone. This setup will not help under extreme load when the entire cluster is at capacity, only the case where an unbalanced subset of the cluster is maxed out. Let me know if you have any questions/comments and especially any results if you're brave enough to try this :) -- Eric Wong From normalperson at yhbt.net Sat Sep 19 20:00:02 2009 From: normalperson at yhbt.net (Eric Wong) Date: Sat, 19 Sep 2009 17:00:02 -0700 Subject: GitHub on Unicorn In-Reply-To: <8b73aaca0909191607r63fd3987w552728c799fb0f0a@mail.gmail.com> References: <8b73aaca0909191607r63fd3987w552728c799fb0f0a@mail.gmail.com> Message-ID: <20090920000002.GB6650@dcvr.yhbt.net> Chris Wanstrath wrote: > Just wanted to say that GitHub has been running on Unicorn for about > two weeks now. In that time it's successfully served millions of pages > and has survived two separate DDoS attacks. Wow, this is wonderful news! > Here's the config file we currently use (complete with a fun hack to > gracefully kill the old master when a new worker pool is ready): > > http://gist.github.com/189623 Great use of the before/after_fork hooks One possible issue is a race condition in the before_fork hook, so I'd put a rescue to protect the File.read in the before_fork: old master new master ----------------------------------------------------------------------- before_fork for worker=0 File.exist?(old_pid) => true Process.kill :QUIT, File.read(old_pid).to_i before_fork for worker=1 File.exist?(old_pid) => true processes :QUIT unlinks old_pid # the File.read will raise Errno::ENOENT: Process.kill :QUIT, File.read(old_pid).to_i > (Tom's thread with the `backlog` fix concerns our new servers, which > aren't yet in production.) > > I plan to do a writeup on our transition from mongrel_cluster to > Unicorn in the near future, in case others are interested. I'll post > the link here when it's available. Looking forward to it! > Also: I'm keeping a mirror of the project at > http://github.com/defunkt/unicorn for any other GH users who want to > watch updates in their generalized feed. I update it semi-regularly. Cool, more exposure's always good. Small nit: "Eric Wong's Unicorn" doesn't sound quite right... While I am the benevolent dictator for now, I do welcome contributions. I could not have have built it without standing on the shoulders of Mongrel and the existence of Rack and nginx. Personally, I try to keep a low public profile and it's always been an weird balancing act trying to get people to use my work at the same time... > Long live fork(2)! And thanks again for the project. :) -- Eric Wong From normalperson at yhbt.net Wed Sep 30 19:44:17 2009 From: normalperson at yhbt.net (Eric Wong) Date: Wed, 30 Sep 2009 16:44:17 -0700 Subject: rolling your own Unicorn gem prerelease Message-ID: <20090930234417.GA26720@dcvr.yhbt.net> Instead of relying entirely on the ridiculous test suite, it's now more easily possible to roll your own properly-versioned prerelease gems. You'll need RubyGems >= 1.3.5 to handle pre-release version numbers. Of course setup.rb users (like myself) have always had this capability, I just lack real applications to test against... >From 9cc4f87353b84f5229d4a8bae78260c24cd02154 Mon Sep 17 00:00:00 2001 From: Eric Wong Date: Wed, 30 Sep 2009 13:41:26 -0700 Subject: [PATCH] Add makefile targets for non-release installs This should make it easier to test and run unreleased versions. --- GNUmakefile | 8 ++++++++ HACKING | 15 +++++++++++++++ 2 files changed, 23 insertions(+), 0 deletions(-) diff --git a/GNUmakefile b/GNUmakefile index 8becc89..3087082 100644 --- a/GNUmakefile +++ b/GNUmakefile @@ -227,6 +227,11 @@ verify: test `git rev-parse --verify HEAD^0` = \ `git rev-parse --verify refs/tags/v$(VERSION)^{}` +gem: $(pkggem) + +install-gem: $(pkggem) + gem install $(CURDIR)/$< + $(pkggem): manifest gem build $(rfpackage).gemspec mkdir -p pkg @@ -249,6 +254,9 @@ release: verify package $(release_notes) $(release_changes) $(rfproject) $(rfpackage) $(VERSION) $(pkggem) rubyforge add_file \ $(rfproject) $(rfpackage) $(VERSION) $(pkgtgz) +else +gem install-gem: GIT-VERSION-FILE + $(MAKE) $@ VERSION=$(GIT_VERSION) endif .PHONY: .FORCE-GIT-VERSION-FILE doc $(T) $(slow_tests) manifest man diff --git a/HACKING b/HACKING index 5085545..08aa76d 100644 --- a/HACKING +++ b/HACKING @@ -96,3 +96,18 @@ We will adhere to mostly the same conventions for patch submissions as git itself. See the Documentation/SubmittingPatches document distributed with git on on patch submission guidelines to follow. Just don't email the git mailing list or maintainer with Unicorn patches :) + +== Running Development Versions + +It is easy to install the contents of your git working directory: + +Via RubyGems (RubyGems 1.3.5+ recommended for prerelease versions): + + make install-gem + +Without RubyGems (via setup.rb): + + make install + +It is not at all recommended to mix a RubyGems installation with an +installation done without RubyGems, however. -- Eric Wong