From giovanni.lion at gmail.com Wed Nov 4 13:49:33 2009 From: giovanni.lion at gmail.com (Giovanni Lion) Date: Wed, 4 Nov 2009 19:49:33 +0100 Subject: rainbows for 3rd party api Message-ID: <2007122a0911041049u2b4376dbpd3b1f727e315ea88@mail.gmail.com> Hi all, I came across rainbows while I was looking for a smart solution for handling 3rd party api calls for my rails app. I would like to know a little more about how to achieve efficency in the following context: 1 user requests a page 2 page content requires xml to be retrieved from 3rd party server through http call 3 page is rendered, without the 3rd party data but with an onload ajax request back to the app to retrieve 3rd party data 4 app generates an http call to 3rd party api 5 app waits for 3rd party response 6 app responds to ajax call rendering html out of the xml response from 3rd party api Right now my current setup is apache + passenger, no constraints on switching to anything else. This setup is not optimal of course because if i receive many concurrent requests that need 3rd party response passenger app pool is full and sleepy. From what i read in the documentation rainbows should come handy in this situation. I had a look at unicorn and i think i got more or less how it works. Can anyone suggest me how to set up the app deployment in order to reduce waste on step 5? My guessing is should create a rack app to handle these calls using DevFdResponse and run it with rainbows. Only problem is can i have the rails environment in there? Thanks in advance, Giovanni From normalperson at yhbt.net Wed Nov 4 16:40:18 2009 From: normalperson at yhbt.net (Eric Wong) Date: Wed, 4 Nov 2009 13:40:18 -0800 Subject: rainbows for 3rd party api In-Reply-To: <2007122a0911041049u2b4376dbpd3b1f727e315ea88@mail.gmail.com> References: <2007122a0911041049u2b4376dbpd3b1f727e315ea88@mail.gmail.com> Message-ID: <20091104214018.GA25942@dcvr.yhbt.net> Giovanni Lion wrote: > Hi all, > > I came across rainbows while I was looking for a smart solution for > handling 3rd party api calls for my rails app. I would like to know a > little more about how to achieve efficency in the following context: > > 1 user requests a page > 2 page content requires xml to be retrieved from 3rd party server > through http call > 3 page is rendered, without the 3rd party data but with an onload ajax > request back to the app to retrieve 3rd party data > 4 app generates an http call to 3rd party api > 5 app waits for 3rd party response > 6 app responds to ajax call rendering html out of the xml response > from 3rd party api > > Right now my current setup is apache + passenger, no constraints on > switching to anything else. This setup is not optimal of course > because if i receive many concurrent requests that need 3rd party > response passenger app pool is full and sleepy. From what i read in > the documentation rainbows should come handy in this situation. I had > a look at unicorn and i think i got more or less how it works. Can > anyone suggest me how to set up the app deployment in order to reduce > waste on step 5? My guessing is should create a rack app to handle > these calls using DevFdResponse and run it with rainbows. Only problem > is can i have the rails environment in there? Hi Giovanni, 3rd party API responses are exactly one of the uses Rainbows! was built for. You really only want DevFdResponse if you're doing a straight proxy between the 3rd party and your client without modifying the data. Since you seem to be getting XML and rendering HTML, you probably can't use DevFdResponse efficiently. Don't despair, though, Rainbows! still gives you plenty of options :) You can build a Rack config.ru to use with Rails, too. In fact, you'll have to for now since we're unsure if we want to support a "rainbows_rails" wrapper like I do with "unicorn_rails". Using config.ru gives you much more flexibility to route around/outside of Rails. Your config.ru can be something like this: ---------------- 8< ------------------ # this example is totally untested and may have syntax errors require 'config/boot' # might not be necessary with newer Rails require 'config/environment' # you only need one of these: dispatcher = if $old_rails require 'unicorn/app/old_rails' Unicorn::App::OldRails.new else ActionController::Dispatcher.new end # send all 3rd party API requests to "/3rd_party" through this block: map("/3rd_party") do use Rack::ContentLength run lambda { |env| # error-checking is left as an exercise to the reader :) body = if env['rainbows.model'] == :Revactor url = "http://example.com/#{ENV['PATH_INFO']}" Revactor::HttpClient.request("GET", url).body else Net::HTTP.get("api.example.com", env["PATH_INFO"]) end # render_to_html(body) # define your own function here [ 200, { "Content-Type" => "text/html" }, [ body ] ] } end # send normal Rails requests here: map("/") do use Rack::Lock # only needed in case your Rails app is not thread-safe # you can also use Rainbows::AppPool to limit Rails concurrency here # independently of "/3rd_party" requests if your Rails app is # thread-safe but not happy with too many threads. run dispatcher end ---------------------------------- 8< -------------------------------- The above example will work best with the ThreadPool or ThreadSpawn or Revactor concurrency models. I hope to have time to work on hybrid concurrency models with Rev and EventMachine to mix threads into them so the application dispatch can be concurrent for those, not just client <-> server I/O too. -- Eric Wong From normalperson at yhbt.net Thu Nov 5 05:33:42 2009 From: normalperson at yhbt.net (Eric Wong) Date: Thu, 5 Nov 2009 02:33:42 -0800 Subject: [ANN] Rainbows! 0.5.0 - dependencies and compatibility Message-ID: <20091105103342.GA20666@dcvr.yhbt.net> Rainbows! is a HTTP server for sleepy Rack applications. It is based on Unicorn, but designed to handle applications that expect long request/response times and/or slow clients. For Rack applications not heavily bound by slow external network dependencies, consider Unicorn instead as it simpler and easier to debug. * http://rainbows.rubyforge.org/ * rainbows-talk at rubyforge.org * git://git.bogomips.org/rainbows.git Changes: We depend on the just-released Unicorn 0.94.0 for the fixed trailer handling. As with `unicorn', the `rainbows' executable now sets and respects ENV["RACK_ENV"]. Also small fixes and cleanups including better FreeBSD 7.2 compatibility and less likely to over-aggressively kill slow/idle workers when a very low timeout is set. Eric Wong (20): rev: split out heartbeat class bump Unicorn dependency to (consistently) pass tests tests: avoid single backquote in echo event_machine: avoid slurping when proxying tests: make timeout tests reliable under 1.9 thread_pool: comment for potential SMP issue under 1.9 Allow 'use "model"' as a string as well as symbol Rev model is the only user of deferred_bodies ev_core: use Tempfile instead of Unicorn::Util::tmpio ev_core: ensure quit is triggered on all errors rainbows: set and use process-wide ENV["RACK_ENV"] http_server: add one second to any requested timeout thread_pool: update fchmod heartbeat every second t0004: tighten up timeout test ev_core: remove Tempfile usage once again cleanup: remove unused t????.ru test files tests: staggered trailer upload test ensure RACK_ENV is inherited from the parent env t0100: more precise `expr` usage Hopefully we'll have time to get more concurrency models supported soon. -- Eric Wong From giovanni.lion at gmail.com Thu Nov 5 08:03:12 2009 From: giovanni.lion at gmail.com (Giovanni Lion) Date: Thu, 5 Nov 2009 14:03:12 +0100 Subject: rainbows for 3rd party api In-Reply-To: <20091104214018.GA25942@dcvr.yhbt.net> References: <2007122a0911041049u2b4376dbpd3b1f727e315ea88@mail.gmail.com> <20091104214018.GA25942@dcvr.yhbt.net> Message-ID: <2007122a0911050503x5740cf3ei4f1185b4cb895298@mail.gmail.com> > Hi Giovanni, > > 3rd party API responses are exactly one of the uses Rainbows! was built > for. > > You really only want DevFdResponse if you're doing a straight proxy > between the 3rd party and your client without modifying the data. ?Since > you seem to be getting XML and rendering HTML, you probably can't use > DevFdResponse efficiently. ?Don't despair, though, Rainbows! still > gives you plenty of options :) > > You can build a Rack config.ru to use with Rails, too. In fact, you'll > have to for now since we're unsure if we want to support a > "rainbows_rails" wrapper like I do with "unicorn_rails". ?Using > config.ru gives you much more flexibility to route around/outside > of Rails. Ok i think i got most of it. Now i was just thinking about the best way to get this going. The issue now is that processing the xml into the html is something I prefer keeping insde the app for consistency. My idea was to do something like this, i use an example this time: 1) I get a request for a friend list html partial 2) I intercept it and using revactor 3) Wait for the response (It shouldn't be called waiting with revactor and fibers, right?) 4) I write the response to memcached 5) I call the rails app who now fetches from cache the friend list 6) The rails app renders the partial and everybody is happy Do you think this is a good flow? Should I create a specific method instead? Also, another issue is not clear to me yet. I need to make api requests with oauth, hence there is a logic layer on top of the http request. I would probably need to brake it down and replace Net:HTTP with Revactor::HttpClient, am i correct? Thanks for your help Giovanni From normalperson at yhbt.net Thu Nov 5 18:06:39 2009 From: normalperson at yhbt.net (Eric Wong) Date: Thu, 5 Nov 2009 15:06:39 -0800 Subject: rainbows for 3rd party api In-Reply-To: <2007122a0911050503x5740cf3ei4f1185b4cb895298@mail.gmail.com> References: <2007122a0911041049u2b4376dbpd3b1f727e315ea88@mail.gmail.com> <20091104214018.GA25942@dcvr.yhbt.net> <2007122a0911050503x5740cf3ei4f1185b4cb895298@mail.gmail.com> Message-ID: <20091105230638.GA7131@dcvr.yhbt.net> Giovanni Lion wrote: > > Hi Giovanni, > > > > 3rd party API responses are exactly one of the uses Rainbows! was built > > for. > > > > You really only want DevFdResponse if you're doing a straight proxy > > between the 3rd party and your client without modifying the data. ?Since > > you seem to be getting XML and rendering HTML, you probably can't use > > DevFdResponse efficiently. ?Don't despair, though, Rainbows! still > > gives you plenty of options :) > > > > You can build a Rack config.ru to use with Rails, too. In fact, you'll > > have to for now since we're unsure if we want to support a > > "rainbows_rails" wrapper like I do with "unicorn_rails". ?Using > > config.ru gives you much more flexibility to route around/outside > > of Rails. > > Ok i think i got most of it. Now i was just thinking about the best > way to get this going. The issue now is that processing the xml into > the html is something I prefer keeping insde the app for consistency. > My idea was to do something like this, i use an example this time: > > 1) I get a request for a friend list html partial > 2) I intercept it and using revactor > 3) Wait for the response (It shouldn't be called waiting with revactor > and fibers, right?) Well, from the caller's point of view, it is waiting :) > 4) I write the response to memcached > 5) I call the rails app who now fetches from cache the friend list > 6) The rails app renders the partial and everybody is happy > > Do you think this is a good flow? Should I create a specific method instead? Depends on the rest of your app, I guess. Is your Rails app reentrant? If so, definitely go for it. If you're dealing with DB connections in there, compatibility will probably be better with the ThreadPool or ThreadSpawn models unless somebody writes Revactor-enabled DB libraries. > Also, another issue is not clear to me yet. I need to make api > requests with oauth, hence there is a logic layer on top of the http > request. I would probably need to brake it down and replace Net:HTTP > with Revactor::HttpClient, am i correct? Yes. To effectively use Revactor you pretty much have to change any parts of your app/libraries to use Revactor's networking API. Fortunately it's not too hard since program logic is still linear with the Actor model. You might even want to do it for memcached, too, but then again memcached is pretty fast on a LAN and you might not notice it block. -- Eric Wong From giovanni.lion at gmail.com Fri Nov 6 05:40:51 2009 From: giovanni.lion at gmail.com (Giovanni Lion) Date: Fri, 6 Nov 2009 11:40:51 +0100 Subject: rainbows for 3rd party api In-Reply-To: <20091105230638.GA7131@dcvr.yhbt.net> References: <2007122a0911041049u2b4376dbpd3b1f727e315ea88@mail.gmail.com> <20091104214018.GA25942@dcvr.yhbt.net> <2007122a0911050503x5740cf3ei4f1185b4cb895298@mail.gmail.com> <20091105230638.GA7131@dcvr.yhbt.net> Message-ID: <2007122a0911060240j105c1fcfgfebb2c5757cf7fd1@mail.gmail.com> On Fri, Nov 6, 2009 at 12:06 AM, Eric Wong wrote: > Giovanni Lion wrote: >> > Hi Giovanni, >> > >> > 3rd party API responses are exactly one of the uses Rainbows! was built >> > for. >> > >> > You really only want DevFdResponse if you're doing a straight proxy >> > between the 3rd party and your client without modifying the data. ?Since >> > you seem to be getting XML and rendering HTML, you probably can't use >> > DevFdResponse efficiently. ?Don't despair, though, Rainbows! still >> > gives you plenty of options :) >> > >> > You can build a Rack config.ru to use with Rails, too. In fact, you'll >> > have to for now since we're unsure if we want to support a >> > "rainbows_rails" wrapper like I do with "unicorn_rails". ?Using >> > config.ru gives you much more flexibility to route around/outside >> > of Rails. >> >> Ok i think i got most of it. Now i was just thinking about the best >> way to get this going. The issue now is that processing the xml into >> the html is something I prefer keeping insde the app for consistency. >> My idea was to do something like this, i use an example this time: >> >> 1) I get a request for a friend list html partial >> 2) I intercept it and using revactor >> 3) Wait for the response (It shouldn't be called waiting with revactor >> and fibers, right?) > > Well, from the caller's point of view, it is waiting :) #=> true >> 4) I write the response to memcached >> 5) I call the rails app who now fetches from cache the friend list >> 6) The rails app renders the partial and everybody is happy >> >> Do you think this is a good flow? Should I create a specific method instead? > > Depends on the rest of your app, I guess. ?Is your Rails app > reentrant? ?If so, definitely go for it. ?If you're dealing > with DB connections in there, compatibility will probably be > better with the ThreadPool or ThreadSpawn models unless somebody > writes Revactor-enabled DB libraries. Well, I do have a db in there but it's just one users table, which i really need only to pair the facebook id to the oauth tokens to make the api calls. Unfortunately it's a 3-legged oauth so i'm kinda stuck with having my own db. I'm testing now something like this: map("/3rd_party/friends") do use Rack::Facebook #checking the fb signature run lambda { |env| request = Rack::Request.new(env) return Rack::Response.new(["Invalid Facebook signature"], 400).finish unless request.POST['fb_sig'] user = User.find_by_fb_id(request.POST['fb_sig_user']) #db query ... oauth_stuff_i_am_still_working_on .... } end Will rails connection pool mess things up if used in an actor? Right now I'm really much more worried about the app being blocked in the http call as the third party is acting wonky lately. Db load is not a worry for now. Let me know your thoughts while I get the oauth - revactor http going. Giovanni From normalperson at yhbt.net Fri Nov 6 14:20:20 2009 From: normalperson at yhbt.net (Eric Wong) Date: Fri, 6 Nov 2009 19:20:20 +0000 Subject: rainbows for 3rd party api In-Reply-To: <2007122a0911060240j105c1fcfgfebb2c5757cf7fd1@mail.gmail.com> References: <2007122a0911041049u2b4376dbpd3b1f727e315ea88@mail.gmail.com> <20091104214018.GA25942@dcvr.yhbt.net> <2007122a0911050503x5740cf3ei4f1185b4cb895298@mail.gmail.com> <20091105230638.GA7131@dcvr.yhbt.net> <2007122a0911060240j105c1fcfgfebb2c5757cf7fd1@mail.gmail.com> Message-ID: <20091106192020.GA11050@dcvr.yhbt.net> Giovanni Lion wrote: > On Fri, Nov 6, 2009 at 12:06 AM, Eric Wong wrote: > > Giovanni Lion wrote: > >> 4) I write the response to memcached > >> 5) I call the rails app who now fetches from cache the friend list > >> 6) The rails app renders the partial and everybody is happy > >> > >> Do you think this is a good flow? Should I create a specific method instead? > > > > Depends on the rest of your app, I guess. ?Is your Rails app > > reentrant? ?If so, definitely go for it. ?If you're dealing > > with DB connections in there, compatibility will probably be > > better with the ThreadPool or ThreadSpawn models unless somebody > > writes Revactor-enabled DB libraries. > > Well, I do have a db in there but it's just one users table, which i > really need only to pair the facebook id to the oauth tokens to make > the api calls. Unfortunately it's a 3-legged oauth so i'm kinda stuck > with having my own db. I'm testing now something like this: > > map("/3rd_party/friends") do > use Rack::Facebook #checking the fb signature > run lambda { |env| > request = Rack::Request.new(env) > return Rack::Response.new(["Invalid Facebook signature"], > 400).finish unless request.POST['fb_sig'] > user = User.find_by_fb_id(request.POST['fb_sig_user']) #db query > ... > oauth_stuff_i_am_still_working_on > .... > } > end > > Will rails connection pool mess things up if used in an actor? Right > now I'm really much more worried about the app being blocked in the > http call as the third party is acting wonky lately. Db load is not a > worry for now. > > Let me know your thoughts while I get the oauth - revactor http going. I don't think there's any need to worry if your DB queries are fast and predictable in performance. Let us know how everything goes. 3rd-party API calls often pose a scalability issue that Rainbows! is designed to solve. -- Eric Wong From normalperson at yhbt.net Tue Nov 10 20:38:42 2009 From: normalperson at yhbt.net (Eric Wong) Date: Tue, 10 Nov 2009 17:38:42 -0800 Subject: [ANN] upr - Upload Progress for Rack (initial release) Message-ID: <20091111013842.GB21125@dcvr.yhbt.net> upr is Rack middleware that allows browser-side upload progress monitoring. It is based-on the "mongrel_upload_progress" module, but allows any Moneta backing store in addition to DRb. There is also a packaged example for using an ActiveRecord model for Rails. * http://upr.bogomips.org/ * upr at librelist.com * git://git.bogomips.org/upr.git You can see upr it in action at http://upr-demo.bogomips.org/ It will report the size and SHA1 of the file you've uploaded. Much of the demo was stolen from mongrel_upload_progress. == Web Server Compatibility While upr is completely Rack::Lint-compatible, upr is only compatible with Rack web servers that support a streaming "rack.input". Currently this is limited to {Rainbows!}[http://rainbows.rubyforge.org/] with a handful of concurrency models: * ThreadSpawn * ThreadPool * Revactor* For use with Revactor, the use of network-based Moneta stores or DRb is only advised if those stores are using Revactor-aware sockets. == JavaScript/HTML Compatibility The current developer does not react well with GUIs. Thus all (R)HTML and Prototype JavaScript code was stolen from mongrel_upload_progress. Contributions to add compatibility for more modern things like JQuery and HTML5 are very welcome. == Backend Compatibility We depend on {Moneta}[http://github.com/wycats/moneta], which allows the use of a multitude of key-value stores. We also provide a DRb+Moneta::Memory server to ease transitions from mongrel_upload_progress. Additionally, there is an example for using Rails ActiveRecord as a backend storage mechanism. Cookie-based upload tracking may eventually be used, too (contributions very welcome). == Proxy Compatibility No proxy is required when used with Rainbows! The only incompatible HTTP proxy we know of is nginx. nginx will buffer large requests to the filesystem before sending them to the backend. nginx has its own 3rd-party module for {upload progress}[http://wiki.nginx.org/NginxHttpUploadProgressModule] and may be used instead of upr. Most other HTTP-aware and all TCP-only proxies should be compatible. Disabling Nagle's algorithm in both the Rack web server and proxy is advised for lower latency, especially with stunnel. == Unicorn Compatibility While {Unicorn}[http://unicorn.bogomips.org/] provides the streaming "rack.input" for Rainbows!, using Unicorn with upr is generally NOT recommended. Unicorn only supports fast clients and progress reporting is unnecessary unless clients are uploading files that are hundreds of megabyte in size or larger. == Getting Started gem install upr For Rails, look at the Rails application {example}[http://git.bogomips.org/cgit/upr.git/tree/examples/rails_app-2.3.4] and RDoc. More documentation is on the way. -- Eric Wong From normalperson at yhbt.net Thu Nov 12 05:04:49 2009 From: normalperson at yhbt.net (Eric Wong) Date: Thu, 12 Nov 2009 02:04:49 -0800 Subject: dealing with client disconnects with TeeInput Message-ID: <20091112100449.GA1929@dcvr.yhbt.net> Foreword: this probably doesn't affect nginx+Unicorn users, which is the recommended configuration for the vast majority of sites. It probably affects Rainbows! users using Thread* or Revactor the most, and probably some Unicorn users serving Intranet clients directly. When clients are uploading large files, there's always a good possibility of them disconnecting before the upload ends. For other web app servers it's not much of a problem: they read the entire upload before attempting to process things; so the app never sees a prematurely disconnected client. However Rainbows! and Unicorn have the TeeInput class which allows real-time processing of uploads as they occur. Now, we _want_ the exception to be thrown and application to stop processing the dead client request immediately. I've made changes in unicorn.git and rainbows.git to ensure no EOFError exceptions from the socket are silenced, not just ones from reading trailers. However, this means (many more) socket errors will be seen within the application and any global exception trappers they use will see them as well. For Rails (and possibly other frameworks), this can mean very messy log files with large backtraces. So, would making a Unicorn::Disconnect < EOFError exception class and raising it with a short/empty backtrace on EOFErrors be the best way to go? That way those global exception trappers can distinguish between EOFError exceptions raised by Unicorn/Rainbows! itself and other code that Unicorn/Rainbows does not care about, and log appropriately... The other option we have is catch/throw. We can avoid worrying about the stack trace entirely, and middlewares that opt-in can still capture and log the disconnect if they want to. More maintenance overhead for Rainbows! with all its concurrency models, but this is a situation where I think catch/throw is appropriate for given the current middleware/application stacks these days. Thanks for reading. -- Eric Wong From normalperson at yhbt.net Fri Nov 13 20:16:58 2009 From: normalperson at yhbt.net (Eric Wong) Date: Fri, 13 Nov 2009 17:16:58 -0800 Subject: dealing with client disconnects with TeeInput In-Reply-To: <20091112100449.GA1929@dcvr.yhbt.net> References: <20091112100449.GA1929@dcvr.yhbt.net> Message-ID: <20091114011658.GA18151@dcvr.yhbt.net> Eric Wong wrote: > So, would making a Unicorn::Disconnect < EOFError exception class and > raising it with a short/empty backtrace on EOFErrors be the best way to > go? That way those global exception trappers can distinguish between > EOFError exceptions raised by Unicorn/Rainbows! itself and other code > that Unicorn/Rainbows does not care about, and log appropriately... I actually named it Unicorn::ClientShutdown since I figured the name would be more descriptive. Here's what I've pushed out to unicorn.git: >From e4256da292f9626d7dfca60e08f65651a0a9139a Mon Sep 17 00:00:00 2001 From: Eric Wong Date: Sat, 14 Nov 2009 00:23:19 +0000 Subject: [PATCH] raise Unicorn::ClientShutdown if client aborts in TeeInput Leaving the EOFError exception as-is bad because most applications/frameworks run an application-wide exception handler to pretty-print and/or log the exception with a huge backtrace. Since there's absolutely nothing we can do in the server-side app to deal with clients prematurely shutting down, having a backtrace does not make sense. Having a backtrace can even be harmful since it creates unnecessary noise for application engineers monitoring or tracking down real bugs. --- lib/unicorn.rb | 6 ++++ lib/unicorn/tee_input.rb | 11 ++++++++ test/unit/test_server.rb | 61 ++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 76 insertions(+), 2 deletions(-) diff --git a/lib/unicorn.rb b/lib/unicorn.rb index a696402..c6c311e 100644 --- a/lib/unicorn.rb +++ b/lib/unicorn.rb @@ -8,6 +8,12 @@ autoload :Rack, 'rack' # a Unicorn web server. It contains a minimalist HTTP server with just enough # functionality to service web application requests fast as possible. module Unicorn + + # raise this inside TeeInput when a client disconnects inside the + # application dispatch + class ClientShutdown < EOFError + end + autoload :Const, 'unicorn/const' autoload :HttpRequest, 'unicorn/http_request' autoload :HttpResponse, 'unicorn/http_response' diff --git a/lib/unicorn/tee_input.rb b/lib/unicorn/tee_input.rb index 69397c0..50ddb5b 100644 --- a/lib/unicorn/tee_input.rb +++ b/lib/unicorn/tee_input.rb @@ -135,10 +135,21 @@ module Unicorn end end finalize_input + rescue EOFError + # in case client only did a premature shutdown(SHUT_WR) + # we do support clients that shutdown(SHUT_WR) after the + # _entire_ request has been sent, and those will not have + # raised EOFError on us. + socket.close if socket + raise ClientShutdown, "bytes_read=#{@tmp.size}", [] end def finalize_input while parser.trailers(req, buf).nil? + # Don't worry about throw-ing :http_499 here on EOFError, tee() + # will catch EOFError when app is processing it, otherwise in + # initialize we never get any chance to enter the app so the + # EOFError will just get trapped by Unicorn and not the Rack app buf << socket.readpartial(Const::CHUNK_SIZE) end self.socket = nil diff --git a/test/unit/test_server.rb b/test/unit/test_server.rb index bbb06da..a7f6a35 100644 --- a/test/unit/test_server.rb +++ b/test/unit/test_server.rb @@ -12,11 +12,12 @@ include Unicorn class TestHandler - def call(env) - # response.socket.write("HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\n\r\nhello!\n") + def call(env) while env['rack.input'].read(4096) end [200, { 'Content-Type' => 'text/plain' }, ['hello!\n']] + rescue Unicorn::ClientShutdown => e + $stderr.syswrite("#{e.class}: #{e.message} #{e.backtrace.empty?}\n") end end @@ -103,6 +104,62 @@ class WebServerTest < Test::Unit::TestCase assert_equal 'hello!\n', results[0], "Handler didn't really run" end + def test_client_shutdown_writes + sock = nil + buf = nil + bs = 15609315 * rand + assert_nothing_raised do + sock = TCPSocket.new('127.0.0.1', @port) + sock.syswrite("PUT /hello HTTP/1.1\r\n") + sock.syswrite("Host: example.com\r\n") + sock.syswrite("Transfer-Encoding: chunked\r\n") + sock.syswrite("Trailer: X-Foo\r\n") + sock.syswrite("\r\n") + sock.syswrite("%x\r\n" % [ bs ]) + sock.syswrite("F" * bs) + sock.syswrite("\r\n0\r\nX-") + "Foo: bar\r\n\r\n".each_byte do |x| + sock.syswrite x.chr + sleep 0.05 + end + # we wrote the entire request before shutting down, server should + # continue to process our request and never hit EOFError on our sock + sock.shutdown(Socket::SHUT_WR) + buf = sock.read + end + assert_equal 'hello!\n', buf.split(/\r\n\r\n/).last + lines = File.readlines("test_stderr.#$$.log") + assert lines.grep(/^Unicorn::ClientShutdown: /).empty? + assert_nothing_raised { sock.close } + end + + def test_client_shutdown_write_truncates + sock = nil + buf = nil + bs = 15609315 * rand + assert_nothing_raised do + sock = TCPSocket.new('127.0.0.1', @port) + sock.syswrite("PUT /hello HTTP/1.1\r\n") + sock.syswrite("Host: example.com\r\n") + sock.syswrite("Transfer-Encoding: chunked\r\n") + sock.syswrite("Trailer: X-Foo\r\n") + sock.syswrite("\r\n") + sock.syswrite("%x\r\n" % [ bs ]) + sock.syswrite("F" * (bs / 2.0)) + + # shutdown prematurely, this will force the server to abort + # processing on us even during app dispatch + sock.shutdown(Socket::SHUT_WR) + IO.select([sock], nil, nil, 60) or raise "Timed out" + buf = sock.read + end + assert_equal "", buf + lines = File.readlines("test_stderr.#$$.log") + lines = lines.grep(/^Unicorn::ClientShutdown: bytes_read=\d+/) + assert_equal 1, lines.size + assert_match %r{\AUnicorn::ClientShutdown: bytes_read=\d+ true$}, lines[0] + assert_nothing_raised { sock.close } + end def do_test(string, chunk, close_after=nil, shutdown_delay=0) # Do not use instance variables here, because it needs to be thread safe -- Eric Wong From normalperson at yhbt.net Sat Nov 14 21:06:10 2009 From: normalperson at yhbt.net (Eric Wong) Date: Sat, 14 Nov 2009 18:06:10 -0800 Subject: Unicorn FAQ section added Message-ID: <20091115020610.GA29681@dcvr.yhbt.net> Hi all, If you haven't noticed, I've added a FAQ section to the site at http://unicorn.bogomips.org/FAQ.html Let us know if we've missed anything and send patches here if you can. -- Eric Wong From normalperson at yhbt.net Sun Nov 15 18:49:18 2009 From: normalperson at yhbt.net (Eric Wong) Date: Sun, 15 Nov 2009 15:49:18 -0800 Subject: Rainbows! 0.6.0 - bugfixes galore Message-ID: <20091115234918.GA26776@dcvr.yhbt.net> Rainbows! is a HTTP server for sleepy Rack applications. It is based on Unicorn, but designed to handle applications that expect long request/response times and/or slow clients. For Rack applications not heavily bound by slow external network dependencies, consider Unicorn instead as it simpler and easier to debug. * http://rainbows.rubyforge.org/ * rainbows-talk at rubyforge.org * git://git.bogomips.org/rainbows.git Changes: Client shutdowns/errors when streaming "rack.input" into the Rack application are quieter now. Rev and EventMachine workers now shutdown correctly when the master dies. Worker processes now fail gracefully if log reopening fails. ThreadSpawn and ThreadPool models now load Unicorn classes in a thread-safe way. There's also an experimental RevThreadSpawn concurrency model which may be heavily reworked in the future... Eric Wong (31): Threaded models have trouble with late loading under 1.9 cleanup worker heartbeat and master deathwatch tests: allow use of alternative sha1 implementations rev/event_machine: simplify keepalive checking a bit tests: sha1.ru now handles empty bodies rev: split out further into separate files for reuse rev: DeferredResponse is independent of parser state remove unnecessary class variable ev_core: cleanup handling of APP constant rev: DeferredResponse: always attach to main loop initial cut of the RevThreadSpawn model rev_thread_spawn/revactor: fix TeeInput for short reads rev_thread_spawn: make 1.9 TeeInput performance tolerable tests: add executable permissions to t0102 tests: extra check to avoid race in reopen logs test rev_thread_spawn: 16K chunked reads work better tests: ensure proper accounting of worker_connections tests: heartbeat-timeout: simplify and avoid possible race tests: ensure we process "START" from FIFO when starting http_response: don't "rescue nil" for body.close cleanup error handling pieces tests: more stringent tests for error handling revactor/tee_input: unnecessary error handling gracefully exit workers if reopening logs fails revactor/tee_input: raise ClientDisconnect on EOFError bump versions since we depend on Unicorn::ClientShutdown FAQ: updates for Rails and SSL-using sites revactor/tee_input: share error handling with superclass RevThreadSpawn is still experimental Revert "Threaded models have trouble with late loading under 1.9" Rakefile: add raa_update task -- Eric Wong From normalperson at yhbt.net Mon Nov 16 16:26:41 2009 From: normalperson at yhbt.net (Eric Wong) Date: Mon, 16 Nov 2009 13:26:41 -0800 Subject: rainbows for sleepy/lethargic apps In-Reply-To: <76461C1B-E671-4DA3-BED0-12F9E571125A@elctech.com> References: <76461C1B-E671-4DA3-BED0-12F9E571125A@elctech.com> Message-ID: <20091116212641.GA7093@dcvr.yhbt.net> Dylan Stamat wrote: > I'm working on a project that has very bad application performance, > specifically, tons of long running requests due to view and > (specifically) database contention. > > While I haven't tested our Unicorn setup under load yet, tweak backlog > counts, etc... I'm wondering if I'd be better off giving Rainbows! a > shot? In the Rainbows! AppPool diagram, I'm having a hard time > understanding the N:P relationship. When the P threshold is met, what > happens to N (the client)? Hi Dylan, With AppPool, request headers are fully parsed and end up getting queued up in userspace. No request bodies are read (since AppPool is only compatible with the Thread-based concurrency models at the moment). Rainbows is designed for apps that are intentionally/unavoidably sleepy. Since I imagine the database is within your control, I'd optimize that first and fall back to using Unicorn with a higher :backlog. Since you're hitting database contention, throwing more concurrency at the database with Rainbows! is probably the wrong thing to do... If your database is just on a high latency network for whatever reason, then maybe Rainbows! with Neverblock drivers can help. Otherwise if your database is bogged down because of memory/CPU/disk, then you probably want fewer things hitting it at once. Can you shard, cache or otherwise offload DB requests? With slow views, Unicorn is pretty effective at using all available cores already with multiple worker processes. Even with a Ruby VM with non-crippled native threads (Rubinius maybe, not MRI 1.9.1) I doubt it'll help with CPU/memory-bound rendering performance. > Also, with both Unicorn/Rainbows!, is there a explicit timeout number > set on the clients (and how does that differ between > Unicorn/Rainbows!)? For instance, in our Nginx config, I have > proxy_buffering turned on, with proxy_read_timeout and > proxy_send_timeout's, set pretty high. Based on your nginx config, the easiest thing would probably be to set a high :backlog for now and then work on tuning/optimizing/avoiding your database ASAP. Views are harder to optimize, Unicorn itself always uses as much resources as the OS can give it to render. A lot of slow views I've seen hit a lot of memory allocation, so check out tcmalloc (included with REE) or use 1.9 (possibly with tcmalloc, too). 1.9 has some good optimizations for allocating shorter strings (should be more effective on 64-bit) and small hashes/arrays, too. My personal style is also to use destructive methods as much as possible (concat/<< vs +/+=, map! vs map, gsub! vs gsub, tr! vs tr, etc...). Avoid creating lots of OpenStruct, too, excessive use of define_method/metadef allocates a lot of short-lived objects and thrashes memory. The timeout in Unicorn affects the entire request (including all I/O). The timeout in Rainbows only kicks in when the process is completely deadlocked. Since apparently lots of "normal" HTTP clients have ridiculous keepalive times, Rainbows! will be getting a separate keepalive_timeout directive soon that only affects reading HTTP headers. I don't plan enforcing a timeout when processing request bodies or app responses (those can probably be done on a more controlled manner in the app/middleware). > Lastly... ?You rock Eric! Thanks, but much of the credit goes to random folks all over who have helped with this (and projects leading up to this), too :> -- Eric Wong From normalperson at yhbt.net Tue Nov 24 14:24:30 2009 From: normalperson at yhbt.net (Eric Wong) Date: Tue, 24 Nov 2009 11:24:30 -0800 Subject: HTML5 WebSockets Message-ID: <20091124192430.GA4965@dcvr.yhbt.net> Hi all, The Revactor/ThreadSpawn/ThreadPool concurrency models *should* already support HTML5 WebSockets out-of-the-box right now with the respective TeeInput (streaming "rack.input" support). You'll probably want to make sure the Rack::Chunked middleware is loaded for anything you run, but other than that everything should work provided you have a working client-side implementation... I'm terrible at doing anything interactive on web browsers[1] and I don't think any current browsers out there support WebSockets natively, but there are ways to mimic it with JS libraries it seems. If anybody can code anything up and put up a demo, that would be great. I'll get around to adding a Fiber-based concurrency model which should work with TeeInput, too. [1] - guess why the bug tracker for this project is a mailing list :) -- Eric Wong From normalperson at yhbt.net Thu Nov 26 20:55:08 2009 From: normalperson at yhbt.net (Eric Wong) Date: Thu, 26 Nov 2009 17:55:08 -0800 Subject: Rainbows! at a glance summary document Message-ID: <20091127015508.GA20733@dcvr.yhbt.net> I just pushed this document out, http://rainbows.rubyforge.org/Summary.html You may note that it includes two Fiber-based concurrency models currently in rainbows.git which will be in the next release :) The source is in HAML[1] for now here: http://git.bogomips.org/cgit/rainbows.git/tree/Documentation/comparison.haml Plain text (from w3m -dump) version here for convenience when replying/annotating. Let me know if there's anything I missed or described wrong/sub-optimally. ---------------------------------- 8< ------------------------------- Rainbows! at a glance Confused by all the options we give you? So are we! Here?s some tables to help keep your head straight. Remember, engineering is all about trade-offs. core features and compatibility module rack.input streaming Ruby 1.8 Ruby 1.9 Rubinius slow clients Unicorn/Base Yes Yes Yes Yes No Revactor Yes No Yes No Yes ThreadPool Yes Yes Yes Yes OK Rev No Yes Yes No Yes ThreadSpawn Yes Yes Yes Yes OK EventMachine No Yes Yes No Yes RevThreadSpawn No Slow Yes No Yes FiberSpawn Yes No Yes Yes Yes FiberPool Yes No Yes Yes Yes ? waiting on Rubinius for better signal handling ? rack.input streaming is what makes upload progress, BOSH, and Web Sockets possible ? rack.input streaming is NOT compatible with current versions of nginx or any proxy that fully buffers request bodies before proxying. Keep in mind request body buffering in nginx is a good thing in all other cases where rack.input streaming is not needed. application requirements module slow I/O (backend, not thread single thread client) safety reentrant Unicorn/Base avoid No No Revactor Rev, Revactor, not Fiber::IO No Yes ThreadPool thread-safe Ruby Yes No Rev Rev No No ThreadSpawn thread-safe Ruby Yes No EventMachine EventMachine No No RevThreadSpawn thread-safe Ruby, Rev Yes No FiberSpawn Rainbows::Fiber::IO No Yes FiberPool Rainbows::Fiber::IO No Yes ? Requirements for single thread reentrancy are loose in that there is no risk of race conditions and potentially mutually exclusive to thread-safety. In the case where a Fiber yields while holding a resource and another Fiber attempting to acquire it may raise an error or worse, deadlock the entire process. ? Slow I/O means anything that can block/stall on sockets including 3rd-party APIs (OpenID providers included) or slow database queries. Properly run Memcached (within the same LAN) is fast and not a blocker. Slow I/O on POSIX filesystems only includes a few operations, namely on UNIX domain sockets and named pipes. Nearly all other operations on POSIX filesystems can be considered "fast", or at least uninterruptible. middlewares and frameworks model DevFdResponse AppPool Rack::Lock async Unicorn/Base no-op no-op no-op lots of RAM :P Revactor no-op Yes No! Revactor itself ThreadPool no-op Yes Yes standard Ruby Rev Yes no-op no-op DevFdResponse ThreadSpawn no-op Yes Yes standard Ruby EventMachine Yes no-op no-op async_sinatra RevThreadSpawn Yes Yes Dumb standard Ruby FiberSpawn Yes Yes No! Rainbows::Fiber{::IO,.sleep} FiberPool Yes Yes No! Rainbows::Fiber{::IO,.sleep} ? "No!" means it's fundamentally incompatible ? Everything that's DevFdResponse-compatible can use it for passing async responses through ---------------------------------- 8< ------------------------------- [1] - not sure if there's a better/easier way of doing tables for HTML -- Eric Wong From normalperson at yhbt.net Sun Nov 29 23:34:13 2009 From: normalperson at yhbt.net (Eric Wong) Date: Mon, 30 Nov 2009 04:34:13 +0000 Subject: Rainbows! 0.7.0 - Fibers and NeverBlock Message-ID: <20091130043413.GA10137@dcvr.yhbt.net> Rainbows! is an HTTP server for sleepy Rack applications. It is based on Unicorn, but designed to handle applications that expect long request/response times and/or slow clients. For Rack applications not heavily bound by slow external network dependencies, consider Unicorn instead as it simpler and easier to debug. * http://rainbows.rubyforge.org/ * rainbows-talk at rubyforge.org * git://git.bogomips.org/rainbows.git Changes: keepalive_timeout (default: 2 seconds) is now supported to disconnect idle connections. Several new concurrency models added include: NeverBlock, FiberSpawn and FiberPool; all of which have only been lightly tested. RevThreadSpawn loses streaming input support to become simpler and faster for the general cases. AppPool middleware is now compatible with all Fiber-based models including Revactor and NeverBlock. A new document gives a summary of all the options we give you: http://rainbows.rubyforge.org/Summary.html If you're using any of the Rev-based concurrency models, the latest iobuffer (0.1.3) gem will improve performance. Also, RevThreadSpawn should become usable under MRI 1.8 with the next release of Rev (0.3.2). -- Eric Wong