From john at squirl.info Fri Jun 1 00:10:10 2007 From: john at squirl.info (John Mcgrath) Date: Fri, 1 Jun 2007 06:10:10 +0200 Subject: [Ferret-talk] iterate through an entire index In-Reply-To: <5A48FD2E-0857-400C-A6B0-4A14C85315B2@benjaminkrause.com> References: <5A48FD2E-0857-400C-A6B0-4A14C85315B2@benjaminkrause.com> Message-ID: <662d2a39aec68dd6b0b65e3dd03406a2@ruby-forum.com> >> I'm trying to get all the documents in an index. I've been hunting >> around, but I don't see a clear way to do this. I can get docs by >> searching on a term, or by specific doc id, but having trouble getting >> the whole pile of them. I'm using AAF and Ferret 0.11.4. Any help >> appreciated. > > Ferret 0.11.4 introduced a ferret-browser (try ferret- browser on the > shell) > > the code for the ferret browser is part of the gem and is entirely in > ruby .. > you should find an example on how to iterate through all documents in > that code .. if you cant find it, i can take a look for you .. Ben, thanks a million, that's exactly what I was looking for. And the ferret browser is rad! In case anyone's curious, the code is here: http://ferret.davebalmain.com/trac/browser/trunk/ruby/lib/ferret/browser.rb?rev=750 (in the DocumentController#list method) and here: http://ferret.davebalmain.com/trac/browser/trunk/ruby/lib/ferret/browser/views/document/list.rhtml?rev=750 -- Posted via http://www.ruby-forum.com/. From nikhil at aurigalogic.com Fri Jun 1 03:47:20 2007 From: nikhil at aurigalogic.com (Nikhil Gupte) Date: Fri, 1 Jun 2007 09:47:20 +0200 Subject: [Ferret-talk] highlight crashes In-Reply-To: References: <3dfb787a21adf0ba1037f69e82ce674f@ruby-forum.com> <0ad14cd08a2af383c00b1b0d81785745@ruby-forum.com> Message-ID: <78925649c2a86ef4502d94d1127a5f90@ruby-forum.com> David, I am using revision 770 from trunk and notice that the highlight problem still occurs if a PhraseQuery like '"big house"~2000' is used. The same works for smaller numbers ((like '"big house"~10') but crashes when used with larger numbers leaving the following message on the term: /usr/local/lib/ruby/gems/1.8/gems/ferret-0.11.4.1/lib/ferret/index.rb:197: [BUG] Segmentation fault ruby 1.8.6 (2007-03-13) [i686-linux] David Balmain wrote: > On 4/16/07, Stephen Sykes wrote: >> >> ..as well as segfaults sometimes. >> > Dave >> >> Dave, the problem has gone away. I was using some code that put items >> in the index as an array of strings - the output of readlines. Like >> this: >> >> index.add_document :file => path, :content => file.readlines >> >> When you try to apply highlighting to items like that, it breaks. So >> now I just join the output of readlines with a space, and all is well. > > Hi Stephen, > > This was in fact a bug. It is now fixed and will be released in Ferret > 0.11.5. -- Posted via http://www.ruby-forum.com/. From starburger234 at yahoo.de Fri Jun 1 17:00:24 2007 From: starburger234 at yahoo.de (Starburger) Date: Fri, 1 Jun 2007 23:00:24 +0200 Subject: [Ferret-talk] Is aaf multi_search broken? Message-ID: <702fdbc6df8fa22d29b4952acc88258c@ruby-forum.com> Hi all, I want to use acts_as_ferret's multi_search to search two model classes (Reviewable and Blog) at a time like @results = Reviewable.multi_search("jemen", [Blog]) and I'm always getting the error You have a nil object when you didn't expect it! You might have expected an instance of Array. The error occurred while evaluating nil.map #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/class_methods.rb:131:in `id_multi_search' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/class_methods.rb:113:in `multi_search' #{RAILS_ROOT}/app/controllers/search_controller.rb:53:in `search' I have configured indexing like this: acts_as_ferret :fields => [:index_text, :index_locations], :single_index => true acts_as_ferret :fields => [:index_text, :index_locations], :single_index => true Maybe I'm doing something wrong? Thanks, Starburger -- Posted via http://www.ruby-forum.com/. From steve at ourbigcircle.com Fri Jun 1 17:55:59 2007 From: steve at ourbigcircle.com (Luna Claire) Date: Fri, 1 Jun 2007 23:55:59 +0200 Subject: [Ferret-talk] Ferret FileNotFound error after adding counter_cache to mode Message-ID: <9d4e5f1c4437d4fb0815c93e02ab28f1@ruby-forum.com> I have a model that I've been indexing and searching with ferret with no problems. I just added a counter_cache for some voting functionality to the same model and now when I perform the voting fxn on an object from that model, I get the FileNotFound error as it looks for a file named "_1c_1.del" ...which breaks my voting function. I tried killing the server and my index directory as suggested for FnF errs elsewhere, but, now, after restarting the server and performing a search (to trigger rebuilding the index), I still have the same prob when voting. Any thoughts? (hopefully I won't have to rewind the counter_cache change) TIA -- Posted via http://www.ruby-forum.com/. From starburger234 at yahoo.de Sat Jun 2 01:35:23 2007 From: starburger234 at yahoo.de (Starburger) Date: Sat, 2 Jun 2007 07:35:23 +0200 Subject: [Ferret-talk] Is aaf multi_search broken? In-Reply-To: <702fdbc6df8fa22d29b4952acc88258c@ruby-forum.com> References: <702fdbc6df8fa22d29b4952acc88258c@ruby-forum.com> Message-ID: <80ab302618a7346a7aee64e5e0ff71b3@ruby-forum.com> Sorry, the above error is for the bleeding edge version (wrong cut & paste). Nevertheless for the stable version I get (in the same scenario): can't convert String into Array RAILS_ROOT: C:/INSTAN~2.6-W/rails_apps/abb/config/.. Application Trace | Framework Trace | Full Trace #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/multi_index.rb:11:in `+' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/multi_index.rb:11:in `initialize' C:/INSTAN~2.6-W/ruby/lib/ruby/gems/1.8/gems/activesupport-1.4.1/lib/active_support/inflector.rb:250:in `inject' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/multi_index.rb:10:in `each' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/multi_index.rb:10:in `inject' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/multi_index.rb:10:in `initialize' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/local_index.rb:191:in `new' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/local_index.rb:191:in `multi_index' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/local_index.rb:112:in `id_multi_search' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/class_methods.rb:117:in `id_multi_search' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/class_methods.rb:98:in `multi_search' #{RAILS_ROOT}/app/controllers/search_controller.rb:53:in `search' -- Posted via http://www.ruby-forum.com/. From cedric.brancourt at gmail.com Sat Jun 2 10:03:29 2007 From: cedric.brancourt at gmail.com (Cedric Brancourt) Date: Sat, 2 Jun 2007 16:03:29 +0200 Subject: [Ferret-talk] Nasty looking warnings on Debian Etch AMD64 bit box In-Reply-To: References: <38da7ce7aa27ccb6ccbd68625cf1d8f7460bd23d@jobsgopublic.com> Message-ID: David Balmain wrote: > On 3/30/07, Jeff Green wrote: >> Running gem install ferret and selecting 0.11.3 on a Dual Xeon or Dual Opteron 64 bit box running Debian Etch gives the following list of nasty looking warnings, anyone running successfully on 64 bit linux? >> error messages > > Hi Jeff, > > I think I've fixed this but as I don't have a 64 bit system to test > on, I don't know for sure. Could you please let me know when I release > the next version. > > Cheers, > Dave Hi , On my debian etch 64 bit server, ferret has been compiled with no errors. I'am using the latest gem's. -- Posted via http://www.ruby-forum.com/. From alain.ravet+ferret at gmail.com Mon Jun 4 07:24:27 2007 From: alain.ravet+ferret at gmail.com (Alain Ravet) Date: Mon, 4 Jun 2007 13:24:27 +0200 Subject: [Ferret-talk] Ferret.donate(Money.aus_dollar(200)) In-Reply-To: <1180647866.18832.48.camel@localhost.localdomain> References: <1180647866.18832.48.camel@localhost.localdomain> Message-ID: > We can also buy the Ferret Shortcut pdf/book from O'Reilly, also written > by Dave Balmain. It's awesome good: > http://www.oreilly.com/catalog/9780596527853/index.html How funny : a book about indexing, .... without an index! Alain From henke at mac.se Mon Jun 4 06:29:27 2007 From: henke at mac.se (Henrik Zagerholm) Date: Mon, 4 Jun 2007 12:29:27 +0200 Subject: [Ferret-talk] Memory concerns ferret 11.4. Message-ID: Hi list, We just built our own ferret drb server (mostly because we don't do an indexing from within rails). The ferret drb server only handles index inserts and some deletes. Usually we make batch inserts were we retrieve a couple of hundred or thousands of documents from a database and then inserts them inte ferret one by one. We call flush every 50th file. We are very impressed with the insert speeds 56 000 documents with varying size in 32 minutes. When started the ferret drb server takes about 9 MB ram but after its been running for a while doing some indexing it reaches about 150 MB RAM and when indexing is finished it still stays around 130 MB. We do manual GC.start at the end of every batch indexing. The index is now about 2.7 Gb. Any suggestions on what can be wrong? Maybe its natural for a ferret drb with an 2.7G index to use that much memory when idle? Please let me know if you need any more info. Regards, Henrik From henke at mac.se Mon Jun 4 07:34:57 2007 From: henke at mac.se (Henrik Zagerholm) Date: Mon, 4 Jun 2007 13:34:57 +0200 Subject: [Ferret-talk] Seg fault ferret-browser Message-ID: Hello, Getting a seg fault with 0.11.4 /var/lib/gems/1.8/gems/ferret-0.11.4/lib/ferret/browser.rb:226: [BUG] Segmentation fault ruby 1.8.5 (2006-08-25) [i486-linux] Maybe this a known issue but I thought that I better report it. Cheers, henke From mattias at oncotype.dk Mon Jun 4 12:25:56 2007 From: mattias at oncotype.dk (Mattias Bud) Date: Mon, 4 Jun 2007 18:25:56 +0200 Subject: [Ferret-talk] Sorting and getting occurrences of search in hit Message-ID: Is there any way you could get the number of occurrences of the search in one hit? In a result I get the ferret_rank and ferret_score but not how many hits the search generated in the current record. I would also like to be able to sort after this when I search. /mattias -- Posted via http://www.ruby-forum.com/. From ridoutspam at gmail.com Mon Jun 4 16:36:37 2007 From: ridoutspam at gmail.com (Ben Ridout) Date: Mon, 4 Jun 2007 22:36:37 +0200 Subject: [Ferret-talk] Ferret install on WinXP fails - procedure entry point rb_w32 Message-ID: Hello. I'm trying to use the 'acts_as_ferret' gem with Rails. Rails: 1.1.4 and 1.2.3 OS: WinXP I've installed both Ferret and the plugin using Ruby Gems: C:\>gem install ferret Successfully installed ferret-0.11.4-mswin32 Installing ri documentation for ferret-0.11.4-mswin32... Installing RDoc documentation for ferret-0.11.4-mswin32... C:\>gem install acts_as_ferret Successfully installed acts_as_ferret-0.4.0 When I try to use/reference Ferret, I get error messages like the following: -- Require ferret fails irb(main):001:0> require 'ferret' LoadError: 127: The specified procedure could not be found. - c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.11.4-mswin32/ext/ferret_ext.so from c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.11.4-mswin32/ext/ferret_ext.so from c:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in `require' from c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.11.4-mswin32/lib/ferret.rb:25 from c:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:33:in `require' from (irb):1 -- Require 'acts_as_ferret' fails irb(main):001:0> require 'acts_as_ferret' LoadError: 127: The specified procedure could not be found. - c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.11.4-mswin32/ext/ferret_ext.so from c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.11.4-mswin32/ext/ferret_ext.so from c:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in `require' from c:/ruby/lib/ruby/gems/1.8/gems/activesupport-1.3.1/lib/active_support/dependencies.rb:147:in `require' from c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.11.4-mswin32/lib/ferret.rb:25 from c:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:33:in `require' from c:/ruby/lib/ruby/gems/1.8/gems/activesupport-1.3.1/lib/active_support/dependencies.rb:147:in `require' from c:/ruby/lib/ruby/gems/1.8/gems/acts_as_ferret-0.4.0/lib/acts_as_ferret.rb:24 from c:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:33:in `require' from (irb):1 I also tried installing the plugin into my app: ruby script/plugin install svn://projects.jkraemer.net/acts_as_ferret/tags/stable/acts_as_ferret I added "require 'acts_as_ferret'" to my environement.rb file, and my Mongrel server fails on startup with the following Windows dialog: "The procedure entry point rb_w32_write could not be located in the dynamic link library msvcrt-ruby18.dll" Googling this error turned up just one other unlucky soul, but I could not find a solution. Any suggestions? Thanks. -B. Ridout -- Posted via http://www.ruby-forum.com/. From ridoutspam at gmail.com Mon Jun 4 17:24:24 2007 From: ridoutspam at gmail.com (Ben Ridout) Date: Mon, 4 Jun 2007 23:24:24 +0200 Subject: [Ferret-talk] Ferret install on WinXP fails - procedure entry point rb In-Reply-To: References: Message-ID: <714077b880b768e5cfb2323a4728ad54@ruby-forum.com> I still have not resolved the problem with (ferret 0.11.4 (mswin32)), but I did try going to the previous Win32 release. >> Successfully installed ferret-0.10.9-mswin32 irb(main):001:0> require 'ferret' => true When I do this, everything works fine. Is 0.11.4 not a stable release? I've seen messages about problems on the Mac OS as well. -Ben Ridout -- Posted via http://www.ruby-forum.com/. From myron.marston at gmail.com Tue Jun 5 03:18:17 2007 From: myron.marston at gmail.com (Myron Marston) Date: Tue, 5 Jun 2007 09:18:17 +0200 Subject: [Ferret-talk] Do I need to use a DRB server? Message-ID: <263106c6a74b4479cb32bb70b0bae8c7@ruby-forum.com> I'm finishing up a rails app that uses Ferret / AAF. I'm using shared hosting provided by railsplayground.com. My app is pretty small and is never going to get much traffic, so I'm going with the cheapest plan. Unfortunately, the cheapest plan doesn't allow any static processes. I've read multiple recommendations to use a DRB server when using ferret in production, but I can't on this plan since using a DRB server requires a static process. So far Ferret/AAF have worked great and I haven't seen any errors, index corruption, or any other problems. Do I really need to use the DRB server for Ferret? Since I haven't experienced any errors yet, I'm not sure what these will look like. I've got the exception notifier plugin installed, but will I get notified of ferret errors? My understanding of how this plugin works is that it handles errors in ApplicationController, so any error that does not occur as a result of a controller/action will not get handled. I'm just not sure what to be looking for to tell if I need to upgrade and use the DRB server or not. One other useful piece of information: my site has 1 (and only 1) admin user, and he is the only one that does any C, U or D (of "CRUD")--visitors to my website will only ever read the models, so there should never be a time where multiple people will be making simulaneous changes that need to be indexed. Thanks, Myron -- Posted via http://www.ruby-forum.com/. From bk at benjaminkrause.com Tue Jun 5 09:13:18 2007 From: bk at benjaminkrause.com (Benjamin Krause) Date: Tue, 5 Jun 2007 15:13:18 +0200 Subject: [Ferret-talk] Do I need to use a DRB server? In-Reply-To: <263106c6a74b4479cb32bb70b0bae8c7@ruby-forum.com> References: <263106c6a74b4479cb32bb70b0bae8c7@ruby-forum.com> Message-ID: <2DF1038E-8C21-48A3-A1B1-EF791A4756C2@benjaminkrause.com> > One other useful piece of information: my site has 1 (and only 1) > admin > user, and he is the only one that does any C, U or D (of > "CRUD")--visitors to my website will only ever read the models, so > there > should never be a time where multiple people will be making > simulaneous > changes that need to be indexed. Hey .. just try it :-) It sounds like you don't need the ferret server.. keep an eye on your production.log .. you woll see the ferret errors (like specific files in the index could not be found).. i think you will be fine without the server.. and even if you get some errors.. AAF makes it so easy for you to add the server, that it won't take long :) Ben From john at digitalpulp.com Tue Jun 5 12:09:59 2007 From: john at digitalpulp.com (John Bachir) Date: Tue, 5 Jun 2007 12:09:59 -0400 Subject: [Ferret-talk] Do I need to use a DRB server? In-Reply-To: <263106c6a74b4479cb32bb70b0bae8c7@ruby-forum.com> References: <263106c6a74b4479cb32bb70b0bae8c7@ruby-forum.com> Message-ID: On Jun 5, 2007, at 3:18 AM, Myron Marston wrote: > One other useful piece of information: my site has 1 (and only 1) > admin > user, and he is the only one that does any C, U or D (of > "CRUD")--visitors to my website will only ever read the models, so > there > should never be a time where multiple people will be making > simulaneous > changes that need to be indexed. That's the most useful piece of information :) As long as the admin user doesn't have multiple windows open doing multiple writes to the ferret index, your app will never experience any concurrency issues. From john at johnleach.co.uk Tue Jun 5 12:32:48 2007 From: john at johnleach.co.uk (John Leach) Date: Tue, 05 Jun 2007 17:32:48 +0100 Subject: [Ferret-talk] Memory concerns ferret 11.4. In-Reply-To: References: Message-ID: <1181061168.8155.39.camel@localhost.localdomain> Hi Henrik, when the IndexWriter is opened, the term dictionary is loaded into RAM. So memory usage is certainly dependent on the number of unique terms in the index. The entire term dictionary isn't actually loaded, just an even spread of terms. The :index_skip_interval parameter allows you to twiddle this spread - the higher the skip interval, the less memory will be used, but the slower your searches. Play with this parameter and see if it improves things for you - if not, at least you know it's not down to having lots of unique terms. Tbh, probably a long shot, but worth a look. John. On Mon, 2007-06-04 at 12:29 +0200, Henrik Zagerholm wrote: > Hi list, > > We just built our own ferret drb server (mostly because we don't do > an indexing from within rails). > > The ferret drb server only handles index inserts and some deletes. > Usually we make batch inserts were we retrieve a couple of hundred or > thousands of documents from a database and then inserts them inte > ferret one by one. > We call flush every 50th file. We are very impressed with the insert > speeds 56 000 documents with varying size in 32 minutes. > > When started the ferret drb server takes about 9 MB ram but after its > been running for a while doing some indexing it reaches about 150 MB > RAM and when indexing is finished it still stays around 130 MB. > We do manual GC.start at the end of every batch indexing. > > The index is now about 2.7 Gb. > > Any suggestions on what can be wrong? > Maybe its natural for a ferret drb with an 2.7G index to use that > much memory when idle? > > Please let me know if you need any more info. > > Regards, > Henrik -- http://johnleach.co.uk From coolpaek at mail.com Tue Jun 5 16:19:49 2007 From: coolpaek at mail.com (J Paek) Date: Tue, 5 Jun 2007 22:19:49 +0200 Subject: [Ferret-talk] Limit on database size? Message-ID: <6fef874185bb81ca9fac92d84ee259a0@ruby-forum.com> Is there a limit on the size of database that acts_as_ferret can handle? I'm using mysql for my application, and substring search facility of acts_as_Ferret appears to stop working when there is over 100,000 rows in a table. -- Posted via http://www.ruby-forum.com/. From solaris at sundevil.de Tue Jun 5 22:31:56 2007 From: solaris at sundevil.de (Hendrik Volkmer) Date: Wed, 6 Jun 2007 04:31:56 +0200 Subject: [Ferret-talk] Strange Problem with AAF DRB connection In-Reply-To: <20070524093957.GB8909@cordoba.webit.de> References: <86e407d8ae1513562c40a81bc43fb626@ruby-forum.com> <20070524093957.GB8909@cordoba.webit.de> Message-ID: Jens Kraemer wrote: > On Thu, May 24, 2007 at 09:31:59AM +0200, Hendrik Volkmer wrote: >> premature marshal format(can't read) >> (druby:/10.0.0.10:9010) /usr/lib/ruby/1.8/drb/drb.rb:580:in `load' >> >> Do you have any ideas what that could be? We didn't change so much >> regarding aaf. Maybe we put some more fields in the index, that should >> be it. > > strange - the first one seems to be a request way to large, and in the > second case the request payload has been shorter than expected - are you > sure you don't have any network issues? > > Any hints on when this happens (i.e. high load, special actions) ? Nothing like that. The strange thing is: It's working again. We didn't really changed much regarding ferret... I'll keep an eye on this behavoir. Hendrik -- Posted via http://www.ruby-forum.com/. From myron.marston at gmail.com Wed Jun 6 02:27:22 2007 From: myron.marston at gmail.com (Myron Marston) Date: Wed, 6 Jun 2007 08:27:22 +0200 Subject: [Ferret-talk] Do I need to use a DRB server? In-Reply-To: References: <263106c6a74b4479cb32bb70b0bae8c7@ruby-forum.com> Message-ID: > As long as the admin > user doesn't have multiple windows open doing multiple writes to the > ferret index, your app will never experience any concurrency issues. Great, I should be fine without the DRB server. Thanks, guys! -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Wed Jun 6 04:07:05 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 6 Jun 2007 10:07:05 +0200 Subject: [Ferret-talk] Ferret.donate(Money.aus_dollar(200)) In-Reply-To: <1180647866.18832.48.camel@localhost.localdomain> References: <1180647866.18832.48.camel@localhost.localdomain> Message-ID: <20070606080705.GA4914@cordoba.webit.de> On Thu, May 31, 2007 at 10:44:26PM +0100, John Leach wrote: [..] > > Many of us probably use Ferret via the acts_as_ferret Rails plugin by > Jens Kraemer. He doesn't stipulate how he'd like to be supported, so > until he chooses to clarify otherwise, I'd recommend that if you see him > in the street buy him lunch[1]. He looks like this: > > http://www.xing.com/profile/Jens_Kraemer2 > > Keep a close eye out for him. Thanks John :-) In fact I made a small list of things people can do to support aaf a while ago: http://www.jkraemer.net/projects/acts_as_ferret Regarding the lunch - always appreciated :-) Btw, if anybody wants to meet me in person, good opportunities to do so are Rails-Konferenz[1] and RailsConf Europe[2]. cheers, Jens [1] http://www.rails-konferenz.de/ [2] http://www.railsconfeurope.com/ -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From ramon at decsai.ugr.es Wed Jun 6 06:12:22 2007 From: ramon at decsai.ugr.es (Ramon) Date: Wed, 6 Jun 2007 12:12:22 +0200 Subject: [Ferret-talk] How to search with limit by field In-Reply-To: <42d8808f0705300945w359bf4d8ib7fddb8e69178682@mail.gmail.com> References: <6330ef94ae27b584e409faf8031bded8@ruby-forum.com> <42d8808f0705300945w359bf4d8ib7fddb8e69178682@mail.gmail.com> Message-ID: <0f4ca72e485f545ab9191b2eec60c577@ruby-forum.com> Thanks Doug, I thought that ferret could do this query. I don't know to priory the total of clients i have. If i have 100 clients, i can't do it. Too, I would like to use the pagination. Ram?n. Doug Smith wrote: > Hi Ramon, > > I think you'd have to do three different queries: > > query = params[:query] > @results1 = model.find_by_contents("client:1 content:#{query}", {:limit > => > 3}) > @results2 = model.find_by_contents("client:2 content:#{query}", {:limit > => > 3}) > @results3 = model.find_by_contents("client:3 content:#{query}", {:limit > => > 3}) > > Ferret is fast enough that this shouldn't be a performance problem. > > Thanks, > > Doug -- Posted via http://www.ruby-forum.com/. From henke at mac.se Wed Jun 6 06:42:26 2007 From: henke at mac.se (Henrik Zagerholm) Date: Wed, 6 Jun 2007 12:42:26 +0200 Subject: [Ferret-talk] Memory concerns ferret 11.4. In-Reply-To: <1181061168.8155.39.camel@localhost.localdomain> References: <1181061168.8155.39.camel@localhost.localdomain> Message-ID: <4F0AC7E9-A20C-469E-986A-61AF41AA0269@mac.se> 5 jun 2007 kl. 18:32 skrev John Leach: Hi John, > HI Henrik, > > when the IndexWriter is opened, the term dictionary is loaded into > RAM. > So memory usage is certainly dependent on the number of unique > terms in > the index. OK, interesting. I'll do some more testing eliminating as much non-ferret code as possible to see what is making my ferret_server eat up about 130-150 MB of ram after it has been running for a while. > The entire term dictionary isn't actually loaded, just an even > spread of > terms. The :index_skip_interval parameter allows you to twiddle this > spread - the higher the skip interval, the less memory will be > used, but > the slower your searches. > Right now in my code I use def initialize @index = Index::Index.new( :path => SafeCube::FERRET_INDEX_PATH ) end But as this drb server is for writing and deleting only, should I specifically create an IndexWriter instead? Then In my rails application I can specify an IndexReader instead as I only do searches for there. Would this change anything? > Play with this parameter and see if it improves things for you - if > not, > at least you know it's not down to having lots of unique terms. > I'll try setting some different high low values and see if Ic an control the amounts of RAM taken. Thanks again for the info John! > Tbh, probably a long shot, but worth a look. > > John. > > > On Mon, 2007-06-04 at 12:29 +0200, Henrik Zagerholm wrote: >> Hi list, >> >> We just built our own ferret drb server (mostly because we don't do >> an indexing from within rails). >> >> The ferret drb server only handles index inserts and some deletes. >> Usually we make batch inserts were we retrieve a couple of hundred or >> thousands of documents from a database and then inserts them inte >> ferret one by one. >> We call flush every 50th file. We are very impressed with the insert >> speeds 56 000 documents with varying size in 32 minutes. >> >> When started the ferret drb server takes about 9 MB ram but after its >> been running for a while doing some indexing it reaches about 150 MB >> RAM and when indexing is finished it still stays around 130 MB. >> We do manual GC.start at the end of every batch indexing. >> >> The index is now about 2.7 Gb. >> >> Any suggestions on what can be wrong? >> Maybe its natural for a ferret drb with an 2.7G index to use that >> much memory when idle? >> >> Please let me know if you need any more info. >> >> Regards, >> Henrik > -- > http://johnleach.co.uk > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From vince71 at gmail.com Wed Jun 6 10:38:21 2007 From: vince71 at gmail.com (Vince W.) Date: Wed, 6 Jun 2007 16:38:21 +0200 Subject: [Ferret-talk] some items not indexed properly.. how to fix? Message-ID: <8f495cc7668510922c8159fc99958f04@ruby-forum.com> I've got ferret enabled for :name and :description. One of the items it should be indexing has a name of Flow Yoga searching for 'Flow' finds it but searching for 'Yoga' does not. I also have Elation Yoga in the database searching for 'Elation' finds it and searching for 'Yoga' also finds it. Anybody know why these might be acting differently to ferret? I'm using version 0.11.3 -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Wed Jun 6 11:00:35 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 6 Jun 2007 17:00:35 +0200 Subject: [Ferret-talk] Is aaf multi_search broken? In-Reply-To: <702fdbc6df8fa22d29b4952acc88258c@ruby-forum.com> References: <702fdbc6df8fa22d29b4952acc88258c@ruby-forum.com> Message-ID: <20070606150035.GD6638@cordoba.webit.de> Hi! On Fri, Jun 01, 2007 at 11:00:24PM +0200, Starburger wrote: > Hi all, > > I want to use acts_as_ferret's multi_search to search two model classes > (Reviewable and Blog) at a time like > > @results = Reviewable.multi_search("jemen", [Blog]) > > and I'm always getting the error > > You have a nil object when you didn't expect it! > You might have expected an instance of Array. > The error occurred while evaluating nil.map > [..] > I have configured indexing like this: > > acts_as_ferret :fields => [:index_text, :index_locations], :single_index > => true > acts_as_ferret :fields => [:index_text, :index_locations], :single_index > => true > > Maybe I'm doing something wrong? just don't use multi_search with :single_index => true. In the single_index case, you tell find_by_contents what additional models to search with the :models option: @results = Reviewable.find_by_contents("jemen", :models => [Blog]) or @results = Reviewable.find_by_contents("jemen", :models => :all) this indeed looks a bit like an inconsistent API, however it's this way because in the single_index case you have only one Ferret index (hence find_by_contents is used), while multi_search runs a combined search across multiple Ferret indexes. Maybe in a future version I'll integrate the multi_search functionality into find_by_contents, too. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From vince71 at gmail.com Wed Jun 6 11:47:40 2007 From: vince71 at gmail.com (Vince W.) Date: Wed, 6 Jun 2007 17:47:40 +0200 Subject: [Ferret-talk] some items not indexed properly.. how to fix? In-Reply-To: <8f495cc7668510922c8159fc99958f04@ruby-forum.com> References: <8f495cc7668510922c8159fc99958f04@ruby-forum.com> Message-ID: <1eed4972591e11ffe45c957be0669bc1@ruby-forum.com> Vince W. wrote: > I've got ferret enabled for :name and :description. One of the items it > should be indexing has a name of Flow Yoga > > searching for 'Flow' finds it but searching for 'Yoga' does not. > > I also have Elation Yoga in the database > > searching for 'Elation' finds it and searching for 'Yoga' also finds it. > > > Anybody know why these might be acting differently to ferret? I'm using > version 0.11.3 I've got some more info. It turns out that it's related to the number of words in the name: In my prior example I should have been more specific: Flow Yoga Network doesn't show up because it's 3 words, but if I change it to just Flow Yoga then it *does* index. Now why would ferret care about the number of words in the name field? -- Posted via http://www.ruby-forum.com/. From me at phillipoertel.com Wed Jun 6 14:20:16 2007 From: me at phillipoertel.com (Phillip Oertel) Date: Wed, 6 Jun 2007 20:20:16 +0200 Subject: [Ferret-talk] bug when assigning new analyzer? In-Reply-To: <20070510073405.GQ9575@cordoba.webit.de> References: <01f86ed16fd2c08775af82d094131e8e@ruby-forum.com> <20070510073405.GQ9575@cordoba.webit.de> Message-ID: hi jens, thanks for making that clear, and sorry for the long delay in replying. we were quite busy. cheers, phillip -- Posted via http://www.ruby-forum.com/. From casey at nerdle.com Wed Jun 6 12:44:17 2007 From: casey at nerdle.com (Casey) Date: Wed, 6 Jun 2007 12:44:17 -0400 (EDT) Subject: [Ferret-talk] Is anyone successfully using acts_as_ferret with Ferret 0.11.4? Message-ID: Hi all, I upgraded from Ferret 0.11.3 to Ferret 0.11.4 because I was getting intermittent segfaults that seemed to be due to a bug which was fixed (changeset 749). Unfortunately, 0.11.4 + acts_as_ferret seems to be a bad combination. I'm getting the some "fs_store/File Not Found Error occured at :117" which was reported in the "Constant 0.11.4 Errors" thread [http://www.mail-archive.com/ferret-talk at rubyforge.org/msg03136.html] when aaf tries to update the index after a record is saved. My logs show that about 1400 fields were added to the Ferret indices before I noticed the problem and I only got this error during the concurrent updates (about 2% of all updates) Does anyone have this combo working? Perhaps one of the people in the original thread resolved the problem? I apologize for not doing more research - I just want to make sure that this isn't already solved before I go off and dig into it. Thanks very much, Casey From doug.arogos at gmail.com Wed Jun 6 14:29:28 2007 From: doug.arogos at gmail.com (Doug Smith) Date: Wed, 6 Jun 2007 11:29:28 -0700 Subject: [Ferret-talk] Is anyone successfully using acts_as_ferret with Ferret 0.11.4? In-Reply-To: References: Message-ID: <42d8808f0706061129i27e2c10fo9962a2723358e6e5@mail.gmail.com> It's working great for us. We just launched this site: http://www.taftlaw.com. That site uses Ferret 0.11.4 on Ruby 1.8.6, Rails 1.2.3, and acts_as_ferret with the DRb server. Thanks, Doug http://www.thinkbarefoot.com On 6/6/07, Casey wrote: > > Hi all, > > I upgraded from Ferret 0.11.3 to Ferret 0.11.4 because I was getting > intermittent segfaults that seemed to be due to a bug which was fixed > (changeset 749). > > Unfortunately, 0.11.4 + acts_as_ferret seems to be a bad combination. I'm > getting the some "fs_store/File Not Found Error occured at :117" > which was reported in the "Constant 0.11.4 Errors" thread > [http://www.mail-archive.com/ferret-talk at rubyforge.org/msg03136.html] when > aaf tries to update the index after a record is saved. > > My logs show that about 1400 fields were added to the Ferret indices > before I noticed the problem and I only got this error during the > concurrent updates (about 2% of all updates) > > Does anyone have this combo working? Perhaps one of the people in the > original thread resolved the problem? I apologize for not doing more > research - I just want to make sure that this isn't already solved before > I go off and dig into it. > > > Thanks very much, > Casey > > > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070606/263dc5b9/attachment.html From kyle at casttv.com Wed Jun 6 14:43:27 2007 From: kyle at casttv.com (Kyle Maxwell) Date: Wed, 6 Jun 2007 11:43:27 -0700 Subject: [Ferret-talk] Is anyone successfully using acts_as_ferret with Ferret 0.11.4? In-Reply-To: References: Message-ID: <47699a8d0706061143l46ec6878ldd24685a267c7b78@mail.gmail.com> On 6/6/07, Casey wrote: > Hi all, > > I upgraded from Ferret 0.11.3 to Ferret 0.11.4 because I was getting > intermittent segfaults that seemed to be due to a bug which was fixed > (changeset 749). > > Unfortunately, 0.11.4 + acts_as_ferret seems to be a bad combination. I'm > getting the some "fs_store/File Not Found Error occured at :117" > which was reported in the "Constant 0.11.4 Errors" thread > [http://www.mail-archive.com/ferret-talk at rubyforge.org/msg03136.html] when > aaf tries to update the index after a record is saved. > > My logs show that about 1400 fields were added to the Ferret indices > before I noticed the problem and I only got this error during the > concurrent updates (about 2% of all updates) > > Does anyone have this combo working? Perhaps one of the people in the > original thread resolved the problem? I apologize for not doing more > research - I just want to make sure that this isn't already solved before > I go off and dig into it. > > > Thanks very much, > Casey > > > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > How big is your index? -- Kyle Maxwell Software Engineer CastTV, Inc http://www.casttv.com From casey at nerdle.com Wed Jun 6 14:50:09 2007 From: casey at nerdle.com (Casey Forbes) Date: Wed, 6 Jun 2007 14:50:09 -0400 (EDT) Subject: [Ferret-talk] Is anyone successfully using acts_as_ferret with Ferret 0.11.4? In-Reply-To: <47699a8d0706061143l46ec6878ldd24685a267c7b78@mail.gmail.com> References: <47699a8d0706061143l46ec6878ldd24685a267c7b78@mail.gmail.com> Message-ID: Good to hear! Um. I'm running Ferret/aaf with a pack of Mongrels and no DRb. I'm thinking that maybe that's not the greatest idea. (I read too much crap on the web and got worried about the performance of the aaf DRb server.) Also, someone asked about the size of my index. It's small - 7 MB. I love Ferret - user contributed data is a big part of my site and I'd don't know where I'd be if I didn't have a good search engine to drive "Did you mean?..." suggestions and help prevent duplication. Casey On Wed, 6 Jun 2007, Kyle Maxwell wrote: > On 6/6/07, Casey wrote: >> Hi all, >> >> I upgraded from Ferret 0.11.3 to Ferret 0.11.4 because I was getting >> intermittent segfaults that seemed to be due to a bug which was fixed >> (changeset 749). >> >> Unfortunately, 0.11.4 + acts_as_ferret seems to be a bad combination. I'm >> getting the some "fs_store/File Not Found Error occured at :117" >> which was reported in the "Constant 0.11.4 Errors" thread >> [http://www.mail-archive.com/ferret-talk at rubyforge.org/msg03136.html] when >> aaf tries to update the index after a record is saved. >> >> My logs show that about 1400 fields were added to the Ferret indices >> before I noticed the problem and I only got this error during the >> concurrent updates (about 2% of all updates) >> >> Does anyone have this combo working? Perhaps one of the people in the >> original thread resolved the problem? I apologize for not doing more >> research - I just want to make sure that this isn't already solved before >> I go off and dig into it. >> >> >> Thanks very much, >> Casey >> >> >> >> >> _______________________________________________ >> Ferret-talk mailing list >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk >> > > How big is your index? > > -- > Kyle Maxwell > Software Engineer > CastTV, Inc > http://www.casttv.com > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From doug.arogos at gmail.com Wed Jun 6 14:52:40 2007 From: doug.arogos at gmail.com (Doug Smith) Date: Wed, 6 Jun 2007 11:52:40 -0700 Subject: [Ferret-talk] Is anyone successfully using acts_as_ferret with Ferret 0.11.4? In-Reply-To: <47699a8d0706061143l46ec6878ldd24685a267c7b78@mail.gmail.com> References: <47699a8d0706061143l46ec6878ldd24685a267c7b78@mail.gmail.com> Message-ID: <42d8808f0706061152n50104607n379be663696f3c83@mail.gmail.com> On 6/6/07, Kyle Maxwell wrote: > > > How big is your index? There are seven indexed models with a total size of 2.5M. There isn't a ton of concurrent updating, though we've had some with no trouble yet. Thanks, Doug -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070606/20fa5425/attachment.html From patcito at gmail.com Wed Jun 6 16:32:26 2007 From: patcito at gmail.com (Patrick Aljord) Date: Wed, 6 Jun 2007 15:32:26 -0500 Subject: [Ferret-talk] globalize+acts_as_ferret Message-ID: <6b6419750706061332w7da8181bpb14d3956288f1da9@mail.gmail.com> Hey all, I'm using acts_as_ferret and globalize. I stumbled upon that post on google: http://osdir.com/ml/lang.ruby.ferret.general/2007-01/msg00068.html does anybody know if it's included in the latest a_a_f or if it's planed to be? I can't seem to find anything about it. thanx in advance Pat From kraemer at webit.de Thu Jun 7 03:20:16 2007 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 7 Jun 2007 09:20:16 +0200 Subject: [Ferret-talk] globalize+acts_as_ferret In-Reply-To: <6b6419750706061332w7da8181bpb14d3956288f1da9@mail.gmail.com> References: <6b6419750706061332w7da8181bpb14d3956288f1da9@mail.gmail.com> Message-ID: <20070607072016.GH6638@cordoba.webit.de> Hi! Saimon sent me a patch for this, but unfortunately I didn't find the time to integrate it back then. Afair he added an option that made aaf use per-language index directories, depending on the language currently set in globalize. Now that I think of it, that might not always be what people want - i.e. you lose the possibility to easily search across multiple languages. Imho it would be sufficient to store the language of a record in a field, and use that for filtering by language when needed. More important is to use an analyzer that can handle the different languages. What exactly would you want to do with globalize and aaf? Anyway, since aaf has undergone some serious refactoring since then the patch won't work anymore, but I uploaded the archive containing Saimon's version of aaf with the patch applied, and his patch to the Wiki. Find it at http://projects.jkraemer.net/acts_as_ferret/wiki at the bottom of the page. cheers, Jens On Wed, Jun 06, 2007 at 03:32:26PM -0500, Patrick Aljord wrote: > Hey all, > I'm using acts_as_ferret and globalize. I stumbled upon that post on google: > http://osdir.com/ml/lang.ruby.ferret.general/2007-01/msg00068.html > > does anybody know if it's included in the latest a_a_f or if it's > planed to be? I can't seem to find anything about it. > > thanx in advance > > Pat > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Thu Jun 7 03:39:02 2007 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 7 Jun 2007 09:39:02 +0200 Subject: [Ferret-talk] Is anyone successfully using acts_as_ferret with Ferret 0.11.4? In-Reply-To: References: <47699a8d0706061143l46ec6878ldd24685a267c7b78@mail.gmail.com> Message-ID: <20070607073902.GJ6638@cordoba.webit.de> On Wed, Jun 06, 2007 at 02:50:09PM -0400, Casey Forbes wrote: > Good to hear! Um. I'm running Ferret/aaf with a pack of Mongrels and no > DRb. I'm thinking that maybe that's not the greatest idea. (I read too > much crap on the web and got worried about the performance of the aaf DRb > server.) Please use the DRb server in Multi-Process-Scenarios, and start to worry about it's performance when you are sure it really is the DRb server that slows your app down ;-) Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From me at phillipoertel.com Thu Jun 7 05:57:09 2007 From: me at phillipoertel.com (Phillip Oertel) Date: Thu, 7 Jun 2007 11:57:09 +0200 Subject: [Ferret-talk] Ferret::Analysis::StemFilter documentation bug Message-ID: "nl" selects the dutch stemming algorithm, "no" selects the norwegian (as one would expect). there's no inconsistency, which the documentation would suggest (using "dut" and "nld" for dutch stemming, "nl" or "no" for norwegian). this is on ferret 0.11.4 at least, i didn't check earlier versions. phillip -- Posted via http://www.ruby-forum.com/. From casey at nerdle.com Thu Jun 7 08:29:08 2007 From: casey at nerdle.com (Casey Forbes) Date: Thu, 7 Jun 2007 08:29:08 -0400 (EDT) Subject: [Ferret-talk] Is anyone successfully using acts_as_ferret with Ferret 0.11.4? In-Reply-To: <20070607073902.GJ6638@cordoba.webit.de> References: <47699a8d0706061143l46ec6878ldd24685a267c7b78@mail.gmail.com> <20070607073902.GJ6638@cordoba.webit.de> Message-ID: Thanks! The Solr / aaf DRb server benchmarks look good to me - I shouldn't have been worried. Also, I just noticed the "[187] Improved DRb server index rebuild handling by keeping index versions" changeset - that looks GREAT! Casey On Thu, 7 Jun 2007, Jens Kraemer wrote: > On Wed, Jun 06, 2007 at 02:50:09PM -0400, Casey Forbes wrote: >> Good to hear! Um. I'm running Ferret/aaf with a pack of Mongrels and no >> DRb. I'm thinking that maybe that's not the greatest idea. (I read too >> much crap on the web and got worried about the performance of the aaf DRb >> server.) > > Please use the DRb server in Multi-Process-Scenarios, and start to worry > about it's performance when you are sure it really is the DRb server > that slows your app down ;-) > > Jens > > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold, Hagen Malessa > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From patcito at gmail.com Thu Jun 7 10:45:20 2007 From: patcito at gmail.com (Patrick Aljord) Date: Thu, 7 Jun 2007 09:45:20 -0500 Subject: [Ferret-talk] globalize+acts_as_ferret In-Reply-To: <20070607072016.GH6638@cordoba.webit.de> References: <6b6419750706061332w7da8181bpb14d3956288f1da9@mail.gmail.com> <20070607072016.GH6638@cordoba.webit.de> Message-ID: <6b6419750706070745s3cac638ao7c946523947b25b4@mail.gmail.com> Hey Jens, Thanks for the link. > What exactly would you want to do with globalize and aaf? I have something like this: class Shooting < ActiveRecord::Base acts_as_ferret :fields => [:media_name,:media_name_es,:media_name_fr] def media_name return self.media.name end def media_name_fr return self.media.name_fr end def media_name_es return self.media.name_es end end is there a better way to handle this? Thanks in advance Pat From jesse at hogbaysoftware.com Thu Jun 7 10:54:57 2007 From: jesse at hogbaysoftware.com (Jesse Grosjean) Date: Thu, 7 Jun 2007 16:54:57 +0200 Subject: [Ferret-talk] :store => :yes doesn't work in some cases Message-ID: <3718a6efdb1006fc97cfc328b27f2023@ruby-forum.com> I'm not really sure if this is a bug, but it makes my search results look a bit strange. I have an acts_as_ferret declaration that looks like: acts_as_ferret :store_class_name => true, :remote => true, :fields => { :ferret_name => { :store => :yes, :boost => 2 }, :ferret_content => { :store => :yes } } I store both fields so that I don't need to load each result model from the rails DB when displaying the results. This is the code that I use to show each result: highlighted_name = result.highlight(params[:q], :field => :ferret_name, :pre_tag => "", :post_tag => "", :excerpt_length => 150, :num_excerpts => 1) highlighted_content = result.highlight(params[:q], :field => :ferret_content, :pre_tag => "", :post_tag => "", :excerpt_length => 150, :num_excerpts => 1) Generally this all works fine. The problem happens when the ferret_name of my model is a word that is skipped by the ferret tokenize. For example if I have a model with: ferret_name: about ferret_content: rails Then when I do a search for 'rails' the result will be found, but the results highlighted_name will be blank. So I don't see the name of the model in the result. This seems to be a special case, because generally words like "about" and "the" that are skipped by the tokenizer will still be stored when :store => :yes when they are in a phrase. I hope that makes some sense. For now I can get around the problem by checking for the blank case and loading the value from the model directly, but things would be easier if :store => :yes would just always store the field value. Thanks for any help, Jesse -- Posted via http://www.ruby-forum.com/. From bk at benjaminkrause.com Thu Jun 7 11:16:41 2007 From: bk at benjaminkrause.com (Benjamin Krause) Date: Thu, 7 Jun 2007 17:16:41 +0200 Subject: [Ferret-talk] globalize+acts_as_ferret In-Reply-To: <6b6419750706070745s3cac638ao7c946523947b25b4@mail.gmail.com> References: <6b6419750706061332w7da8181bpb14d3956288f1da9@mail.gmail.com> <20070607072016.GH6638@cordoba.webit.de> <6b6419750706070745s3cac638ao7c946523947b25b4@mail.gmail.com> Message-ID: <2407A695-DF27-436E-B7B1-1E3AB8B29452@benjaminkrause.com> On 2007-06-07, at 4:45 PM, Patrick Aljord wrote: > Hey Jens, > Thanks for the link. > >> What exactly would you want to do with globalize and aaf? > > I have something like this: > class Shooting < ActiveRecord::Base > acts_as_ferret :fields => [:media_name,:media_name_es,:media_name_fr] thats how i store language-specific fields as well .. i got something like :title_de, :title_en and :content_de, :content_en Ben From l.a.olsson at gmail.com Thu Jun 7 12:21:44 2007 From: l.a.olsson at gmail.com (Lars Olsson) Date: Thu, 7 Jun 2007 17:21:44 +0100 Subject: [Ferret-talk] Unique :key not maintained after add_indexes? Message-ID: <280493c0706070921s54d1d246x9af2f6017b4d22a2@mail.gmail.com> Hi, When adding an index to another one using add_indexes I get duplicates even though I use the :key attribute. For example: def test_add_indexes_uniqueness index1 = Ferret::Index::Index.new(:key => :id) index2 = Ferret::Index::Index.new(:key => :id) # Add two items with same id index1 << {:id => 23, :data => "This is the data..."} index1 << {:id => 23, :data => "This is the new data..."} assert_equal(1, index1.size) index2 << {:id => 23, :data => "This is the new data..."} assert_equal(1, index2.size) # Add index2 to index1 index1.add_indexes(index2) #Size should still be 1 as the items in both indexes have id 23 assert_equal(1, index1.size) end Here the final assertion fails because the size is 2. What have I misunderstood? How can I maintain uniqueness when merging indexes? Thanks, Lars From patcito at gmail.com Thu Jun 7 12:42:23 2007 From: patcito at gmail.com (Patrick Aljord) Date: Thu, 7 Jun 2007 11:42:23 -0500 Subject: [Ferret-talk] globalize+acts_as_ferret In-Reply-To: <2407A695-DF27-436E-B7B1-1E3AB8B29452@benjaminkrause.com> References: <6b6419750706061332w7da8181bpb14d3956288f1da9@mail.gmail.com> <20070607072016.GH6638@cordoba.webit.de> <6b6419750706070745s3cac638ao7c946523947b25b4@mail.gmail.com> <2407A695-DF27-436E-B7B1-1E3AB8B29452@benjaminkrause.com> Message-ID: <6b6419750706070942h1472c051kcfb48281e485e70f@mail.gmail.com> On 6/7/07, Benjamin Krause wrote: > thats how i store language-specific fields as well .. i got something > like > :title_de, :title_en and :content_de, :content_en > ok, thanx for the info Ben. From deinspanjer at gmail.com Thu Jun 7 13:19:26 2007 From: deinspanjer at gmail.com (Daniel Einspanjer) Date: Thu, 7 Jun 2007 17:19:26 +0000 (UTC) Subject: [Ferret-talk] Advise on slowness in bootstrapping? Message-ID: I am looking at trying to use ferret/aaf to supplement my querying against a medium and large table with lots of columns. Some facts first: Ferret 0.11.4 AAF 0.4.0 Ruby 1.8.6 Rails 1.2.3 Medium table: 105,464 rows 168 columns (mostly varchar(20)) 11 actual columns indexed in aaf plus 40 virtual columns indexed in aaf (virtual is concat of two physical columns. e.g. cast_first_name_1 + cast_last_name_1 through cast_first_name_20 + cast_last_name_20) Large table: 1,244,716 rows same column/index structure These tables are not updated via Ruby, only read. I am trying to use rebuild_index to bootstrap the medium sized table and it is taking a very long time (running for about 4 hours, indicates 50% complete with 4 hours remaining) and creating a massive number of files in the index directory (currently about 65k, was 90k earlier) I have not done any tuning of ferret/aaf so far, and I fear what it will look like to do the big table. Does anyone have any advise on how to speed this process up? Because the tables are updated by an external batch process, if I were to continue down this ferret/aaf path, I'd have to be looking at running this rebuild_index a couple of times per week which would be rather painful given the present time and might not be possible if the large table took more than 48 hours... From deinspanjer at gmail.com Thu Jun 7 14:07:50 2007 From: deinspanjer at gmail.com (Daniel Einspanjer) Date: Thu, 7 Jun 2007 14:07:50 -0400 Subject: [Ferret-talk] Advise on slowness in bootstrapping? In-Reply-To: References: Message-ID: p.s. Please forgive my lack of attention to the changes I let the spell checker make. All instances of the verb advise should be mentally replaced with the noun advice. :) From ruby at bharathrentals.com Thu Jun 7 21:47:59 2007 From: ruby at bharathrentals.com (Ruby Bharathrentals) Date: Fri, 8 Jun 2007 03:47:59 +0200 Subject: [Ferret-talk] Advanced search Message-ID: <2a0bef391a939ad5ab70e4a4c74ec15f@ruby-forum.com> I like the simple search of ferret; I would like to take this one step further and do an advanced search; the user will type in key words and use a drop down box to select the location; now how to pass this location_id to the ferret search so it searches key words only on matching records with same location_id? thanks -- Posted via http://www.ruby-forum.com/. From steve at ourbigcircle.com Fri Jun 8 02:46:25 2007 From: steve at ourbigcircle.com (Luna Claire) Date: Fri, 8 Jun 2007 08:46:25 +0200 Subject: [Ferret-talk] getting the list of indexed words from ferret or aaf Message-ID: <9ea9ed2c67908fe4368c3e2d3f973541@ruby-forum.com> is the list of indexed words readily available via aaf or directly from ferret? -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Fri Jun 8 04:34:00 2007 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 8 Jun 2007 10:34:00 +0200 Subject: [Ferret-talk] Advise on slowness in bootstrapping? In-Reply-To: References: Message-ID: <20070608083400.GB19126@cordoba.webit.de> On Thu, Jun 07, 2007 at 05:19:26PM +0000, Daniel Einspanjer wrote: > I am looking at trying to use ferret/aaf to supplement my querying against a > medium and large table with lots of columns. Some facts first: > > Ferret 0.11.4 > AAF 0.4.0 > Ruby 1.8.6 > Rails 1.2.3 > > Medium table: > 105,464 rows > 168 columns (mostly varchar(20)) > 11 actual columns indexed in aaf plus > 40 virtual columns indexed in aaf (virtual is concat of two physical columns. > e.g. cast_first_name_1 + cast_last_name_1 through cast_first_name_20 + > cast_last_name_20) > > Large table: > 1,244,716 rows > same column/index structure > > These tables are not updated via Ruby, only read. I am trying to use > rebuild_index to bootstrap the medium sized table and it is taking a very long > time (running for about 4 hours, indicates 50% complete with 4 hours remaining) > and creating a massive number of files in the index directory (currently about > 65k, was 90k earlier) strange. Ferret is faster than that - I have a test script that builds an index of 100000 documents with 50 fields each containing a single random word in under 10 Minutes here on standard hardware. Maybe the problem is something else? For starters, change line 220 of local_index.rb from index << rec.to_doc if rec.ferret_enabled?(true) to doc = rec.to_doc if rec.ferret_enabled?(true) so nothing is added to the index. How long does that take? Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From john at digitalpulp.com Fri Jun 8 06:13:51 2007 From: john at digitalpulp.com (John Bachir) Date: Fri, 8 Jun 2007 06:13:51 -0400 Subject: [Ferret-talk] complete index rebuild using AAF trunk In-Reply-To: <88EF8DAF-8AA3-4DD8-961D-BAD6D091852E@digitalpulp.com> References: <88EF8DAF-8AA3-4DD8-961D-BAD6D091852E@digitalpulp.com> Message-ID: <4AC55D32-90E5-4D0E-BE53-DAB9629AFA77@digitalpulp.com> Jens, any thoughts on this? On May 31, 2007, at 2:30 PM, John Bachir wrote: > I am using AAF trunk, and I want a way to rebuild an index on a > production site with little or no interruption to service. The Drb > Server documentation* states that when an index is rebuilt, it is > done in a separate location and then swapped into place when > finished, and so to do a complete rebuild on a live site, one must > take into consideration objects which have been created or modified > in the meantime. To achieve this, I have come up with the following > solution: > > http://pastie.textmate.org/66602 > > [1] Does this look like a complete solution? I suppose it relies on > timestamp consistency between system components... it is possible > that between setting "start = ..." and performing the rebuild, > another thread in the system will have create an earlier timestamp > for an object that did not get committed until after the rebuild > began. Is it possible to do a perfect rebuild, or would that require > building a layer of concurrency logic into AAF? > > [2] Is the behavior described in the Drb Server documentation > different from AAF when not using the Drb Server? > > Thanks, > John > > * http://projects.jkraemer.net/acts_as_ferret/wiki/DrbServer#AAFtrunk From john at digitalpulp.com Fri Jun 8 06:16:00 2007 From: john at digitalpulp.com (John Bachir) Date: Fri, 8 Jun 2007 06:16:00 -0400 Subject: [Ferret-talk] getting the list of indexed words from ferret or aaf In-Reply-To: <9ea9ed2c67908fe4368c3e2d3f973541@ruby-forum.com> References: <9ea9ed2c67908fe4368c3e2d3f973541@ruby-forum.com> Message-ID: <0150489F-7857-433C-A1EE-E074019E306E@digitalpulp.com> On Jun 8, 2007, at 2:46 AM, Luna Claire wrote: > is the list of indexed words readily available via aaf or directly > from > ferret? See this thread: http://www.ruby-forum.com/topic/110065 From mattias at oncotype.dk Fri Jun 8 07:29:37 2007 From: mattias at oncotype.dk (Mattias Bud) Date: Fri, 8 Jun 2007 13:29:37 +0200 Subject: [Ferret-talk] Errror on update after Model.rebuild_index Message-ID: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> Hi I use Ferret 0.11.4 and the latest stabel version of the acts_as_ferret plugin. To the issue. if I do Model.rebuild_index and after that try to update one of my objects of that Model I get: File Not Found Error occured at :117 in xpop_context Error occured in fs_store.c:329 - fs_open_input tried to open "/Users/mattias/Sites/thm/photo_archive/index/development/asset/_b.cfs" but it doesn't exist: If I delete the entire index folder and update again it works fine. it breaks only if I do Model.rebuild_index. Any suggestions? -- Posted via http://www.ruby-forum.com/. From mattias at oncotype.dk Fri Jun 8 07:32:19 2007 From: mattias at oncotype.dk (Mattias Bud) Date: Fri, 8 Jun 2007 13:32:19 +0200 Subject: [Ferret-talk] Errror on update after Model.rebuild_index In-Reply-To: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> References: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> Message-ID: <198629f670b1c35d27851fc7e92f33a3@ruby-forum.com> More info: I can also make it work by doing. Model.rebuild_index Model.find_by_contents Then i can update/save a record again. -- Posted via http://www.ruby-forum.com/. From bk at benjaminkrause.com Fri Jun 8 07:39:02 2007 From: bk at benjaminkrause.com (Benjamin Krause) Date: Fri, 8 Jun 2007 13:39:02 +0200 Subject: [Ferret-talk] Errror on update after Model.rebuild_index In-Reply-To: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> References: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> Message-ID: <929D5B35-83FC-42C8-AB01-3C97329195F5@benjaminkrause.com> On 2007-06-08, at 1:29 PM, Mattias Bud wrote: > To the issue. if I do Model.rebuild_index and after that try to update > one of my objects of that Model I get: > > File Not Found Error occured at :117 in xpop_context > Error occured in fs_store.c:329 - fs_open_input > tried to open > "/Users/mattias/Sites/thm/photo_archive/index/development/asset/ > _b.cfs" > but it doesn't exist: > > If I delete the entire index folder and update again it works fine. it > breaks only if I do Model.rebuild_index. hey .. sounds like you had another process access/modify the index.. did you try it with the ferret drb server? Ben From mattias at oncotype.dk Fri Jun 8 07:41:34 2007 From: mattias at oncotype.dk (Mattias Bud) Date: Fri, 8 Jun 2007 13:41:34 +0200 Subject: [Ferret-talk] Errror on update after Model.rebuild_index In-Reply-To: <929D5B35-83FC-42C8-AB01-3C97329195F5@benjaminkrause.com> References: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> <929D5B35-83FC-42C8-AB01-3C97329195F5@benjaminkrause.com> Message-ID: <4b7e279af2ca289855f23a2741d544c3@ruby-forum.com> Benjamin Krause wrote: > On 2007-06-08, at 1:29 PM, Mattias Bud wrote: > >> If I delete the entire index folder and update again it works fine. it >> breaks only if I do Model.rebuild_index. > > hey .. sounds like you had another process access/modify the > index.. did you try it with the ferret drb server? > > Ben No - local index. -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Fri Jun 8 07:54:52 2007 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 8 Jun 2007 13:54:52 +0200 Subject: [Ferret-talk] complete index rebuild using AAF trunk In-Reply-To: <4AC55D32-90E5-4D0E-BE53-DAB9629AFA77@digitalpulp.com> References: <88EF8DAF-8AA3-4DD8-961D-BAD6D091852E@digitalpulp.com> <4AC55D32-90E5-4D0E-BE53-DAB9629AFA77@digitalpulp.com> Message-ID: <20070608115452.GB23116@cordoba.webit.de> On Fri, Jun 08, 2007 at 06:13:51AM -0400, John Bachir wrote: > yeah, that's ok, I still didn't catch up with the list ;-) > Jens, any thoughts on this? see below. > > > On May 31, 2007, at 2:30 PM, John Bachir wrote: > > > I am using AAF trunk, and I want a way to rebuild an index on a > > production site with little or no interruption to service. The Drb > > Server documentation* states that when an index is rebuilt, it is > > done in a separate location and then swapped into place when > > finished, and so to do a complete rebuild on a live site, one must > > take into consideration objects which have been created or modified > > in the meantime. To achieve this, I have come up with the following > > solution: > > > > http://pastie.textmate.org/66602 > > > > [1] Does this look like a complete solution? I suppose it relies on > > timestamp consistency between system components... it is possible > > that between setting "start = ..." and performing the rebuild, > > another thread in the system will have create an earlier timestamp > > for an object that did not get committed until after the rebuild > > began. Is it possible to do a perfect rebuild, or would that require > > building a layer of concurrency logic into AAF? The scenario you describe might happen and cause a record not to be indexed, but I'd implement it just like you did. To be safe you can subtract a minute or so from your recorded start time ;-) If it is really critical for you to have all records indexed and relying on the timestamps is a no-go you'll have to implement your own synchronisation mechanism, maybe with checking for a running rebuild on each index update, and recording the corresponding records somewhere for later indexing. > > [2] Is the behavior described in the Drb Server documentation > > different from AAF when not using the Drb Server? Without the DRb server aaf won't use index versions but will re-build the index in place. I didn't introduce the versioning there because the usual non-DRb-scenarios (test cases and development system) don't require it. With non-DRb-Multi-Process-Scenarios it would be hard to implement anyway. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Fri Jun 8 07:58:03 2007 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 8 Jun 2007 13:58:03 +0200 Subject: [Ferret-talk] Errror on update after Model.rebuild_index In-Reply-To: <4b7e279af2ca289855f23a2741d544c3@ruby-forum.com> References: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> <929D5B35-83FC-42C8-AB01-3C97329195F5@benjaminkrause.com> <4b7e279af2ca289855f23a2741d544c3@ruby-forum.com> Message-ID: <20070608115803.GC23116@cordoba.webit.de> On Fri, Jun 08, 2007 at 01:41:34PM +0200, Mattias Bud wrote: > Benjamin Krause wrote: > > On 2007-06-08, at 1:29 PM, Mattias Bud wrote: > > > >> If I delete the entire index folder and update again it works fine. it > >> breaks only if I do Model.rebuild_index. > > > > hey .. sounds like you had another process access/modify the > > index.. did you try it with the ferret drb server? > > > > Ben > > No - local index. is this in development mode with 1 mongrel instance? if yes, make sure your Mongrel doesn't do index updates while you run rebuild_index from the console. Even better, run these tests without mongrel running, or in another RAILS_ENV. With multiple mongrels, use the DRb server. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From mattias at oncotype.dk Fri Jun 8 08:03:47 2007 From: mattias at oncotype.dk (Mattias Bud) Date: Fri, 8 Jun 2007 14:03:47 +0200 Subject: [Ferret-talk] Errror on update after Model.rebuild_index In-Reply-To: <20070608115803.GC23116@cordoba.webit.de> References: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> <929D5B35-83FC-42C8-AB01-3C97329195F5@benjaminkrause.com> <4b7e279af2ca289855f23a2741d544c3@ruby-forum.com> <20070608115803.GC23116@cordoba.webit.de> Message-ID: <9f348078fae49f6b0bcc534fed11a207@ruby-forum.com> Jens Kraemer wrote: > On Fri, Jun 08, 2007 at 01:41:34PM +0200, Mattias Bud wrote: >> >> No - local index. > > is this in development mode with 1 mongrel instance? if yes, make sure > your Mongrel doesn't do index updates while you run rebuild_index from > the console. Even better, run these tests without mongrel running, or in > another RAILS_ENV. > > With multiple mongrels, use the DRb server. > > Jens I turned of mongrel and rebuilt - and it worked. Thanks for the help -- Posted via http://www.ruby-forum.com/. From deinspanjer at gmail.com Fri Jun 8 10:25:07 2007 From: deinspanjer at gmail.com (Daniel Einspanjer) Date: Fri, 8 Jun 2007 10:25:07 -0400 Subject: [Ferret-talk] Advise on slowness in bootstrapping? In-Reply-To: <20070608083400.GB19126@cordoba.webit.de> References: <20070608083400.GB19126@cordoba.webit.de> Message-ID: The bootstrap indexing actually ended up taking twice the amount of time listed below. When there was no index directory and I made the call to rebuild_index, the ferret_index.log file had these lines in it: # Logfile created on Thu Jun 07 08:46:34 -0400 2007 by logger.rb/1.5.2.9 rebuild index: [] reindexing model CurrentProgram reindex model CurrentProgram : 0.00% complete : 25658.57 secs to finish ... when it hit 100%, the following lines appeared: reindex model CurrentProgram : 99.56% complete : 219.29 secs to finish Created Ferret index in: ./script/../config/../config/../index/production/current_program rebuild index: [["CurrentProgram"]] reindexing model CurrentProgram reindex model CurrentProgram : 0.00% complete : 25740.65 secs to finish reindex model CurrentProgram : 0.95% complete : 26065.95 secs to finish So it looks like for some reason, it performed the rebuild twice. :( When I looked at it this morning, it had over 116k files in the current_program directory. Not the most healthy thing. I ran CurrentProgram.aaf_index.ferret_index.optimize and it took a few minutes and fully optimized down to three files. I made the testing patch suggested and am running now. I did not delete the index directory. The ferret_index.log started out with these lines: rebuild index: [["CurrentProgram"]] reindexing model CurrentProgram reindex model CurrentProgram : 0.00% complete : 3540.78 secs to finish reindex model CurrentProgram : 0.95% complete : 3510.69 secs to finish So it is a significantly shorter time when it isn't actually adding the doc to the index. If you have any further ideas on things to try or any other information you'd like to collect, please let me know. In the meantime, I'm going to try out the acts_as_solr plugin since I've had a bit more experience with tuning solr and see what the indexing performance on that looks like. Daniel On 6/8/07, Jens Kraemer wrote: > On Thu, Jun 07, 2007 at 05:19:26PM +0000, Daniel Einspanjer wrote: > > I am looking at trying to use ferret/aaf to supplement my querying against a > > medium and large table with lots of columns. Some facts first: > > > > Ferret 0.11.4 > > AAF 0.4.0 > > Ruby 1.8.6 > > Rails 1.2.3 > > > > Medium table: > > 105,464 rows > > 168 columns (mostly varchar(20)) > > 11 actual columns indexed in aaf plus > > 40 virtual columns indexed in aaf (virtual is concat of two physical columns. > > e.g. cast_first_name_1 + cast_last_name_1 through cast_first_name_20 + > > cast_last_name_20) > > > > Large table: > > 1,244,716 rows > > same column/index structure > > > > These tables are not updated via Ruby, only read. I am trying to use > > rebuild_index to bootstrap the medium sized table and it is taking a very long > > time (running for about 4 hours, indicates 50% complete with 4 hours remaining) > > and creating a massive number of files in the index directory (currently about > > 65k, was 90k earlier) > > strange. Ferret is faster than that - I have a test script that builds > an index of 100000 documents with 50 fields each containing a single random > word in under 10 Minutes here on standard hardware. > > Maybe the problem is something else? For starters, change line 220 > of local_index.rb from > index << rec.to_doc if rec.ferret_enabled?(true) > to > doc = rec.to_doc if rec.ferret_enabled?(true) > > so nothing is added to the index. How long does that take? > > Jens > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold, Hagen Malessa > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From kraemer at webit.de Fri Jun 8 10:53:22 2007 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 8 Jun 2007 16:53:22 +0200 Subject: [Ferret-talk] Advise on slowness in bootstrapping? In-Reply-To: References: <20070608083400.GB19126@cordoba.webit.de> Message-ID: <20070608145322.GF23116@cordoba.webit.de> On Fri, Jun 08, 2007 at 10:25:07AM -0400, Daniel Einspanjer wrote: > The bootstrap indexing actually ended up taking twice the amount of > time listed below. When there was no index directory and I made the > call to rebuild_index, the ferret_index.log file had these lines in > it: > # Logfile created on Thu Jun 07 08:46:34 -0400 2007 by logger.rb/1.5.2.9 > rebuild index: [] > reindexing model CurrentProgram > reindex model CurrentProgram : 0.00% complete : 25658.57 secs to finish > ... > > when it hit 100%, the following lines appeared: > reindex model CurrentProgram : 99.56% complete : 219.29 secs to finish > Created Ferret index in: > ./script/../config/../config/../index/production/current_program > rebuild index: [["CurrentProgram"]] > reindexing model CurrentProgram > reindex model CurrentProgram : 0.00% complete : 25740.65 secs to finish > reindex model CurrentProgram : 0.95% complete : 26065.95 secs to finish > > > So it looks like for some reason, it performed the rebuild twice. :( damn, that bug seems to come back from time to time, I'll try to fix this over the weekend. > When I looked at it this morning, it had over 116k files in the > current_program directory. Not the most healthy thing. I ran > CurrentProgram.aaf_index.ferret_index.optimize and it took a few > minutes and fully optimized down to three files. It should optimize the index automatically after re-indexing. > I made the testing patch suggested and am running now. I did not > delete the index directory. The ferret_index.log started out with > these lines: > rebuild index: [["CurrentProgram"]] > reindexing model CurrentProgram > reindex model CurrentProgram : 0.00% complete : 3540.78 secs to finish > reindex model CurrentProgram : 0.95% complete : 3510.69 secs to finish > > So it is a significantly shorter time when it isn't actually adding > the doc to the index. Yeah, looks like it's really the indexing that takes the time. Can you make sure for your testing that nothing else accesses the index while the rebuild runs (i.e. shutdown any mongrels running? Or try aaf trunk and the DRb server which will ensure that by design and for performance measurements is the more realistical scenario anyway. > If you have any further ideas on things to try or any other > information you'd like to collect, please let me know. In the > meantime, I'm going to try out the acts_as_solr plugin since I've had > a bit more experience with tuning solr and see what the indexing > performance on that looks like. >From what I've heard it should be on par with aaf when things are working normal (I guess they don't for some reason in your case). btw, what platform do you run on? Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From deinspanjer at gmail.com Fri Jun 8 12:44:06 2007 From: deinspanjer at gmail.com (Daniel Einspanjer) Date: Fri, 8 Jun 2007 12:44:06 -0400 Subject: [Ferret-talk] Advise on slowness in bootstrapping? In-Reply-To: <20070608145322.GF23116@cordoba.webit.de> References: <20070608083400.GB19126@cordoba.webit.de> <20070608145322.GF23116@cordoba.webit.de> Message-ID: On 6/8/07, Jens Kraemer wrote: > On Fri, Jun 08, 2007 at 10:25:07AM -0400, Daniel Einspanjer wrote: > damn, that bug seems to come back from time to time, I'll try to fix > this over the weekend. I saw a couple of other threads mentioning something similar to this so I figured it either wasn't fixed in the version I was working with or it might have been a regression. > > When I looked at it this morning, it had over 116k files in the > > current_program directory. Not the most healthy thing. I ran > > CurrentProgram.aaf_index.ferret_index.optimize and it took a few > > minutes and fully optimized down to three files. > > It should optimize the index automatically after re-indexing. I see in the rebuild_index method where it calls optimize, but it certainly didn't seem to fully optimize it at that time. Maybe there was something specific to the case of a newly created index instead of opening an existing one? > > I made the testing patch suggested and am running now. I did not > > delete the index directory. The ferret_index.log started out with > > these lines: > > rebuild index: [["CurrentProgram"]] > > reindexing model CurrentProgram > > reindex model CurrentProgram : 0.00% complete : 3540.78 secs to finish > > reindex model CurrentProgram : 0.95% complete : 3510.69 secs to finish > > > > So it is a significantly shorter time when it isn't actually adding > > the doc to the index. > > Yeah, looks like it's really the indexing that takes the time. Can you > make sure for your testing that nothing else accesses the index while > the rebuild runs (i.e. shutdown any mongrels running? Since this was a bootstrapping test, I had no processes running other than the script\console production from which I issued the rebuild_index command. > Or try aaf trunk and the DRb server which will ensure that by design and > for performance measurements is the more realistical scenario anyway. I'm currently planning on running this as a single instance application because the index will be read only at run time and only used by one or two people at a time. > >From what I've heard it [aas] should be on par with aaf when things are > working normal (I guess they don't for some reason in your case). I've heard the same. The only reason I thought to try it out was because of my prior experience with Solr. > btw, what platform do you run on? This is a windows box connecting to a MSSQL server. (I know.. ick. ;) I did some preliminary testing to make sure that the pagination was working properly since I saw in the list that other people had some difficulties with it. Daniel From steve at ourbigcircle.com Fri Jun 8 14:00:49 2007 From: steve at ourbigcircle.com (Luna Claire) Date: Fri, 8 Jun 2007 20:00:49 +0200 Subject: [Ferret-talk] Ferret FileNotFound error after adding counter_cache to In-Reply-To: <9d4e5f1c4437d4fb0815c93e02ab28f1@ruby-forum.com> References: <9d4e5f1c4437d4fb0815c93e02ab28f1@ruby-forum.com> Message-ID: Luna Claire wrote: > I have a model that I've been indexing and searching with ferret with no > problems. > > I just added a counter_cache for some voting functionality to the same > model and now when I perform the voting fxn on an object from that > model, I get the FileNotFound error as it looks for a file named > "_1c_1.del" ...which breaks my voting function. > > I tried killing the server and my index directory as suggested for FnF > errs elsewhere, but, now, after restarting the server and performing a > search (to trigger rebuilding the index), I still have the same prob > when voting. > > Any thoughts? (hopefully I won't have to rewind the counter_cache > change) > > TIA Haven't gotten a response to the above, but I'm still hoping to... I removed the counter_cache on the non-indexed field that was causing the FnF error and thngs are back to working well, but I'll still need that at some pt. Has anyone else experienced ferret not finding files like this after changing a model that was working fine before? Is there something about that specific file named "_1c_1.del" perhaps? Thanks for any help. -- Posted via http://www.ruby-forum.com/. From steve at ourbigcircle.com Fri Jun 8 14:15:09 2007 From: steve at ourbigcircle.com (Luna Claire) Date: Fri, 8 Jun 2007 20:15:09 +0200 Subject: [Ferret-talk] getting the list of indexed words from ferret or aaf In-Reply-To: <0150489F-7857-433C-A1EE-E074019E306E@digitalpulp.com> References: <9ea9ed2c67908fe4368c3e2d3f973541@ruby-forum.com> <0150489F-7857-433C-A1EE-E074019E306E@digitalpulp.com> Message-ID: <4440be7bb4d84db7471aa10fe1110f1c@ruby-forum.com> John Bachir wrote: > See this thread: > > http://www.ruby-forum.com/topic/110065 thanks... that ref is to this snippet: "th_hash = {} Resource.aaf_index.ferret_index.reader.terms(:body).each {|t, f| term_hash[t] = f } th_sorted = term_hash.sort {|a,b| a[1]<=>b[1]}.reverse where Resource is the model being indexed" Thanks, John. That puts me on the right path, but there are 2 gotchas for me here: to be more specific, 1) I'd like to get the index for just a specific object in my model, but it seems the call above would return an index built on all of the obejcts of that model... any way to just build an index and return the terms for one object? 2) I'd also like to index text from a text_field or text_area *before it's saved* as part of an object, so for this I'd like to be able to pass a block of text to aaf and get the indexed terms for just that body of text... any way to do that with aaf? TIA -- Posted via http://www.ruby-forum.com/. From steven at housecafemusic.com Sun Jun 10 21:22:44 2007 From: steven at housecafemusic.com (Steven Garcia) Date: Sun, 10 Jun 2007 18:22:44 -0700 Subject: [Ferret-talk] Rewarding exact matches In-Reply-To: <20070508121627.GK25267@cordoba.webit.de> References: <20070502115003.GB4687@cordoba.webit.de> <42149D18-5DEF-47E2-85DA-E1EF669106B2@housecafemusic.com> <20070508115246.GJ25267@cordoba.webit.de> <20070508121627.GK25267@cordoba.webit.de> Message-ID: I tried using the do_search method you suggested and get this error: undefined method `SearchResults' for ActsAsFerret:Module On May 8, 2007, at 5:16 AM, Jens Kraemer wrote: > def do_search > title_results = Term.multi_search( "title:(#{query})", > [Article, Term] ) > body_results = Term.multi_search( "#{query} -title:(#{query})", > [Article, Term] ) > new ActsAsFerret::SearchResults( title_results + body_results, > title_results.total_hits + > body_results.total_hits ) > end -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070610/ca469f81/attachment.html From kraemer at webit.de Mon Jun 11 02:56:29 2007 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 11 Jun 2007 08:56:29 +0200 Subject: [Ferret-talk] Rewarding exact matches In-Reply-To: References: <20070502115003.GB4687@cordoba.webit.de> <42149D18-5DEF-47E2-85DA-E1EF669106B2@housecafemusic.com> <20070508115246.GJ25267@cordoba.webit.de> <20070508121627.GK25267@cordoba.webit.de> Message-ID: <20070611065629.GG23116@cordoba.webit.de> On Sun, Jun 10, 2007 at 06:22:44PM -0700, Steven Garcia wrote: > I tried using the do_search method you suggested and get this error: > > undefined method `SearchResults' for ActsAsFerret:Module my fault, correct syntax of course is ActsAsFerret::SearchResults.new(...) Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Mon Jun 11 03:02:51 2007 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 11 Jun 2007 09:02:51 +0200 Subject: [Ferret-talk] Ferret FileNotFound error after adding counter_cache to In-Reply-To: References: <9d4e5f1c4437d4fb0815c93e02ab28f1@ruby-forum.com> Message-ID: <20070611070251.GH23116@cordoba.webit.de> On Fri, Jun 08, 2007 at 08:00:49PM +0200, Luna Claire wrote: > Luna Claire wrote: > > I have a model that I've been indexing and searching with ferret with no > > problems. > > > > I just added a counter_cache for some voting functionality to the same > > model and now when I perform the voting fxn on an object from that > > model, I get the FileNotFound error as it looks for a file named > > "_1c_1.del" ...which breaks my voting function. > > > > I tried killing the server and my index directory as suggested for FnF > > errs elsewhere, but, now, after restarting the server and performing a > > search (to trigger rebuilding the index), I still have the same prob > > when voting. > > > > Any thoughts? (hopefully I won't have to rewind the counter_cache > > change) that is really interesting, could you provide me with your corresponding code (model declaration and your voting function) so I can try to reproduce this? Does this happen in development or in only in production environment? > Is there something about that specific file named "_1c_1.del" perhaps? Afaik it's a file where Ferret remembers the ids of deleted documents (they will 'really' get removed from the index only when optimizing it). Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From plynchnlm at gmail.com Mon Jun 11 18:08:13 2007 From: plynchnlm at gmail.com (Paul Lynch) Date: Tue, 12 Jun 2007 00:08:13 +0200 Subject: [Ferret-talk] Highlight slowness Message-ID: Has anyone else found that using ferret's highlighting slows searches down significantly? I am seeing that it more than doubles the search time on my system. I am returning up to 500 results at once, so the slow down is quite noticeable (probably adding about .7 seconds for searches with large result sets.) --Paul -- Posted via http://www.ruby-forum.com/. From joergd at pobox.com Tue Jun 12 09:48:06 2007 From: joergd at pobox.com (Joerg Diekmann) Date: Tue, 12 Jun 2007 15:48:06 +0200 Subject: [Ferret-talk] Starting script/ferret_start problems Message-ID: Hi - I have a fresh install of OSX, Rails, Ruby, Ferret, AAF etc ... Everything is working fine, I can start mongrel, I can start backgroundrb, but Ferret Drb ... doesn't want to work. RAILS_ENV=development script/ferret_start Gives me: env: script/runner: Permission denied Is there a way I can debug this to figure out what is going on - or is there another way to start Ferret Drb? Thanks Joerg -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Jun 12 09:55:24 2007 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 12 Jun 2007 15:55:24 +0200 Subject: [Ferret-talk] Starting script/ferret_start problems In-Reply-To: References: Message-ID: <20070612135524.GC6418@cordoba.webit.de> On Tue, Jun 12, 2007 at 03:48:06PM +0200, Joerg Diekmann wrote: > Hi - I have a fresh install of OSX, Rails, Ruby, Ferret, AAF etc ... > Everything is working fine, I can start mongrel, I can start > backgroundrb, but Ferret Drb ... doesn't want to work. > > RAILS_ENV=development script/ferret_start > > Gives me: > > env: script/runner: Permission denied that probably means that the runner script isn't marked as executable. You can correct this by doing `chmod +x script/runner` from your RAILS_HOME. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From joergd at pobox.com Tue Jun 12 10:00:36 2007 From: joergd at pobox.com (Joerg Diekmann) Date: Tue, 12 Jun 2007 16:00:36 +0200 Subject: [Ferret-talk] Starting script/ferret_start problems In-Reply-To: <20070612135524.GC6418@cordoba.webit.de> References: <20070612135524.GC6418@cordoba.webit.de> Message-ID: <9c1a4022b85e3513cf83954ac1c52458@ruby-forum.com> And it was that simple .... Doh. Thanks Jens! > that probably means that the runner script isn't marked as executable. > > You can correct this by doing `chmod +x script/runner` from your > RAILS_HOME. -- Posted via http://www.ruby-forum.com/. From dickjr at gmail.com Tue Jun 12 11:32:48 2007 From: dickjr at gmail.com (Richard Jones) Date: Tue, 12 Jun 2007 11:32:48 -0400 Subject: [Ferret-talk] index browser inconsistent with IndexReader Message-ID: <381eb1660706120832p56442euf85b5ab24aeaaaf5@mail.gmail.com> Hi, We have an index of around 1M web pages as part of our web app. The app uses ferret by way of RDig to perform searches. We have noticed anecdotally that some searches don't work the way we thought they should, as if documents were missing from the index. Yesterday we came upon a concrete instance of this. Our documents have several fields, one of which is called :keywords and another called :data, both of which are used for searching. We isolated a single document that is not found on the web app by terms in the :data field, but which can be found by the terms in its :keywords field. We assumed first that a problem occurred in the indexing which resulted in the :data field being lost. However, the index browser that's included with version 0.11.4 showed the document with all its fields intact, including the :data field. All the :data field terms that failed to retrieve the document on the web app were indeed present, according to the browser. We then built a short script with the API that instantiated an IndexReader and called IndexReader.term_vectors() with the id of our subject doc. The term_vectors returned included a vector for :keywords, but not for :data. Somehow the core API funcs are not finding this document's :data field when the 0.11.4 browser is. Are there differences between the two that would explain this? Does this problem description ring a bell with anyone out there? Many thanks. -- Richard Jones dickjr at gmail.com From dickjr at gmail.com Tue Jun 12 11:46:02 2007 From: dickjr at gmail.com (Richard Jones) Date: Tue, 12 Jun 2007 11:46:02 -0400 Subject: [Ferret-talk] index browser inconsistent with IndexReader Message-ID: <381eb1660706120846h2c17b05fpa83fc6f6a95dd2ec@mail.gmail.com> Follow-up to my recent post with the same subject: It seems that within the API scripting world I can view the suspect document by instantiating and then loading the LazyDoc returned by Ferret::Search::Searcher.get_document(doc_id). It contains the :data field data and is perhaps what is being used by the browser. So my question is then this: what would cause a document in an index to have a non-empty field when looked at through a LazyDoc, but for which no non-empty term_vector is available for the same field on the same document? -- Richard Jones dickjr at gmail.com From kraemer at webit.de Tue Jun 12 11:58:29 2007 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 12 Jun 2007 17:58:29 +0200 Subject: [Ferret-talk] index browser inconsistent with IndexReader In-Reply-To: <381eb1660706120846h2c17b05fpa83fc6f6a95dd2ec@mail.gmail.com> References: <381eb1660706120846h2c17b05fpa83fc6f6a95dd2ec@mail.gmail.com> Message-ID: <20070612155829.GD6418@cordoba.webit.de> On Tue, Jun 12, 2007 at 11:46:02AM -0400, Richard Jones wrote: > Follow-up to my recent post with the same subject: > > It seems that within the API scripting world I can view the suspect > document by instantiating and then loading the LazyDoc returned by > Ferret::Search::Searcher.get_document(doc_id). It contains the :data > field data and is perhaps what is being used by the browser. > > So my question is then this: what would cause a document in an index > to have a non-empty field when looked at through a LazyDoc, but for > which no non-empty term_vector is available for the same field on the > same document? having the field data stored in the index does not imply that this field is searchable. It all depends what options are set for the field (see the FieldInfos api docs for the available options) So it's perfectly possible to create an index with fields f1 and f2, where only f1 can be searched, but the contents of f2 can be shown for search results: fi = Ferret::Index::FieldInfos.new fi.add_field :f1, :store => :yes, :index => :yes fi.add_field :f2, :store => :yes, :index => :no, :term_vector => :no i = Ferret::I.new :field_infos => fi i << { :f1 => 'field one' , :f2 => 'field two' } i.search 'one' # finds the document i.search 'two' # won't find anything i[0][:f1] # outputs 'field one' i[0][:f2] # outputs 'field two' However that does not explain why some documents seem to have other indexing options than the rest - maybe yo uchanged them some time without doing a rebuild? Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From ismaelct at gmail.com Tue Jun 12 19:23:55 2007 From: ismaelct at gmail.com (IsmaSan) Date: Wed, 13 Jun 2007 01:23:55 +0200 Subject: [Ferret-talk] Automatically Indexing Associated Models In-Reply-To: <20070201143045.GJ21355@cordoba.webit.de> References: <20070201083009.GH21355@cordoba.webit.de> <3d0d9a26211961a3e92e9b7d8dad6446@ruby-forum.com> <20070201143045.GJ21355@cordoba.webit.de> Message-ID: <0165640b06f779b3668c01e3d9040c26@ruby-forum.com> Jens Kraemer wrote: > On Thu, Feb 01, 2007 at 02:54:41PM +0100, Mark wrote: >> >> do_something >> end > yeah, this should work, too. Even prettier from an architectural point > of view :-) > > Jens > > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 Hi. i have the same problem, but my_instance.ferret_update doesn't index associated model fields on the console. Doesn't ferret_update get called after_save anyway? -- Posted via http://www.ruby-forum.com/. From dickjr at gmail.com Wed Jun 13 08:58:36 2007 From: dickjr at gmail.com (Richard Jones) Date: Wed, 13 Jun 2007 08:58:36 -0400 Subject: [Ferret-talk] index browser inconsistent with IndexReader In-Reply-To: <20070612155829.GD6418@cordoba.webit.de> References: <381eb1660706120846h2c17b05fpa83fc6f6a95dd2ec@mail.gmail.com> <20070612155829.GD6418@cordoba.webit.de> Message-ID: <381eb1660706130558l7af3c26fy5b66783498aeffb@mail.gmail.com> According to my IndexReader's field_infos, all the fields are stored and indexed, with :with_positions_offsets for the term_vectors. A look at a term vector for one of these :data fields gives: # Is this what they look like when you index with :index=>no? On 6/12/07, Jens Kraemer wrote: > On Tue, Jun 12, 2007 at 11:46:02AM -0400, Richard Jones wrote: > > Follow-up to my recent post with the same subject: > > > > It seems that within the API scripting world I can view the suspect > > document by instantiating and then loading the LazyDoc returned by > > Ferret::Search::Searcher.get_document(doc_id). It contains the :data > > field data and is perhaps what is being used by the browser. > > > > So my question is then this: what would cause a document in an index > > to have a non-empty field when looked at through a LazyDoc, but for > > which no non-empty term_vector is available for the same field on the > > same document? > > having the field data stored in the index does not imply that this field > is searchable. It all depends what options are set for the field (see > the FieldInfos api docs for the available options) > > So it's perfectly possible to create an index with fields f1 and f2, > where only f1 can be searched, but the contents of f2 can be shown for > search results: > > fi = Ferret::Index::FieldInfos.new > fi.add_field :f1, :store => :yes, :index => :yes > fi.add_field :f2, :store => :yes, :index => :no, :term_vector => :no > i = Ferret::I.new :field_infos => fi > i << { :f1 => 'field one' , :f2 => 'field two' } > > i.search 'one' # finds the document > i.search 'two' # won't find anything > > > i[0][:f1] # outputs 'field one' > i[0][:f2] # outputs 'field two' > > > However that does not explain why some documents seem to have other > indexing options than the rest - maybe yo uchanged them some time > without doing a rebuild? > > > Jens > > > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold, Hagen Malessa > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Richard Jones dickjr at gmail.com From joergd at pobox.com Wed Jun 13 09:08:54 2007 From: joergd at pobox.com (Joerg Diekmann) Date: Wed, 13 Jun 2007 15:08:54 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? Message-ID: Hi folks, I have several models that index well in Drb mode. However, I have one scenario where it works in normal mode, but not in Drb mode. model A field :one end model B belongs_to :a field :two delegate :one, :to => :a acts_as_ferret :fields => { :one => {}, :two => {} }, :remote => true end If I leave off the :remote parameter, it works. Or, if I don't index field :one it works in remote mode. But I can't use Drb to index the delegate. Am I doing something wrong? Thanks Joerg -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Wed Jun 13 09:54:53 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 13 Jun 2007 15:54:53 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: References: Message-ID: <20070613135453.GG6418@cordoba.webit.de> On Wed, Jun 13, 2007 at 03:08:54PM +0200, Joerg Diekmann wrote: > Hi folks, > > I have several models that index well in Drb mode. > > However, I have one scenario where it works in normal mode, but not in > Drb mode. > > model A > field :one > end > > > model B > belongs_to :a > > field :two > delegate :one, :to => :a > > acts_as_ferret :fields => { :one => {}, :two => {} }, :remote => true > end > > > If I leave off the :remote parameter, it works. Or, if I don't index > field :one it works in remote mode. But I can't use Drb to index the > delegate. > > Am I doing something wrong? I'm not sure, but I don't think so :-) What exactly is your problem? Field :one just not being indexed or an exception? Could you try to replace the degate with a normal method call in a method named 'one' and see what happens then? A possible cause for this might be that the DRb server doesn't see your :a relationship when b gets indexed. To check this, you could i.e. override aaf's to_doc instance method in class B and have a look at the :a relationship or try to call the delegated method yourself. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Wed Jun 13 10:00:54 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 13 Jun 2007 16:00:54 +0200 Subject: [Ferret-talk] index browser inconsistent with IndexReader In-Reply-To: <381eb1660706130558l7af3c26fy5b66783498aeffb@mail.gmail.com> References: <381eb1660706120846h2c17b05fpa83fc6f6a95dd2ec@mail.gmail.com> <20070612155829.GD6418@cordoba.webit.de> <381eb1660706130558l7af3c26fy5b66783498aeffb@mail.gmail.com> Message-ID: <20070613140054.GH6418@cordoba.webit.de> On Wed, Jun 13, 2007 at 08:58:36AM -0400, Richard Jones wrote: > According to my IndexReader's field_infos, all the fields are stored > and indexed, with :with_positions_offsets for the term_vectors. > > A look at a term vector for one of these :data fields gives: > > # > > Is this what they look like when you index with :index=>no? no, with index => no no term vectors can be stored and then term_vector returns nil, not an empty tv. The scenario you have could happen if your analyzer choked at indexing time and returned not a single term for your document (just like if you had a doc full of stop words). Since you have the stored contents, could you try to index that data again and see if the problem can be reproduced? Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From joergd at pobox.com Wed Jun 13 10:30:59 2007 From: joergd at pobox.com (Joerg Diekmann) Date: Wed, 13 Jun 2007 16:30:59 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: <20070613135453.GG6418@cordoba.webit.de> References: <20070613135453.GG6418@cordoba.webit.de> Message-ID: <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> Hi Jens - ok - that's a lot of meat to work with for now. Thanks. I'll post back here once I've figured it out or got more info. (The error message I receive is that Ferret cannot find the "add" method on model b.) > > A possible cause for this might be that the DRb server doesn't see your > :a relationship when b gets indexed. To check this, you could i.e. > override aaf's to_doc instance method in class B and have a look at the > :a relationship or try to call the delegated method yourself. > > > Jens > -- Posted via http://www.ruby-forum.com/. From dickjr at gmail.com Wed Jun 13 10:31:38 2007 From: dickjr at gmail.com (Richard Jones) Date: Wed, 13 Jun 2007 10:31:38 -0400 Subject: [Ferret-talk] index browser inconsistent with IndexReader In-Reply-To: <20070613140054.GH6418@cordoba.webit.de> References: <381eb1660706120846h2c17b05fpa83fc6f6a95dd2ec@mail.gmail.com> <20070612155829.GD6418@cordoba.webit.de> <381eb1660706130558l7af3c26fy5b66783498aeffb@mail.gmail.com> <20070613140054.GH6418@cordoba.webit.de> Message-ID: <381eb1660706130731w12a7b518ibc7a1fc04970d6e8@mail.gmail.com> I ran one of the :data fields through the StandardAnalyzer - the only one we have used - and it tokenized it with no complaints. Interestingly, the last batch of 1700 sites that we added incrementally to our index does not seem to suffer from this problem. On 6/13/07, Jens Kraemer wrote: > On Wed, Jun 13, 2007 at 08:58:36AM -0400, Richard Jones wrote: > > According to my IndexReader's field_infos, all the fields are stored > > and indexed, with :with_positions_offsets for the term_vectors. > > > > A look at a term vector for one of these :data fields gives: > > > > # > > > > Is this what they look like when you index with :index=>no? > > no, with index => no no term vectors can be stored and then term_vector > returns nil, not an empty tv. > > The scenario you have could happen if your analyzer choked at indexing > time and returned not a single term for your document (just like if you > had a doc full of stop words). > > Since you have the stored contents, could you try to index that data > again and see if the problem can be reproduced? > > Jens > > > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold, Hagen Malessa > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Richard Jones dickjr at gmail.com From kraemer at webit.de Wed Jun 13 10:42:51 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 13 Jun 2007 16:42:51 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> References: <20070613135453.GG6418@cordoba.webit.de> <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> Message-ID: <20070613144250.GJ6418@cordoba.webit.de> On Wed, Jun 13, 2007 at 04:30:59PM +0200, Joerg Diekmann wrote: > Hi Jens - ok - that's a lot of meat to work with for now. Thanks. I'll > post back here once I've figured it out or got more info. > > (The error message I receive is that Ferret cannot find the "add" method > on model b.) strange, do you have a stack trace of this error? Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From jason at greenhell.com Wed Jun 13 10:53:12 2007 From: jason at greenhell.com (jason) Date: Wed, 13 Jun 2007 16:53:12 +0200 Subject: [Ferret-talk] Last indexed date for a model? Message-ID: <193e115db2ff994712d859bd7a7b2095@ruby-forum.com> Is it possible to retrieve the date of when a particular model was last indexed? I would like to display this date in the search results. Thanks. -- Posted via http://www.ruby-forum.com/. From joergd at pobox.com Wed Jun 13 10:58:44 2007 From: joergd at pobox.com (Joerg Diekmann) Date: Wed, 13 Jun 2007 16:58:44 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> References: <20070613135453.GG6418@cordoba.webit.de> <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> Message-ID: <39d04cf9facca97f86d5e402f68832c5@ruby-forum.com> One other thing that is WEIRD is that I have a unit test, with a single line: assert B.create!(:a => a, :two => "Test") 1. This line fails every time I run it (the Ferret error about there not being a method called "add" on model B). 2. When I then comment out the delegate field in the acts_as_ferret declaration, and rerun the test, it works. 3. But now this is the INTERESTING bit: If I re-include (uncomment) the delegate in the acts_as_ferret declaration, and run the test again ... then the test passes! 4. And when I delete the index folder and rerun the test, it still works. 5. Then I top and start Ferret and the tests FAIL. 6. Now I do step 2. again, and the tests pass. 7. Step 3. and the test FAILS. 8. Run the test again, and it PASSES. So .... yeah. Sometimes it works - other times it doesn't. -- Posted via http://www.ruby-forum.com/. From joergd at pobox.com Wed Jun 13 11:04:56 2007 From: joergd at pobox.com (Joerg Diekmann) Date: Wed, 13 Jun 2007 17:04:56 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: <39d04cf9facca97f86d5e402f68832c5@ruby-forum.com> References: <20070613135453.GG6418@cordoba.webit.de> <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> <39d04cf9facca97f86d5e402f68832c5@ruby-forum.com> Message-ID: <963070a85ffbe2d2f825485fda31e3bb@ruby-forum.com> class Employee belongs_to :person delegate :name, :to => :person acts_as_ferret( :fields => { :employee_no => {}, :name => {} }, :remote => true) end test_create(EmployeeTest): NoMethodError: undefined method `add' for Employee:Class at top level in localhost at line 9009 at top level in localhost at line 9009 at top level in localhost at line 9009 method << in remote_index.rb at line 31 method ferret_create in instance_methods.rb at line 73 method send in callbacks.rb at line 333 method callback in callbacks.rb at line 333 method each in callbacks.rb at line 330 method callback in callbacks.rb at line 330 method create_without_timestamps in callbacks.rb at line 255 method create_without_user in timestamp.rb at line 30 method create in user_monitor.rb at line 22 method create_or_update_without_callbacks in base.rb at line 1792 method create_or_update in callbacks.rb at line 242 method save_without_validation! in base.rb at line 1554 method save_without_transactions! in validations.rb at line 762 method save! in transactions.rb at line 133 method transaction in database_statements.rb at line 59 method transaction in transactions.rb at line 95 method transaction in transactions.rb at line 121 method save! in transactions.rb at line 133 method create! in validations.rb at line 727 method test_create in employee_test.rb at line 165 -- Posted via http://www.ruby-forum.com/. From seggy.umboh at gmail.com Wed Jun 13 21:12:34 2007 From: seggy.umboh at gmail.com (Seggy Umboh) Date: Thu, 14 Jun 2007 03:12:34 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: <963070a85ffbe2d2f825485fda31e3bb@ruby-forum.com> References: <20070613135453.GG6418@cordoba.webit.de> <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> <39d04cf9facca97f86d5e402f68832c5@ruby-forum.com> <963070a85ffbe2d2f825485fda31e3bb@ruby-forum.com> Message-ID: <1deaceeb47ebdce2ee45d717934f0c45@ruby-forum.com> I have the same problem using the Drb Server but mine is consistent, I get this everytime once I switch to remote mode: NoMethodError: undefined method `add' for User:Class from (druby://localhost:9010) /usr/local/lib/ruby/gems/1.8/gems/activerecord-1.15.3/lib/active_record/base.rb:1236:in `method_missing' from (druby://localhost:9010) ./script/../config/../vendor/plugins/acts_as_ferret/lib/ferret_server.rb:70:in `send' from (druby://localhost:9010) ./script/../config/../vendor/plugins/acts_as_ferret/lib/ferret_server.rb:70:in `method_missing' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/remote_index.rb:31:in `<<' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/instance_methods.rb:73:in `ferret_update' The acts_as_ferret line for the User model is very basic: acts_as_ferret :fields => ['lastname','firstname'], :remote => true -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Jun 14 02:38:38 2007 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 14 Jun 2007 08:38:38 +0200 Subject: [Ferret-talk] Last indexed date for a model? In-Reply-To: <193e115db2ff994712d859bd7a7b2095@ruby-forum.com> References: <193e115db2ff994712d859bd7a7b2095@ruby-forum.com> Message-ID: <20070614063838.GK6418@cordoba.webit.de> On Wed, Jun 13, 2007 at 04:53:12PM +0200, jason wrote: > Is it possible to retrieve the date of when a particular model was last > indexed? > I would like to display this date in the search results. Thanks. Ferret doesn't store such data unless you tell it to do so by creating a field holding the indexing date. With aaf you can easily create a method that returns the current time in a reasonable format and mention it's name in the list of fields. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From emanuele.vicentini at mayaidee.it Thu Jun 14 03:34:16 2007 From: emanuele.vicentini at mayaidee.it (Emanuele) Date: Thu, 14 Jun 2007 09:34:16 +0200 Subject: [Ferret-talk] Is DRb server *really* required in production mode? Message-ID: <2406bdec4bd9e6a11c4b2925318a6813@ruby-forum.com> Greetings, I'm certainly :-) too thick, but after reading Ferret and AAF's documentation I'm still wondering if I really need a DRb server in production. At the moment, the planned structure for production mode is a single server running the app with Apache and fastcgi; as far as I know, Ferret is multithread-safe, isn't it? So, is a DRb server required to avoid any concurrent access problem in this scenario? Let's say we're going to switch to a bunch of mongrels, still keeping everything on a single server: now the DRb server would be really necessary, right? I mean, there would be multiple processess trying to access the index to read and possibly update it, so here the DRb server would save the day, right? Thanks for your patience :-) -- Posted via http://www.ruby-forum.com/. From seggy.umboh at gmail.com Thu Jun 14 04:46:50 2007 From: seggy.umboh at gmail.com (Seggy Umboh) Date: Thu, 14 Jun 2007 10:46:50 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: <1deaceeb47ebdce2ee45d717934f0c45@ruby-forum.com> References: <20070613135453.GG6418@cordoba.webit.de> <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> <39d04cf9facca97f86d5e402f68832c5@ruby-forum.com> <963070a85ffbe2d2f825485fda31e3bb@ruby-forum.com> <1deaceeb47ebdce2ee45d717934f0c45@ruby-forum.com> Message-ID: Ahh, after some investigation, I found that starting the ferret server puts the following in the Rails log: Asked for a remote server ? true, ENV["FERRET_USE_LOCAL_INDEX"] is nil, server is not_running Will use remote index server which should be available at druby://127.0.0.1:9010 Which was only vaguely suspicious, but then the code revealed: # Usually the automatic detection of server mode works fine, however if you # require your model classes in environment.rb they will get loaded before the # DRb server is started, so this code is executed too early and detection won't # work. In this case you'll get endless loops resulting in "stack level too deep" # errors. # To get around this, start the server with the environment variable # FERRET_USE_LOCAL_INDEX set to '1'. Now the comments claim that we would get "stack level too deep" but because RemoteIndex::add modifies the argument with a to_doc, we actually get a NoMethodError from the second call to to_doc: undefined method `to_doc' for {"firstname"=>"Seggy", "lastname"=>"Umboh", :id=>1}:Ferret::Document This effectively cuts the infinite loop, and avoids the "stack level too deep" error. So, the solution, at least in my case, is to run the server with: FERRET_USE_LOCAL_INDEX=1 script/ferret_start Hope that helps... -- Posted via http://www.ruby-forum.com/. From joergd at pobox.com Thu Jun 14 05:19:45 2007 From: joergd at pobox.com (Joerg Diekmann) Date: Thu, 14 Jun 2007 11:19:45 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: References: <20070613135453.GG6418@cordoba.webit.de> <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> <39d04cf9facca97f86d5e402f68832c5@ruby-forum.com> <963070a85ffbe2d2f825485fda31e3bb@ruby-forum.com> <1deaceeb47ebdce2ee45d717934f0c45@ruby-forum.com> Message-ID: <4b7d013eb0ca1f9e8a077635f6d7a9c1@ruby-forum.com> Hmm ... that means though that I am not using the remote server? > So, the solution, at least in my case, is to run the server with: > > FERRET_USE_LOCAL_INDEX=1 script/ferret_start > > Hope that helps... -- Posted via http://www.ruby-forum.com/. From seggy.umboh at gmail.com Thu Jun 14 05:38:19 2007 From: seggy.umboh at gmail.com (Seggy Umboh) Date: Thu, 14 Jun 2007 11:38:19 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: <4b7d013eb0ca1f9e8a077635f6d7a9c1@ruby-forum.com> References: <20070613135453.GG6418@cordoba.webit.de> <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> <39d04cf9facca97f86d5e402f68832c5@ruby-forum.com> <963070a85ffbe2d2f825485fda31e3bb@ruby-forum.com> <1deaceeb47ebdce2ee45d717934f0c45@ruby-forum.com> <4b7d013eb0ca1f9e8a077635f6d7a9c1@ruby-forum.com> Message-ID: By setting the environment variable this way, it only takes effect for that command, which is the ferret server. The server needs to use the LocalIndex internally. Your regular Rails app will not have FERRET_USE_LOCAL_INDEX set, and thus use RemoteIndex. Try it out... Joerg Diekmann wrote: > Hmm ... that means though that I am not using the remote server? > >> So, the solution, at least in my case, is to run the server with: >> >> FERRET_USE_LOCAL_INDEX=1 script/ferret_start >> >> Hope that helps... -- Posted via http://www.ruby-forum.com/. From seggy.umboh at gmail.com Thu Jun 14 05:45:35 2007 From: seggy.umboh at gmail.com (Seggy Umboh) Date: Thu, 14 Jun 2007 11:45:35 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: References: <20070613135453.GG6418@cordoba.webit.de> <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> <39d04cf9facca97f86d5e402f68832c5@ruby-forum.com> <963070a85ffbe2d2f825485fda31e3bb@ruby-forum.com> <1deaceeb47ebdce2ee45d717934f0c45@ruby-forum.com> <4b7d013eb0ca1f9e8a077635f6d7a9c1@ruby-forum.com> Message-ID: I should also add that this environment variable is only being read after http://projects.jkraemer.net/acts_as_ferret/changeset/180 So if you are using the gem, you will either need to backport this fix as an extension or use the plugin in trunk. Seggy Umboh wrote: > By setting the environment variable this way, it only takes effect for > that command, which is the ferret server. The server needs to use the > LocalIndex internally. > > Your regular Rails app will not have FERRET_USE_LOCAL_INDEX set, and > thus use RemoteIndex. > > Try it out... > > > Joerg Diekmann wrote: >> Hmm ... that means though that I am not using the remote server? >> >>> So, the solution, at least in my case, is to run the server with: >>> >>> FERRET_USE_LOCAL_INDEX=1 script/ferret_start >>> >>> Hope that helps... -- Posted via http://www.ruby-forum.com/. From joergd at pobox.com Thu Jun 14 06:07:03 2007 From: joergd at pobox.com (Joerg Diekmann) Date: Thu, 14 Jun 2007 12:07:03 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: References: <20070613135453.GG6418@cordoba.webit.de> <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> <39d04cf9facca97f86d5e402f68832c5@ruby-forum.com> <963070a85ffbe2d2f825485fda31e3bb@ruby-forum.com> <1deaceeb47ebdce2ee45d717934f0c45@ruby-forum.com> <4b7d013eb0ca1f9e8a077635f6d7a9c1@ruby-forum.com> Message-ID: <942aeb6f627bd5901869ef92a9917a81@ruby-forum.com> Did it all, but the trunk version complains that it cannot find my index/test/my_model directory when I run the test ... :-( Seggy Umboh wrote: > I should also add that this environment variable is only being read > after http://projects.jkraemer.net/acts_as_ferret/changeset/180 > > So if you are using the gem, you will either need to backport this fix > as an extension or use the plugin in trunk. > > > Seggy Umboh wrote: >> By setting the environment variable this way, it only takes effect for >> that command, which is the ferret server. The server needs to use the >> LocalIndex internally. >> >> Your regular Rails app will not have FERRET_USE_LOCAL_INDEX set, and >> thus use RemoteIndex. >> >> Try it out... >> >> >> Joerg Diekmann wrote: >>> Hmm ... that means though that I am not using the remote server? >>> >>>> So, the solution, at least in my case, is to run the server with: >>>> >>>> FERRET_USE_LOCAL_INDEX=1 script/ferret_start >>>> >>>> Hope that helps... -- Posted via http://www.ruby-forum.com/. From john at johnleach.co.uk Thu Jun 14 07:07:31 2007 From: john at johnleach.co.uk (John Leach) Date: Thu, 14 Jun 2007 12:07:31 +0100 Subject: [Ferret-talk] Is DRb server *really* required in production mode? In-Reply-To: <2406bdec4bd9e6a11c4b2925318a6813@ruby-forum.com> References: <2406bdec4bd9e6a11c4b2925318a6813@ruby-forum.com> Message-ID: <1181819251.27461.12.camel@localhost.localdomain> Hi Emanuele, Mi dispiace, volgio aiutarlo en Italiano ma mio Italiano e non buon. So, Ferret is thread-safe yes, but Rails is not - each instance of Rails fastcgi will be running in a separate process (with a separate Ferret et al.) So, if your initial plan is for only one fastcgi process then you'll be fine. If you will be having multiple fastcgi processes then you may encounter problems without the DRb server. Whether it's a cluster of mongrels of a cluster of fastcgis, they are separate processes either way. John. On Thu, 2007-06-14 at 09:34 +0200, Emanuele wrote: > Greetings, > > I'm certainly :-) too thick, but after reading Ferret and AAF's > documentation I'm still wondering if I really need a DRb server in > production. > > At the moment, the planned structure for production mode is a single > server running the app with Apache and fastcgi; as far as I know, Ferret > is multithread-safe, isn't it? So, is a DRb server required to avoid any > concurrent access problem in this scenario? > > Let's say we're going to switch to a bunch of mongrels, still keeping > everything on a single server: now the DRb server would be really > necessary, right? I mean, there would be multiple processess trying to > access the index to read and possibly update it, so here the DRb server > would save the day, right? > > Thanks for your patience :-) > -- http://johnleach.co.uk From reverri at gmail.com Thu Jun 14 10:36:55 2007 From: reverri at gmail.com (Daniel Reverri) Date: Thu, 14 Jun 2007 16:36:55 +0200 Subject: [Ferret-talk] Undo deletion? Message-ID: <68f5de52068cfe3c1f312a5fa9903782@ruby-forum.com> Does anyone know of a method for undoing a deletion before an IndexWriter has been committed? The deletion would be done through the IndexWriter. -- Posted via http://www.ruby-forum.com/. From mattias at oncotype.dk Thu Jun 14 11:59:57 2007 From: mattias at oncotype.dk (Mattias Bud) Date: Thu, 14 Jun 2007 17:59:57 +0200 Subject: [Ferret-talk] Sorting and getting occurrences of search in hit In-Reply-To: References: Message-ID: Doesn't anyone have any input on this? Mattias -- Posted via http://www.ruby-forum.com/. From mattias at oncotype.dk Thu Jun 14 12:22:53 2007 From: mattias at oncotype.dk (Mattias Bud) Date: Thu, 14 Jun 2007 18:22:53 +0200 Subject: [Ferret-talk] Sorting and getting occurrences of search in hit In-Reply-To: References: Message-ID: <670f67f110bfd258f061b6c29464902e@ruby-forum.com> Mattias Bud wrote: > Is there any way you could get the number of occurrences of the search > in one hit? > > In a result I get the ferret_rank and ferret_score but not how many hits > the search generated in the current record. > > I would also like to be able to sort after this when I search. > > /mattias Forgot the original message. Doesn't anyone have any input on this? mattias -- Posted via http://www.ruby-forum.com/. From mattias at oncotype.dk Thu Jun 14 12:35:12 2007 From: mattias at oncotype.dk (Mattias Bud) Date: Thu, 14 Jun 2007 18:35:12 +0200 Subject: [Ferret-talk] Errror on update after Model.rebuild_index In-Reply-To: <9f348078fae49f6b0bcc534fed11a207@ruby-forum.com> References: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> <929D5B35-83FC-42C8-AB01-3C97329195F5@benjaminkrause.com> <4b7e279af2ca289855f23a2741d544c3@ruby-forum.com> <20070608115803.GC23116@cordoba.webit.de> <9f348078fae49f6b0bcc534fed11a207@ruby-forum.com> Message-ID: <07de1688845a663d14145de071fb0733@ruby-forum.com> Mattias Bud wrote: > Jens Kraemer wrote: >> On Fri, Jun 08, 2007 at 01:41:34PM +0200, Mattias Bud wrote: >>> >>> No - local index. >> >> is this in development mode with 1 mongrel instance? if yes, make sure >> your Mongrel doesn't do index updates while you run rebuild_index from >> the console. Even better, run these tests without mongrel running, or in >> another RAILS_ENV. >> >> With multiple mongrels, use the DRb server. >> >> Jens > > I turned of mongrel and rebuilt - and it worked. > > Thanks for the help When moving this to the production server it fails again. Here i have the same vesion of the gem and plug-in. The server runs lighttpd and fast-cgi. i can't re-index this class if it has :index => :no. if I remove this from the :fields it works again. Looks like this acts_as_ferret({:store_class_name => true, :fields => { :title => {}, :date_comment_index => {:index => :no} } }, { :analyzer => Ferret::Analysis::StandardAnalyzer.new([]) }) As I said, if i remove :index => :no it works. Is the solution to change to DRb? mattias -- Posted via http://www.ruby-forum.com/. From henke at mac.se Thu Jun 14 16:58:01 2007 From: henke at mac.se (Henrik Zagerholm) Date: Thu, 14 Jun 2007 22:58:01 +0200 Subject: [Ferret-talk] API towards cFerret Message-ID: <14B1EE78-CE8D-449C-9CAD-61DF6F1A46D7@mac.se> Hello list, I know this is a little OT but I wonder how I would go about using cFerret directly through C/C++ for indexing. Right now I do index = Index::Index.new(:path => '/path/to/index') index << {:title => "Programming Ruby", :content => "blah blah blah"} How would I do this in C/C++ against cFerret? Thanks! //Henrik From seggy.umboh at gmail.com Thu Jun 14 19:00:46 2007 From: seggy.umboh at gmail.com (Seggy Umboh) Date: Fri, 15 Jun 2007 01:00:46 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: <942aeb6f627bd5901869ef92a9917a81@ruby-forum.com> References: <20070613135453.GG6418@cordoba.webit.de> <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> <39d04cf9facca97f86d5e402f68832c5@ruby-forum.com> <963070a85ffbe2d2f825485fda31e3bb@ruby-forum.com> <1deaceeb47ebdce2ee45d717934f0c45@ruby-forum.com> <4b7d013eb0ca1f9e8a077635f6d7a9c1@ruby-forum.com> <942aeb6f627bd5901869ef92a9917a81@ruby-forum.com> Message-ID: <08db7a574fb6ff6230c02a67bcc61dc4@ruby-forum.com> Oh yeah I had that too, I just added another extension to create the directory if it does not exist. It might be more correct to rebuild the index for the model instead of just creating an empty directory, I'm not sure. Right now, I have moved back to the gem for production use so unfortunately I don't have the code that I used to do the above. Joerg Diekmann wrote: > Did it all, but the trunk version complains that it cannot find my > index/test/my_model directory when I run the test ... :-( > > > > > Seggy Umboh wrote: >> I should also add that this environment variable is only being read >> after http://projects.jkraemer.net/acts_as_ferret/changeset/180 >> >> So if you are using the gem, you will either need to backport this fix >> as an extension or use the plugin in trunk. >> >> >> Seggy Umboh wrote: >>> By setting the environment variable this way, it only takes effect for >>> that command, which is the ferret server. The server needs to use the >>> LocalIndex internally. >>> >>> Your regular Rails app will not have FERRET_USE_LOCAL_INDEX set, and >>> thus use RemoteIndex. >>> >>> Try it out... >>> >>> >>> Joerg Diekmann wrote: >>>> Hmm ... that means though that I am not using the remote server? >>>> >>>>> So, the solution, at least in my case, is to run the server with: >>>>> >>>>> FERRET_USE_LOCAL_INDEX=1 script/ferret_start >>>>> >>>>> Hope that helps... -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Fri Jun 15 03:35:06 2007 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 15 Jun 2007 09:35:06 +0200 Subject: [Ferret-talk] Sorting and getting occurrences of search in hit In-Reply-To: <670f67f110bfd258f061b6c29464902e@ruby-forum.com> References: <670f67f110bfd258f061b6c29464902e@ruby-forum.com> Message-ID: <20070615073506.GC4661@cordoba.webit.de> On Thu, Jun 14, 2007 at 06:22:53PM +0200, Mattias Bud wrote: > Mattias Bud wrote: > > Is there any way you could get the number of occurrences of the search > > in one hit? > > > > In a result I get the ferret_rank and ferret_score but not how many hits > > the search generated in the current record. > > > > I would also like to be able to sort after this when I search. Do you mean the number of times the query occurs in a hit? There's no such function, and for non-trivial queries consisting of more than one term - what number should it return? Here's an example how you can retrieve how often a term occurs in a specific field of a document: http://ferret.davebalmain.com/api/classes/Ferret/Index/TermVector.html hth, Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Fri Jun 15 03:38:22 2007 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 15 Jun 2007 09:38:22 +0200 Subject: [Ferret-talk] Errror on update after Model.rebuild_index In-Reply-To: <07de1688845a663d14145de071fb0733@ruby-forum.com> References: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> <929D5B35-83FC-42C8-AB01-3C97329195F5@benjaminkrause.com> <4b7e279af2ca289855f23a2741d544c3@ruby-forum.com> <20070608115803.GC23116@cordoba.webit.de> <9f348078fae49f6b0bcc534fed11a207@ruby-forum.com> <07de1688845a663d14145de071fb0733@ruby-forum.com> Message-ID: <20070615073822.GD4661@cordoba.webit.de> On Thu, Jun 14, 2007 at 06:35:12PM +0200, Mattias Bud wrote: > Mattias Bud wrote: > > Jens Kraemer wrote: > >> On Fri, Jun 08, 2007 at 01:41:34PM +0200, Mattias Bud wrote: > >>> > >>> No - local index. > >> > >> is this in development mode with 1 mongrel instance? if yes, make sure > >> your Mongrel doesn't do index updates while you run rebuild_index from > >> the console. Even better, run these tests without mongrel running, or in > >> another RAILS_ENV. > >> > >> With multiple mongrels, use the DRb server. > >> > >> Jens > > > > I turned of mongrel and rebuilt - and it worked. > > > > Thanks for the help > > When moving this to the production server it fails again. Here i have > the same vesion of the gem and plug-in. The server runs lighttpd and > fast-cgi. i can't re-index this class if it has :index => :no. if I > remove this from the :fields it works again. > > Looks like this > acts_as_ferret({:store_class_name => true, > :fields => { :title => {}, > :date_comment_index => {:index => :no} > } > }, { :analyzer => > Ferret::Analysis::StandardAnalyzer.new([]) }) > > As I said, if i remove :index => :no it works. > > Is the solution to change to DRb? At least you should give it a try. However I cannot imaging how having a field indexed or not should influence rebuilding. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Fri Jun 15 03:49:53 2007 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 15 Jun 2007 09:49:53 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: <08db7a574fb6ff6230c02a67bcc61dc4@ruby-forum.com> References: <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> <39d04cf9facca97f86d5e402f68832c5@ruby-forum.com> <963070a85ffbe2d2f825485fda31e3bb@ruby-forum.com> <1deaceeb47ebdce2ee45d717934f0c45@ruby-forum.com> <4b7d013eb0ca1f9e8a077635f6d7a9c1@ruby-forum.com> <942aeb6f627bd5901869ef92a9917a81@ruby-forum.com> <08db7a574fb6ff6230c02a67bcc61dc4@ruby-forum.com> Message-ID: <20070615074953.GE4661@cordoba.webit.de> On Fri, Jun 15, 2007 at 01:00:46AM +0200, Seggy Umboh wrote: > Oh yeah I had that too, I just added another extension to create the > directory if it does not exist. It might be more correct to rebuild the > index for the model instead of just creating an empty directory, I'm not > sure. > > Right now, I have moved back to the gem for production use so > unfortunately I don't have the code that I used to do the above. > > > Joerg Diekmann wrote: > > Did it all, but the trunk version complains that it cannot find my > > index/test/my_model directory when I run the test ... :-( I usually run Model.rebuild_index once per test class, or even in set_up before every test method. That will create the index dir. However the directory should be created by aaf automatically, I just committed a fix for this. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Fri Jun 15 03:52:57 2007 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 15 Jun 2007 09:52:57 +0200 Subject: [Ferret-talk] API towards cFerret In-Reply-To: <14B1EE78-CE8D-449C-9CAD-61DF6F1A46D7@mac.se> References: <14B1EE78-CE8D-449C-9CAD-61DF6F1A46D7@mac.se> Message-ID: <20070615075257.GF4661@cordoba.webit.de> On Thu, Jun 14, 2007 at 10:58:01PM +0200, Henrik Zagerholm wrote: > Hello list, > > I know this is a little OT but I wonder how I would go about using > cFerret directly through C/C++ for indexing. > > Right now I do > > index = Index::Index.new(:path => '/path/to/index') > > index << {:title => "Programming Ruby", :content => "blah blah blah"} > > How would I do this in C/C++ against cFerret? Have you had a look at Ferret's source tree? There are c unit tests, maybe these help to get you started? I think you're the first to try this, so good luck :-) Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From emanuele.vicentini at mayaidee.it Fri Jun 15 04:44:29 2007 From: emanuele.vicentini at mayaidee.it (Emanuele) Date: Fri, 15 Jun 2007 10:44:29 +0200 Subject: [Ferret-talk] =?utf-8?q?Is_DRb_server_*really*_required_in_produc?= =?utf-8?q?tion=09mode=3F?= In-Reply-To: <1181819251.27461.12.camel@localhost.localdomain> References: <2406bdec4bd9e6a11c4b2925318a6813@ruby-forum.com> <1181819251.27461.12.camel@localhost.localdomain> Message-ID: <50bb4fde09256e07ed6a13f3f3c84d35@ruby-forum.com> John Leach wrote: > So, Ferret is thread-safe yes, but Rails is not - each instance of Rails > fastcgi will be running in a separate process (with a separate Ferret et > al.) Doh! You're right, putting it in this way I understand the situation more clearly. > Whether it's a cluster of mongrels of a cluster of fastcgis, they are > separate processes either way. Thanks for your help. I'll give the DRb server a try, hoping for the best :-) -- Posted via http://www.ruby-forum.com/. From mattias at oncotype.dk Fri Jun 15 05:17:27 2007 From: mattias at oncotype.dk (Mattias Bud) Date: Fri, 15 Jun 2007 11:17:27 +0200 Subject: [Ferret-talk] Errror on update after Model.rebuild_index In-Reply-To: <20070615073822.GD4661@cordoba.webit.de> References: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> <929D5B35-83FC-42C8-AB01-3C97329195F5@benjaminkrause.com> <4b7e279af2ca289855f23a2741d544c3@ruby-forum.com> <20070608115803.GC23116@cordoba.webit.de> <9f348078fae49f6b0bcc534fed11a207@ruby-forum.com> <07de1688845a663d14145de071fb0733@ruby-forum.com> <20070615073822.GD4661@cordoba.webit.de> Message-ID: <0d964b1cd78578ae84289eeb8bd06921@ruby-forum.com> Jens Kraemer wrote: > On Thu, Jun 14, 2007 at 06:35:12PM +0200, Mattias Bud wrote: >> >> >> fast-cgi. i can't re-index this class if it has :index => :no. if I >> As I said, if i remove :index => :no it works. >> >> Is the solution to change to DRb? > > At least you should give it a try. However I cannot imaging how having a > field indexed or not should influence rebuilding. > > > Jens Yes - it's very strange. This is what I get: ArgumentError (Argument Error occured at :93 in xraise Error occured in index.c:270 - fi_check_params You can't store the term vectors of an unindexed field ): /vendor/plugins/acts_as_ferret/lib/local_index.rb:217:in `add_field' Does this help? mattias -- Posted via http://www.ruby-forum.com/. From mattias at oncotype.dk Fri Jun 15 05:03:19 2007 From: mattias at oncotype.dk (Mattias Bodlund) Date: Fri, 15 Jun 2007 11:03:19 +0200 Subject: [Ferret-talk] Sorting and getting occurrences of search in hit In-Reply-To: <20070615073506.GC4661@cordoba.webit.de> References: <670f67f110bfd258f061b6c29464902e@ruby-forum.com> <20070615073506.GC4661@cordoba.webit.de> Message-ID: <31B5AD2B-04D3-487B-BD68-259D0CC04667@oncotype.dk> On 15/06/2007, at 9.35, Jens Kraemer wrote: > On Thu, Jun 14, 2007 at 06:22:53PM +0200, Mattias Bud wrote: >> Mattias Bud wrote: >>> Is there any way you could get the number of occurrences of the >>> search >>> in one hit? >>> >>> In a result I get the ferret_rank and ferret_score but not how >>> many hits >>> the search generated in the current record. >>> >>> I would also like to be able to sort after this when I search. > > Do you mean the number of times the query occurs in a hit? There's no > such function, and for non-trivial queries consisting of more than one > term - what number should it return? > > Here's an example how you can retrieve how often a term occurs in a > specific field of a document: > http://ferret.davebalmain.com/api/classes/Ferret/Index/TermVector.html > > > hth, > Jens > > > -- I know that this may be a bit off but the feature I'm looking for is to be able to determine how often lets say a name occurs in the complete index of an object. By default we sort results chronologically - this due to the context of the project. Thanks - i'll look into the above. mattias From kraemer at webit.de Fri Jun 15 07:59:51 2007 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 15 Jun 2007 13:59:51 +0200 Subject: [Ferret-talk] Errror on update after Model.rebuild_index In-Reply-To: <0d964b1cd78578ae84289eeb8bd06921@ruby-forum.com> References: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> <929D5B35-83FC-42C8-AB01-3C97329195F5@benjaminkrause.com> <4b7e279af2ca289855f23a2741d544c3@ruby-forum.com> <20070608115803.GC23116@cordoba.webit.de> <9f348078fae49f6b0bcc534fed11a207@ruby-forum.com> <07de1688845a663d14145de071fb0733@ruby-forum.com> <20070615073822.GD4661@cordoba.webit.de> <0d964b1cd78578ae84289eeb8bd06921@ruby-forum.com> Message-ID: <20070615115951.GG4661@cordoba.webit.de> On Fri, Jun 15, 2007 at 11:17:27AM +0200, Mattias Bud wrote: > Jens Kraemer wrote: > > On Thu, Jun 14, 2007 at 06:35:12PM +0200, Mattias Bud wrote: > >> >> > >> fast-cgi. i can't re-index this class if it has :index => :no. if I > >> As I said, if i remove :index => :no it works. > >> > >> Is the solution to change to DRb? > > > > At least you should give it a try. However I cannot imaging how having a > > field indexed or not should influence rebuilding. > > > > > > Jens > > Yes - it's very strange. This is what I get: > > ArgumentError (Argument Error occured at :93 in xraise > Error occured in index.c:270 - fi_check_params > You can't store the term vectors of an unindexed field > > ): > /vendor/plugins/acts_as_ferret/lib/local_index.rb:217:in `add_field' > > Does this help? yeah, jusat do what it says - disable term vector storage by adding :term_vector => :no to the field options for the :index => :no field. AAF trunk now should handle this correctly :-) Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From andreas.korth at gmx.net Fri Jun 15 08:42:59 2007 From: andreas.korth at gmx.net (Andreas Korth) Date: Fri, 15 Jun 2007 14:42:59 +0200 Subject: [Ferret-talk] [BUG] StandardTokenizer, multibyte and word boundaries In-Reply-To: <20070615115951.GG4661@cordoba.webit.de> References: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> <929D5B35-83FC-42C8-AB01-3C97329195F5@benjaminkrause.com> <4b7e279af2ca289855f23a2741d544c3@ruby-forum.com> <20070608115803.GC23116@cordoba.webit.de> <9f348078fae49f6b0bcc534fed11a207@ruby-forum.com> <07de1688845a663d14145de071fb0733@ruby-forum.com> <20070615073822.GD4661@cordoba.webit.de> <0d964b1cd78578ae84289eeb8bd06921@ruby-forum.com> <20070615115951.GG4661@cordoba.webit.de> Message-ID: <14BA3CE8-FFFF-4D9F-B34F-F36EF031B796@gmx.net> Hi, I'm not sure if this has been brought up before: I found a bug in StandardTokenizer which misinterprets non-ASCII characters as word boundaries. This happens only with words that contain non- alphanumeric characters. Consider this example: The text 'Gerd Schr?der Stra?e' is properly tokenized to: ["Gerd", "Schr?der", "Stra?e"] as well as 'Gerd-Schroeder-Strasse': ["Gerd-Schroeder-Strasse"] but 'Gerd-Schr?der-Stra?e' yields: ["Gerd-Schr", "?der-Stra", "?e"] So apparently, multibyte and non-word characters don't mix... Cheers, Andy From joergd at pobox.com Fri Jun 15 08:50:54 2007 From: joergd at pobox.com (Joerg Diekmann) Date: Fri, 15 Jun 2007 14:50:54 +0200 Subject: [Ferret-talk] Do delegates work properly in Drb mode? In-Reply-To: <20070615074953.GE4661@cordoba.webit.de> References: <20070613135453.GG6418@cordoba.webit.de> <98aa3b5159b197b39cf5553b151b2f41@ruby-forum.com> <39d04cf9facca97f86d5e402f68832c5@ruby-forum.com> <963070a85ffbe2d2f825485fda31e3bb@ruby-forum.com> <1deaceeb47ebdce2ee45d717934f0c45@ruby-forum.com> <4b7d013eb0ca1f9e8a077635f6d7a9c1@ruby-forum.com> <942aeb6f627bd5901869ef92a9917a81@ruby-forum.com> <08db7a574fb6ff6230c02a67bcc61dc4@ruby-forum.com> <20070615074953.GE4661@cordoba.webit.de> Message-ID: <7318df67d5e87a4167ed70035a98ad83@ruby-forum.com> Thanks Jens. > However the directory should be created by aaf automatically, I just > committed a fix for this. > > Jens > -- Posted via http://www.ruby-forum.com/. From alain.ravet+ferret at gmail.com Fri Jun 15 09:05:28 2007 From: alain.ravet+ferret at gmail.com (Alain Ravet) Date: Fri, 15 Jun 2007 15:05:28 +0200 Subject: [Ferret-talk] indexed 'text' (not string) column not used when searching, unless explicitely specified!! Message-ID: Hi all, I've included a mysql 'text' column in my index, and aaf/Ferret doesn't use it when I search, UNLESS I specify it as a restrictor, and then it only searches in that field! For example: => 0 results are returned by a standard (__all__ fields ?) search >> Entity.find_by_contents "zixi" Query: zixi total hits: 0, results delivered: 0 BUT if I search __only__ in 1 field, I find something >> Entity.find_by_contents "extra:zixi" Entity Load (0.001286) SELECT * FROM entities WHERE (entities.id in ('1723')) AND ( (entities.`type` = 'Person' ) ) Query: extra:zixi total hits: 1, results delivered: 1 => # true do |t| t.string "type" t.text "extra" t.datetime "created_at" .... and my aaf setting : file: Entity.rb acts_as_ferret :remote => true, :fields => { :name => {:store => :yes, :boost => 1000}, :last_name => {:store => :yes, :boost => 100}, :tags_names => {:store => :yes, :boost => 30 } , :first_name => {:store => :yes, :boost => 10 }, :type => {:store => :yes}, :extra => {:store => :yes}, :slaves_list => {:store => :yes, :boost => 10 }, :masters_list => {:store => :yes, :boost => 10 }, :function => {:store => :yes}, :name_for_sort=> {:index => :untokenized} } TIA Alain From my at mail.com Fri Jun 15 09:21:46 2007 From: my at mail.com (mike) Date: Fri, 15 Jun 2007 15:21:46 +0200 Subject: [Ferret-talk] Ferret and capistrano, how to keep the indexes? Message-ID: Hi, i'm using capistrano to deploy the application, but every time i deploy it change all the directory, so i lost also the ferret's indexes. Is it possible to keep them in order to prevent the reindex on each deploy? -- Posted via http://www.ruby-forum.com/. From john at johnleach.co.uk Fri Jun 15 09:31:05 2007 From: john at johnleach.co.uk (John Leach) Date: Fri, 15 Jun 2007 14:31:05 +0100 Subject: [Ferret-talk] Ferret and capistrano, how to keep the indexes? In-Reply-To: References: Message-ID: <1181914265.7002.8.camel@localhost.localdomain> sure, put the ferret index directory in the capistrano shared directory, and have capistrano symlink it in. The same way the log/ and tmp/ directories are handled. John. On Fri, 2007-06-15 at 15:21 +0200, mike wrote: > Hi, i'm using capistrano to deploy the application, but every time i > deploy it change all the directory, so i lost also the ferret's indexes. > Is it possible to keep them in order to prevent the reindex on each > deploy? > -- http://johnleach.co.uk From alain.ravet+ferret at gmail.com Fri Jun 15 10:30:19 2007 From: alain.ravet+ferret at gmail.com (Alain Ravet) Date: Fri, 15 Jun 2007 16:30:19 +0200 Subject: [Ferret-talk] indexed 'text' (not string) column not used when searching, unless explicitely specified!! In-Reply-To: References: Message-ID: ... additionally, I noticed that prefixing the query term with *: 'solves' the problem. So, Person.find_by_contents "zixi" -> NO RESULT : WRONG but Person.find_by_contents "*:zixi" -> 1 RESULT : CORRECT Person.find_by_contents "extra:zixi" -> 1 RESULT : CORRECT This hack would get trickier to implement on queries like : (a OR B) c name:d so I'd rather do it the right way. Why does this happen : bug, or feature? Alain From alain.ravet+ferret at gmail.com Fri Jun 15 10:35:26 2007 From: alain.ravet+ferret at gmail.com (Alain Ravet) Date: Fri, 15 Jun 2007 16:35:26 +0200 Subject: [Ferret-talk] Ferret and capistrano, how to keep the indexes? In-Reply-To: <1181914265.7002.8.camel@localhost.localdomain> References: <1181914265.7002.8.camel@localhost.localdomain> Message-ID: > sure, put the ferret index directory in the capistrano shared directory, > and have capistrano symlink it in. The same way the log/ and tmp/ > directories are handled. Example: In my deploy.rb file, I added (edited) : desc "Set up the shared index" task :after_setup, :roles => [:app, :web] do run "mkdir -p -m 777 #{shared_path}/index" end desc "symlink the index" task :after_update, :roles => [:app, :web] do run "ln -nfs #{shared_path}/index #{current_release}/index" end Alain From mattias at oncotype.dk Fri Jun 15 11:16:49 2007 From: mattias at oncotype.dk (Mattias Bud) Date: Fri, 15 Jun 2007 17:16:49 +0200 Subject: [Ferret-talk] Errror on update after Model.rebuild_index In-Reply-To: <20070615115951.GG4661@cordoba.webit.de> References: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> <929D5B35-83FC-42C8-AB01-3C97329195F5@benjaminkrause.com> <4b7e279af2ca289855f23a2741d544c3@ruby-forum.com> <20070608115803.GC23116@cordoba.webit.de> <9f348078fae49f6b0bcc534fed11a207@ruby-forum.com> <07de1688845a663d14145de071fb0733@ruby-forum.com> <20070615073822.GD4661@cordoba.webit.de> <0d964b1cd78578ae84289eeb8bd06921@ruby-forum.com> <20070615115951.GG4661@cordoba.webit.de> Message-ID: <0cf96ec0495ef762d449b629d7182c1d@ruby-forum.com> Jens Kraemer wrote: > On Fri, Jun 15, 2007 at 11:17:27AM +0200, Mattias Bud wrote: >> > >> /vendor/plugins/acts_as_ferret/lib/local_index.rb:217:in `add_field' >> >> Does this help? > > yeah, jusat do what it says - disable term vector storage by adding > :term_vector => :no to the field options for the :index => :no field. > > AAF trunk now should handle this correctly :-) > > Jens I have the trunk version. This removes the error but the search doen't match what I was looking for. I have a large number of dates in dirrerent formats. Like this one 6.9.-8.9. 1796 When I search for 6.9.* - the result is blank. I'm maybee missing something here. mattias -- Posted via http://www.ruby-forum.com/. From henke at mac.se Sat Jun 16 10:39:19 2007 From: henke at mac.se (Henrik Zagerholm) Date: Sat, 16 Jun 2007 16:39:19 +0200 Subject: [Ferret-talk] API towards cFerret In-Reply-To: <20070615075257.GF4661@cordoba.webit.de> References: <14B1EE78-CE8D-449C-9CAD-61DF6F1A46D7@mac.se> <20070615075257.GF4661@cordoba.webit.de> Message-ID: <01EBD150-4E80-4C69-92C5-1447E72AEDEF@mac.se> 15 jun 2007 kl. 09:52 skrev Jens Kraemer: > On Thu, Jun 14, 2007 at 10:58:01PM +0200, Henrik Zagerholm wrote: >> Hello list, >> >> I know this is a little OT but I wonder how I would go about using >> cFerret directly through C/C++ for indexing. >> >> Right now I do >> >> index = Index::Index.new(:path => '/path/to/index') >> >> index << {:title => "Programming Ruby", :content => "blah blah blah"} >> >> How would I do this in C/C++ against cFerret? > > Have you had a look at Ferret's source tree? There are c unit tests, > maybe these help to get you started? > Yes I looked at these tests but instead of interfacing directly to the files used by the tests I would like to interface directly through ferret.h. I'll do some more research and start to do some coding to see what I come up with. Just wanted to see if anyone already had done it. :) > I think you're the first to try this, so good luck :-) > Thanks! :) > Jens > > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold, Hagen Malessa > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From john at digitalpulp.com Sat Jun 16 14:09:36 2007 From: john at digitalpulp.com (John Bachir) Date: Sat, 16 Jun 2007 14:09:36 -0400 Subject: [Ferret-talk] Sorting and getting occurrences of search in hit In-Reply-To: <31B5AD2B-04D3-487B-BD68-259D0CC04667@oncotype.dk> References: <670f67f110bfd258f061b6c29464902e@ruby-forum.com> <20070615073506.GC4661@cordoba.webit.de> <31B5AD2B-04D3-487B-BD68-259D0CC04667@oncotype.dk> Message-ID: On Jun 15, 2007, at 5:03 AM, Mattias Bodlund wrote: > > I know that this may be a bit off but the feature I'm looking for is > to be able to determine how often lets say a name occurs in the > complete index of an object. By default we sort results > chronologically - this due to the context of the project. look into things like... YourModel.aaf_index.ferret_index.reader.terms(:your_field_name) and other methods on the index reader: http://ferret.davebalmain.com/api/classes/Ferret/Index/IndexReader.html From john at digitalpulp.com Sat Jun 16 14:59:14 2007 From: john at digitalpulp.com (John Bachir) Date: Sat, 16 Jun 2007 14:59:14 -0400 Subject: [Ferret-talk] more specific queries via IndexReader Message-ID: <6DD74FFB-446D-4BA2-846B-AD83444A1927@digitalpulp.com> We would like to show a list of "most recently added terms", meaning, the results of this query: Resource.aaf_index.ferret_index.reader.terms(:summary) BUT, only returning terms from a certain set of documents (in our case, we are going to filter by creation data). Is this possible? Thanks, John From john at digitalpulp.com Sat Jun 16 15:20:27 2007 From: john at digitalpulp.com (John Bachir) Date: Sat, 16 Jun 2007 15:20:27 -0400 Subject: [Ferret-talk] more specific queries via IndexReader In-Reply-To: <6DD74FFB-446D-4BA2-846B-AD83444A1927@digitalpulp.com> References: <6DD74FFB-446D-4BA2-846B-AD83444A1927@digitalpulp.com> Message-ID: <7C99D0CB-4289-49F2-AC2B-4D9F4D5474E8@digitalpulp.com> On Jun 16, 2007, at 2:59 PM, John Bachir wrote: > We would like to show a list of "most recently added terms", meaning, > the results of this query: > > Resource.aaf_index.ferret_index.reader.terms(:summary) > > BUT, only returning terms from a certain set of documents (in our > case, we are going to filter by creation data). > > Is this possible? > Actually I just discovered that when using the drb server, the ferret_index cannot be accessed. So I imagine it is impossible to access the index's terms at all? John From syrius.ml at no-log.org Sun Jun 17 11:40:48 2007 From: syrius.ml at no-log.org (syrius.ml at no-log.org) Date: Sun, 17 Jun 2007 17:40:48 +0200 Subject: [Ferret-talk] highlighting and range queries Message-ID: <87lkeilhi8.87k5u2lhi8@87ir9mlhi8.message.id> Hi there, Is highlighting for range queries supposed to work ? It doesn't work here. here is an non-working example: (highlighting works when q="test:2007*") require 'ferret' include Ferret index = Index::Index.new() #index.field_infos.add_field(:test, :store => :yes, :index => :untokenized) i=1 for a in [ "20070505", "20071230", "20060920", "20081111" ] index << {:id => i, :test => a} i+=1 end for q in [ 'test:( >= 20070101)', 'test:2007*', Ferret::Search::RangeQuery.new(:test, :>= => "20070101") ] index.search_each(q) do |id, score| puts "Document #{index[id][:test]} found with a score of #{score}" highlights = index.highlight(q, id, :field => :test, :pre_tag => "\033[36m", :post_tag => "\033[m") puts highlights end puts "------" end -- From james.bebbington at gmail.com Mon Jun 18 04:57:51 2007 From: james.bebbington at gmail.com (James Bebbington) Date: Mon, 18 Jun 2007 10:57:51 +0200 Subject: [Ferret-talk] "No such file or directory - script" Error on Model.rebuild Message-ID: <0fac9ae0747a8195536b2308b71cd2d3@ruby-forum.com> Hi, Having recently converted to using ferret_server on my staging site my deployment is now failing due to the following error when attempting to rebuild the indexes on my models: from (irb):1>> Post.rebuild_index Errno::ENOENT: No such file or directory - script from (druby://127.2.0.1:9100) /usr/local/lib/ruby/1.8/fileutils.rb:243:in `mkdir' from (druby://127.2.0.1:9100) /usr/local/lib/ruby/1.8/fileutils.rb:243:in `fu_mkdir' from (druby://127.2.0.1:9100) /usr/local/lib/ruby/1.8/fileutils.rb:217:in `mkdir_p' from (druby://127.2.0.1:9100) /usr/local/lib/ruby/1.8/fileutils.rb:215:in `reverse_each' from (druby://127.2.0.1:9100) /usr/local/lib/ruby/1.8/fileutils.rb:215:in `mkdir_p' from (druby://127.2.0.1:9100) /usr/local/lib/ruby/1.8/fileutils.rb:201:in `each' from (druby://127.2.0.1:9100) /usr/local/lib/ruby/1.8/fileutils.rb:201:in `mkdir_p' from (druby://127.2.0.1:9100) /usr/local/lib/ruby/gems/1.8/gems/ferret-0.11.3/lib/ferret/index.rb:120:in `new' from (druby://127.2.0.1:9100) /usr/local/lib/ruby/gems/1.8/gems/ferret-0.11.3/lib/ferret/index.rb:120:in `initialize' from (druby://127.2.0.1:9100) /usr/local/lib/ruby/gems/1.8/gems/acts_as_ferret-0.4.0/lib/local_index.rb:43:in `new' from (druby://127.2.0.1:9100) /usr/local/lib/ruby/gems/1.8/gems/acts_as_ferret-0.4.0/lib/local_index.rb:43:in `rebuild_index' from (druby://127.2.0.1:9100) /usr/local/lib/ruby/gems/1.8/gems/acts_as_ferret-0.4.0/lib/ferret_server.rb:68:in `send' from (druby://127.2.0.1:9100) /usr/local/lib/ruby/gems/1.8/gems/acts_as_ferret-0.4.0/lib/ferret_server.rb:68:in `method_missing' from /usr/local/lib/ruby/gems/1.8/gems/acts_as_ferret-0.4.0/lib/remote_index.rb:16:in `send' from /usr/local/lib/ruby/gems/1.8/gems/acts_as_ferret-0.4.0/lib/remote_index.rb:16:in `method_missing' from /usr/local/lib/ruby/gems/1.8/gems/acts_as_ferret-0.4.0/lib/class_methods.rb:15:in `rebuild_index' The rebuild did work for a while (week or so) after converting to using DRb but now its complaining about this 'script' directory, anyone have ideas? Thanks, fractious. -- Posted via http://www.ruby-forum.com/. From alainravet-spam2004 at yahoo.com Mon Jun 18 06:08:40 2007 From: alainravet-spam2004 at yahoo.com (Alain Ravet) Date: Mon, 18 Jun 2007 12:08:40 +0200 Subject: [Ferret-talk] disabling automatic indexing in acts_as_ferret In-Reply-To: <1156504226.6728.4.camel@localhost.localdomain> References: <1156504226.6728.4.camel@localhost.localdomain> Message-ID: <27f321359e87ddaafeab81876a145987@ruby-forum.com> > I'd like to be able to enable/disable the automatic indexing of I miss this feature badly: I need to disable aaf at the model level, before bulk-modifying the 150K rows of a table, and rebuilding the index afterwards. I wish I could write code like : Account.disable_ferret ... modify all the accounts in bulk Account.enable_ferret Account.rebuild_index Is there a best practice/semi-official hack to achieve this goal? Alain -- Posted via http://www.ruby-forum.com/. From boyleme at verizon.net Mon Jun 18 07:51:48 2007 From: boyleme at verizon.net (Mike Boyle) Date: Mon, 18 Jun 2007 13:51:48 +0200 Subject: [Ferret-talk] Solaris - make failure Message-ID: Getting the following error when installing the ferret-0.11.4.gem on a Solaris box. Any ideas on how to fix this or what is going wrong? In file included from q_multi_term.c:2: search.h:716: field 'comparables' has incomplete type q_multi_term.c: In function 'multi-tq_new_conf': q_multi_term.c:634: '__func__' undeclared (first use in this function) q_multi_term.c:634: (Each undeclared identifier is reported only once q_mulit_term.c:634: for each function it appears in.) make: Fatal error: Command failed for target 'q_multi_term.o' rub extconf.rb install ferret-0.11.4.gem creating Makefile make gcc -I. -I/xxx/yyy/ruby/lib/ruby/1.8/sparc-solaris2.10 -I/xxx/yyy/ruby /lib/ruby/1.8/sparc-solaris2.10 -I. -fPIC -g -02 -D_File_OFFSET_BITS=64 -c q_mu lit_term.c *** Error code 1 make install gcc -I. -I/xxx/yyy/ruby/lib/ruby/1.8/sparc-solaris2.10 -I/xxx/yyy/ruby /lib/ruby/1.8/sparc-solaris2.10 -I. -fPIC -g -02 -D_File_OFFSET_BITS=64 -c q_mu lit_term.c *** Error code 1 -- Posted via http://www.ruby-forum.com/. From andreas.korth at gmx.net Mon Jun 18 08:14:07 2007 From: andreas.korth at gmx.net (Andreas Korth) Date: Mon, 18 Jun 2007 14:14:07 +0200 Subject: [Ferret-talk] disabling automatic indexing in acts_as_ferret In-Reply-To: <27f321359e87ddaafeab81876a145987@ruby-forum.com> References: <1156504226.6728.4.camel@localhost.localdomain> <27f321359e87ddaafeab81876a145987@ruby-forum.com> Message-ID: <0B262144-E954-4743-A5C5-B0DC5F2AFA00@gmx.net> On 18.06.2007, at 12:08, Alain Ravet wrote: > >> I'd like to be able to enable/disable the automatic indexing of > > I miss this feature badly: I need to disable aaf at the model level, > before bulk-modifying the 150K rows of a table, and rebuilding the > index > afterwards. You can enable/disable AAF on the object level: http://projects.jkraemer.net/acts_as_ferret/wiki/ AdvancedUsage#Disablingautomaticindexing I don't understand why it's done that way. Disabling indexing on the class level would make a whole lot more sense. There you go. Cheers, Andy From kraemer at webit.de Mon Jun 18 14:30:05 2007 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 18 Jun 2007 20:30:05 +0200 Subject: [Ferret-talk] "No such file or directory - script" Error on Model.rebuild In-Reply-To: <0fac9ae0747a8195536b2308b71cd2d3@ruby-forum.com> References: <0fac9ae0747a8195536b2308b71cd2d3@ruby-forum.com> Message-ID: <20070618183004.GB24322@cordoba.webit.de> On Mon, Jun 18, 2007 at 10:57:51AM +0200, James Bebbington wrote: > Hi, > > Having recently converted to using ferret_server on my staging site my > deployment is now failing due to the following error when attempting to > rebuild the indexes on my models: > > from (irb):1>> Post.rebuild_index > Errno::ENOENT: No such file or directory - script looks like you launch the rebuild from some directory other than RAILS_ROOT, could you check this somehow? Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Mon Jun 18 14:34:56 2007 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 18 Jun 2007 20:34:56 +0200 Subject: [Ferret-talk] disabling automatic indexing in acts_as_ferret In-Reply-To: <0B262144-E954-4743-A5C5-B0DC5F2AFA00@gmx.net> References: <1156504226.6728.4.camel@localhost.localdomain> <27f321359e87ddaafeab81876a145987@ruby-forum.com> <0B262144-E954-4743-A5C5-B0DC5F2AFA00@gmx.net> Message-ID: <20070618183456.GC24322@cordoba.webit.de> On Mon, Jun 18, 2007 at 02:14:07PM +0200, Andreas Korth wrote: > > On 18.06.2007, at 12:08, Alain Ravet wrote: > > > > >> I'd like to be able to enable/disable the automatic indexing of > > > > I miss this feature badly: I need to disable aaf at the model level, > > before bulk-modifying the 150K rows of a table, and rebuilding the > > index > > afterwards. > > You can enable/disable AAF on the object level: > > http://projects.jkraemer.net/acts_as_ferret/wiki/ > AdvancedUsage#Disablingautomaticindexing > > I don't understand why it's done that way. Disabling indexing on the > class level would make a whole lot more sense. just override the ferret_enabled?(bool) instance method to check for a class variable holding your class level disable-flag... Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Mon Jun 18 15:54:15 2007 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 18 Jun 2007 21:54:15 +0200 Subject: [Ferret-talk] Errror on update after Model.rebuild_index In-Reply-To: <0cf96ec0495ef762d449b629d7182c1d@ruby-forum.com> References: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> <929D5B35-83FC-42C8-AB01-3C97329195F5@benjaminkrause.com> <4b7e279af2ca289855f23a2741d544c3@ruby-forum.com> <20070608115803.GC23116@cordoba.webit.de> <9f348078fae49f6b0bcc534fed11a207@ruby-forum.com> <07de1688845a663d14145de071fb0733@ruby-forum.com> <20070615073822.GD4661@cordoba.webit.de> <0d964b1cd78578ae84289eeb8bd06921@ruby-forum.com> <20070615115951.GG4661@cordoba.webit.de> <0cf96ec0495ef762d449b629d7182c1d@ruby-forum.com> Message-ID: <20070618195415.GD4052@cordoba.webit.de> On Fri, Jun 15, 2007 at 05:16:49PM +0200, Mattias Bud wrote: > Jens Kraemer wrote: > > On Fri, Jun 15, 2007 at 11:17:27AM +0200, Mattias Bud wrote: > >> > > >> /vendor/plugins/acts_as_ferret/lib/local_index.rb:217:in `add_field' > >> > >> Does this help? > > > > yeah, jusat do what it says - disable term vector storage by adding > > :term_vector => :no to the field options for the :index => :no field. > > > > AAF trunk now should handle this correctly :-) > > > > Jens > > I have the trunk version. This removes the error but the search doen't > match what I was looking for. I have a large number of dates in > dirrerent formats. > > Like this one 6.9.-8.9. 1796 > > When I search for 6.9.* - the result is blank. I'm maybee missing > something here. if this date string is in the date_comment_index field, you won't find it as long as you say :index => :no. The correct choice for fields you want to search, but not to be tokenized is :index => :untokenized. :index => :no is only useful for fields you never want to search, but which should hold data to be retrieved later on. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From mattias at oncotype.dk Mon Jun 18 16:03:19 2007 From: mattias at oncotype.dk (Mattias Bud) Date: Mon, 18 Jun 2007 22:03:19 +0200 Subject: [Ferret-talk] Errror on update after Model.rebuild_index In-Reply-To: <20070618195415.GD4052@cordoba.webit.de> References: <69c947c05439b56163d3aa7bc715de7e@ruby-forum.com> <929D5B35-83FC-42C8-AB01-3C97329195F5@benjaminkrause.com> <4b7e279af2ca289855f23a2741d544c3@ruby-forum.com> <20070608115803.GC23116@cordoba.webit.de> <9f348078fae49f6b0bcc534fed11a207@ruby-forum.com> <07de1688845a663d14145de071fb0733@ruby-forum.com> <20070615073822.GD4661@cordoba.webit.de> <0d964b1cd78578ae84289eeb8bd06921@ruby-forum.com> <20070615115951.GG4661@cordoba.webit.de> <0cf96ec0495ef762d449b629d7182c1d@ruby-forum.com> <20070618195415.GD4052@cordoba.webit.de> Message-ID: Jens Kraemer wrote: > On Fri, Jun 15, 2007 at 05:16:49PM +0200, Mattias Bud wrote: >> > AAF trunk now should handle this correctly :-) >> something here. > if this date string is in the date_comment_index field, you won't find > it as long as you say :index => :no. The correct choice for fields you > want to search, but not to be tokenized is :index => :untokenized. > > :index => :no is only useful for fields you never want to search, but > which should hold data to be retrieved later on. > > Jens > Yes - I ran into that. After a lot of tests I now use untokenized_omit_norms which seems to give the correct result. Now I only have to index the date_comment twice. It also holds unsurten dates lite Around 1820. These chould be searchable with "Around" (find al unsurten dates) So now I index :date_comment_index => {}, :date_comment_index_untokenized => {:index => :untokenized_omit_norms} mattias -- Posted via http://www.ruby-forum.com/. From alain.ravet+ferret at gmail.com Mon Jun 18 17:22:33 2007 From: alain.ravet+ferret at gmail.com (Alain Ravet) Date: Mon, 18 Jun 2007 23:22:33 +0200 Subject: [Ferret-talk] disabling automatic indexing in acts_as_ferret In-Reply-To: <20070618183456.GC24322@cordoba.webit.de> References: <1156504226.6728.4.camel@localhost.localdomain> <27f321359e87ddaafeab81876a145987@ruby-forum.com> <0B262144-E954-4743-A5C5-B0DC5F2AFA00@gmx.net> <20070618183456.GC24322@cordoba.webit.de> Message-ID: > just override the ferret_enabled?(bool) instance method to check for a > class variable holding your class level disable-flag... I know, but it's ugly, and indirect, and really not the way to do it for this kind of use cases : bulk change, bulk import, .. A class method is needed : Model.disable_indexing # bulk change Model.enable_indexing(true) # true => trigger a rebuild_index Alain From james.bebbington at gmail.com Tue Jun 19 04:43:28 2007 From: james.bebbington at gmail.com (James Bebbington) Date: Tue, 19 Jun 2007 10:43:28 +0200 Subject: [Ferret-talk] "No such file or directory - script" Error on Model.rebu In-Reply-To: <20070618183004.GB24322@cordoba.webit.de> References: <0fac9ae0747a8195536b2308b71cd2d3@ruby-forum.com> <20070618183004.GB24322@cordoba.webit.de> Message-ID: <9395bee041d71e9365e526fdb74216e0@ruby-forum.com> Jens Kraemer wrote: > looks like you launch the rebuild from some directory other than > RAILS_ROOT, could you check this somehow? Thanks for getting back to me Jens, That output was from ./script/console. The same error was being thrown in the application too when attempting to do anything on the site that involved ferret. I restarted the ferret_server (which I'm sure I tried before) and everything's working fine now. Odd, but I'm happy as long as it doesn't happen again :) Cheers, fractious. -- Posted via http://www.ruby-forum.com/. From bagam_venkat at hotmail.com Tue Jun 19 07:12:01 2007 From: bagam_venkat at hotmail.com (Venkat Bagam) Date: Tue, 19 Jun 2007 13:12:01 +0200 Subject: [Ferret-talk] offline installation of acts_as_ferret gem Message-ID: <5bb8a2938cdba5cad72d4aa9d6e4eae5@ruby-forum.com> hi folks, I have been trying to use acts_as_ferret in my rails app.I have installed the ferret gem.I would like to download acts_as_ferret gem and then install it offline.But i was unable to find any link for download..please let me know if any one has did it before... any help appreciated...thanks in advance.... regards, venkat -- Posted via http://www.ruby-forum.com/. From syrius.ml at no-log.org Tue Jun 19 18:30:03 2007 From: syrius.ml at no-log.org (syrius.ml at no-log.org) Date: Wed, 20 Jun 2007 00:30:03 +0200 Subject: [Ferret-talk] another issue with highlighting Message-ID: <874pl38v6n.873b0n8v6n@871wg78v6n.message.id> Hi, I'm encountering another highlighting issue. (about the first one "range search and highlighting", i received no response. I don't even know if somebody tried to reproduce and/or if it's normal behavior) about the new issue, an example will be easier for you to reproduce: I'm filling an index with random data, i try to match for "*1*" and then highlight the matched tokens. If it's matched and not highlighted i put it in z It works as expected when there're 100 entries (replace 500.times by 100.times), in that case z contains empty arrays. When having 500 entries it doesn't highlight every matches ! This example has been tested with 0.11.4 (r770 has been tested with the application i first discovered this issue with) I would appreciate if you could test and tell me if I'm the only one having this problem. TIA require 'ferret' include Ferret # filling index=Index::Index.new(:path => '/tmp/test') chars1 = chars2 = chars3 = chars4 = ("a".."z").to_a + ("0".."9").to_a chars2.concat(["-", "_", " "]) chars3 << " " chars4 << "-" chars5 = chars6 = ("0".."9").to_a chars6 << "." 500.times do z={} t="" 1.upto(15+rand(10)) { |i| t << chars4[rand(chars4.size-1)] } z[:un] = t t="" 1.upto(40+rand(40)) { |i| t << chars2[rand(chars2.size-1)] } z[:deux] = t t="" 1.upto(30+rand(10)) { |i| t << chars4[rand(chars4.size-1)] } z[:trois] = t t="" 1.upto(30+rand(10)) { |i| t << chars1[rand(chars1.size-1)] } z[:quatre] = t t="" 1.upto(30+rand(10)) { |i| t << chars2[rand(chars2.size-1)] } z[:cinq] = t t="" 1.upto(12) { |i| t << chars5[rand(chars5.size-1)] } z[:six] = t t="" 1.upto(12) { |i| t << chars6[rand(chars6.size-1)] } z[:sept] = t t="" 1.upto(12) { |i| t << chars6[rand(chars6.size-1)] } z[:huit] = t t="" 1.upto(24+rand(24)) { |i| t << chars3[rand(chars3.size-1)] } z[:neuf] = t t="" 1.upto(100+rand(100)) { |i| t << chars2[rand(chars2.size-1)] } z[:dix] = t index << z end #testing q="*1*" z={} index.search_each(q,:limit => :all) do |id,score| for b in [:un, :deux, :trois, :quatre, :cinq, :six, :sept, :huit, :neuf, :dix] z[b]=[] if not z[b] z[b] << id.to_s + " : " + index.highlight(q,id,:field => b, :pre_tag => "", :post_tag => "", :num_excerpts => :all, :excerpt_length => :all).join(" | ") if index[id][b].match(/1/) and index.highlight(q,id,:field => b, :pre_tag => "", :post_tag => "", :num_excerpts => :all, :excerpt_length => :all) and not index.highlight(q,id,:field => b, :pre_tag => "", :post_tag => "", :num_excerpts => :all, :excerpt_length => :all).join(" | ").match(//) end end z index.search("*",:limit => :all).total_hits -- From kraemer at webit.de Wed Jun 20 04:44:50 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 20 Jun 2007 10:44:50 +0200 Subject: [Ferret-talk] offline installation of acts_as_ferret gem In-Reply-To: <5bb8a2938cdba5cad72d4aa9d6e4eae5@ruby-forum.com> References: <5bb8a2938cdba5cad72d4aa9d6e4eae5@ruby-forum.com> Message-ID: <20070620084450.GB22469@cordoba.webit.de> On Tue, Jun 19, 2007 at 01:12:01PM +0200, Venkat Bagam wrote: > hi folks, > > I have been trying to use acts_as_ferret in my rails app.I > have installed the ferret gem.I would like to download acts_as_ferret > gem and then install it offline.But i was unable to find any link for > download..please let me know if any one has did it before... > > any help appreciated...thanks in advance.... there's a rubyforge project at http://rubyforge.org/projects/actsasferret where you can download the gem. cheers, Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Wed Jun 20 05:26:23 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 20 Jun 2007 11:26:23 +0200 Subject: [Ferret-talk] highlighting and range queries In-Reply-To: <87lkeilhi8.87k5u2lhi8@87ir9mlhi8.message.id> References: <87lkeilhi8.87k5u2lhi8@87ir9mlhi8.message.id> Message-ID: <20070620092622.GE22469@cordoba.webit.de> On Sun, Jun 17, 2007 at 05:40:48PM +0200, syrius.ml at no-log.org wrote: > > Hi there, > > Is highlighting for range queries supposed to work ? > It doesn't work here. Your test doesn't work for me, too - looks like highlighting is just not yet implemented for RangeQueries... Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Wed Jun 20 05:49:17 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 20 Jun 2007 11:49:17 +0200 Subject: [Ferret-talk] :store => :yes doesn't work in some cases In-Reply-To: <3718a6efdb1006fc97cfc328b27f2023@ruby-forum.com> References: <3718a6efdb1006fc97cfc328b27f2023@ruby-forum.com> Message-ID: <20070620094917.GF22469@cordoba.webit.de> Hi! Imho the highlight method is supposed to return nil when nothing to highlight is there. In this case just retrieve the content of the field with result.name or doc[:name] if working with Ferret directly. I just checked with a plain Ferret script and it had no problems retrieving field contents that were just a stop word. If we don't get this to work there might be an aaf bug, though ;-) On Thu, Jun 07, 2007 at 04:54:57PM +0200, Jesse Grosjean wrote: > I'm not really sure if this is a bug, but it makes my search results > look a bit strange. I have an acts_as_ferret declaration that looks > like: > > acts_as_ferret :store_class_name => true, :remote => true, :fields => > { > :ferret_name => { :store => :yes, :boost => 2 }, > :ferret_content => { :store => :yes } } > [..] > Then when I do a search for 'rails' the result will be found, but the > results highlighted_name will be blank. So I don't see the name of the > model in the result. This seems to be a special case, because generally > words like "about" and "the" that are skipped by the tokenizer will > still be stored when :store => :yes when they are in a phrase. > > I hope that makes some sense. For now I can get around the problem by > checking for the blank case and loading the value from the model > directly, but things would be easier if :store => :yes would just always > store the field value. What do you mean with 'loading the value directly'? If you used aaf's :lazy => true option when searching, aaf would by default not query your db in the first place, and only do so if you ask for a non-stored field. You can read more about this feature there: http://www.jkraemer.net/2007/3/26/lazy-loading-with-acts_as_ferret > > Thanks for any help, > Jesse > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Wed Jun 20 05:53:15 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 20 Jun 2007 11:53:15 +0200 Subject: [Ferret-talk] indexed 'text' (not string) column not used when searching, unless explicitely specified!! In-Reply-To: References: Message-ID: <20070620095315.GG22469@cordoba.webit.de> Hi Alain, could you please look in your application's log for a line reading 'default field list: ...' and see what's in there? cheers, Jens On Fri, Jun 15, 2007 at 04:30:19PM +0200, Alain Ravet wrote: > ... additionally, I noticed that prefixing the query term with *: > 'solves' the problem. > > So, > Person.find_by_contents "zixi" -> NO RESULT : WRONG > > but > Person.find_by_contents "*:zixi" -> 1 RESULT : CORRECT > Person.find_by_contents "extra:zixi" -> 1 RESULT : CORRECT > > > This hack would get trickier to implement on queries like : > (a OR B) c name:d > so I'd rather do it the right way. > > Why does this happen : bug, or feature? > > > Alain > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From bagam_venkat at hotmail.com Wed Jun 20 07:40:47 2007 From: bagam_venkat at hotmail.com (venkat) Date: Wed, 20 Jun 2007 13:40:47 +0200 Subject: [Ferret-talk] offline installation of acts_as_ferret gem In-Reply-To: <20070620084450.GB22469@cordoba.webit.de> References: <5bb8a2938cdba5cad72d4aa9d6e4eae5@ruby-forum.com> <20070620084450.GB22469@cordoba.webit.de> Message-ID: <2c2b9930a74d868f0d26eeeaf732e440@ruby-forum.com> Jens Kraemer wrote: > On Tue, Jun 19, 2007 at 01:12:01PM +0200, Venkat Bagam wrote: >> hi folks, >> >> I have been trying to use acts_as_ferret in my rails app.I >> have installed the ferret gem.I would like to download acts_as_ferret >> gem and then install it offline.But i was unable to find any link for >> download..please let me know if any one has did it before... >> >> any help appreciated...thanks in advance.... > > there's a rubyforge project at > http://rubyforge.org/projects/actsasferret where you can download the > gem. > > cheers, > Jens > > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold, Hagen Malessa Hi Jens Cheers... I found it.. thanks a lot.. havva nice time.. bye -- Posted via http://www.ruby-forum.com/. From novaprospekt at gmail.com Wed Jun 20 12:19:42 2007 From: novaprospekt at gmail.com (Richard) Date: Wed, 20 Jun 2007 18:19:42 +0200 Subject: [Ferret-talk] Count_by_content ?? Message-ID: <85ebf7ab8205116579b07d54898fd630@ruby-forum.com> Is there a count_by_content alternative to the find_by_content action? This is because I'm wanting to do the following in my pagination method: def list # step 1: set the variables you'll need page = (params[:page] ||= 1).to_i items_per_page = 20 offset = (page - 1) * items_per_page # step 2: instead of performing a find, just get a count item_count = Item.count_with_some_custom_method() # step 3: create a Paginator, the second argument has to be the number of ALL items on all pages @item_pages = Paginator.new(self, item_count, items_per_page, page) # step 4: only find the requested subset of @items @items = Item.find_with_some_custom_method(items_per_page, offset) end -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Wed Jun 20 15:44:54 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 20 Jun 2007 21:44:54 +0200 Subject: [Ferret-talk] Count_by_content ?? In-Reply-To: <85ebf7ab8205116579b07d54898fd630@ruby-forum.com> References: <85ebf7ab8205116579b07d54898fd630@ruby-forum.com> Message-ID: <20070620194454.GA22907@cordoba.webit.de> On Wed, Jun 20, 2007 at 06:19:42PM +0200, Richard wrote: > Is there a count_by_content alternative to the find_by_content action? there is a method total_hits you can call, but this is not really necessary. just use find_by_contents with the :limit and :offset options and call total_hits on the result returned to find out the total number of results. Jens > > This is because I'm wanting to do the following in my pagination method: > > def list > # step 1: set the variables you'll need > page = (params[:page] ||= 1).to_i > items_per_page = 20 > offset = (page - 1) * items_per_page > > # step 2: instead of performing a find, just get a count > item_count = Item.count_with_some_custom_method() > > # step 3: create a Paginator, the second argument has to be the > number of ALL items on all pages > @item_pages = Paginator.new(self, item_count, items_per_page, page) > > # step 4: only find the requested subset of @items > @items = Item.find_with_some_custom_method(items_per_page, offset) > end > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From shoeso_1999 at hotmail.com Thu Jun 21 16:37:19 2007 From: shoeso_1999 at hotmail.com (chen) Date: Thu, 21 Jun 2007 22:37:19 +0200 Subject: [Ferret-talk] wholesales brand name sport shoes in $25-$38 Message-ID: our company wholesales shoes ,clothing,IPODS MP3,MP4, as nike,jordan, adidas,puma, prada,air force 1,max,shox, oz,timberland,ice cream,rift,bapestar,bate,bbc,gucci,jeans t-shirts,hoodies,red monkey,jerseys,dunk,shirts,sport shoes,sportswear,levis,hat,sandal,handbag,watch,dsquared2,kobe and so on. Our principle"SAFETY+ QUALITY+FAST DELIVERY+ MOST FAVOURABLE PRICE" = PERMANENT CUSTOMER". our website : www.shoesoo.com just email our seller's email ,they will tell you how to place order. -- Posted via http://www.ruby-forum.com/. From drhenner at yahoo.com Fri Jun 22 04:50:07 2007 From: drhenner at yahoo.com (David Henner) Date: Fri, 22 Jun 2007 10:50:07 +0200 Subject: [Ferret-talk] clean uninstall for ferret Message-ID: <0ee5b6c41e6c61a0fab2a31c5d9bc894@ruby-forum.com> I installed ferret and it cleaned me out!!! I do the following: __________________________ # gem install ferret Need to update 4 gems from http://gems.rubyforge.org .... complete Select which gem to install for your platform (i386-linux) 1. ferret 0.11.4 (ruby) 2. ferret 0.11.4 (mswin32) 3. ferret 0.11.3 (ruby) 4. ferret 0.11.2 (ruby) 5. Skip this gem 6. Cancel installation > 3 Building native extensions. This could take a while... ERROR: While executing gem ... (Gem::Installer::ExtensionBuildError) ERROR: Failed to build gem native extension. ruby extconf.rb install ferret can't find header files for ruby. _________________________________________ Then I try to uninstall and i get the following: __________________________________________ gem uninstall ferret ERROR: While executing gem ... (Gem::InstallError) Unknown gem ferret-> 0 ___________________________________________ Then I try to rake my db and i get errors ____________________________________________ $ rake db:migrate (in /home/drhenner/eggpad) rake aborted! no such file to load -- ferret_ext ____________________________________________ I have no idea what i can do to just uninstall ferret from my system. I would prefer to have it work but at the very least i don't want it to stop all future work. I installed 0.11.4 first and i got the same problem. I manually removed all the files and directories in ruby/gems/1.8/gems/ferret-0.11.4/ it didnt help at all. This has to be my worst experience with a rubygem EVER. Everything always runs smooth up until now. What are the ruby header files by the way??? -- Posted via http://www.ruby-forum.com/. From JanPrill at blauton.de Fri Jun 22 05:12:37 2007 From: JanPrill at blauton.de (Jan Prill) Date: Fri, 22 Jun 2007 11:12:37 +0200 Subject: [Ferret-talk] clean uninstall for ferret In-Reply-To: <0ee5b6c41e6c61a0fab2a31c5d9bc894@ruby-forum.com> References: <0ee5b6c41e6c61a0fab2a31c5d9bc894@ruby-forum.com> Message-ID: <562a35c10706220212h303f411egc00a4546cbe9f2e6@mail.gmail.com> Hi David, check out http://www.ruby-forum.com/topic/65550#76725 as well as http://www.ruby-forum.com/search?query=%22no+such+file+to+load+--+ferret_ext%22&forums%5B%5D=5 please. Cheers, Jan Prill 2007/6/22, David Henner : > > I installed ferret and it cleaned me out!!! > I do the following: > > __________________________ > # gem install ferret > Need to update 4 gems from http://gems.rubyforge.org > .... > complete > Select which gem to install for your platform (i386-linux) > 1. ferret 0.11.4 (ruby) > 2. ferret 0.11.4 (mswin32) > 3. ferret 0.11.3 (ruby) > 4. ferret 0.11.2 (ruby) > 5. Skip this gem > 6. Cancel installation > > 3 > Building native extensions. This could take a while... > ERROR: While executing gem ... (Gem::Installer::ExtensionBuildError) > ERROR: Failed to build gem native extension. > > ruby extconf.rb install ferret > can't find header files for ruby. > _________________________________________ > > Then I try to uninstall and i get the following: > __________________________________________ > gem uninstall ferret > ERROR: While executing gem ... (Gem::InstallError) > Unknown gem ferret-> 0 > ___________________________________________ > > Then I try to rake my db and i get errors > ____________________________________________ > > $ rake db:migrate > (in /home/drhenner/eggpad) > rake aborted! > no such file to load -- ferret_ext > ____________________________________________ > > I have no idea what i can do to just uninstall ferret from my system. I > would prefer to have it work but at the very least i don't want it to > stop all future work. I installed 0.11.4 first and i got the same > problem. I manually removed all the files and directories in > ruby/gems/1.8/gems/ferret-0.11.4/ > > it didnt help at all. > > This has to be my worst experience with a rubygem EVER. Everything > always runs smooth up until now. > > What are the ruby header files by the way??? > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jan Prill Rechtsanwalt Gr?nebergstra?e 38 22763 Hamburg Tel +49 (0)40 41265809 Fax +49 (0)40 380178-73022 Mobil +49 (0)171 3516667 http://www.inviado.de -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070622/650029ce/attachment.html From drhenner at yahoo.com Fri Jun 22 10:34:41 2007 From: drhenner at yahoo.com (David Henner) Date: Fri, 22 Jun 2007 16:34:41 +0200 Subject: [Ferret-talk] clean uninstall for ferret In-Reply-To: <562a35c10706220212h303f411egc00a4546cbe9f2e6@mail.gmail.com> References: <0ee5b6c41e6c61a0fab2a31c5d9bc894@ruby-forum.com> <562a35c10706220212h303f411egc00a4546cbe9f2e6@mail.gmail.com> Message-ID: <6226e58514f532adff35ed5540c273b9@ruby-forum.com> THANK YOU SO MUCH that was too easy.. I guess most people have these file installed by default because i don't hear many complaints... -- Posted via http://www.ruby-forum.com/. From eimorton at gmail.com Sat Jun 23 08:45:27 2007 From: eimorton at gmail.com (Erik Morton) Date: Sat, 23 Jun 2007 08:45:27 -0400 Subject: [Ferret-talk] End of File Error on index optmize Message-ID: I was optimizing a 650MB using ferret (0.11.3) and I received the following error. I've seen some people have similar issues but I haven't seen any resolutions. The contents of the index directory follow the error. Has anyone seen anything like this and found a resolution? Many thanks. /mnt/apps/search/releases/20070622175637/script/../config/../vendor/ gems/rdig-0.3.4/lib/rdig/index.rb:48:in `optimize': End-of-File Error occured at :93 in xraise (EOFError) Error occured in store.c:216 - is_refill current pos = 0, file length = 0 from /mnt/apps/search/releases/20070622175637/script/../ config/../vendor/gems/rdig-0.3.4/lib/rdig/index.rb:48:in `close' from /mnt/apps/search/releases/20070622175637/script/../ config/../vendor/gems/rdig-0.3.4/lib/rdig/crawler.rb:36:in `run' from /mnt/apps/search/releases/20070622175637/script/../ config/../vendor/gems/rdig-0.3.4/lib/rdig.rb:274:in `run' from /mnt/apps/search/releases/20070622175637/script/rdig:95 from /usr/lib/ruby/gems/1.8/gems/daemons-1.0.5/lib/daemons/ application.rb:152:in `start_load' from /usr/lib/ruby/gems/1.8/gems/daemons-1.0.5/lib/daemons/ application.rb:229:in `start' from /usr/lib/ruby/gems/1.8/gems/daemons-1.0.5/lib/daemons/ controller.rb:69:in `run' from /usr/lib/ruby/gems/1.8/gems/daemons-1.0.5/lib/ daemons.rb:133:in `run' from /usr/lib/ruby/gems/1.8/gems/daemons-1.0.5/lib/daemons/ cmdline.rb:105:in `catch_exceptions' from /usr/lib/ruby/gems/1.8/gems/daemons-1.0.5/lib/ daemons.rb:132:in `run' ls -Alh /index_dir/ total 649M -rw------- 1 my_user my_user 3.2M Jun 22 15:40 _0.cfs -rw------- 1 my_user my_user 22M Jun 22 17:29 _1.cfs -rw------- 1 my_user my_user 22M Jun 22 18:53 _2.cfs -rw------- 1 my_user my_user 23M Jun 22 20:14 _3.cfs -rw------- 1 my_user my_user 22M Jun 22 22:01 _4.cfs -rw------- 1 my_user my_user 20M Jun 22 21:19 _5.cfs -rw------- 1 my_user my_user 41M Jun 22 22:02 _5.fdt -rw------- 1 my_user my_user 113K Jun 22 22:02 _5.fdx -rw------- 1 my_user my_user 0 Jun 22 22:02 _5.frq -rw------- 1 my_user my_user 0 Jun 22 22:02 _5.prx -rw------- 1 my_user my_user 0 Jun 22 22:02 _5.tfx -rw------- 1 my_user my_user 0 Jun 22 22:02 _5.tis -rw------- 1 my_user my_user 0 Jun 22 22:02 _5.tix -rw------- 1 my_user my_user 57M Jun 22 21:19 _6.fdt -rw------- 1 my_user my_user 161K Jun 22 21:19 _6.fdx -rw------- 1 my_user my_user 0 Jun 22 21:19 _6.frq -rw------- 1 my_user my_user 0 Jun 22 21:19 _6.prx -rw------- 1 my_user my_user 0 Jun 22 21:19 _6.tfx -rw------- 1 my_user my_user 0 Jun 22 21:19 _6.tis -rw------- 1 my_user my_user 0 Jun 22 21:19 _6.tix -rw------- 1 my_user my_user 22M Jun 22 22:50 _7.cfs -rw------- 1 my_user my_user 135M Jun 22 22:50 _8.fdt -rw------- 1 my_user my_user 376K Jun 22 22:50 _8.fdx -rw------- 1 my_user my_user 0 Jun 22 22:50 _8.frq -rw------- 1 my_user my_user 0 Jun 22 22:50 _8.prx -rw------- 1 my_user my_user 0 Jun 22 22:50 _8.tfx -rw------- 1 my_user my_user 0 Jun 22 22:50 _8.tis -rw------- 1 my_user my_user 0 Jun 22 22:50 _8.tix -rw------- 1 my_user my_user 7.9M Jun 22 23:11 _9.cfs -rw------- 1 my_user my_user 276M Jun 22 23:11 _a.fdt -rw------- 1 my_user my_user 767K Jun 22 23:11 _a.fdx -rw------- 1 my_user my_user 0 Jun 22 23:11 _a.frq -rw------- 1 my_user my_user 0 Jun 22 23:11 _a.prx -rw------- 1 my_user my_user 0 Jun 22 23:11 _a.tfx -rw------- 1 my_user my_user 0 Jun 22 23:11 _a.tis -rw------- 1 my_user my_user 0 Jun 22 23:11 _a.tix -rw------- 1 my_user my_user 16 Jun 22 23:11 segments -rw------- 1 my_user my_user 350 Jun 22 23:11 segments_5 From eimorton at gmail.com Sat Jun 23 08:52:21 2007 From: eimorton at gmail.com (Erik Morton) Date: Sat, 23 Jun 2007 08:52:21 -0400 Subject: [Ferret-talk] End of File Error on index optmize References: Message-ID: <26FEA5E1-8271-4C67-9A3A-DDE70B87AB78@gmail.com> Quick follow up. I'm running a cluster of servers to do indexing in parallel and I noticed a similar error on another server. ruby 1.8.4 (2005-12-24) [i386-linux] Linux 2.6.16-xenU #1 SMP Thu Nov 30 13:48:50 SAST 2006 i686 athlon i386 GNU/Linux Over all 16 out of 19 servers have died for as yet unknown reasons. Here is the other error: in `optimize': End-of-File Error occured at :93 in xraise (EOFError) Error occured in compound_io.c:137 - cmpdi_read_i Tried to read past end of file. File length is <4360> and tried to read to <5120> Help is very much appreciated. Thanks. Begin forwarded message: > From: Erik Morton > Date: June 23, 2007 8:45:27 AM EDT > To: Ferret Talk > Subject: End of File Error on index optmize > > I was optimizing a 650MB using ferret (0.11.3) and I received the > following error. I've seen some people have similar issues but I > haven't seen any resolutions. The contents of the index directory > follow the error. Has anyone seen anything like this and found a > resolution? Many thanks. > > /mnt/apps/search/releases/20070622175637/script/../config/../vendor/ > gems/rdig-0.3.4/lib/rdig/index.rb:48:in `optimize': End-of-File > Error occured at :93 in xraise (EOFError) > Error occured in store.c:216 - is_refill > current pos = 0, file length = 0 > > from /mnt/apps/search/releases/20070622175637/script/../ > config/../vendor/gems/rdig-0.3.4/lib/rdig/index.rb:48:in `close' > from /mnt/apps/search/releases/20070622175637/script/../ > config/../vendor/gems/rdig-0.3.4/lib/rdig/crawler.rb:36:in `run' > from /mnt/apps/search/releases/20070622175637/script/../ > config/../vendor/gems/rdig-0.3.4/lib/rdig.rb:274:in `run' > from /mnt/apps/search/releases/20070622175637/script/rdig:95 > from /usr/lib/ruby/gems/1.8/gems/daemons-1.0.5/lib/daemons/ > application.rb:152:in `start_load' > from /usr/lib/ruby/gems/1.8/gems/daemons-1.0.5/lib/daemons/ > application.rb:229:in `start' > from /usr/lib/ruby/gems/1.8/gems/daemons-1.0.5/lib/daemons/ > controller.rb:69:in `run' > from /usr/lib/ruby/gems/1.8/gems/daemons-1.0.5/lib/ > daemons.rb:133:in `run' > from /usr/lib/ruby/gems/1.8/gems/daemons-1.0.5/lib/daemons/ > cmdline.rb:105:in `catch_exceptions' > from /usr/lib/ruby/gems/1.8/gems/daemons-1.0.5/lib/ > daemons.rb:132:in `run' > > ls -Alh /index_dir/ > total 649M > -rw------- 1 my_user my_user 3.2M Jun 22 15:40 _0.cfs > -rw------- 1 my_user my_user 22M Jun 22 17:29 _1.cfs > -rw------- 1 my_user my_user 22M Jun 22 18:53 _2.cfs > -rw------- 1 my_user my_user 23M Jun 22 20:14 _3.cfs > -rw------- 1 my_user my_user 22M Jun 22 22:01 _4.cfs > -rw------- 1 my_user my_user 20M Jun 22 21:19 _5.cfs > -rw------- 1 my_user my_user 41M Jun 22 22:02 _5.fdt > -rw------- 1 my_user my_user 113K Jun 22 22:02 _5.fdx > -rw------- 1 my_user my_user 0 Jun 22 22:02 _5.frq > -rw------- 1 my_user my_user 0 Jun 22 22:02 _5.prx > -rw------- 1 my_user my_user 0 Jun 22 22:02 _5.tfx > -rw------- 1 my_user my_user 0 Jun 22 22:02 _5.tis > -rw------- 1 my_user my_user 0 Jun 22 22:02 _5.tix > -rw------- 1 my_user my_user 57M Jun 22 21:19 _6.fdt > -rw------- 1 my_user my_user 161K Jun 22 21:19 _6.fdx > -rw------- 1 my_user my_user 0 Jun 22 21:19 _6.frq > -rw------- 1 my_user my_user 0 Jun 22 21:19 _6.prx > -rw------- 1 my_user my_user 0 Jun 22 21:19 _6.tfx > -rw------- 1 my_user my_user 0 Jun 22 21:19 _6.tis > -rw------- 1 my_user my_user 0 Jun 22 21:19 _6.tix > -rw------- 1 my_user my_user 22M Jun 22 22:50 _7.cfs > -rw------- 1 my_user my_user 135M Jun 22 22:50 _8.fdt > -rw------- 1 my_user my_user 376K Jun 22 22:50 _8.fdx > -rw------- 1 my_user my_user 0 Jun 22 22:50 _8.frq > -rw------- 1 my_user my_user 0 Jun 22 22:50 _8.prx > -rw------- 1 my_user my_user 0 Jun 22 22:50 _8.tfx > -rw------- 1 my_user my_user 0 Jun 22 22:50 _8.tis > -rw------- 1 my_user my_user 0 Jun 22 22:50 _8.tix > -rw------- 1 my_user my_user 7.9M Jun 22 23:11 _9.cfs > -rw------- 1 my_user my_user 276M Jun 22 23:11 _a.fdt > -rw------- 1 my_user my_user 767K Jun 22 23:11 _a.fdx > -rw------- 1 my_user my_user 0 Jun 22 23:11 _a.frq > -rw------- 1 my_user my_user 0 Jun 22 23:11 _a.prx > -rw------- 1 my_user my_user 0 Jun 22 23:11 _a.tfx > -rw------- 1 my_user my_user 0 Jun 22 23:11 _a.tis > -rw------- 1 my_user my_user 0 Jun 22 23:11 _a.tix > -rw------- 1 my_user my_user 16 Jun 22 23:11 segments > -rw------- 1 my_user my_user 350 Jun 22 23:11 segments_5 From kyle at casttv.com Sat Jun 23 09:20:49 2007 From: kyle at casttv.com (Kyle Maxwell) Date: Sat, 23 Jun 2007 06:20:49 -0700 Subject: [Ferret-talk] End of File Error on index optmize In-Reply-To: <26FEA5E1-8271-4C67-9A3A-DDE70B87AB78@gmail.com> References: <26FEA5E1-8271-4C67-9A3A-DDE70B87AB78@gmail.com> Message-ID: <47699a8d0706230620t49d536d6na6c6d431f07e5473@mail.gmail.com> > in `optimize': End-of-File Error occured at :93 in xraise > (EOFError) > Error occured in compound_io.c:137 - cmpdi_read_i > Tried to read past end of file. File length is <4360> and > tried to read to <5120> > > I was optimizing a 650MB using ferret (0.11.3) and I received the > > following error. I've seen some people have similar issues but I > > haven't seen any resolutions. The contents of the index directory > > follow the error. Has anyone seen anything like this and found a > > resolution? Many thanks. Go to ferret 0.11.4 -- Kyle Maxwell Software Engineer CastTV, Inc http://www.casttv.com From rajmamadgi at gmail.com Sun Jun 24 00:29:56 2007 From: rajmamadgi at gmail.com (Ravindraraj Mamadgi) Date: Sun, 24 Jun 2007 06:29:56 +0200 Subject: [Ferret-talk] Example for using ferret search engine Message-ID: <4640043d554f37595ab9dda11b0a5d11@ruby-forum.com> Hi, Is there any application where I can see the usage of Ferret engine(like example implementation). I have some difficulties in using it, sending query and getting the results. Thank you, Raj. -- Posted via http://www.ruby-forum.com/. From eimorton at gmail.com Sun Jun 24 12:04:47 2007 From: eimorton at gmail.com (Erik Morton) Date: Sun, 24 Jun 2007 12:04:47 -0400 Subject: [Ferret-talk] IO Error when querying on a single field Message-ID: <87719223-639D-4CD7-BC7E-C14FC55AE910@gmail.com> I'm getting an exception when I query a large index (4GB, ~700K docs) on a particular field. For example: supplier_id:77490 IO Error occured at :93 in xraise (IOError) Error occured in fs_store.c:293 - fsi_seek_i seeking pos -1515676213: Anecdotally it seems to happen for supplier_ids > 76900. Here is the field info for that field: supplier_id: index: yes term_vector: :with_position_offsets store: :yes I'm running 0.11.4 with Ruby 1.8.4 on Linux. Has anyone seen anything like this? Erik From kyle at casttv.com Sun Jun 24 13:39:04 2007 From: kyle at casttv.com (Kyle Maxwell) Date: Sun, 24 Jun 2007 10:39:04 -0700 Subject: [Ferret-talk] IO Error when querying on a single field In-Reply-To: <87719223-639D-4CD7-BC7E-C14FC55AE910@gmail.com> References: <87719223-639D-4CD7-BC7E-C14FC55AE910@gmail.com> Message-ID: <47699a8d0706241039u12e556ccufc89722689d2c25d@mail.gmail.com> > I'm getting an exception when I query a large index (4GB, ~700K docs) > on a particular field. Try this: [PATCH] Large file issues http://ferret.davebalmain.com/trac/ticket/215 -- Kyle Maxwell Software Engineer CastTV, Inc http://www.casttv.com From eimorton at gmail.com Sun Jun 24 14:09:05 2007 From: eimorton at gmail.com (Erik Morton) Date: Sun, 24 Jun 2007 14:09:05 -0400 Subject: [Ferret-talk] IO Error when querying on a single field In-Reply-To: <47699a8d0706241039u12e556ccufc89722689d2c25d@mail.gmail.com> References: <87719223-639D-4CD7-BC7E-C14FC55AE910@gmail.com> <47699a8d0706241039u12e556ccufc89722689d2c25d@mail.gmail.com> Message-ID: <680D83AB-11A4-4F78-8205-C08370A7A7D2@gmail.com> Kyle, Thanks for the input. I appreciate it. Stupid question: what's the best way to apply this patch? I'm getting the following error. lionheart:/usr/local/lib/ruby/gems/1.8/gems/ferret-0.11.4 erik$ sudo patch -p0 > I'm getting an exception when I query a large index (4GB, ~700K docs) >> on a particular field. > > Try this: > > [PATCH] Large file issues > http://ferret.davebalmain.com/trac/ticket/215 > > -- > Kyle Maxwell > Software Engineer > CastTV, Inc > http://www.casttv.com > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From dj at collectiveinsight.net Sun Jun 24 14:35:46 2007 From: dj at collectiveinsight.net (David James) Date: Sun, 24 Jun 2007 13:35:46 -0500 Subject: [Ferret-talk] Resetting ferret index before test runs Message-ID: <8C7BF5B4-B887-4767-BB69-E8DF4F6D8A4B@collectiveinsight.net> I need to reset the ferret index between test runs. It seems like there are a few ways to reset the ferret index. * Deleting the index directory -- is this really bad form? * calling rebuild_index * (any others?) What would y'all recommend? (Sorry, I'm from Texas) Preferably, I'd like a way to reset the index that I can integrate into a selenium test. -David From eimorton at gmail.com Sun Jun 24 14:58:47 2007 From: eimorton at gmail.com (Erik Morton) Date: Sun, 24 Jun 2007 14:58:47 -0400 Subject: [Ferret-talk] IO Error when querying on a single field In-Reply-To: <47699a8d0706241039u12e556ccufc89722689d2c25d@mail.gmail.com> References: <87719223-639D-4CD7-BC7E-C14FC55AE910@gmail.com> <47699a8d0706241039u12e556ccufc89722689d2c25d@mail.gmail.com> Message-ID: <7D09A6A5-F4B1-415A-9F0F-27B85DEBC47D@gmail.com> I realized that I was applying the patch against the gem instead of checking out the source. Stupid mistake. It looks like the ferret SVN repository is down? I can't seem to check out the current trunk. Anyway, I grabbed the source gem and applied the patch manually. I had some trouble using the rake install task on Linux--rake ext didn't work, but rake -v did. The test cases passed, so I'm running the reindex that failed with the trunk version of index.h and index.c. Thanks for the help. Did you ever hear from Dave about getting this fix included in the trunk? Thanks again Kyle. Erik On Jun 24, 2007, at 1:39 PM, Kyle Maxwell wrote: >> I'm getting an exception when I query a large index (4GB, ~700K docs) >> on a particular field. > > Try this: > > [PATCH] Large file issues > http://ferret.davebalmain.com/trac/ticket/215 > > -- > Kyle Maxwell > Software Engineer > CastTV, Inc > http://www.casttv.com > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From kyle at casttv.com Sun Jun 24 15:10:04 2007 From: kyle at casttv.com (Kyle Maxwell) Date: Sun, 24 Jun 2007 12:10:04 -0700 Subject: [Ferret-talk] Resetting ferret index before test runs In-Reply-To: <8C7BF5B4-B887-4767-BB69-E8DF4F6D8A4B@collectiveinsight.net> References: <8C7BF5B4-B887-4767-BB69-E8DF4F6D8A4B@collectiveinsight.net> Message-ID: <47699a8d0706241210p61f371abvdf1b086d09a8b9ec@mail.gmail.com> > I need to reset the ferret index between test runs. > > It seems like there are a few ways to reset the ferret index. > * Deleting the index directory -- is this really bad form? This is ok, if it's fast enough for you. > What would y'all recommend? (Sorry, I'm from Texas) I'd personally keep an index somewhere outside of the normal path, and "cp -lr" it into the path -- Kyle Maxwell Software Engineer CastTV, Inc http://www.casttv.com From dj at collectiveinsight.net Sun Jun 24 16:35:25 2007 From: dj at collectiveinsight.net (David James) Date: Sun, 24 Jun 2007 15:35:25 -0500 Subject: [Ferret-talk] Resetting ferret index before test runs In-Reply-To: <47699a8d0706241210p61f371abvdf1b086d09a8b9ec@mail.gmail.com> References: <8C7BF5B4-B887-4767-BB69-E8DF4F6D8A4B@collectiveinsight.net> <47699a8d0706241210p61f371abvdf1b086d09a8b9ec@mail.gmail.com> Message-ID: <3B1A93C4-CD2C-4389-8EEA-90BE406E2296@collectiveinsight.net> Comments... On Jun 24, 2007, at 2:10 PM, Kyle Maxwell wrote: >> I need to reset the ferret index between test runs. >> >> It seems like there are a few ways to reset the ferret index. >> * Deleting the index directory -- is this really bad form? > > This is ok, if it's fast enough for you. > >> What would y'all recommend? (Sorry, I'm from Texas) > > I'd personally keep an index somewhere outside of the normal path, and > "cp -lr" it into the path I don't understand how this solves my problem... Are you saying that I should remove the symbolic links before each test run? Also, not sure if this is a good idea for development, since BSD (Mac OS X) doesn't have the -l option on cp. My fc6 box (deployment) has it though. -David > -- > Kyle Maxwell > Software Engineer > CastTV, Inc > http://www.casttv.com > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From sheuer at int42.org Sun Jun 24 18:42:54 2007 From: sheuer at int42.org (Stephen Heuer) Date: Mon, 25 Jun 2007 00:42:54 +0200 Subject: [Ferret-talk] I only want one type of model returned from a multi_search Message-ID: <82ac39558cbeacd8a2c9260801c5fd65@ruby-forum.com> I am trying to use acts_as_ferret's multi_search to search across multiple models, but i only want it to return one type of model. for example i have a page that lists out people. on this page it shows email addresses and phone numbers. I want to be able to search by any fields directly from the person model and search the fields from the email_address and phone_number models, but I only want to get back people. person has_many email_addresses has_many phone_numbers acts_as_ferret :fields => [:firstname, :lastname, :birth_date] email_address has_one person acts_as_ferret :fields => [:email_address] phone_number has_one person acts_as_ferret :fields => [:phone_number, :phone_type] multi_search(options[:query], ["EmailAddress", "PhoneNumber"], {:limit => :all}) Is this the right way of doing this... or is there a better way? -- Posted via http://www.ruby-forum.com/. From ahfeel-nospam- at gmail.com Sun Jun 24 18:46:22 2007 From: ahfeel-nospam- at gmail.com (ahFeel) Date: Mon, 25 Jun 2007 00:46:22 +0200 Subject: [Ferret-talk] Resetting ferret index before test runs In-Reply-To: <3B1A93C4-CD2C-4389-8EEA-90BE406E2296@collectiveinsight.net> References: <8C7BF5B4-B887-4767-BB69-E8DF4F6D8A4B@collectiveinsight.net> <47699a8d0706241210p61f371abvdf1b086d09a8b9ec@mail.gmail.com> <3B1A93C4-CD2C-4389-8EEA-90BE406E2296@collectiveinsight.net> Message-ID: <8bb9b20e67c3e81392d405d2aafeebf0@ruby-forum.com> You should provide a RAMDirectory to your Index during your tests. That way, each new Index instance creates a new empty RAM stored index, which's faster for tests and doesn't create any files :-) Have a look at the RAMDirectory class :) J?r?mie. -- J?r?mie 'ahFeel' BORDIER Rift Technologies - http://www.installclick.com Blog - http://www.unixaumonde.com -- Posted via http://www.ruby-forum.com/. From none at gmail.com Sun Jun 24 22:35:43 2007 From: none at gmail.com (sarah) Date: Mon, 25 Jun 2007 04:35:43 +0200 Subject: [Ferret-talk] hello, is there a way to exclude duplicates of a field? Message-ID: hi, this is what i am trying to accomplish Post.find_by_contents("artist:#{session[:srchstring]}*") this returns to me all artist with the first letter of say 'n' is there a way to not repeat valuse of the same artist? thank you so much for the help./ -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Jun 25 03:11:37 2007 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 25 Jun 2007 09:11:37 +0200 Subject: [Ferret-talk] hello, is there a way to exclude duplicates of a field? In-Reply-To: References: Message-ID: <20070625071137.GB11906@cordoba.webit.de> On Mon, Jun 25, 2007 at 04:35:43AM +0200, sarah wrote: > > hi, > this is what i am trying to accomplish > > Post.find_by_contents("artist:#{session[:srchstring]}*") > > > this returns to me all artist with the first letter of say 'n' is there > a way to not repeat valuse of the same artist? Not with Ferret alone. But if you combined Ferret with a custom active record query this should be doable. Check out find_id_by_contents which only gets the id values from ferret, and then use these to query your unique artists. find_id_by_contents is used internally by find_by_contents, too, so you can have a look at the implementation to see how it is used. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Mon Jun 25 03:41:41 2007 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 25 Jun 2007 09:41:41 +0200 Subject: [Ferret-talk] Resetting ferret index before test runs In-Reply-To: <8C7BF5B4-B887-4767-BB69-E8DF4F6D8A4B@collectiveinsight.net> References: <8C7BF5B4-B887-4767-BB69-E8DF4F6D8A4B@collectiveinsight.net> Message-ID: <20070625074141.GC11906@cordoba.webit.de> Hi! On Sun, Jun 24, 2007 at 01:35:46PM -0500, David James wrote: > I need to reset the ferret index between test runs. > > It seems like there are a few ways to reset the ferret index. > * Deleting the index directory -- is this really bad form? Not really, but it can cause problems if you have an index instance open in this directory. Closing any open indexes and them removing the directory is the best way to ensure the index is cleared out. > * calling rebuild_index You should do this after removing the old index, of course ;-) In theory, removing the old index before calling rebuild_index should not be needed, but it won't hurt either: Model.aaf_index.close FileUtils.rm_rf Model.aaf_configuration[:index_dir] Model.rebuild_index > What would y'all recommend? (Sorry, I'm from Texas) In general you should think about how often you need the index to be rebuilt. With larger fixture volumes reindexing in your setup method can really slow down your tests - in this case think about only rebuilding the index less frequently, i.e. only before the tests that really need the index are run. You could also split your tests in index changing and not index changing tests, and choose the appropriate rebuild frequency on a case by case basis to reduce the slow down by frequent rebuilds. If you keep an index to be used for testing ready somewhere, replacing RAILS_ROOT/index/test/model with this one in setup should work, too. Just be sure to call Model.aaf_index.close before replacing the index directory to avoid any hickups. Replacing the persistent index directory implementation used in tests with a RAMDirectory based on a persistent test index that is never changed is a good idea, too, but won't work with acts_as_ferret out of the box. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Mon Jun 25 07:48:33 2007 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 25 Jun 2007 13:48:33 +0200 Subject: [Ferret-talk] I only want one type of model returned from a multi_search In-Reply-To: <82ac39558cbeacd8a2c9260801c5fd65@ruby-forum.com> References: <82ac39558cbeacd8a2c9260801c5fd65@ruby-forum.com> Message-ID: <20070625114833.GE11906@cordoba.webit.de> Hi! On Mon, Jun 25, 2007 at 12:42:54AM +0200, Stephen Heuer wrote: > I am trying to use acts_as_ferret's multi_search to search across > multiple models, but i only want it to return one type of model. > > for example i have a page that lists out people. on this page it shows > email addresses and phone numbers. I want to be able to search by any > fields directly from the person model and search the fields from the > email_address and phone_number models, but I only want to get back > people. > > person > has_many email_addresses > has_many phone_numbers > acts_as_ferret :fields => [:firstname, :lastname, :birth_date] > > email_address > has_one person > acts_as_ferret :fields => [:email_address] > > phone_number > has_one person > acts_as_ferret :fields => [:phone_number, :phone_type] > > > > multi_search(options[:query], ["EmailAddress", "PhoneNumber"], {:limit > => :all}) > > Is this the right way of doing this... or is there a better way? Be sure to apply the :store_classname => true option to all your acts_as_ferret calls. Otherwise aaf cannot filter results by class. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Mon Jun 25 07:57:53 2007 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 25 Jun 2007 13:57:53 +0200 Subject: [Ferret-talk] more specific queries via IndexReader In-Reply-To: <7C99D0CB-4289-49F2-AC2B-4D9F4D5474E8@digitalpulp.com> References: <6DD74FFB-446D-4BA2-846B-AD83444A1927@digitalpulp.com> <7C99D0CB-4289-49F2-AC2B-4D9F4D5474E8@digitalpulp.com> Message-ID: <20070625115753.GF11906@cordoba.webit.de> On Sat, Jun 16, 2007 at 03:20:27PM -0400, John Bachir wrote: > > On Jun 16, 2007, at 2:59 PM, John Bachir wrote: > > > We would like to show a list of "most recently added terms", meaning, > > the results of this query: > > > > Resource.aaf_index.ferret_index.reader.terms(:summary) > > > > BUT, only returning terms from a certain set of documents (in our > > case, we are going to filter by creation data). > > > > Is this possible? > > > > Actually I just discovered that when using the drb server, the > ferret_index cannot be accessed. So I imagine it is impossible to > access the index's terms at all? Yeah, the DRb server is not exposing the ferret index to the outside. You can however extend aaf's LocalIndex class (where ferret_index is available) with a custom terms method and call that via DRb (aaf_index is an instance of LocalIndex in local and of RemoteIndex in DRb mode, but RemoteIndex routes all calls via method_missing through to the remote LocalIndex instance). Just be sure to not pull complex Ferret objects (like IndexReaders) across the DRb connection. regarding the original problem - I'm not sure but would guess you can't filter the terms result to only include terms from some documents. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From sheuer at int42.org Mon Jun 25 09:39:57 2007 From: sheuer at int42.org (Stephen Heuer) Date: Mon, 25 Jun 2007 15:39:57 +0200 Subject: [Ferret-talk] I only want one type of model returned from a multi_sear In-Reply-To: <20070625114833.GE11906@cordoba.webit.de> References: <82ac39558cbeacd8a2c9260801c5fd65@ruby-forum.com> <20070625114833.GE11906@cordoba.webit.de> Message-ID: <55b21eff3e3bfc76330d3a27bcf815b5@ruby-forum.com> Hey, > Be sure to apply the :store_classname => true option to all your > acts_as_ferret calls. Otherwise aaf cannot filter results by class. Yep, I got :store_class_name => true in my models. The problem is occurring when i do a search and i get back Person, EmailAddress, and PhoneNumber objects that all match the query, but i only want back Person objects. So, when i do a search for someone via their email i don't want to get back an EmailAddress object and a Person Object, Just the Person Object that the EmailAddress is Associated. Stephen Heuer -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Jun 25 10:22:08 2007 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 25 Jun 2007 16:22:08 +0200 Subject: [Ferret-talk] I only want one type of model returned from a multi_sear In-Reply-To: <55b21eff3e3bfc76330d3a27bcf815b5@ruby-forum.com> References: <82ac39558cbeacd8a2c9260801c5fd65@ruby-forum.com> <20070625114833.GE11906@cordoba.webit.de> <55b21eff3e3bfc76330d3a27bcf815b5@ruby-forum.com> Message-ID: <20070625142208.GE7326@cordoba.webit.de> On Mon, Jun 25, 2007 at 03:39:57PM +0200, Stephen Heuer wrote: > Hey, > > > Be sure to apply the :store_classname => true option to all your > > acts_as_ferret calls. Otherwise aaf cannot filter results by class. > > Yep, I got :store_class_name => true in my models. The problem is > occurring when i do a search and i get back Person, EmailAddress, and > PhoneNumber objects that all match the query, but i only want back > Person objects. So, when i do a search for someone via their email i > don't want to get back an EmailAddress object and a Person Object, Just > the Person Object that the EmailAddress is Associated. Well, if you only want to retrieve Person Objects from your index, then just index Person objects in the first place :-) Why don't you just index the email address right along with the Person? Do you ever need to find a single EmailAddress object? If not, just don't index them in their own index. Instead add a custom field to Person's acts_as_ferred statement for the email address value. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From brickley at gmail.com Mon Jun 25 11:02:54 2007 From: brickley at gmail.com (Chris Brickley) Date: Mon, 25 Jun 2007 17:02:54 +0200 Subject: [Ferret-talk] Ignore apostrophes in words Message-ID: <1dd3f374bdfd502c392e6d42e6a43056@ruby-forum.com> Hi, I just started using ferret and the aaf plugin and it seems to work quite nicely. However, my fields are very short (titles of music) and I don't think may users will be typing in apostrophes when they are looking for something. Right now, for a simple document such as "what i've done" I'd like it to be indexed as "what ive done" instead. Right now I'm using this for my aaf line (I don't want any stop words either as smaller docs, each word even articles can have some significance): acts_as_ferret( { :fields => [ :name ] }, { :analyzer => Ferret::Analysis::StandardAnalyzer.new([]) } ) How should I go about removing the apostrophes when docs are added to the index? Thanks, Chris -- Posted via http://www.ruby-forum.com/. From sheuer at int42.org Mon Jun 25 11:06:39 2007 From: sheuer at int42.org (Stephen Heuer) Date: Mon, 25 Jun 2007 17:06:39 +0200 Subject: [Ferret-talk] I only want one type of model returned from a multi_sear In-Reply-To: <20070625142208.GE7326@cordoba.webit.de> References: <82ac39558cbeacd8a2c9260801c5fd65@ruby-forum.com> <20070625114833.GE11906@cordoba.webit.de> <55b21eff3e3bfc76330d3a27bcf815b5@ruby-forum.com> <20070625142208.GE7326@cordoba.webit.de> Message-ID: > Why don't you just index the email address right along with the Person? > Do you ever need to find a single EmailAddress object? If not, just > don't index them in their own index. Instead add a custom field to > Person's acts_as_ferred statement for the email address value. I would index the email address right along with the person, but there is a multiple association there. I would have to index a whole lot of other data that I would like to be able to search through: has_many :phone_numbers has_many :addresses has_many :email_addresses has_many :enrollments has_many :facilities_applications has_many :course_invoices has_many :refunds has_many :medications has_many :instructor_bios has_one :immunization has_one :administrator put that with the fact that I have over 50 models ( all with many associations ) in my application of which at least half of need to be searchable, and that turns into a large task. I created a module that helps me generate functions for list views ( with sorting and searching ) that used mysql fulltext searching, but when searching through ~100,000 records, it would take the app upwards of 7 seconds to finish finding results. So I was rewriting it to work with acts_as_ferret. So I would assume that acts_as_ferret multi_search doesn't have the ability to be told that I only want one type of model even though i want to search through multiple models (to get associations). -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Jun 26 04:04:52 2007 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 26 Jun 2007 10:04:52 +0200 Subject: [Ferret-talk] I only want one type of model returned from a multi_sear In-Reply-To: References: <82ac39558cbeacd8a2c9260801c5fd65@ruby-forum.com> <20070625114833.GE11906@cordoba.webit.de> <55b21eff3e3bfc76330d3a27bcf815b5@ruby-forum.com> <20070625142208.GE7326@cordoba.webit.de> Message-ID: <20070626080452.GG7326@cordoba.webit.de> On Mon, Jun 25, 2007 at 05:06:39PM +0200, Stephen Heuer wrote: > > > Why don't you just index the email address right along with the Person? > > Do you ever need to find a single EmailAddress object? If not, just > > don't index them in their own index. Instead add a custom field to > > Person's acts_as_ferred statement for the email address value. > > I would index the email address right along with the person, but there > is a multiple association there. I would have to index a whole lot of > other data that I would like to be able to search through: > > has_many :phone_numbers > has_many :addresses > has_many :email_addresses > has_many :enrollments > has_many :facilities_applications > has_many :course_invoices > has_many :refunds > has_many :medications > has_many :instructor_bios > has_one :immunization > has_one :administrator > > put that with the fact that I have over 50 models ( all with many > associations ) in my application of which at least half of need to be > searchable, and that turns into a large task. > > I created a module that helps me generate functions for list views ( > with sorting and searching ) that used mysql fulltext searching, but > when searching through ~100,000 records, it would take the app upwards > of 7 seconds to finish finding results. So I was rewriting it to work > with acts_as_ferret. > > So I would assume that acts_as_ferret multi_search doesn't have the > ability to be told that I only want one type of model even though i want > to search through multiple models (to get associations). no, as aaf doesn't store relationships between records there's no way to do this. However, given the fact your models all have a :person relationship, you could easily filter your results after running the search. However I wouldn't suggest this, I'd really go for a single Person index having all the information in it. For the has_many relationships - just join the contents of all elements together and put them into a single field. What might ease your work with indexing all the related objects along with the Person is a patch residing in aaf's Trac. Unfortunately I didn't find the time to apply this to trunk yet, but it does exactly what you want - just name the relationships as field names in your :fields list. the corresponding ticket is there: http://projects.jkraemer.net/acts_as_ferret/ticket/96 Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Tue Jun 26 04:45:40 2007 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 26 Jun 2007 10:45:40 +0200 Subject: [Ferret-talk] Ignore apostrophes in words In-Reply-To: <1dd3f374bdfd502c392e6d42e6a43056@ruby-forum.com> References: <1dd3f374bdfd502c392e6d42e6a43056@ruby-forum.com> Message-ID: <20070626084540.GI7326@cordoba.webit.de> On Mon, Jun 25, 2007 at 05:02:54PM +0200, Chris Brickley wrote: > Hi, I just started using ferret and the aaf plugin and it seems to work > quite nicely. However, my fields are very short (titles of music) and I > don't think may users will be typing in apostrophes when they are > looking for something. Right now, for a simple document such as "what > i've done" I'd like it to be indexed as "what ive done" instead. Right > now I'm using this for my aaf line (I don't want any stop words either > as smaller docs, each word even articles can have some significance): > > acts_as_ferret( { :fields => [ :name ] }, { :analyzer => > Ferret::Analysis::StandardAnalyzer.new([]) } ) > > How should I go about removing the apostrophes when docs are added to > the index? I'd implement a custom analyzer that does what StandardAnalyzer does, plus filtering out the apostrophes from the tokens (which should be possible with a custom filter added to the chain). For a starting point, see http://ferret.davebalmain.com/api/classes/Ferret/Analysis/StandardAnalyzer.html Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From syrius.ml at no-log.org Tue Jun 26 05:46:29 2007 From: syrius.ml at no-log.org (syrius.ml at no-log.org) Date: Tue, 26 Jun 2007 11:46:29 +0200 Subject: [Ferret-talk] another issue with highlighting In-Reply-To: <874pl38v6n.873b0n8v6n@871wg78v6n.message.id> (syrius ml's message of "Wed, 20 Jun 2007 00:30:03 +0200") References: <874pl38v6n.873b0n8v6n@871wg78v6n.message.id> Message-ID: <87hcov127e.87fy4f127e@87ejjz127e.message.id> syrius.ml at no-log.org writes: > Hi, > > I'm encountering another highlighting issue. > (about the first one "range search and highlighting", i received no > response. I don't even know if somebody tried to reproduce and/or if > it's normal behavior) > > about the new issue, an example will be easier for you to reproduce: > I'm filling an index with random data, i try to match for "*1*" and > then highlight the matched tokens. If it's matched and not highlighted > i put it in z > > It works as expected when there're 100 entries (replace 500.times by > 100.times), in that case z contains empty arrays. > When having 500 entries it doesn't highlight every matches ! > > This example has been tested with 0.11.4 > (r770 has been tested with the application i first discovered this > issue with) > > I would appreciate if you could test and tell me if I'm the only one > having this problem. > TIA > > > > require 'ferret' > include Ferret > > # filling > index=Index::Index.new(:path => '/tmp/test') > chars1 = chars2 = chars3 = chars4 = ("a".."z").to_a + ("0".."9").to_a > chars2.concat(["-", "_", " "]) > chars3 << " " > chars4 << "-" > chars5 = chars6 = ("0".."9").to_a > chars6 << "." > 500.times do > z={} > t="" > 1.upto(15+rand(10)) { |i| t << chars4[rand(chars4.size-1)] } > z[:un] = t > t="" > 1.upto(40+rand(40)) { |i| t << chars2[rand(chars2.size-1)] } > z[:deux] = t > t="" > 1.upto(30+rand(10)) { |i| t << chars4[rand(chars4.size-1)] } > z[:trois] = t > t="" > 1.upto(30+rand(10)) { |i| t << chars1[rand(chars1.size-1)] } > z[:quatre] = t > t="" > 1.upto(30+rand(10)) { |i| t << chars2[rand(chars2.size-1)] } > z[:cinq] = t > t="" > 1.upto(12) { |i| t << chars5[rand(chars5.size-1)] } > z[:six] = t > t="" > 1.upto(12) { |i| t << chars6[rand(chars6.size-1)] } > z[:sept] = t > t="" > 1.upto(12) { |i| t << chars6[rand(chars6.size-1)] } > z[:huit] = t > t="" > 1.upto(24+rand(24)) { |i| t << chars3[rand(chars3.size-1)] } > z[:neuf] = t > t="" > 1.upto(100+rand(100)) { |i| t << chars2[rand(chars2.size-1)] } > z[:dix] = t > index << z > end > > #testing > q="*1*" > z={} > index.search_each(q,:limit => :all) do |id,score| > for b in [:un, :deux, :trois, :quatre, :cinq, :six, :sept, :huit, :neuf, :dix] > z[b]=[] if not z[b] > z[b] << id.to_s + " : " + index.highlight(q,id,:field => b, :pre_tag => "", :post_tag => "", :num_excerpts => :all, :excerpt_length => :all).join(" | ") if index[id][b].match(/1/) and index.highlight(q,id,:field => b, :pre_tag => "", :post_tag => "", :num_excerpts => :all, :excerpt_length => :all) and not index.highlight(q,id,:field => b, :pre_tag => "", :post_tag => "", :num_excerpts => :all, :excerpt_length => :all).join(" | ").match(//) > end > end > z > index.search("*",:limit => :all).total_hits > Hi, Could somebody mind trying to reproduce this please ? Highlighting is a very important feature for me, I need to know if I'm doing something wrong or if it's a dirty bug. Thanks in advance -- From mmangino at elevatedrails.com Tue Jun 26 09:17:45 2007 From: mmangino at elevatedrails.com (Mike Mangino) Date: Tue, 26 Jun 2007 15:17:45 +0200 Subject: [Ferret-talk] Reverse Sorting with array Message-ID: <2f28765e5787422ec268120da105e2cb@ruby-forum.com> I run several websites that receive a lot of concurrent access and use the ferret DRB server. I am adding sort functionality to my search pages and have run into problems with the marshalling of Sort and SortField objects. Using the :sort=>"name" style works, but I can't figure out how to reverse the sort. Is it possible to reverse the sort by passing a string? Thanks, Mike http://www.elevatedrails.com -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Jun 26 09:43:55 2007 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 26 Jun 2007 15:43:55 +0200 Subject: [Ferret-talk] Reverse Sorting with array In-Reply-To: <2f28765e5787422ec268120da105e2cb@ruby-forum.com> References: <2f28765e5787422ec268120da105e2cb@ruby-forum.com> Message-ID: <20070626134355.GJ7326@cordoba.webit.de> On Tue, Jun 26, 2007 at 03:17:45PM +0200, Mike Mangino wrote: > I run several websites that receive a lot of concurrent access and use > the ferret DRB server. I am adding sort functionality to my search pages > and have run into problems with the marshalling of Sort and SortField > objects. Using the :sort=>"name" style works, but I can't figure out how > to reverse the sort. Is it possible to reverse the sort by passing a > string? of course - just use :sort => 'name DESC'. Regarding the sort object/marshalling problems - this is supposed to work, do you have a stack trace and some code leading to it for me? Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From brickley at gmail.com Tue Jun 26 10:25:27 2007 From: brickley at gmail.com (Chris Brickley) Date: Tue, 26 Jun 2007 16:25:27 +0200 Subject: [Ferret-talk] Ignore apostrophes in words In-Reply-To: <20070626084540.GI7326@cordoba.webit.de> References: <1dd3f374bdfd502c392e6d42e6a43056@ruby-forum.com> <20070626084540.GI7326@cordoba.webit.de> Message-ID: Ok thanks for that link. However, I am a bit lost as to where I would put my analyzer code? In my model itself or somewhere else? This is what I came up with: class MyAnalyzer < Analyzer def initialize(stop_words = FULL_ENGLISH_STOP_WORDS, lower = true) @lower = lower @stop_words = stop_words end def token_stream(field, str) ts = StandardTokenizer.new(str) ts = LowerCaseFilter.new(ts) if @lower ts = StopFilter.new(ts, @stop_words) ts = HyphenFilter.new(ts) ts = ApostropheFilter.new(ts) end end class ApostropheFilter def next() t = @input.next() if (t == nil) return nil end t.term_text = t.term_text.tr("'","") return t end end I tried putting it below my aaf declaration in my model file but I just get: "NameError: uninitialized constant Ferret::Analysis::MyAnalyzer" when trying to do Model.rebuild_index. Thanks. -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Jun 26 10:43:03 2007 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 26 Jun 2007 16:43:03 +0200 Subject: [Ferret-talk] Ignore apostrophes in words In-Reply-To: References: <1dd3f374bdfd502c392e6d42e6a43056@ruby-forum.com> <20070626084540.GI7326@cordoba.webit.de> Message-ID: <20070626144303.GK7326@cordoba.webit.de> I'd just put this into lib/, if you call the file my_analyzer.rb it should be found and loaded by Rails automatically when you use the class. if not, require it explicitly in environment.rb. Jens On Tue, Jun 26, 2007 at 04:25:27PM +0200, Chris Brickley wrote: > Ok thanks for that link. However, I am a bit lost as to where I would > put my analyzer code? In my model itself or somewhere else? > > This is what I came up with: > > > class MyAnalyzer < Analyzer > def initialize(stop_words = FULL_ENGLISH_STOP_WORDS, lower = true) > @lower = lower > @stop_words = stop_words > end > > def token_stream(field, str) > ts = StandardTokenizer.new(str) > ts = LowerCaseFilter.new(ts) if @lower > ts = StopFilter.new(ts, @stop_words) > ts = HyphenFilter.new(ts) > ts = ApostropheFilter.new(ts) > end > end > > class ApostropheFilter > def next() > t = @input.next() > > if (t == nil) > return nil > end > > t.term_text = t.term_text.tr("'","") > > return t > end > end > > I tried putting it below my aaf declaration in my model file but I just > get: > "NameError: uninitialized constant Ferret::Analysis::MyAnalyzer" when > trying to do Model.rebuild_index. > > Thanks. > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From sheuer at int42.org Tue Jun 26 15:02:00 2007 From: sheuer at int42.org (Stephen Heuer) Date: Tue, 26 Jun 2007 21:02:00 +0200 Subject: [Ferret-talk] I only want one type of model returned from a multi_sear In-Reply-To: <20070626080452.GG7326@cordoba.webit.de> References: <82ac39558cbeacd8a2c9260801c5fd65@ruby-forum.com> <20070625114833.GE11906@cordoba.webit.de> <55b21eff3e3bfc76330d3a27bcf815b5@ruby-forum.com> <20070625142208.GE7326@cordoba.webit.de> <20070626080452.GG7326@cordoba.webit.de> Message-ID: <8752378871e82231be91a0edd29203aa@ruby-forum.com> > the corresponding ticket is there: > http://projects.jkraemer.net/acts_as_ferret/ticket/96 Excellent, this works perfectly. Thanks, Stephen Heuer http://www.int42.org -- Posted via http://www.ruby-forum.com/. From brickley at gmail.com Tue Jun 26 17:23:29 2007 From: brickley at gmail.com (Chris Brickley) Date: Tue, 26 Jun 2007 23:23:29 +0200 Subject: [Ferret-talk] Ignore apostrophes in words In-Reply-To: <20070626144303.GK7326@cordoba.webit.de> References: <1dd3f374bdfd502c392e6d42e6a43056@ruby-forum.com> <20070626084540.GI7326@cordoba.webit.de> <20070626144303.GK7326@cordoba.webit.de> Message-ID: <966dafa2d2f45beb3dd865165e4efc78@ruby-forum.com> Jens Kraemer wrote: > I'd just put this into lib/, if you call the file my_analyzer.rb it > should be found and loaded by Rails automatically when you use the > class. > > if not, require it explicitly in environment.rb. > > Jens Awesome! Thanks Jens :) Adding the require to environment.rb did the trick (as well as putting it in the lib dir). Thanks for all your help! -- Posted via http://www.ruby-forum.com/. From ferret-talk at stuartsierra.com Tue Jun 26 17:26:17 2007 From: ferret-talk at stuartsierra.com (Stuart Sierra) Date: Tue, 26 Jun 2007 17:26:17 -0400 Subject: [Ferret-talk] Ferret Subversion repository unavailable? In-Reply-To: <7D09A6A5-F4B1-415A-9F0F-27B85DEBC47D@gmail.com> References: <87719223-639D-4CD7-BC7E-C14FC55AE910@gmail.com> <47699a8d0706241039u12e556ccufc89722689d2c25d@mail.gmail.com> <7D09A6A5-F4B1-415A-9F0F-27B85DEBC47D@gmail.com> Message-ID: <46818479.1000805@stuartsierra.com> Erik Morton wrote: > It looks like the ferret SVN repository is down? I can't seem to > check out the current trunk. I can't get at the SVN repository directly: > $ svn checkout svn://davebalmain.com/ferret/trunk ferret > svn: Can't connect to host 'davebalmain.com': Connection refused But you can download the latest trunk via the subversion web interface: http://ferret.davebalmain.com/trac/browser/trunk Click on the "Zip archive" link at the bottom of the page. -Stuart Sierra From jonathan.viney at gmail.com Wed Jun 27 05:52:09 2007 From: jonathan.viney at gmail.com (Jonathan Viney) Date: Wed, 27 Jun 2007 11:52:09 +0200 Subject: [Ferret-talk] acts_as_ferret, DRb, and filter_proc Message-ID: <77c443cf139603fd04116872fe10b05c@ruby-forum.com> I was just trying to use acts_as_ferret with DRb and filter_proc, without much success. My guess is that this isn't possible because there's no way to send the proc to the DRb server, correct? Person.find_by_contents("jonathan", :filter_proc => proc {}) causes an exception when using DRb (DRb::DRbConnError: DRb::DRbServerNotFound). I can work around this by applying the filter after searching as the exact number of results isn't too important in this case. Cheers, -Jonathan -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Wed Jun 27 06:13:30 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 27 Jun 2007 12:13:30 +0200 Subject: [Ferret-talk] another issue with highlighting In-Reply-To: <874pl38v6n.873b0n8v6n@871wg78v6n.message.id> References: <874pl38v6n.873b0n8v6n@871wg78v6n.message.id> Message-ID: <20070627101330.GN7326@cordoba.webit.de> On Wed, Jun 20, 2007 at 12:30:03AM +0200, syrius.ml at no-log.org wrote: > > Hi, > > I'm encountering another highlighting issue. > (about the first one "range search and highlighting", i received no > response. I don't even know if somebody tried to reproduce and/or if > it's normal behavior) > > about the new issue, an example will be easier for you to reproduce: > I'm filling an index with random data, i try to match for "*1*" and > then highlight the matched tokens. If it's matched and not highlighted > i put it in z > > It works as expected when there're 100 entries (replace 500.times by > 100.times), in that case z contains empty arrays. > When having 500 entries it doesn't highlight every matches ! > > This example has been tested with 0.11.4 > (r770 has been tested with the application i first discovered this > issue with) > > I would appreciate if you could test and tell me if I'm the only one > having this problem. Here z is not empty in the 500 case, too. Strange behaviour indeed, imho highlight should either return strings with some highlighted content, or nothing at all when there's nothing to highlight... Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Wed Jun 27 06:18:18 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 27 Jun 2007 12:18:18 +0200 Subject: [Ferret-talk] acts_as_ferret, DRb, and filter_proc In-Reply-To: <77c443cf139603fd04116872fe10b05c@ruby-forum.com> References: <77c443cf139603fd04116872fe10b05c@ruby-forum.com> Message-ID: <20070627101818.GO7326@cordoba.webit.de> On Wed, Jun 27, 2007 at 11:52:09AM +0200, Jonathan Viney wrote: > I was just trying to use acts_as_ferret with DRb and filter_proc, > without much success. My guess is that this isn't possible because > there's no way to send the proc to the DRb server, correct? > > Person.find_by_contents("jonathan", :filter_proc => proc {}) causes an > exception when using DRb (DRb::DRbConnError: DRb::DRbServerNotFound). afaik acts_as_ferret has no such option. But you're right, Procs across DRb is not that easy, if possible at all. I don't know what you want to do in that proc, maybe some simple active record conditions can do what you want? Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From jonathan.viney at gmail.com Wed Jun 27 06:26:01 2007 From: jonathan.viney at gmail.com (Jonathan Viney) Date: Wed, 27 Jun 2007 12:26:01 +0200 Subject: [Ferret-talk] acts_as_ferret, DRb, and filter_proc In-Reply-To: <20070627101818.GO7326@cordoba.webit.de> References: <77c443cf139603fd04116872fe10b05c@ruby-forum.com> <20070627101818.GO7326@cordoba.webit.de> Message-ID: <667ddcc3828c0d5301793137266458e7@ruby-forum.com> Yes, it's just a filter to remove people who shouldn't be visible so I'll use AR conditions. I guess it should be possible to use a filter with DRb, perhaps by just sending a code string that gets evaled by the DRb server. -Jonathan. Jens Kraemer wrote: > afaik acts_as_ferret has no such option. But you're right, Procs across > DRb is not that easy, if possible at all. I don't know what you want to > do in that proc, maybe some simple active record conditions can do what > you want? > -- Posted via http://www.ruby-forum.com/. From syrius.ml at no-log.org Wed Jun 27 06:33:26 2007 From: syrius.ml at no-log.org (syrius.ml at no-log.org) Date: Wed, 27 Jun 2007 12:33:26 +0200 Subject: [Ferret-talk] another issue with highlighting In-Reply-To: <20070627101330.GN7326@cordoba.webit.de> (Jens Kraemer's message of "Wed, 27 Jun 2007 12:13:30 +0200") References: <874pl38v6n.873b0n8v6n@871wg78v6n.message.id> <20070627101330.GN7326@cordoba.webit.de> Message-ID: <87odj1g053.87myylg053@87lke5g053.message.id> Jens Kraemer writes: > Here z is not empty in the 500 case, too. Strange behaviour indeed, > imho highlight should either return strings with some highlighted > content, or nothing at all when there's nothing to highlight... Hi, Thanks Jens for your answer. hmm since content has to be stored the default behavior doesn't surprise me. But the issue does ! :) I'm looking forward to read Dave about this :) -- From jesse at hogbaysoftware.com Wed Jun 27 10:41:31 2007 From: jesse at hogbaysoftware.com (Jesse Grosjean) Date: Wed, 27 Jun 2007 16:41:31 +0200 Subject: [Ferret-talk] :store => :yes doesn't work in some cases In-Reply-To: <20070620094917.GF22469@cordoba.webit.de> References: <3718a6efdb1006fc97cfc328b27f2023@ruby-forum.com> <20070620094917.GF22469@cordoba.webit.de> Message-ID: <8b21e2d4ed474feec60ccfa9636f1165@ruby-forum.com> Thanks for responding, I didn't know about the :lazy option. Or at least I didn't understand how it worked, thanks. > Imho the highlight method is supposed to return nil when nothing to > highlight is there. In this case just retrieve the content of the field > with result.name or doc[:name] if working with Ferret directly. Ok that's easy enough to do, thanks. > I just checked with a plain Ferret script and it had no problems > retrieving field contents that were just a stop word. If we don't > get this to work there might be an aaf bug, though ;-) No that's working correctly, I just didn't realize that that wasn't loading my models (if the fields were stored). > What do you mean with 'loading the value directly'? > > If you used aaf's :lazy => true option when searching, aaf would by > default not query your db in the first place, and only do so if you ask > for a non-stored field. > > You can read more about this feature there: > http://www.jkraemer.net/2007/3/26/lazy-loading-with-acts_as_ferret Thanks I was missing that part. Now it's working, or almost. The on trouble spot is that when I highlight my results the models do get loaded. I guess is this because FerretResult doesn't have a highlight method, and that causes it to load the underlying model. To get around this I've changed from: result.highlight(...) to: result.model.aaf_index.highlight(result.id, result.model.name...) After doing that the database is no longer hit when displaying highlighted ferret results, but to do that I needed to add "attr_accessor :model" to ActsAsFerret > ResultAttributes. Is there a better way to do that? If not could you add that attr_accessor to ferret proper? It might also make sense to also add the highlight method to FerretResult to avoid the model load, but I'd still like some way to access the model class without requiring a model load since I need that to generate the right link the the original page for each result. I hope at least some of that makes sense :) Thanks again for the great toolkit. Jesse -- Posted via http://www.ruby-forum.com/. From jesse at hogbaysoftware.com Wed Jun 27 10:53:07 2007 From: jesse at hogbaysoftware.com (Jesse Grosjean) Date: Wed, 27 Jun 2007 16:53:07 +0200 Subject: [Ferret-talk] Highlight slowness In-Reply-To: References: Message-ID: Paul Lynch wrote: > Has anyone else found that using ferret's highlighting slows searches > down significantly? I am seeing that it more than doubles the search > time on my system. I am returning up to 500 results at once, so the > slow down is quite noticeable (probably adding about .7 seconds for > searches with large result sets.) > --Paul Are you using lazy loading? If so the problem might be that you are not loading anything from your database when doing non-highlight searches (good! fast), but you are when you do a highlight search (bad! slow) Or that might not have anything to do with it, I'm still learning ferret and rails. See this post: http://www.ruby-forum.com/topic/111027#264368 Jesse -- Posted via http://www.ruby-forum.com/. From ferdouse at precisionrecruitment.co.uk Wed Jun 27 11:27:28 2007 From: ferdouse at precisionrecruitment.co.uk (Ferdouse Ara) Date: Wed, 27 Jun 2007 17:27:28 +0200 Subject: [Ferret-talk] Ruby on Rails Developer In-Reply-To: References: Message-ID: Hi Please can you help me with the following: Looking for a Ruby on Rails Developer, Salary ?30K-?60K, Leicester, City Centre Use the latest web technology Ruby on Rails. To develop professional web applications. Must have written previous Rail applications. Need the talents of a smart, experienced programmer. Need to be attentive to detail, a team-player and responsible. Require previous commercial experience with Ruby on Rails and dealing with Database Schema?s to Front End Applications. Involved with creative projects which are making them a big name in the gaming and video industries (sony and microsoft). Would you be interested or do you know anyone who would be interested as there are several positions available. I appreciate your help in this. Ferdouse Ara Technical Recruitment Consultant Precision Recruitment UK Ltd Tel +44 (0) 116 2545488 Fax +44 (0) 871 2778928 http://www.precisionrecruitment.co.uk -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Jun 28 05:55:43 2007 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 28 Jun 2007 11:55:43 +0200 Subject: [Ferret-talk] Example for using ferret search engine In-Reply-To: <4640043d554f37595ab9dda11b0a5d11@ruby-forum.com> References: <4640043d554f37595ab9dda11b0a5d11@ruby-forum.com> Message-ID: <20070628095543.GR7326@cordoba.webit.de> On Sun, Jun 24, 2007 at 06:29:56AM +0200, Ravindraraj Mamadgi wrote: > Hi, > > Is there any application where I can see the usage of Ferret engine(like > example implementation). I have some difficulties in using it, sending > query and getting the results. For the usage of plain Ferret in a non-Web-Application, you might want to check out RDig (http://rubyforge.org/projects/rdig). For a really large Web application using Ferret have a look at the omdb.org source code at http://bugs.omdb.org/ . acts_as_ferret comes with a small demo application, you can have a look at it in Trac: http://projects.jkraemer.net/acts_as_ferret/browser/trunk/demo Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From jesse at hogbaysoftware.com Thu Jun 28 07:39:27 2007 From: jesse at hogbaysoftware.com (Jesse Grosjean) Date: Thu, 28 Jun 2007 13:39:27 +0200 Subject: [Ferret-talk] DRb server crashing Message-ID: I'm having a problem where the DRb server seems to be disappearing (crashing?) and I don't know how to track down the cause or work around the problem. Unfortunately I haven't found a way to reproduce the problem, but it seems to happen fairly often (maybe once a day). Other then that ferret seems to be working well. I can't seem to find any trace of the crash, neither of the ferret_server.out (blank) and ferret_server.log (tail -n 100 after a crash attached below) seem to hold any hints to the problem. So my questions are. 1. Has anyone seen this behavior before, how did you work around it? 2. If no one has seen it, any ideas of where I should look to track it down? 3. Last if I can't fix it how much work would it be to just turn off ferret when the server was down. That would mean indexing and searching wouldn't work, but as it now stands no one can post to my site when the problem occurs. Thanks, Jesse ----------- typical crash that I see in my rails app -------------- A DRb::DRbConnError occurred in comments#create: druby://127.0.0.1:40869 - # /usr/local/lib/ruby/1.8/drb/drb.rb:736:in `open' ------------------------------- Request: ------------------------------- * URL: http://127.0.0.1:40860/forums/writeroom/topics/18_WR_1_document_model/comments * Parameters: {"topic_id"=>"18_WR_1_document_model", "commit"=>"Create and Save", "action"=>"create", "controller"=>"comments", "forum_id"=>"writeroom", "comment"=>{"comment"=>"Perhaps there is a way to let the two document models coexist. What if you could open, save and close documents the standard way, but let WriteRoom handle all open documents internally until they are manually closed? Even between sessions? I imagine this could be achieved by using an internal database, or perhaps a folder with \"working\" files under Application Support. This would also serve as a simple versioning system -- you open your file, manipulate it in WriteRoom (perhaps during the course of many sessions) and then save the changes when your ready.\r\nThis would accomplish the same thing as WR 1.0 did, but would be invisible to those who want the familiar old file-system. I also believe this would bring back the \"room\" in WriteRoom.\r\n\r\nJust a thought.", "parent_id"=>""}} * Rails root: /home/jessegr/apps/blocks/releases/20070627174655 ------------------------------- Session: ------------------------------- * session id: "6c6a8092f55f2b5160197f33ec616e74" * data: {"flash"=>{}, :user=>15} ------------------------------- Environment: ------------------------------- * CONTENT_LENGTH : 872 * CONTENT_TYPE : application/x-www-form-urlencoded * GATEWAY_INTERFACE : CGI/1.2 * HTTP_ACCEPT : */* * HTTP_ACCEPT_ENCODING : gzip, deflate * HTTP_ACCEPT_LANGUAGE : sv-se * HTTP_CONNECTION : Keep-Alive * HTTP_CONTENT_LENGTH : 872 * HTTP_CONTENT_TYPE : application/x-www-form-urlencoded * HTTP_COOKIE : _blocks_session_id=6c6a8092f55f2b5160197f33ec616e74; __utmz=26552955.1182412857.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none); __utmc=26552955; __utmb=26552955; auth_token=4618c42bf7a93f8d74e9bef8eedb1c9fbfb37690; __utma=26552955.823564440.1182412857.1183021598.1183026292.14 * HTTP_HOST : 127.0.0.1:40860 * HTTP_MAX_FORWARDS : 10 * HTTP_REFERER : http://75.126.217.82/forums/writeroom/topics/18_WR_1_document_model/comments/new * HTTP_USER_AGENT : Mozilla/5.0 (Macintosh; U; PPC Mac OS X; sv-se) AppleWebKit/419.2.1 (KHTML, like Gecko) Safari/419.3 * HTTP_VERSION : HTTP/1.1 * HTTP_X_FORWARDED_FOR : 213.185.4.64 * HTTP_X_FORWARDED_HOST : 75.126.217.82 * HTTP_X_FORWARDED_SERVER: www.jesse-grosjean-temp.com * PATH_INFO : /forums/writeroom/topics/18_WR_1_document_model/comments * RAW_POST_DATA : [FILTERED] * REMOTE_ADDR : 213.185.4.64 * REQUEST_METHOD : POST * REQUEST_PATH : /forums/writeroom/topics/18_WR_1_document_model/comments * REQUEST_URI : /forums/writeroom/topics/18_WR_1_document_model/comments * SCRIPT_NAME : / * SERVER_NAME : 127.0.0.1 * SERVER_PORT : 40860 * SERVER_PROTOCOL : HTTP/1.1 * SERVER_SOFTWARE : Mongrel 1.0.1 * Process: 4894 * Server : spurgeon ------------------------------- Backtrace: ------------------------------- /usr/local/lib/ruby/1.8/drb/drb.rb:736:in `open' /usr/local/lib/ruby/1.8/drb/drb.rb:729:in `each' /usr/local/lib/ruby/1.8/drb/drb.rb:729:in `open' /usr/local/lib/ruby/1.8/drb/drb.rb:1189:in `initialize' /usr/local/lib/ruby/1.8/drb/drb.rb:1169:in `new' /usr/local/lib/ruby/1.8/drb/drb.rb:1169:in `open' /usr/local/lib/ruby/1.8/drb/drb.rb:1085:in `method_missing' /usr/local/lib/ruby/1.8/drb/drb.rb:1103:in `with_friend' /usr/local/lib/ruby/1.8/drb/drb.rb:1084:in `method_missing' [RAILS_ROOT]/vendor/plugins/acts_as_ferret/lib/remote_index.rb:31:in `<<' [RAILS_ROOT]/vendor/plugins/acts_as_ferret/lib/instance_methods.rb:73:in `ferret_update' [RAILS_ROOT]/app/models/topic.rb:61:in `update_cached_fields' [RAILS_ROOT]/app/models/topic.rb:45:in `update_cached_comment_fields' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/associations/association_proxy.rb:128:in `send' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/associations/association_proxy.rb:128:in `method_missing' [RAILS_ROOT]/app/models/comment.rb:62:in `update_cached_fields' [RAILS_ROOT]/app/models/comment.rb:43:in `after_update' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/callbacks.rb:352:in `send' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/callbacks.rb:352:in `callback' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/callbacks.rb:271:in `update_without_timestamps' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/timestamp.rb:38:in `update' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/base.rb:1959:in `create_or_update_without_callbacks' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/callbacks.rb:243:in `create_or_update' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/base.rb:1693:in `save_without_validation' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/validations.rb:848:in `save_without_transactions' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/transactions.rb:105:in `save' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb:59:in `transaction' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/query_cache.rb:66:in `send' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/query_cache.rb:66:in `method_missing' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/transactions.rb:77:in `transaction' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/transactions.rb:97:in `transaction' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/transactions.rb:105:in `save' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/transactions.rb:117:in `rollback_active_record_state!' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/transactions.rb:105:in `save' [RAILS_ROOT]/app/models/comment.rb:35:in `after_create' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/callbacks.rb:352:in `send' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/callbacks.rb:352:in `callback' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/callbacks.rb:257:in `create_without_timestamps' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/timestamp.rb:29:in `create' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/base.rb:1959:in `create_or_update_without_callbacks' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/callbacks.rb:243:in `create_or_update' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/base.rb:1693:in `save_without_validation' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/validations.rb:848:in `save_without_transactions' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/transactions.rb:105:in `save' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb:59:in `transaction' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/query_cache.rb:66:in `send' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/query_cache.rb:66:in `method_missing' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/transactions.rb:77:in `transaction' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/transactions.rb:97:in `transaction' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/transactions.rb:105:in `save' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/transactions.rb:117:in `rollback_active_record_state!' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/transactions.rb:105:in `save' [RAILS_ROOT]/app/controllers/comments_controller.rb:52:in `create' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/mime_responds.rb:104:in `call' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/mime_responds.rb:104:in `respond_to' [RAILS_ROOT]/app/controllers/comments_controller.rb:51:in `create' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/base.rb:1136:in `send' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/base.rb:1136:in `perform_action_without_filters' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/filters.rb:713:in `call_filters' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/filters.rb:752:in `perform_action_without_benchmark' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/benchmarking.rb:68:in `perform_action_without_rescue' /usr/local/lib/ruby/1.8/benchmark.rb:293:in `measure' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/benchmarking.rb:68:in `perform_action_without_rescue' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/rescue.rb:133:in `perform_action_without_caching' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/caching.rb:668:in `perform_action' [RAILS_ROOT]/vendor/rails/activerecord/lib/active_record/query_cache.rb:99:in `cache' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/caching.rb:667:in `perform_action' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/base.rb:494:in `send' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/base.rb:494:in `process_without_filters' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/filters.rb:747:in `process_without_session_management_support' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/session_management.rb:122:in `process' [RAILS_ROOT]/vendor/rails/actionpack/lib/action_controller/base.rb:346:in `process' [RAILS_ROOT]/vendor/rails/railties/lib/dispatcher.rb:39:in `dispatch' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/rails.rb:78:in `process' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/rails.rb:76:in `synchronize' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/rails.rb:76:in `process' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel.rb:618:in `process_client' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel.rb:617:in `each' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel.rb:617:in `process_client' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel.rb:736:in `run' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel.rb:736:in `initialize' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel.rb:736:in `new' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel.rb:736:in `run' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel.rb:720:in `initialize' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel.rb:720:in `new' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel.rb:720:in `run' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/configurator.rb:271:in `run' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/configurator.rb:270:in `each' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/configurator.rb:270:in `run' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/bin/mongrel_rails:127:in `run' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/command.rb:211:in `run' /usr/local/lib/ruby/gems/1.8/gems/mongrel-1.0.1/bin/mongrel_rails:243 /usr/local/bin/mongrel_rails:16:in `load' /usr/local/bin/mongrel_rails:16 ----------- ferret_server.out ----------------- EMPTY ----------- ferret_server.log ----------------- call index method: highlight with [13, "Topic", "jesse", {:field=>:ferret_name, :excerpt_length=>150, :num_excerpts=>1, :pre_tag=>"", :post_tag=>""}] call index method: highlight with [13, "Topic", "jesse", {:field=>:ferret_content, :excerpt_length=>150, :num_excerpts=>1, :pre_tag=>"", :post_tag=>""}] call index method: highlight with [17, "Topic", "jesse", {:field=>:ferret_name, :excerpt_length=>150, :num_excerpts=>1, :pre_tag=>"", :post_tag=>""}] call index method: highlight with [17, "Topic", "jesse", {:field=>:ferret_content, :excerpt_length=>150, :num_excerpts=>1, :pre_tag=>"", :post_tag=>""}] call index method: add with [{:ferret_name=>"test", :class_name=>"Topic", :ferret_content=>"test", :id=>19}] call index method: add with [{:ferret_name=>"test", :class_name=>"Topic", :ferret_content=>"test", :id=>19}] call index method: remove with [19, "Topic"] call index method: add with [{:ferret_name=>"test", :class_name=>"Topic", :ferret_content=>"test", :id=>19}] jessegr at spurgeon [~/apps/blocks/current/log]# -- Posted via http://www.ruby-forum.com/. From jonathan.viney at gmail.com Thu Jun 28 08:25:57 2007 From: jonathan.viney at gmail.com (Jonathan Viney) Date: Thu, 28 Jun 2007 14:25:57 +0200 Subject: [Ferret-talk] ThreadError from DRb server Message-ID: <6547a7275ac6ecfd9c9d8d95f2339549@ruby-forum.com> Hi there, I've just started using aaf for searching and it's running well apart from one error I received: A ThreadError occurred in person#search: current thread not owner (druby:/localhost:9010) /usr/lib/ruby/1.8/monitor.rb:274:in `mon_check_owner' (druby:/localhost:9010) /usr/lib/ruby/1.8/monitor.rb:274:in `mon_check_owner' (druby:/localhost:9010) /usr/lib/ruby/1.8/monitor.rb:220:in `mon_exit' (druby:/localhost:9010) /usr/lib/ruby/1.8/monitor.rb:240:in `synchronize' (druby:/localhost:9010) /var/lib/gems/1.8/gems/ferret-0.11.4/lib/ferret/index.rb:8:in `synchrolock' (druby:/localhost:9010) /var/lib/gems/1.8/gems/ferret-0.11.4/lib/ferret/index.rb:267:in `<<' (druby:/localhost:9010) /var/www/rails/balrog/releases/20070628045633/vendor/plugins/acts_as_ferret/lib/local_index.rb:263:in `reindex_model' ..... Has anybody else had this before? Any ideas what could have caused it? It's only happened once. I'm using ferret 0.11.4 and aaf trunk. Cheers, -Jonathan. -- Posted via http://www.ruby-forum.com/. From jonathan.viney at gmail.com Thu Jun 28 08:36:51 2007 From: jonathan.viney at gmail.com (Jonathan Viney) Date: Thu, 28 Jun 2007 14:36:51 +0200 Subject: [Ferret-talk] acts_as_ferret and capistrano Message-ID: Hi, I'd like to share the ferret indexes between deployments of a Rails app. At the moment the index is stored in #{RAILS_ROOT}/index meaning that it gets moved and must be rebuilt after every deploy. I guess the simplest solution would be to put it under log/ which is already shared by capistrano, any other ideas? Also, script/ferret_stop doesn't work when run on a fresh deploy without a index/ dir. It gives an error like: /home/jviney/Workspace/balrog/vendor/plugins/acts_as_ferret/lib/act_methods.rb:1 88:in `open':Errno::ENOENT: No such file or directory - /home/jviney/Workspace/b alrog/index/production/person Even though it is set up to use DRb, it wrongly expects the index to be present. Aaf shouldn't do anything with a local index when it is supposed to be using DRb. The "unless options[:remote]" block in act_methods.rb (line 136) should extend down to logger.debug five lines later to avoid the call to find_last_index_version. Cheers, -Jonathan. -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Jun 28 09:29:42 2007 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 28 Jun 2007 15:29:42 +0200 Subject: [Ferret-talk] acts_as_ferret and capistrano In-Reply-To: References: Message-ID: <20070628132941.GS7326@cordoba.webit.de> On Thu, Jun 28, 2007 at 02:36:51PM +0200, Jonathan Viney wrote: > Hi, > > I'd like to share the ferret indexes between deployments of a Rails app. > At the moment the index is stored in #{RAILS_ROOT}/index meaning that it > gets moved and must be rebuilt after every deploy. I guess the simplest > solution would be to put it under log/ which is already shared by > capistrano, any other ideas? I usually create my index directory once in shared/index and symlink this to current/index in an after_update_code task. > Also, script/ferret_stop doesn't work when run on a fresh deploy without > a index/ dir. It gives an error like: > > /home/jviney/Workspace/balrog/vendor/plugins/acts_as_ferret/lib/act_methods.rb:1 > 88:in `open':Errno::ENOENT: No such file or directory - > /home/jviney/Workspace/b > alrog/index/production/person > > Even though it is set up to use DRb, it wrongly expects the index to be > present. Aaf shouldn't do anything with a local index when it is > supposed to be using DRb. The "unless options[:remote]" block in > act_methods.rb (line 136) should extend down to logger.debug five lines > later to avoid the call to find_last_index_version. good point, I just committed this. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Thu Jun 28 09:34:13 2007 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 28 Jun 2007 15:34:13 +0200 Subject: [Ferret-talk] ThreadError from DRb server In-Reply-To: <6547a7275ac6ecfd9c9d8d95f2339549@ruby-forum.com> References: <6547a7275ac6ecfd9c9d8d95f2339549@ruby-forum.com> Message-ID: <20070628133413.GT7326@cordoba.webit.de> On Thu, Jun 28, 2007 at 02:25:57PM +0200, Jonathan Viney wrote: > Hi there, > > I've just started using aaf for searching and it's running well apart > from one error I received: > > A ThreadError occurred in person#search: > > current thread not owner > (druby:/localhost:9010) /usr/lib/ruby/1.8/monitor.rb:274:in > `mon_check_owner' > (druby:/localhost:9010) /usr/lib/ruby/1.8/monitor.rb:274:in > `mon_check_owner' > (druby:/localhost:9010) /usr/lib/ruby/1.8/monitor.rb:220:in > `mon_exit' > (druby:/localhost:9010) /usr/lib/ruby/1.8/monitor.rb:240:in > `synchronize' > (druby:/localhost:9010) > /var/lib/gems/1.8/gems/ferret-0.11.4/lib/ferret/index.rb:8:in > `synchrolock' > (druby:/localhost:9010) > /var/lib/gems/1.8/gems/ferret-0.11.4/lib/ferret/index.rb:267:in `<<' > (druby:/localhost:9010) > /var/www/rails/balrog/releases/20070628045633/vendor/plugins/acts_as_ferret/lib/local_index.rb:263:in > `reindex_model' > > ..... > > Has anybody else had this before? Any ideas what could have caused it? > It's only happened once. I'm using ferret 0.11.4 and aaf trunk. Interesting. It might be because you ran your first search without having an index, so aaf triggered an automatic rebuild. Possible that this scenario hasn't occured in my testing with the DRb server yet. I'll try to reproduce this. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From kraemer at webit.de Thu Jun 28 09:41:27 2007 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 28 Jun 2007 15:41:27 +0200 Subject: [Ferret-talk] DRb server crashing In-Reply-To: References: Message-ID: <20070628134127.GU7326@cordoba.webit.de> On Thu, Jun 28, 2007 at 01:39:27PM +0200, Jesse Grosjean wrote: > I'm having a problem where the DRb server seems to be disappearing > (crashing?) and I don't know how to track down the cause or work around > the problem. > > Unfortunately I haven't found a way to reproduce the problem, but it > seems to happen fairly often (maybe once a day). Other then that ferret > seems to be working well. > > I can't seem to find any trace of the crash, neither of the > ferret_server.out (blank) and ferret_server.log (tail -n 100 after a > crash attached below) seem to hold any hints to the problem. > > So my questions are. > > 1. Has anyone seen this behavior before, how did you work around it? > 2. If no one has seen it, any ideas of where I should look to track it > down? Difficult given that there are no log entries at all. Try to find out if the crash is related to special queries or indexing requests, or happens during idle time. You might use monit or something like this to track down when exactly the DRb server went away, and automatically launch a new one. > 3. Last if I can't fix it how much work would it be to just turn off > ferret when the server was down. That would mean indexing and searching > wouldn't work, but as it now stands no one can post to my site when the > problem occurs. I'll implement handling of these errors in aaf soon (hopefully this weekend). Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From ferret-talk at stuartsierra.com Thu Jun 28 12:33:17 2007 From: ferret-talk at stuartsierra.com (Stuart Sierra) Date: Thu, 28 Jun 2007 12:33:17 -0400 Subject: [Ferret-talk] Warning encoding error after Ferret changesets 765-769 Message-ID: <4683E2CD.50906@stuartsierra.com> Hello all, I recently upgraded from the Ferret gem 0.11.4 to the Subversion trunk, changeset #770. I did this because I needed the large file patch from http://ferret.davebalmain.com/trac/ticket/215 When adding documents to the index, I get the following message printed to STDERR many times: Warning encoding error. Please check that you are using the correct locale for your input Google found me this: http://ferret.davebalmain.com/trac/changeset/768 This appears to be part of a series of changes starting at #765 and ending at #769. #769 has the message "Reverted fix for OpenSolaris as it was affecting other operating systems. New fix in place." But apparently not everything was reverted. This does not seem to affect indexing or retrieval, just the warning message. Has anyone else experienced this? I'll file a ticket if I can get confirmation. This is on Ubuntu Feisty, Linux 2.6.20, i686, Ruby version 1.8.5. -Stuart From tarrall at gmail.com Thu Jun 28 15:48:32 2007 From: tarrall at gmail.com (Robert Tarrall) Date: Thu, 28 Jun 2007 13:48:32 -0600 Subject: [Ferret-talk] Is anyone using ferret on Solaris/SPARC? Message-ID: Ferret throws a bus error in the unit tests under Solaris, sun4u architecture. http://ferret.davebalmain.com/trac/ticket/272 Bug reporter appears to be on Solaris 8 with Ruby 1.8.4. I've tried on Solaris 10 (first release and 10/06, first without and then with the most recent patch set), both with Ruby 1.8.6, and get exactly the same error in the same spot. Tried with 0.11.4 and also a couple of older versions (0.10.14 and 0.10.7), again get the same error. Would try HEAD but svn://davebalmain.com/ seems to be down. Have tried a couple of different versions of gcc as well. I'm fairly sure the same compiler was used to compile both Ruby and Ferret. Just curious if anyone's gotten Ferret to pass the unit tests on Solaris. (I've also tried just using it but it throws bus errors in production as well when I try to build the index.) In case anyone's familiar enough with the code to suggest a reason why this might be happening - bus errors always seem to occur on MP_ALLOC calls. Here's an example (from 0.11.4): #0 0xff1c0f90 in _lwp_kill () from /lib/libc.so.1 #1 0xff15fd80 in raise () from /lib/libc.so.1 #2 0xff13ffa0 in abort () from /lib/libc.so.1 #3 0x0009ef44 in rb_bug (fmt=0xb7908 "Bus Error") at error.c:214 #4 0x0007e0b8 in sigbus (sig=73) at signal.c:605 #5 0xff1bfed0 in __sighndlr () from /lib/libc.so.1 #6 0xff1b4ffc in call_user_handler () from /lib/libc.so.1 #7 0xfdf43668 in dw_add_posting (mp=0x13e7590, curr_plists=0x0, fld_plists=0xa, doc_num=0, text=0x1e64d2c "policies", len=8, pos=1) at index.c:4832 #8 0xfdf437b0 in dw_invert_field (dw=0x1dfab28, fld_inv=0x206c810, df=0x13e74f0) at index.c:5218 #9 0xfdf43a8c in dw_add_doc (dw=0x1dfab28, doc=0x99f5e0) at index.c:5288 #10 0xfdf44e54 in iw_add_doc (iw=0x1dfaa00, doc=0x99f5e0) at index.c:5968 #11 0xfdf34990 in frt_iw_add_doc (self=27175152, rdoc=27184872) at r_index.c:1541 Kinda wonder if this is a 64-bit issue though I haven't made any attempt to compile these in 64-bit mode... -- -Robert Tarrall.- Unix System/Network Admin E.Central/Neighborhood Link From jonathan.viney at gmail.com Thu Jun 28 20:05:07 2007 From: jonathan.viney at gmail.com (Jonathan Viney) Date: Fri, 29 Jun 2007 02:05:07 +0200 Subject: [Ferret-talk] acts_as_ferret and capistrano In-Reply-To: <20070628132941.GS7326@cordoba.webit.de> References: <20070628132941.GS7326@cordoba.webit.de> Message-ID: <636b87ad7f3c5c7b8f7cb3e3f9d20220@ruby-forum.com> > I usually create my index directory once in shared/index and symlink > this to current/index in an after_update_code task. Sounds pretty good. It would be nice to include all the necessary capistrano tasks in the plugin. >> supposed to be using DRb. The "unless options[:remote]" block in >> act_methods.rb (line 136) should extend down to logger.debug five lines >> later to avoid the call to find_last_index_version. > > good point, I just committed this. > Cheers, -Jonathan. -- Posted via http://www.ruby-forum.com/.