From wintonius at gmail.com Sun Oct 1 20:08:35 2006 From: wintonius at gmail.com (Winton) Date: Mon, 2 Oct 2006 02:08:35 +0200 Subject: [Ferret-talk] Another web app using Ferret Message-ID: <997855c7aa1c09e0c12f32b52700cec2@ruby-forum.com> I am apart of a team that runs a student site called Studicious (http://stu.dicio.us). We have been using Ferret from the beginning, and recently added acts_as_ferret and sorting to the system. As you can see if you try the search, sorting is not working as expected. I am using this code (w/ find_by_content): :sort => Ferret::Search::SortField.new(:school_sort, :reverse => false) :school_sort is an untokenized field with otherwise default settings. Any ideas? - Winton -- Posted via http://www.ruby-forum.com/. From mleung at projectrideme.com Mon Oct 2 00:12:02 2006 From: mleung at projectrideme.com (Michael Leung) Date: Mon, 2 Oct 2006 06:12:02 +0200 Subject: [Ferret-talk] Strange Sorting Issues Message-ID: Hi there, I'm having some strange sorting stuff goign on. Here's my search method: sort_fields = [] sort_fields << Ferret::Search::SortField.new("name", :reverse => :false) @results = Listing.find_by_contents @search_criteria, :limit => :all, :sort => sort_fields page = (params[:page] ||= 1).to_i items_per_page = 9 offset = (page - 1) * items_per_page @pages = Paginator.new(self, @results.length, items_per_page, page) @results = @results[offset..(offset + items_per_page - 1)] For some queries, the sorting is correct. Other times, it's not. I'm not sure what's causing this. Any help would be greatly appreciated! We're using ferret 10.3, I believe. Thanks. -- Posted via http://www.ruby-forum.com/. From peter at ioffer.com Mon Oct 2 01:42:02 2006 From: peter at ioffer.com (peter) Date: Sun, 01 Oct 2006 22:42:02 -0700 Subject: [Ferret-talk] Strange Sorting Issues In-Reply-To: Message-ID: Are you sorting by the same "name" field each time, or is it by different fields? > From: Michael Leung > Reply-To: ferret-talk at rubyforge.org > Date: Mon, 2 Oct 2006 06:12:02 +0200 > To: ferret-talk at rubyforge.org > Subject: [Ferret-talk] Strange Sorting Issues > > Hi there, > > I'm having some strange sorting stuff goign on. Here's my search method: > > > sort_fields = [] > sort_fields << Ferret::Search::SortField.new("name", > :reverse => :false) > @results = Listing.find_by_contents @search_criteria, :limit => :all, > :sort => sort_fields > > page = (params[:page] ||= 1).to_i > items_per_page = 9 > offset = (page - 1) * items_per_page > @pages = Paginator.new(self, @results.length, items_per_page, page) > @results = @results[offset..(offset + items_per_page - 1)] > > For some queries, the sorting is correct. Other times, it's not. I'm not > sure what's causing this. Any help would be greatly appreciated! > > We're using ferret 10.3, I believe. > > Thanks. > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From david.wennergren at gmail.com Mon Oct 2 03:15:43 2006 From: david.wennergren at gmail.com (David Wennergren) Date: Mon, 2 Oct 2006 09:15:43 +0200 Subject: [Ferret-talk] Strange Sorting Issues In-Reply-To: References: Message-ID: > For some queries, the sorting is correct. Other times, it's not. I'm not > sure what's causing this. Any help would be greatly appreciated! I've had the same problem. I solved it by using the find_options in find_by_contents method. Like this: find_by_contens(q, options, find_options) find_options is a hash passed on to active_record?s find when retrieving the data from db, useful to i.e. prefetch relationships. So for your query: @results = Listing.find_by_contents @search_criteria, {:limit => :all, :sort => sort_fields},{:order => "name ASC"} I'm not sure this is the best way but it worked for me. /David -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Oct 2 04:42:12 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 2 Oct 2006 10:42:12 +0200 Subject: [Ferret-talk] Strange Sorting Issues In-Reply-To: References: Message-ID: <20061002084212.GZ11602@cordoba.webit.de> On Mon, Oct 02, 2006 at 09:15:43AM +0200, David Wennergren wrote: > > For some queries, the sorting is correct. Other times, it's not. I'm not > > sure what's causing this. Any help would be greatly appreciated! > > I've had the same problem. I solved it by using the find_options in > find_by_contents method. Like this: > > find_by_contens(q, options, find_options) > > find_options is a hash passed on to active_record?s find when retrieving > the data from db, useful to i.e. prefetch relationships. > > So for your query: > > @results = Listing.find_by_contents @search_criteria, {:limit => :all, > :sort => sort_fields},{:order => "name ASC"} > > I'm not sure this is the best way but it worked for me. please note that this will only work with :limit => :all, otherwise you'll only sort the subset of records retrieved from ferret, not the whole result set. As :limit => :all can be *very* expensive (with Ferret returning all results, and aaf fetching them all from the db), making the Ferret sorting work correctly would be the better way. Is the correctness of sorting related to a special kind of queries ? Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From josh.nug at gmail.com Mon Oct 2 05:04:29 2006 From: josh.nug at gmail.com (Josh D.) Date: Mon, 2 Oct 2006 11:04:29 +0200 Subject: [Ferret-talk] concurrency / #search_each problem / segfault In-Reply-To: References: <36115846c5a17377b2c332f69ffd37be@ruby-forum.com> Message-ID: <1f019d069c0f38d0def668fcf1fc2d76@ruby-forum.com> Hi Dave, > Just to you this time. What is the rest of the code in this loop > (above). ie, what is the "..". It should help me sort out the problem. here are the complete stack traces: ArgumentError (:12250 is out of range [0..12243] for IndexWriter#[]): /usr/lib/ruby/gems/1.8/gems/ferret-0.10.8/lib/ferret/index.rb:382:in `[]' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.8/lib/ferret/index.rb:382:in `[]' /usr/lib/ruby/1.8/monitor.rb:229:in `synchronize' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.8/lib/ferret/index.rb:375:in `[]' /app/models/article.rb:150:in `fulltext_search' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.8/lib/ferret/index.rb:364:in `search_each' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.8/lib/ferret/index.rb:363:in `search_each' /usr/lib/ruby/1.8/monitor.rb:229:in `synchronize' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.8/lib/ferret/index.rb:359:in `search_each' /app/models/article.rb:149:in `fulltext_search' /app/controllers/public/article_controller.rb:91:in `search' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/base.rb:941:in `perform_action_without_filters' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/filters.rb:368:in `perform_action_without_benchmark' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/benchmarking.rb:69:in `perform_action_without_rescue' /usr/local/lib/site_ruby/benchmark.rb:300:in `measure' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/benchmarking.rb:69:in `perform_action_without_rescue' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/rescue.rb:82:in `perform_action' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/base.rb:408:in `process_without_filters' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/filters.rb:377:in . `process_without_session_management_support' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/session_management.rb:117:in `process' /usr/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/dispatcher.rb:38:in `dispatch' /usr/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/fcgi_handler.rb:150:in `process_request' /usr/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/fcgi_handler.rb:54:in `process!' /usr/lib/ruby/1.8/fcgi.rb:600:in `each_cgi' /usr/lib/ruby/1.8/fcgi.rb:597:in `each_cgi' /usr/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/fcgi_handler.rb:53:in `process!' /usr/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/fcgi_handler.rb:23:in `process!' Ferret::StateError (State Error occured at :79 in xraise Error occured in index.c:3404 - sr_get_lazy_doc Document 0 has already been deleted ): /usr/lib/ruby/gems/1.8/gems/ferret-0.10.8/lib/ferret/index.rb:382:in `[]' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.8/lib/ferret/index.rb:382:in `[]' /usr/lib/ruby/1.8/monitor.rb:229:in `synchronize' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.8/lib/ferret/index.rb:375:in `[]' /app/models/article.rb:150:in `fulltext_search' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.8/lib/ferret/index.rb:364:in `search_each' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.8/lib/ferret/index.rb:363:in `search_each' /usr/lib/ruby/1.8/monitor.rb:229:in `synchronize' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.8/lib/ferret/index.rb:359:in `search_each' /app/models/article.rb:149:in `fulltext_search' /app/controllers/public/article_controller.rb:91:in `search' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/base.rb:941:in `perform_action_without_filters' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/filters.rb:368:in `perform_action_without_benchmark' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/benchmarking.rb:69:in `perform_action_without_rescue' /usr/local/lib/site_ruby/benchmark.rb:300:in `measure' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/benchmarking.rb:69:in `perform_action_without_rescue' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/rescue.rb:82:in `perform_action' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/base.rb:408:in `process_without_filters' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/filters.rb:377:in . `process_without_session_management_support' /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/session_management.rb:117:in `process' /usr/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/dispatcher.rb:38:in `dispatch' /usr/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/fcgi_handler.rb:150:in `process_request' /usr/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/fcgi_handler.rb:54:in `process!' /usr/lib/ruby/1.8/fcgi.rb:600:in `each_cgi' /usr/lib/ruby/1.8/fcgi.rb:597:in `each_cgi' /usr/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/fcgi_handler.rb:53:in `process!' /usr/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/fcgi_handler.rb:23:in `process!' Hope that helps. > By the way, you should upgrade to Ferret-0.10.9. Ok, will do that. Thank you josh -- Posted via http://www.ruby-forum.com/. From david.wennergren at gmail.com Mon Oct 2 06:03:31 2006 From: david.wennergren at gmail.com (David Wennergren) Date: Mon, 2 Oct 2006 12:03:31 +0200 Subject: [Ferret-talk] Strange Sorting Issues In-Reply-To: <20061002084212.GZ11602@cordoba.webit.de> References: <20061002084212.GZ11602@cordoba.webit.de> Message-ID: <4e059dfa1888768738430ae664a2e7c1@ruby-forum.com> > please note that this will only work with :limit => :all, otherwise > you'll only sort the subset of records retrieved from ferret, not the > whole result set. Just to make sure that I don't misunderstand something. If I skip the find_options but use a Ferret sort field I get the correct result (for exampel, 20 hits ordered by name). My problem was that if I didn't provide the find_options, the records when loaded with an sql like this (in the find_by_contents method) "items.id in (1,12,13,45,23)" was still in the wrong order unless a passed an find_options ordering them by "name". /David -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Oct 2 07:45:12 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 2 Oct 2006 13:45:12 +0200 Subject: [Ferret-talk] Strange Sorting Issues In-Reply-To: <4e059dfa1888768738430ae664a2e7c1@ruby-forum.com> References: <20061002084212.GZ11602@cordoba.webit.de> <4e059dfa1888768738430ae664a2e7c1@ruby-forum.com> Message-ID: <20061002114512.GC11602@cordoba.webit.de> On Mon, Oct 02, 2006 at 12:03:31PM +0200, David Wennergren wrote: > > please note that this will only work with :limit => :all, otherwise > > you'll only sort the subset of records retrieved from ferret, not the > > whole result set. > > Just to make sure that I don't misunderstand something. If I skip the > find_options but use a Ferret sort field I get the correct result (for > exampel, 20 hits ordered by name). > > My problem was that if I didn't provide the find_options, the records > when loaded with an sql like this (in the find_by_contents method) > "items.id in (1,12,13,45,23)" was still in the wrong order unless a > passed an find_options ordering them by "name". aaf is supposed to retain the sorting of results delivered by Ferret. The records retrieved with the sql 'in' clause are sorted afterwards so they are in the same order as the originial Ferret result set. At least that is how it is supposed to be. Could you please post your acts_as_ferret declaration, and the snippet where you call find_by_contents, so I can check if this is a bug in aaf? cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From charlie.hubbard at gmail.com Mon Oct 2 09:30:59 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Mon, 2 Oct 2006 15:30:59 +0200 Subject: [Ferret-talk] Adding dependant objects to an Index? Message-ID: I have design question and I'm wondering what's the best way to solve it. I'm trying to index HTML content where I have a single model object call it Article that is an acts_as_ferret model, and an article consists of many HTML files. I would like to index all of the content of the article with ferret and search across it. However, since the article's content is spread over several files how would I do that if I don't have an object in the database for each page? Is there a way from within my Article object to add more than one Document to the index? These pages would obviously be attached to the life cycle of the Article. In other words if I remove the article I want to remove all the pages that went along with that article. How would I do that? Another question I have is I would like to search the elements of the article like author, title, etc, and search the contents of those Articles within one search field. Can I place all of this data inside a single index? Or do I have to use the multi_search method? Thanks Charlie -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Oct 2 11:48:58 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 2 Oct 2006 17:48:58 +0200 Subject: [Ferret-talk] Adding dependant objects to an Index? In-Reply-To: References: Message-ID: <20061002154858.GA25863@cordoba.webit.de> On Mon, Oct 02, 2006 at 03:30:59PM +0200, Charlie Hubbard wrote: > > I have design question and I'm wondering what's the best way to solve > it. I'm trying to index HTML content where I have a single model object > call it Article that is an acts_as_ferret model, and an article consists > of many HTML files. I would like to index all of the content of the > article with ferret and search across it. However, since the article's > content is spread over several files how would I do that if I don't have > an object in the database for each page? Is there a way from within my > Article object to add more than one Document to the index? These pages > would obviously be attached to the life cycle of the Article. In other > words if I remove the article I want to remove all the pages that went > along with that article. How would I do that? Do you want to be able to find single html files in search results, or is it ok to only find the whole article, without knowing which file the hit was in ? In the first case, you can either create a Page model representing a single page and index that, or don't use acts_as_ferret at all and do the indexing yourself. The easier way is the second case, just create a method named html_content returning the concatenated contents from all the files belonging to your article, and add :html_content to the fields list in your call to acts_as_ferret. This will index all files belonging to your article in a single Ferret document. > Another question I have is I would like to search the elements of the > article like author, title, etc, and search the contents of those > Articles within one search field. Can I place all of this data inside a > single index? Or do I have to use the multi_search method? you'll only need multi_search if you have several indexes (that is, several Model classes where you called acts_as_ferret). In your case, if you choose the second way, just index your meta data together with the content, aaf will by default search in all fields. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From mleung at projectrideme.com Mon Oct 2 12:51:15 2006 From: mleung at projectrideme.com (Michael Leung) Date: Mon, 2 Oct 2006 18:51:15 +0200 Subject: [Ferret-talk] Strange Sorting Issues In-Reply-To: <20061002114512.GC11602@cordoba.webit.de> References: <20061002084212.GZ11602@cordoba.webit.de> <4e059dfa1888768738430ae664a2e7c1@ruby-forum.com> <20061002114512.GC11602@cordoba.webit.de> Message-ID: Hey guys. Thaks for all the great feeback. Actually using David Wennergren's suggestions did the trick! Thanks David! M. Jens Kraemer wrote: > On Mon, Oct 02, 2006 at 12:03:31PM +0200, David Wennergren wrote: >> "items.id in (1,12,13,45,23)" was still in the wrong order unless a >> passed an find_options ordering them by "name". > > aaf is supposed to retain the sorting of results delivered by Ferret. > The records retrieved with the sql 'in' clause are sorted afterwards so > they are in the same order as the originial Ferret result set. > > At least that is how it is supposed to be. > Could you please post your acts_as_ferret declaration, and the snippet > where you call find_by_contents, so I can check if this is a bug in aaf? > > cheers, > Jens > > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Mon Oct 2 14:47:04 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Mon, 2 Oct 2006 20:47:04 +0200 Subject: [Ferret-talk] Adding dependant objects to an Index? In-Reply-To: <20061002154858.GA25863@cordoba.webit.de> References: <20061002154858.GA25863@cordoba.webit.de> Message-ID: Jens Kraemer wrote: > On Mon, Oct 02, 2006 at 03:30:59PM +0200, Charlie Hubbard wrote: >> words if I remove the article I want to remove all the pages that went >> along with that article. How would I do that? > > Do you want to be able to find single html files in search results, or > is it ok to only find the whole article, without knowing which file the > hit was in ? > > In the first case, you can either create a Page model representing a > single page and index that, or don't use acts_as_ferret at all and do > the indexing yourself. This is actually more the scenario. I want the user to be able to jump right to the relevant portions of article and see their search results. Possibly with highlights etc. Mainly because these articles can be quite large. >> Another question I have is I would like to search the elements of the >> article like author, title, etc, and search the contents of those >> Articles within one search field. Can I place all of this data inside a >> single index? Or do I have to use the multi_search method? > > you'll only need multi_search if you have several indexes (that is, > several Model classes where you called acts_as_ferret). > In your case, if you choose the second way, just index your meta data > together with the content, aaf will by default search in all fields. So bottom line is create a Page object for each page of the article and put that stuff in the DB, and use the acts_as_ferret options to find it. Use the multi-search across the two models. Thanks Charlie -- Posted via http://www.ruby-forum.com/. From chrisc at amphora-research.com Tue Oct 3 03:22:36 2006 From: chrisc at amphora-research.com (Chris Catton) Date: Tue, 3 Oct 2006 09:22:36 +0200 Subject: [Ferret-talk] newbie question Message-ID: <046a473a0dfad2b030ecd1083564a203@ruby-forum.com> Hi, I'm new to using ferret (and fairly new to ruby/rails) and I'm having a problem I can't fathom. Sorry for the long post ... I have a test which passes require 'rubygems' require 'ferret' include Ferret require 'test/unit' class CompanyTest < Test::Unit::TestCase def test_index puts 'running test' @index = Index::Index.new(:path => '../tmp/search-index') @index << {:title => "prospecting", :content => "blah blah blah"} @index << {:title => "prospecting", :content => "yada yada yada"} @index.search_each('content:"blah"') do |id, score| #just assert true if we didn't get an error .. ferret #seems to be working assert true end end end which I think means that ferret is properly installed When I search in my app I get this error undefined method `exists?' for {:term_vector=>:no, :store=>:no, :boost=>1.0, :index=>:yes}:Hash RAILS_ROOT: /Users/chrisc/Documents/checkouts/PROS/config/.. Application Trace | Framework Trace | Full Trace /opt/local/lib/ruby/site_ruby/1.8/ferret/index/field_infos.rb:20:in `initialize' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/class_methods.rb:166:in `rebuild_index' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/class_methods.rb:230:in `create_index_instance' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/class_methods.rb:223:in `ferret_index' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/class_methods.rb:389:in `find_id_by_contents' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/class_methods.rb:253:in `find_by_contents' #{RAILS_ROOT}/app/controllers/companies_controller.rb:517:in `createConditionsFromParameters' #{RAILS_ROOT}/app/controllers/companies_controller.rb:694:in `generate_list_from_filter' #{RAILS_ROOT}/app/controllers/companies_controller.rb:290:in `list_profile_information' /opt/local/bin/mongrel_rails:18 which looks to me as if acts_as_ferret is not finding the ferret index In my model I have require 'rubygems' require 'ferret' include Ferret acts_as_ferret :fields => [ 'name', 'comments'] and in the controller companies=Company.find_by_contents(params[:company_search_term]) I don't see that aaf is building the index, which I think is where the app is blowing up. My reading of the docs is that it should do. Did I do something dumb, or is there a known issue with these versions that I missed? Many thanks for any help ... this is driving me crazy ... Rails 1.1.6 Ruby 1.8.4 acts_as_ferret from svn Ferret 0.10.9 on mac OS X -- Posted via http://www.ruby-forum.com/. From shammond at patientslikeme.com Tue Oct 3 11:36:33 2006 From: shammond at patientslikeme.com (Steven Hammond) Date: Tue, 3 Oct 2006 11:36:33 -0400 Subject: [Ferret-talk] Ferret on Windows 32 In-Reply-To: References: Message-ID: <68DC7502-006D-479E-8ADD-3E4CA1224F9B@patientslikeme.com> We're having some troubles with ferret on Win32, the illegal character issue that has been discussed here before. Am I correct that the heart of the issue is that the version of ruby we are using is different from the one used to compile the win32 gem of ferret? If so, what version of ruby is the latest gem compiled for? What was the last win32 gem to be compiled with Ruby 1.8.4-20? Thanks, Steve From dougal.s at gmail.com Tue Oct 3 12:43:25 2006 From: dougal.s at gmail.com (Douglas Shearer) Date: Tue, 3 Oct 2006 18:43:25 +0200 Subject: [Ferret-talk] Ferret install, rake failing on make Message-ID: I'm currently trying to install the latest version of Ferret (0.10.9) on my Ubuntu Dapper (6.06) system. I have tried the gem, but it does not generate the ferret_ext.so file. Ideally I would prefer to install from the gem, but if source works, I'm fine with that too. I am now trying an install from source, but when I run the command '$ rake ext' I get the following error when it reaches the 'make' command: make rake aborted! Command failed with status (127): [make] /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:722:in `sh' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:729:in `sh' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:812:in `sh' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:807:in `sh' /home/dougal/ferret-0.10.9/Rakefile:137 /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:387:in `execute' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:387:in `execute' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:357:in `invoke' /usr/lib/ruby/1.8/thread.rb:135:in `synchronize' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:350:in `invoke' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:364:in `invoke_prerequisites' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:999:in `each' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:363:in `invoke_prerequisites' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:356:in `invoke' /usr/lib/ruby/1.8/thread.rb:135:in `synchronize' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:350:in `invoke' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:1906:in `run' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb:1906:in `run' /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/bin/rake:7 /usr/bin/rake:18 This is slightly lost on me as I am a Ruby newbie, hopefully someone will be able to tell me what is going on. Thanks. Dougal. (I posted this on comp.lang.ruby, and was referred here by Chris Lowis, thanks Chris). -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Oct 3 17:51:05 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 3 Oct 2006 23:51:05 +0200 Subject: [Ferret-talk] newbie question In-Reply-To: <046a473a0dfad2b030ecd1083564a203@ruby-forum.com> References: <046a473a0dfad2b030ecd1083564a203@ruby-forum.com> Message-ID: <20061003215105.GA29863@cordoba.webit.de> On Tue, Oct 03, 2006 at 09:22:36AM +0200, Chris Catton wrote: > Hi, > I'm new to using ferret (and fairly new to ruby/rails) and I'm having a > problem I can't fathom. Sorry for the long post ... [..] > When I search in my app I get this error > > undefined method `exists?' for {:term_vector=>:no, :store=>:no, > :boost=>1.0, :index=>:yes}:Hash > RAILS_ROOT: /Users/chrisc/Documents/checkouts/PROS/config/.. > > Application Trace | Framework Trace | Full Trace > /opt/local/lib/ruby/site_ruby/1.8/ferret/index/field_infos.rb:20:in > `initialize' > #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/class_methods.rb:166:in > `rebuild_index' [..] > Many thanks for any help ... this is driving me crazy ... > Rails 1.1.6 > Ruby 1.8.4 > acts_as_ferret from svn > Ferret 0.10.9 > on mac OS X > strange, on my system line 20 in field_infos.rb is inside a comment, and no use of exists? is made inside the file. Are you sure that field_infos.rb belongs to Ferrewt 0.10.9 ? Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Wed Oct 4 04:09:59 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 4 Oct 2006 10:09:59 +0200 Subject: [Ferret-talk] Ferret install, rake failing on make In-Reply-To: References: Message-ID: <20061004080959.GB25863@cordoba.webit.de> On Tue, Oct 03, 2006 at 06:43:25PM +0200, Douglas Shearer wrote: > I'm currently trying to install the latest version of Ferret (0.10.9) > on my Ubuntu Dapper (6.06) system. I have tried the gem, but it does > not generate the ferret_ext.so file. Ideally I would prefer to install > from the gem, but if source works, I'm fine with that too. > > I am now trying an install from source, but when I run the command '$ > rake ext' I get the following error when it reaches the 'make' command: short guess: do you have the ruby1.8-dev package installed ? Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Wed Oct 4 04:16:10 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 4 Oct 2006 10:16:10 +0200 Subject: [Ferret-talk] Adding dependant objects to an Index? In-Reply-To: References: <20061002154858.GA25863@cordoba.webit.de> Message-ID: <20061004081610.GA31096@cordoba.webit.de> On Mon, Oct 02, 2006 at 08:47:04PM +0200, Charlie Hubbard wrote: > Jens Kraemer wrote: [..] > > you'll only need multi_search if you have several indexes (that is, > > several Model classes where you called acts_as_ferret). > > In your case, if you choose the second way, just index your meta data > > together with the content, aaf will by default search in all fields. > > So bottom line is create a Page object for each page of the article and > put that stuff in the DB, and use the acts_as_ferret options to find it. > Use the multi-search across the two models. right. to further simplify things, you could index the article's meta data with each page, via an indexed method you mention in your field list. that method should retrieve the meta data from the parent article object and get this indexed together with each page. this might actually be faster than using multi_search (unless your article meta data is really large so that the overhead of indexing it with each page weighs in). In addition it would save you from having to handle different kinds of objects (Articles and Pages) in your result set. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From klansopranowd at msn.com Wed Oct 4 04:41:43 2006 From: klansopranowd at msn.com (sopranoam) Date: Wed, 4 Oct 2006 10:41:43 +0200 Subject: [Ferret-talk] ran Message-ID: <594691153aa47b352ad7b8ec21f896d6@ruby-forum.com> ranitidine ranitidine side effects ranitidine hcl ranitidine hydrochloride apo ranitidine -- Posted via http://www.ruby-forum.com/. From dougal.s at gmail.com Wed Oct 4 06:50:05 2006 From: dougal.s at gmail.com (Douglas Shearer) Date: Wed, 4 Oct 2006 12:50:05 +0200 Subject: [Ferret-talk] Ferret install, rake failing on make In-Reply-To: <20061004080959.GB25863@cordoba.webit.de> References: <20061004080959.GB25863@cordoba.webit.de> Message-ID: <1a24e576566ebdf817a98291d4711d16@ruby-forum.com> Jens Kraemer wrote: > short guess: do you have the ruby1.8-dev package installed ? Yes, ruby1.8-dev package is installed. I'm currently setting up an identical machine in Parallels to see if I can replicate my problem. -- Posted via http://www.ruby-forum.com/. From chrisc at amphora-research.com Wed Oct 4 07:45:27 2006 From: chrisc at amphora-research.com (Chris Catton) Date: Wed, 4 Oct 2006 13:45:27 +0200 Subject: [Ferret-talk] newbie question In-Reply-To: <20061003215105.GA29863@cordoba.webit.de> References: <046a473a0dfad2b030ecd1083564a203@ruby-forum.com> <20061003215105.GA29863@cordoba.webit.de> Message-ID: <3be87d03ad482324e6ba9ef435bfe36c@ruby-forum.com> Jens Thanks very much - this helped me find the problem which was an old version of ferret in the path and a failure on my part to check that the same version of ruby was being callled in the console and by mongrel. chris Jens Kraemer wrote: > On Tue, Oct 03, 2006 at 09:22:36AM +0200, Chris Catton wrote: >> Hi, >> I'm new to using ferret (and fairly new to ruby/rails) and I'm having a >> problem I can't fathom. Sorry for the long post ... > [..] >> `rebuild_index' > [..] >> Many thanks for any help ... this is driving me crazy ... >> Rails 1.1.6 >> Ruby 1.8.4 >> acts_as_ferret from svn >> Ferret 0.10.9 >> on mac OS X >> > > strange, on my system line 20 in field_infos.rb is inside a comment, and > no use of exists? is made inside the file. Are you sure that > field_infos.rb belongs to Ferrewt 0.10.9 ? > > Jens > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 -- Posted via http://www.ruby-forum.com/. From dougal.s at gmail.com Wed Oct 4 07:54:38 2006 From: dougal.s at gmail.com (Douglas Shearer) Date: Wed, 4 Oct 2006 13:54:38 +0200 Subject: [Ferret-talk] Ferret install, rake failing on make In-Reply-To: <1a24e576566ebdf817a98291d4711d16@ruby-forum.com> References: <20061004080959.GB25863@cordoba.webit.de> <1a24e576566ebdf817a98291d4711d16@ruby-forum.com> Message-ID: <05ec3eec739d2ad23fff17ff4ff1f1bc@ruby-forum.com> Douglas Shearer wrote: > Yes, ruby1.8-dev package is installed. I'm currently setting up an > identical machine in Parallels to see if I can replicate my problem. Ok, I got the same result with my virtual machine in parallels. Anyone have a solution, or even a copy of ferret_ext from an i386 Ubuntu Dapper install? Thanks. -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Wed Oct 4 12:19:54 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 5 Oct 2006 01:19:54 +0900 Subject: [Ferret-talk] Ferret just got faster. Message-ID: Hey guys, Sorry I haven't been around for the last few days. I've just finished a coding marathon fixing up some of the performance problems in Ferret. If you don't know what I'm talking about there has been a problem with Filters and Sorts on large indexes. Well, I think I've fixed the problem. Before: dbalmain at ubuntu:~/workspace/exp_old/c $ slow_bench sort_test Took: 10410000 clocks in 10.410 seconds rangeq_test Took: 8110000 clocks in 8.110 seconds After: dbalmain at ubuntu:~/workspace/exp/c $ slow_bench sort_test Took: 120000 clocks in 0.120 seconds rangeq_test Took: 80000 clocks in 0.080 seconds So as you can see, there is a significant difference (two orders of magnitude). I'll get to answering my email now and I'll have a new gem out soon after that. Cheers, Dave From peter at ioffer.com Wed Oct 4 12:43:29 2006 From: peter at ioffer.com (peter) Date: Wed, 04 Oct 2006 09:43:29 -0700 Subject: [Ferret-talk] Error while optimizing my index In-Reply-To: <05ec3eec739d2ad23fff17ff4ff1f1bc@ruby-forum.com> Message-ID: Hello. We are indexing a large amount of data (about 850,000 records). During the indexing, every 100,000 items I flush and optimize the index, so the disk space doesn't get too large during indexing. About 700,000 items in, I got this error: Error occured in fs_store.c:226 - fso_flush_i flushing src of length 1024 /usr/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:517:in `close' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:517:in `flush' /usr/lib/ruby/1.8/monitor.rb:229:in `synchronize' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:514:in `flush' Is there a way to find out what happened? It wasn't a space error, as there's plenty of space on the machine. Also, if I optimize during an index run such as this, do I need to close and then re-open the index, or will that not affect things? Thanks for any info you have. From dbalmain.ml at gmail.com Wed Oct 4 14:45:46 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 5 Oct 2006 03:45:46 +0900 Subject: [Ferret-talk] Another web app using Ferret In-Reply-To: <997855c7aa1c09e0c12f32b52700cec2@ruby-forum.com> References: <997855c7aa1c09e0c12f32b52700cec2@ruby-forum.com> Message-ID: On 10/2/06, Winton wrote: > I am apart of a team that runs a student site called Studicious > (http://stu.dicio.us). We have been using Ferret from the beginning, and > recently added acts_as_ferret and sorting to the system. > > As you can see if you try the search, sorting is not working as > expected. I am using this code (w/ find_by_content): > > :sort => Ferret::Search::SortField.new(:school_sort, :reverse => false) > > :school_sort is an untokenized field with otherwise default settings. > Any ideas? > > - Winton > Hi Winton, Sorry for the slow reply. Is it possible that one of your schools begins with a number? numbers appear in the database before letters so when Ferret detects the fields type it may think the field is an integer field. For example, imagine you are indexing tv shows. "24" would probably be the first entry in the index so when you try to sort by that field it will think it is an integer. So you need to specify that the field is in fact a string field. :sort => Ferret::Search::SortField.new(:school_sort, :type => :string, :reverse => false) I hope that helps, otherwise it really looks to me like the field is tokenized. I'd check that again just in case. Cheers, Dave PS: love the site design From dbalmain.ml at gmail.com Wed Oct 4 14:52:42 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 5 Oct 2006 03:52:42 +0900 Subject: [Ferret-talk] Ferret on Windows 32 In-Reply-To: <68DC7502-006D-479E-8ADD-3E4CA1224F9B@patientslikeme.com> References: <68DC7502-006D-479E-8ADD-3E4CA1224F9B@patientslikeme.com> Message-ID: On 10/4/06, Steven Hammond wrote: > > > We're having some troubles with ferret on Win32, the illegal > character issue that has been discussed here before. Am I correct > that the heart of the issue is that the version of ruby we are using > is different from the one used to compile the win32 gem of ferret? > > If so, what version of ruby is the latest gem compiled for? What was > the last win32 gem to be compiled with Ruby 1.8.4-20? > > Thanks, > Steve > 0.10.9. I will continue compiling against that version until another version of Ruby marked "stable" comes out. I have no idea why this problem is occuring. Unfortunately I can't reproduce the problem here which makes it even more difficult for me to solve. Actually, if anyone has built the gem themselves, please send it to me. I'd like to try a build from a different system. Cheers, Dave From dbalmain.ml at gmail.com Wed Oct 4 15:09:24 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 5 Oct 2006 04:09:24 +0900 Subject: [Ferret-talk] Ferret install, rake failing on make In-Reply-To: <05ec3eec739d2ad23fff17ff4ff1f1bc@ruby-forum.com> References: <20061004080959.GB25863@cordoba.webit.de> <1a24e576566ebdf817a98291d4711d16@ruby-forum.com> <05ec3eec739d2ad23fff17ff4ff1f1bc@ruby-forum.com> Message-ID: Hi Dougal, Have you tried cd'ing into the ext directory and calling `ruby extconf.rb` and `make` directly? That may solve your problems or at least give you a better idea of what is causing the error. If this doesn't work I can send you a ferret_ext.so but there will be a new version coming out very soon so I'll wait until then. Cheers, Dave On 10/4/06, Douglas Shearer wrote: > Douglas Shearer wrote: > > Yes, ruby1.8-dev package is installed. I'm currently setting up an > > identical machine in Parallels to see if I can replicate my problem. > > Ok, I got the same result with my virtual machine in parallels. Anyone > have a solution, or even a copy of ferret_ext from an i386 Ubuntu Dapper > install? > > Thanks. > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From bk at benjaminkrause.com Wed Oct 4 15:25:44 2006 From: bk at benjaminkrause.com (Benjamin Krause) Date: Wed, 04 Oct 2006 21:25:44 +0200 Subject: [Ferret-talk] Ferret just got faster. In-Reply-To: References: Message-ID: <45240AB8.3030400@benjaminkrause.com> Hey David, > So as you can see, there is a significant difference (two orders of > magnitude). I'll get to answering my email now and I'll have a new gem > out soon after that. > Wow.. it looks great on paper! I can't wait to see the difference on my domain :-) Thanks again for your efforts. Ben p.s. @anyone_else_out_there, did you know, that you can donate to Ferret by visting this link: http://ferret.davebalmain.com/trac ;-) From wintonius at gmail.com Wed Oct 4 17:33:29 2006 From: wintonius at gmail.com (Winton) Date: Wed, 4 Oct 2006 23:33:29 +0200 Subject: [Ferret-talk] Another web app using Ferret In-Reply-To: References: <997855c7aa1c09e0c12f32b52700cec2@ruby-forum.com> Message-ID: <805e374fea85d79e68190a7761e9b770@ruby-forum.com> Would there be any way for it to still be tokenized when defining it as 'school_sort' => { :index => :untokenized } in the model (a_a_f)? I just saw the other thread on sorting strangeness and it seems like I am having similar problems. I will try that solution and get back. Thanks again for a great library. - Winton David Balmain wrote: > On 10/2/06, Winton wrote: >> Any ideas? >> >> - Winton >> > > Hi Winton, > > Sorry for the slow reply. Is it possible that one of your schools > begins with a number? numbers appear in the database before letters so > when Ferret detects the fields type it may think the field is an > integer field. For example, imagine you are indexing tv shows. "24" > would probably be the first entry in the index so when you try to sort > by that field it will think it is an integer. So you need to specify > that the field is in fact a string field. > > :sort => Ferret::Search::SortField.new(:school_sort, :type => :string, > :reverse => false) > > I hope that helps, otherwise it really looks to me like the field is > tokenized. I'd check that again just in case. > > Cheers, > Dave > > PS: love the site design -- Posted via http://www.ruby-forum.com/. From wintonius at gmail.com Wed Oct 4 17:52:57 2006 From: wintonius at gmail.com (Winton) Date: Wed, 4 Oct 2006 23:52:57 +0200 Subject: [Ferret-talk] Another web app using Ferret In-Reply-To: <805e374fea85d79e68190a7761e9b770@ruby-forum.com> References: <997855c7aa1c09e0c12f32b52700cec2@ruby-forum.com> <805e374fea85d79e68190a7761e9b770@ruby-forum.com> Message-ID: <56cd71c5a58342421e03592bb2ac9df2@ruby-forum.com> (PS - Sorry, I didn't address your solution. Even with :type => :string I am getting odd sorting.) -- Posted via http://www.ruby-forum.com/. From wintonius at gmail.com Wed Oct 4 17:57:15 2006 From: wintonius at gmail.com (Winton) Date: Wed, 4 Oct 2006 23:57:15 +0200 Subject: [Ferret-talk] Ferret just got faster. In-Reply-To: <45240AB8.3030400@benjaminkrause.com> References: <45240AB8.3030400@benjaminkrause.com> Message-ID: <16cf2c912ac1441b735044a30196d463@ruby-forum.com> Very cool! I am donating $10 (hey, I'm a college student) and I suggest everyone else do the same. Thanks again, Dave. - Winton -- Posted via http://www.ruby-forum.com/. From joshuabates at gmail.com Wed Oct 4 20:10:28 2006 From: joshuabates at gmail.com (Joshua Bates) Date: Wed, 4 Oct 2006 18:10:28 -0600 Subject: [Ferret-talk] acts_as_ferret limit on multi_search not working? In-Reply-To: <97efd6d34b6fe3a8f444c287d835331f@ruby-forum.com> References: <20060920184901.GB11593@cordoba.webit.de> <97efd6d34b6fe3a8f444c287d835331f@ruby-forum.com> Message-ID: I'm getting the exact same behavior ferret 0.10.9 aaf 0.3.0 joshua On 9/21/06, David Wennergren wrote: > > Jens Kraemer wrote: > > On Wed, Sep 20, 2006 at 04:22:18PM +0200, David Wennergren wrote: > >> I'm using acts_as_ferret to do a query like this: > >> > >> Model1.multi_search("my query",[Model2,Model3], :limit => 2) > >> > >> No matter what number i set limit to I get 10 items in the resultset. > Am > >> I doing something wrong? > > > > nothing, this is supposed to work. what version of Ferret/aaf do you > > use ? > > > > Jens > > > > -- > > webit! Gesellschaft f?r neue Medien mbH www.webit.de > > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > > Schnorrstra?e 76 Tel +49 351 46766 0 > > D-01069 Dresden Fax +49 351 46766 66 > > I'm using ferret 0.10.4 and aaf 0.3.0. I'll try to make it into a > testcase so it easier to reproduce. > > This is my actual query: > > >> Pressrelease.multi_search("con*",[Event,Image],:limit => 2).size > => 10 > > /david > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20061004/cfd348ed/attachment-0001.html From joshuabates at gmail.com Wed Oct 4 20:16:36 2006 From: joshuabates at gmail.com (Joshua Bates) Date: Wed, 4 Oct 2006 18:16:36 -0600 Subject: [Ferret-talk] acts_as_ferret limit on multi_search not working? In-Reply-To: References: <20060920184901.GB11593@cordoba.webit.de> <97efd6d34b6fe3a8f444c287d835331f@ruby-forum.com> Message-ID: and I just found the culprit... Line 27 of multi_index.rb searcher.search_each(query, options={}, &block) should be searcher.search_each(query, options, &block) joshua On 10/4/06, Joshua Bates wrote: > > I'm getting the exact same behavior > ferret 0.10.9 > aaf 0.3.0 > > joshua > > On 9/21/06, David Wennergren < david.wennergren at gmail.com> wrote: > > > > Jens Kraemer wrote: > > > On Wed, Sep 20, 2006 at 04:22:18PM +0200, David Wennergren wrote: > > >> I'm using acts_as_ferret to do a query like this: > > >> > > >> Model1.multi_search("my query",[Model2,Model3], :limit => 2) > > >> > > >> No matter what number i set limit to I get 10 items in the resultset. > > Am > > >> I doing something wrong? > > > > > > nothing, this is supposed to work. what version of Ferret/aaf do you > > > use ? > > > > > > Jens > > > > > > -- > > > webit! Gesellschaft f?r neue Medien mbH www.webit.de > > > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > > > Schnorrstra?e 76 Tel +49 351 46766 0 > > > D-01069 Dresden Fax +49 351 46766 66 > > > > I'm using ferret 0.10.4 and aaf 0.3.0. I'll try to make it into a > > testcase so it easier to reproduce. > > > > This is my actual query: > > > > >> Pressrelease.multi_search("con*",[Event,Image],:limit => 2).size > > => 10 > > > > /david > > > > -- > > Posted via http://www.ruby-forum.com/. > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20061004/99fa3d34/attachment.html From dbalmain.ml at gmail.com Wed Oct 4 20:57:22 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 5 Oct 2006 09:57:22 +0900 Subject: [Ferret-talk] Error while optimizing my index In-Reply-To: References: <05ec3eec739d2ad23fff17ff4ff1f1bc@ruby-forum.com> Message-ID: On 10/5/06, peter wrote: > Hello. > > We are indexing a large amount of data (about 850,000 records). During the > indexing, every 100,000 items I flush and optimize the index, so the disk > space doesn't get too large during indexing. > > About 700,000 items in, I got this error: > Error occured in fs_store.c:226 - fso_flush_i > flushing src of length 1024 > > > > /usr/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:517:in `close' > > /usr/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:517:in `flush' > /usr/lib/ruby/1.8/monitor.rb:229:in `synchronize' > > /usr/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:514:in `flush' > > > Is there a way to find out what happened? It wasn't a space error, as > there's plenty of space on the machine. > > Also, if I optimize during an index run such as this, do I need to close and > then re-open the index, or will that not affect things? > > Thanks for any info you have. Hi Peter, How much space is the indexing taking up by this stage? You may be hitting the 2Gb limit in which case you will need to compile Ferret with large-file support. Otherwise, I'm not sure why you'd be getting this error. I'm going to have to improve that error message. Cheers, Dave From dbalmain.ml at gmail.com Wed Oct 4 21:03:05 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 5 Oct 2006 10:03:05 +0900 Subject: [Ferret-talk] Another web app using Ferret In-Reply-To: <805e374fea85d79e68190a7761e9b770@ruby-forum.com> References: <997855c7aa1c09e0c12f32b52700cec2@ruby-forum.com> <805e374fea85d79e68190a7761e9b770@ruby-forum.com> Message-ID: On 10/5/06, Winton wrote: > Would there be any way for it to still be tokenized when defining it as > > 'school_sort' => { :index => :untokenized } > > in the model (a_a_f)? I just saw the other thread on sorting strangeness > and it seems like I am having similar problems. I will try that solution > and get back. > > Thanks again for a great library. > > - Winton Hi Winton, If the index was already created before you defined the field as untokenized then it would remain tokenized. You can check like this: puts index.field_infos That will print a YAML output of your field infos. Cheers, Dave From dbalmain.ml at gmail.com Wed Oct 4 22:00:32 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 5 Oct 2006 11:00:32 +0900 Subject: [Ferret-talk] Ferret just got faster. In-Reply-To: <16cf2c912ac1441b735044a30196d463@ruby-forum.com> References: <45240AB8.3030400@benjaminkrause.com> <16cf2c912ac1441b735044a30196d463@ruby-forum.com> Message-ID: On 10/5/06, Winton wrote: > Very cool! I am donating $10 (hey, I'm a college student) and I suggest > everyone else do the same. > > Thanks again, Dave. > > - Winton Thanks for the support Winton. I do appreciate it. From jduflost at ben.vub.ac.be Thu Oct 5 01:43:31 2006 From: jduflost at ben.vub.ac.be (johan duflost) Date: Thu, 5 Oct 2006 07:43:31 +0200 Subject: [Ferret-talk] sort Message-ID: <000b01c6e841$30ec7940$0700000a@ORION> Dear all, It seems there's a sort bug with ferret 0.10.9 on Debian. I sort the search results by a field which can contain null values. The string sort type doesn't work. If I test the values and replace null by empty strings when indexing, it works. Johan Analyst Programmer Belgian Biodiversity Platform ( http://www.biodiversity.be) Belgian Federal Science Policy Office (http://www.belspo.be ) Tel:+32 2 650 5751 Fax: +32 2 650 5124 From jduflost at ben.vub.ac.be Thu Oct 5 01:58:40 2006 From: jduflost at ben.vub.ac.be (johan duflost) Date: Thu, 5 Oct 2006 07:58:40 +0200 Subject: [Ferret-talk] search results autocompletion Message-ID: <002d01c6e843$4e868750$0700000a@ORION> Dear list, I 'm using a text input field with autocompletion . The suggestions come from a ferret index which is created by getting all the terms belonging to other indices. Here is the code: class Suggestion attr_accessor :term def self.index(create) [Person, Project, Orgunit].each{|kl| terms = self.all_terms(kl) terms.each{|term| suggestion = Suggestion.new suggestion.term = term SUGGESTION_INDEX << suggestion.to_doc } } SUGGESTION_INDEX.optimize end def self.all_terms(klass) reader = Index::IndexReader.new(Object.const_get(klass.name.upcase + "_INDEX_DIR")) terms = [] begin reader.field_names.each {|field_name| term_enum = reader.terms(field_name) begin term = term_enum.term() if !term.nil? if klass::SUGGESTIONABLE_FIELDS.include?(field_name) terms << term end end end while term_enum.next? } ensure reader.close end return terms end def to_doc doc = {} doc[:term] = self.term return doc end end It works very well except that the indexing process takes a long time. Does anybody knows if there's a better way to do this? Is there another way to get all the terms of an index? Thank you. Johan Analyst Programmer Belgian Biodiversity Platform ( http://www.biodiversity.be) Belgian Federal Science Policy Office (http://www.belspo.be ) Tel:+32 2 650 5751 Fax: +32 2 650 5124 From peter at ioffer.com Thu Oct 5 02:34:41 2006 From: peter at ioffer.com (peter) Date: Wed, 04 Oct 2006 23:34:41 -0700 Subject: [Ferret-talk] Error while optimizing my index In-Reply-To: Message-ID: Hey Dave, thanks for the reply. At this stage, the file size is about 20GB's, so it's a lot. Each file in the index (not optimized) is about 2GB, so it does seem we have hit the limit you are talking about. The box has over 140 GB's of storage, so it's not out of space, but it's consistently stopping at the same point each time. We've used gem install ferret to grab the latest (0.10.9) release. What's the way to compile it with this "large-file" support flag? Thanks. > From: "David Balmain" > Reply-To: ferret-talk at rubyforge.org > Date: Thu, 5 Oct 2006 09:57:22 +0900 > To: ferret-talk at rubyforge.org > Subject: Re: [Ferret-talk] Error while optimizing my index > > On 10/5/06, peter wrote: >> Hello. >> >> We are indexing a large amount of data (about 850,000 records). During the >> indexing, every 100,000 items I flush and optimize the index, so the disk >> space doesn't get too large during indexing. >> >> About 700,000 items in, I got this error: >> Error occured in fs_store.c:226 - fso_flush_i >> flushing src of length 1024 >> >> >> >> /usr/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:517:in `close' >> >> /usr/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:517:in `flush' >> /usr/lib/ruby/1.8/monitor.rb:229:in `synchronize' >> >> /usr/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:514:in `flush' >> >> >> Is there a way to find out what happened? It wasn't a space error, as >> there's plenty of space on the machine. >> >> Also, if I optimize during an index run such as this, do I need to close and >> then re-open the index, or will that not affect things? >> >> Thanks for any info you have. > > Hi Peter, > > How much space is the indexing taking up by this stage? You may be > hitting the 2Gb limit in which case you will need to compile Ferret > with large-file support. Otherwise, I'm not sure why you'd be getting > this error. I'm going to have to improve that error message. > > Cheers, > Dave > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From kraemer at webit.de Thu Oct 5 03:42:18 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 5 Oct 2006 09:42:18 +0200 Subject: [Ferret-talk] acts_as_ferret limit on multi_search not working? In-Reply-To: References: <20060920184901.GB11593@cordoba.webit.de> <97efd6d34b6fe3a8f444c287d835331f@ruby-forum.com> Message-ID: <20061005074218.GA5467@cordoba.webit.de> On Wed, Oct 04, 2006 at 06:16:36PM -0600, Joshua Bates wrote: > and I just found the culprit... > Line 27 of multi_index.rb > > searcher.search_each(query, options={}, &block) > > should be > > searcher.search_each(query, options, &block) yeah, this has been fixed in trunk a while ago, I'll try to push out a new release soon. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From dougal.s at gmail.com Thu Oct 5 03:55:47 2006 From: dougal.s at gmail.com (Douglas Shearer) Date: Thu, 5 Oct 2006 09:55:47 +0200 Subject: [Ferret-talk] Ferret install, rake failing on make In-Reply-To: References: <20061004080959.GB25863@cordoba.webit.de> <1a24e576566ebdf817a98291d4711d16@ruby-forum.com> <05ec3eec739d2ad23fff17ff4ff1f1bc@ruby-forum.com> Message-ID: David Balmain wrote: > Have you tried cd'ing into the ext directory and calling `ruby > extconf.rb` and `make` directly? That may solve your problems or at > least give you a better idea of what is causing the error. > > If this doesn't work I can send you a ferret_ext.so but there will be > a new version coming out very soon so I'll wait until then. I'll try doing that later on today to see if I can trace the error. Thanks very much to Jens for sending me a working ferret_ext.so, that has things running for the moment. The only thing I can of that was strange about my installs was that is was an Ubuntu Breezy install from CD, upgraded to Dapper by way of modification of the sources.list file. I did this since parallels won't support Breezy booting for some reason, and I had ran out of blank cds. I have a new server coming next week, so I'll load that with Dapper and try again. Thanks again guys. Dougal. -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Oct 5 04:30:59 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 5 Oct 2006 10:30:59 +0200 Subject: [Ferret-talk] search results autocompletion In-Reply-To: <002d01c6e843$4e868750$0700000a@ORION> References: <002d01c6e843$4e868750$0700000a@ORION> Message-ID: <20061005083059.GB5467@cordoba.webit.de> On Thu, Oct 05, 2006 at 07:58:40AM +0200, johan duflost wrote: > > Dear list, > > I 'm using a text input field with autocompletion . The suggestions come > from a ferret index which is created by getting all the terms belonging to > other indices. Here is the code: > > class Suggestion > > attr_accessor :term > > def self.index(create) > [Person, Project, Orgunit].each{|kl| > terms = self.all_terms(kl) > terms.each{|term| > suggestion = Suggestion.new > suggestion.term = term > SUGGESTION_INDEX << suggestion.to_doc > } > } > SUGGESTION_INDEX.optimize > end > > def self.all_terms(klass) > reader = Index::IndexReader.new(Object.const_get(klass.name.upcase + > "_INDEX_DIR")) > terms = [] > begin > reader.field_names.each {|field_name| > term_enum = reader.terms(field_name) > begin > term = term_enum.term() > if !term.nil? > if klass::SUGGESTIONABLE_FIELDS.include?(field_name) > terms << term > end > end > end while term_enum.next? > } > ensure > reader.close > end > return terms > end > > def to_doc > doc = {} > doc[:term] = self.term > return doc > end > > end > > > It works very well except that the indexing process takes a long time. Does > anybody knows if there's a better way to do this? > Is there another way to get all the terms of an index? Nothing ferret-related, but from the first look at it your code seems a bit inefficient: you check the SUGGESTIONABLE_FIELDS array for each term, instead of checking once and then going ahead. You even could just iterate over the SUGGESTIONABLE_FIELDS array and use the field names from there: def self.all_terms(klass) reader = Index::IndexReader.new(Object.const_get(klass.name.upcase + "_INDEX_DIR")) terms = [] begin klass::SUGGESTIONABLE_FIELDS.map { |field| reader.terms(field) }.each do |term_enum| # term_enum.term should not be nil, so no need to check this. terms << term_enum.term while term_enum.next? end ensure reader.close end return terms end if your SUGGESTIONABLE_FIELDS contains fields not in the index (yet), the reader.terms call might fail, in that case reader.terms(field) rescue nil and compacting the result of map before calling each should work. You further could save one iteration across all terms by yielding the addition of the term to the index like this: all_terms(klass) do |term| INDEX << { :term => term } end all_terms should do yield term_enum.term while term_enum.next? in the inner loop then. For extra style points rename all_terms to each_term :-) cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From dbalmain.ml at gmail.com Thu Oct 5 08:11:18 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 5 Oct 2006 21:11:18 +0900 Subject: [Ferret-talk] sort In-Reply-To: <000b01c6e841$30ec7940$0700000a@ORION> References: <000b01c6e841$30ec7940$0700000a@ORION> Message-ID: On 10/5/06, johan duflost wrote: > > Dear all, > > It seems there's a sort bug with ferret 0.10.9 on Debian. I sort the search > results by a field which can contain null values. The string sort type > doesn't work. If I test the values and replace null by empty strings when > indexing, it works. > > Johan > Hi Johan, I'm seeing the same thing here. I'll fix that right away. There will be a fix in the next release. Cheers, Dave From charlie.hubbard at gmail.com Thu Oct 5 09:11:07 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Thu, 5 Oct 2006 15:11:07 +0200 Subject: [Ferret-talk] Submitting patches for acts_as_ferret Message-ID: Hi, I have a small path I'd like to submit to acts_as_ferret how do I do that? Charlie -- Posted via http://www.ruby-forum.com/. From jduflost at ben.vub.ac.be Thu Oct 5 09:56:43 2006 From: jduflost at ben.vub.ac.be (johan duflost) Date: Thu, 5 Oct 2006 15:56:43 +0200 Subject: [Ferret-talk] search results autocompletion - Checked by AntiVir DE References: <002d01c6e843$4e868750$0700000a@ORION> <20061005083059.GB5467@cordoba.webit.de> Message-ID: <004a01c6e886$16dc35f0$0700000a@ORION> Jens, You are right, my code was not efficient I agree with you. The indices from which I create the suggestions index are not very big: 80kb, 300kb and 2 Mb. After 20 minutes, I get a suggestions index of 1400 kb approximately. Thank you for your help, Johan ----- Original Message ----- From: "Jens Kraemer" To: Sent: Thursday, October 05, 2006 10:30 AM Subject: Re: [Ferret-talk] search results autocompletion - Checked by AntiVir DE > On Thu, Oct 05, 2006 at 07:58:40AM +0200, johan duflost wrote: >> >> Dear list, >> >> I 'm using a text input field with autocompletion . The suggestions come >> from a ferret index which is created by getting all the terms belonging >> to >> other indices. Here is the code: >> >> class Suggestion >> >> attr_accessor :term >> >> def self.index(create) >> [Person, Project, Orgunit].each{|kl| >> terms = self.all_terms(kl) >> terms.each{|term| >> suggestion = Suggestion.new >> suggestion.term = term >> SUGGESTION_INDEX << suggestion.to_doc >> } >> } >> SUGGESTION_INDEX.optimize >> end >> >> def self.all_terms(klass) >> reader = Index::IndexReader.new(Object.const_get(klass.name.upcase + >> "_INDEX_DIR")) >> terms = [] >> begin >> reader.field_names.each {|field_name| >> term_enum = reader.terms(field_name) >> begin >> term = term_enum.term() >> if !term.nil? >> if klass::SUGGESTIONABLE_FIELDS.include?(field_name) >> terms << term >> end >> end >> end while term_enum.next? >> } >> ensure >> reader.close >> end >> return terms >> end >> >> def to_doc >> doc = {} >> doc[:term] = self.term >> return doc >> end >> >> end >> >> >> It works very well except that the indexing process takes a long time. >> Does >> anybody knows if there's a better way to do this? >> Is there another way to get all the terms of an index? > > Nothing ferret-related, but from the first look at it your code seems a > bit inefficient: you check the SUGGESTIONABLE_FIELDS array for each > term, instead of checking once and then going ahead. You even could just > iterate over the SUGGESTIONABLE_FIELDS array and use the field names > from there: > > def self.all_terms(klass) > reader = Index::IndexReader.new(Object.const_get(klass.name.upcase + > "_INDEX_DIR")) > terms = [] > begin > klass::SUGGESTIONABLE_FIELDS.map { |field| > reader.terms(field) > }.each do |term_enum| > # term_enum.term should not be nil, so no need to check this. > terms << term_enum.term while term_enum.next? > end > ensure > reader.close > end > return terms > end > > if your SUGGESTIONABLE_FIELDS contains fields not in the index (yet), the > reader.terms call might fail, in that case > reader.terms(field) rescue nil > and compacting the result of map before calling each should work. > > You further could save one iteration across all terms by yielding the > addition of the term to the index like this: > > all_terms(klass) do |term| > INDEX << { :term => term } > end > > all_terms should do > yield term_enum.term while term_enum.next? > in the inner loop then. For extra style points rename all_terms to > each_term :-) > > > > cheers, > Jens > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From chrisc at amphora-research.com Thu Oct 5 11:04:54 2006 From: chrisc at amphora-research.com (Chris Catton) Date: Thu, 5 Oct 2006 17:04:54 +0200 Subject: [Ferret-talk] indexing and migrations Message-ID: <3fcdee1ca37794ad8d26de331494cd0b@ruby-forum.com> Hi there Is it necessary to make sure that indexes are rebuilt when the db is migrated - columns added or removed? If so, can anyone tell me the best way of automating this? Chris -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Oct 5 12:02:45 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 5 Oct 2006 18:02:45 +0200 Subject: [Ferret-talk] Submitting patches for acts_as_ferret In-Reply-To: References: Message-ID: <20061005160244.GF5467@cordoba.webit.de> Hi Charlie, please open a ticket at http://projects.jkraemer.net/acts_as_ferret/ and attach the patch to it. thanks, Jens On Thu, Oct 05, 2006 at 03:11:07PM +0200, Charlie Hubbard wrote: > Hi, > > I have a small path I'd like to submit to acts_as_ferret how do I do > that? > > Charlie > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From charlie.hubbard at gmail.com Thu Oct 5 12:52:22 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Thu, 5 Oct 2006 18:52:22 +0200 Subject: [Ferret-talk] Submitting patches for acts_as_ferret In-Reply-To: References: Message-ID: <8773bbf2063fed2a707308cb94f7d624@ruby-forum.com> Charlie Hubbard wrote: > Hi, > > I have a small path I'd like to submit to acts_as_ferret how do I do > that? > > Charlie Sorry that's patch. -- Posted via http://www.ruby-forum.com/. From bk at benjaminkrause.com Thu Oct 5 13:33:22 2006 From: bk at benjaminkrause.com (Benjamin Krause) Date: Thu, 05 Oct 2006 19:33:22 +0200 Subject: [Ferret-talk] indexing and migrations In-Reply-To: <3fcdee1ca37794ad8d26de331494cd0b@ruby-forum.com> References: <3fcdee1ca37794ad8d26de331494cd0b@ruby-forum.com> Message-ID: <452541E2.9010406@benjaminkrause.com> Chris Catton schrieb: > Is it necessary to make sure that indexes are rebuilt when the db is > migrated - columns added or removed? If so, can anyone tell me the best > way of automating this? > Hey Chris, http://ferret.davebalmain.com/trac/wiki/FAQ%3AIndexing#WhathappensifIaddnewFieldstotheIndexlateron so you shouldn't worry about adding fields later on .. no need to rebuild. Ben From ryansking at gmail.com Thu Oct 5 17:00:15 2006 From: ryansking at gmail.com (Ryan King) Date: Thu, 5 Oct 2006 14:00:15 -0700 Subject: [Ferret-talk] indexing and migrations In-Reply-To: <3fcdee1ca37794ad8d26de331494cd0b@ruby-forum.com> References: <3fcdee1ca37794ad8d26de331494cd0b@ruby-forum.com> Message-ID: <846f30c70610051400i67978f54p7313729210689578@mail.gmail.com> On 10/5/06, Chris Catton wrote: > Hi there > Is it necessary to make sure that indexes are rebuilt when the db is > migrated - columns added or removed? Well, it will only really matter when you put data into those columns. > If so, can anyone tell me the best way of automating this? There's a rake task in acts_as_ferret for reindexing entire tables. -ryan From kraemer at webit.de Fri Oct 6 04:23:38 2006 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 6 Oct 2006 10:23:38 +0200 Subject: [Ferret-talk] search results autocompletion In-Reply-To: <004a01c6e886$16dc35f0$0700000a@ORION> References: <002d01c6e843$4e868750$0700000a@ORION> <20061005083059.GB5467@cordoba.webit.de> <004a01c6e886$16dc35f0$0700000a@ORION> Message-ID: <20061006082338.GH5467@cordoba.webit.de> On Thu, Oct 05, 2006 at 03:56:43PM +0200, johan duflost wrote: [..] > > The indices from which I create the suggestions index are not very big: > 80kb, 300kb and 2 Mb. > > After 20 minutes, I get a suggestions index of 1400 kb approximately. still looks somewhat slow to me... Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From f at andreas-s.net Fri Oct 6 14:54:47 2006 From: f at andreas-s.net (Andreas Schwarz) Date: Fri, 6 Oct 2006 20:54:47 +0200 Subject: [Ferret-talk] Ferret just got faster. In-Reply-To: References: Message-ID: <6081e4a35027a6de14dac5aabbef559a@ruby-forum.com> David Balmain wrote: > So as you can see, there is a significant difference (two orders of > magnitude). I'll get to answering my email now and I'll have a new gem > out soon after that. That's great news. Looking forward to the release. -- Posted via http://www.ruby-forum.com/. From john at fivesquaresoftware.com Fri Oct 6 16:32:56 2006 From: john at fivesquaresoftware.com (John Clayton) Date: Fri, 6 Oct 2006 13:32:56 -0700 Subject: [Ferret-talk] Luke does not work with Ferret indexes? Message-ID: <35F01A1E-03B4-4BA6-9D6A-921B25D95DA7@fivesquaresoftware.com> Hey, Luke doesn't seem to be able to open a Ferret index I've created. Is this expected? If yes, can someone recommend another index inspection tool? Thanks, John From rubystuff at tut0r.com Fri Oct 6 19:52:05 2006 From: rubystuff at tut0r.com (Jerome) Date: Sat, 7 Oct 2006 01:52:05 +0200 Subject: [Ferret-talk] Updating to the bleeding edge version of Ferret In-Reply-To: References: Message-ID: David Balmain wrote: > > build the gem. REL should be the current release and then append 0.1. > If you do this a second time between release append 0.2 and so on. The > current version is 0.10.5 so we'll build 0.10.5.1: > > $ rake package REL=0.10.5.1 How do we know what the version is when we download using: svn co svn://www.davebalmain.com/exp ferret Do we look at ruby/lib/ferret_version.rb each time? Thanks, Jerome -- Posted via http://www.ruby-forum.com/. From rubystuff at tut0r.com Fri Oct 6 19:56:15 2006 From: rubystuff at tut0r.com (Jerome) Date: Sat, 7 Oct 2006 01:56:15 +0200 Subject: [Ferret-talk] Updating to the bleeding edge version of Ferret In-Reply-To: References: Message-ID: <684c6ba5310448761d5a7ebf18e9345e@ruby-forum.com> How do we uninstall a previous version of Ferret? I don't see an option in the setup.rb file. Thanks for any info on this, Jerome -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Fri Oct 6 23:07:21 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sat, 7 Oct 2006 12:07:21 +0900 Subject: [Ferret-talk] Updating to the bleeding edge version of Ferret In-Reply-To: References: Message-ID: On 10/7/06, Jerome wrote: > David Balmain wrote: > > > > > build the gem. REL should be the current release and then append 0.1. > > If you do this a second time between release append 0.2 and so on. The > > current version is 0.10.5 so we'll build 0.10.5.1: > > > > $ rake package REL=0.10.5.1 > > How do we know what the version is when we download using: > svn co svn://www.davebalmain.com/exp ferret > > Do we look at ruby/lib/ferret_version.rb each time? > Yep From samuelgiffney at gmail.com Fri Oct 6 23:09:37 2006 From: samuelgiffney at gmail.com (Sam) Date: Sat, 7 Oct 2006 05:09:37 +0200 Subject: [Ferret-talk] Luke does not work with Ferret indexes? In-Reply-To: <35F01A1E-03B4-4BA6-9D6A-921B25D95DA7@fivesquaresoftware.com> References: <35F01A1E-03B4-4BA6-9D6A-921B25D95DA7@fivesquaresoftware.com> Message-ID: <461dd2192b05668c6b3df05939702d85@ruby-forum.com> Ferret indexes from version 0.10 are not compatible with Luke. I don't know of any compatible tool right now. John Clayton wrote: > Hey, > > Luke doesn't seem to be able to open a Ferret index I've created. Is > this expected? > > If yes, can someone recommend another index inspection tool? > > Thanks, > John -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Fri Oct 6 23:12:58 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sat, 7 Oct 2006 12:12:58 +0900 Subject: [Ferret-talk] Updating to the bleeding edge version of Ferret In-Reply-To: <684c6ba5310448761d5a7ebf18e9345e@ruby-forum.com> References: <684c6ba5310448761d5a7ebf18e9345e@ruby-forum.com> Message-ID: On 10/7/06, Jerome wrote: > How do we uninstall a previous version of Ferret? > > I don't see an option in the setup.rb file. > > Thanks for any info on this, > > Jerome > That's why you build a gem. You can use `gem uninstall`. It isn't really necessary to delete previous versions though when using a gems will take care of using the latest version. If you've installed using setup.rb you'll need to delete the files by hand. You should be able to find our where it is installed using the 'gemwhich' command. Cheers, Dave From dbalmain.ml at gmail.com Sat Oct 7 00:01:20 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sat, 7 Oct 2006 13:01:20 +0900 Subject: [Ferret-talk] Luke does not work with Ferret indexes? In-Reply-To: <35F01A1E-03B4-4BA6-9D6A-921B25D95DA7@fivesquaresoftware.com> References: <35F01A1E-03B4-4BA6-9D6A-921B25D95DA7@fivesquaresoftware.com> Message-ID: On 10/7/06, John Clayton wrote: > Hey, > > Luke doesn't seem to be able to open a Ferret index I've created. Is > this expected? > > If yes, can someone recommend another index inspection tool? > > Thanks, > John Yes, that is expected. Jens Kraemer implemented an inspection tool a while back except it doesn't work with the latest version of Ferret. It is based on GTK. I played with it a bit earlier this week and got it to work with the latest version of Ferret's indexes but it still needs a little more testing. I'm also concerned about the portability of this solution. It won't work on OS X without X server installed. I looked at a few of the other gui options briefly but I still haven't decided on the best option. These were my first impressions but please feel free to correct me. I won't be offended. * Tk: very portable but some of the widgets I really wanted where missing like tabbed panes. A lot of people seem to complain about it being ugly I think you can fix this by installing the Tile theming engine for Tk. * QtRuby: We'd need to use Qt4 because of licencing issues on Windows. I had trouble even installing Qt4 which doesn't bode well for it and there is currently no binary gem for windows. I don't want to deal with that problem again. Other than that, QtRuby looks really nice. * WxRuby: Again no binary gem but it has the widgets I need and it looks pretty good because it uses native widgets. I couldn't really find my way around the documentation though. I wanted to quick tutorial to show me how everything works. * FLTK: There are bindings for this in Ruby but I actually found it easy to use FLTK straight from C. I quite liked this solution because FLTK is light enough that I could possibly package FLTK up in the package. Also, FLUID is a great GUI builder. Anyway, I only looked at each of these GUIs briefly. What do other people think an inspection tool should be built with? Cheers, Dave From peter at ioffer.com Sat Oct 7 00:58:47 2006 From: peter at ioffer.com (peter) Date: Fri, 06 Oct 2006 21:58:47 -0700 Subject: [Ferret-talk] Updating to the bleeding edge version of Ferret In-Reply-To: <684c6ba5310448761d5a7ebf18e9345e@ruby-forum.com> Message-ID: If you installed using the rake REL=0.10.9.1 Which creates a gem package, simply removing that gem package puts you back at the previous version you had..it's pretty self-contained. For example, I compiled a version of 0.10.9.1, but removed it for API change reasons, so I simply removed my 0.10.9.1 folder (but still kept my 0.10.9 folder (under the ruby/1.8/gems folder), and everything was back in sync. Hope this helps. > From: Jerome > Reply-To: ferret-talk at rubyforge.org > Date: Sat, 7 Oct 2006 01:56:15 +0200 > To: ferret-talk at rubyforge.org > Subject: Re: [Ferret-talk] Updating to the bleeding edge version of Ferret > > How do we uninstall a previous version of Ferret? > > I don't see an option in the setup.rb file. > > Thanks for any info on this, > > Jerome > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From erik at ehatchersolutions.com Sat Oct 7 04:25:10 2006 From: erik at ehatchersolutions.com (Erik Hatcher) Date: Sat, 7 Oct 2006 04:25:10 -0400 Subject: [Ferret-talk] Luke does not work with Ferret indexes? In-Reply-To: References: <35F01A1E-03B4-4BA6-9D6A-921B25D95DA7@fivesquaresoftware.com> Message-ID: <98055D3A-ACE9-4586-BF8A-DF6ADD12C20F@ehatchersolutions.com> What about the even more obvious idea of creating a Rails front-end to Ferret indexes? That'd be portable, and easy enough for someone to install (and could come in handy remotely against an index on a server too). Erik On Oct 7, 2006, at 12:01 AM, David Balmain wrote: > On 10/7/06, John Clayton wrote: >> Hey, >> >> Luke doesn't seem to be able to open a Ferret index I've created. Is >> this expected? >> >> If yes, can someone recommend another index inspection tool? >> >> Thanks, >> John > > Yes, that is expected. Jens Kraemer implemented an inspection tool a > while back except it doesn't work with the latest version of Ferret. > It is based on GTK. I played with it a bit earlier this week and got > it to work with the latest version of Ferret's indexes but it still > needs a little more testing. I'm also concerned about the portability > of this solution. It won't work on OS X without X server installed. I > looked at a few of the other gui options briefly but I still haven't > decided on the best option. These were my first impressions but please > feel free to correct me. I won't be offended. > > * Tk: very portable but some of the widgets I really wanted where > missing like tabbed panes. A lot of people seem to complain about it > being ugly I think you can fix this by installing the Tile theming > engine for Tk. > > * QtRuby: We'd need to use Qt4 because of licencing issues on Windows. > I had trouble even installing Qt4 which doesn't bode well for it and > there is currently no binary gem for windows. I don't want to deal > with that problem again. Other than that, QtRuby looks really nice. > > * WxRuby: Again no binary gem but it has the widgets I need and it > looks pretty good because it uses native widgets. I couldn't really > find my way around the documentation though. I wanted to quick > tutorial to show me how everything works. > > * FLTK: There are bindings for this in Ruby but I actually found it > easy to use FLTK straight from C. I quite liked this solution because > FLTK is light enough that I could possibly package FLTK up in the > package. Also, FLUID is a great GUI builder. > > Anyway, I only looked at each of these GUIs briefly. What do other > people think an inspection tool should be built with? > > Cheers, > Dave > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From dbalmain.ml at gmail.com Sat Oct 7 07:58:33 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sat, 7 Oct 2006 20:58:33 +0900 Subject: [Ferret-talk] Luke does not work with Ferret indexes? In-Reply-To: <98055D3A-ACE9-4586-BF8A-DF6ADD12C20F@ehatchersolutions.com> References: <35F01A1E-03B4-4BA6-9D6A-921B25D95DA7@fivesquaresoftware.com> <98055D3A-ACE9-4586-BF8A-DF6ADD12C20F@ehatchersolutions.com> Message-ID: On 10/7/06, Erik Hatcher wrote: > What about the even more obvious idea of creating a Rails front-end > to Ferret indexes? That'd be portable, and easy enough for someone > to install (and could come in handy remotely against an index on a > server too). > > Erik > Hey Erik, it's been a while. I've actually been considering this idea since I sent my earlier email and your reply has convinced me that it's a good one. I was actually thinking Rails might be overkill for this and just a simple webrick servlet might do the job fine. Maybe I'm wrong. I've just got back into using Rails again this week. It's been a while since I worked on a GUI of any kind. :P Dave From hutch at recursive.ca Sat Oct 7 08:57:46 2006 From: hutch at recursive.ca (Bob Hutchison) Date: Sat, 7 Oct 2006 08:57:46 -0400 Subject: [Ferret-talk] How to proceed with incorporating Ferret? Message-ID: <3F9529F1-AED9-4F94-9B23-6691568C688D@recursive.ca> Hi, I've listened in to this mail list for quite a while now but not doing anything with Ferret until I was ready to incorporate it. I've used Lucene for years, but not Ferret. I downloaded and installed the 'bleeding edge' version (lets call it 0.10.9.1). There appears to be a significant re-working of the API happening. It all looks good. But there might be a couple of gaps still there. The first question: should I even consider using the 0.10.9.1 version of Ferret? What I intend to use it for will not be a critical component, at least for the time being. I'm also used to working with shifting software. The advantage that I see is the new API. Performance is a BIG issue with my project. The second question: are there any opinions regarding ease-of-upgrade from the current stable version to what is being worked on now. I don't have anything to upgrade at the moment, but if I go with the stable version then I will have. The third question: it looks to me that in the 0.10.9.1 version the content of the fields is being stored in the index. For my application this is worse than a waste of time. Am I missing something. The fourth question: in a message from August 23 there was a hint of a write-up discussing the new API. Did this ever get published? I think there is some *very* nice work here. I'm looking forward to using Ferret. Cheers, Bob ---- Bob Hutchison -- blogs at Recursive Design Inc. -- Raconteur -- xampl for Ruby -- From marvin at rectangular.com Sat Oct 7 09:16:24 2006 From: marvin at rectangular.com (Marvin Humphrey) Date: Sat, 7 Oct 2006 06:16:24 -0700 Subject: [Ferret-talk] Luke does not work with Ferret indexes? In-Reply-To: References: <35F01A1E-03B4-4BA6-9D6A-921B25D95DA7@fivesquaresoftware.com> Message-ID: <49F78D9A-D7BB-4913-9F18-A88F6CE9AB4A@rectangular.com> On Oct 6, 2006, at 9:01 PM, David Balmain wrote: > Anyway, I only looked at each of these GUIs briefly. What do other > people think an inspection tool should be built with? AJAX. This is on my todo list. But behind KinoSearch 0.20 and Lucy. An AJAX version of Luke would be a major contribution to the Lucene community, because it could be adapted easily to work with any of the different ports regardless of file format discrepancies. Marvin Humphrey Rectangular Research http://www.rectangular.com/ From jan.prill at gmail.com Sat Oct 7 14:29:06 2006 From: jan.prill at gmail.com (Jan Prill) Date: Sat, 7 Oct 2006 20:29:06 +0200 Subject: [Ferret-talk] Luke does not work with Ferret indexes? In-Reply-To: <49F78D9A-D7BB-4913-9F18-A88F6CE9AB4A@rectangular.com> References: <35F01A1E-03B4-4BA6-9D6A-921B25D95DA7@fivesquaresoftware.com> <49F78D9A-D7BB-4913-9F18-A88F6CE9AB4A@rectangular.com> Message-ID: <562a35c10610071129t2769ca02h12d82f0533ecaa38@mail.gmail.com> > > AJAX. > > This is on my todo list. But behind KinoSearch 0.20 and Lucy. > > An AJAX version of Luke would be a major contribution to the Lucene > community, because it could be adapted easily to work with any of the > different ports regardless of file format discrepancies. Rails with its great integration of prototype and scriptaculous looks like a perfect fit in this regard too. Cheers, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20061007/b3b2a1d8/attachment-0001.html From ork at orkland.de Sat Oct 7 15:11:12 2006 From: ork at orkland.de (Benjamin Krause) Date: Sat, 07 Oct 2006 21:11:12 +0200 Subject: [Ferret-talk] Luke does not work with Ferret indexes? In-Reply-To: <49F78D9A-D7BB-4913-9F18-A88F6CE9AB4A@rectangular.com> References: <35F01A1E-03B4-4BA6-9D6A-921B25D95DA7@fivesquaresoftware.com> <49F78D9A-D7BB-4913-9F18-A88F6CE9AB4A@rectangular.com> Message-ID: <4527FBD0.4030005@orkland.de> Marvin Humphrey schrieb: >>Anyway, I only looked at each of these GUIs briefly. What do other >>people think an inspection tool should be built with? >> >> > >AJAX. > >This is on my todo list. But behind KinoSearch 0.20 and Lucy. > > An Ajax-Frontend to the Ferret-Index would be great.. yes. And I think this wouldn't take that much time, if ferret allows me to get all the information i need to be displayed on the frontend. David, maybe we can work something out here? Ben From dbalmain.ml at gmail.com Sun Oct 8 00:24:56 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sun, 8 Oct 2006 13:24:56 +0900 Subject: [Ferret-talk] How to proceed with incorporating Ferret? In-Reply-To: <3F9529F1-AED9-4F94-9B23-6691568C688D@recursive.ca> References: <3F9529F1-AED9-4F94-9B23-6691568C688D@recursive.ca> Message-ID: On 10/7/06, Bob Hutchison wrote: > Hi, > > I've listened in to this mail list for quite a while now but not > doing anything with Ferret until I was ready to incorporate it. I've > used Lucene for years, but not Ferret. > > I downloaded and installed the 'bleeding edge' version (lets call it > 0.10.9.1). There appears to be a significant re-working of the API > happening. It all looks good. But there might be a couple of gaps > still there. I'm all ears. What do you think needs improvement? > The first question: should I even consider using the 0.10.9.1 version > of Ferret? What I intend to use it for will not be a critical > component, at least for the time being. I'm also used to working with > shifting software. The advantage that I see is the new API. > Performance is a BIG issue with my project. I've just release 0.10.10. Version 0.10.9 is probably the most stable version to date. 0.10.10 has some significant changes to improve performance of sorting and filtering of large unoptimized indexes (putting Ferret orders up to orders of magnitude ahead of Lucene for these tasks). In a few days we should know if I broke anything. There are currently only 3 outstanding tickets on Trac and they are only on Windows and OS X so if you are on Linux you should be fine. > The second question: are there any opinions regarding ease-of-upgrade > from the current stable version to what is being worked on now. I > don't have anything to upgrade at the moment, but if I go with the > stable version then I will have. Well, 0.10.9 is the most stable version since the pure ruby version so that would be the version I go with. Also, I can usually fix most problems within a day or two if I can reproduce the problem or you are willing to give me ssh access to your server. > The third question: it looks to me that in the 0.10.9.1 version the > content of the fields is being stored in the index. For my > application this is worse than a waste of time. Am I missing something. > It depends how you set your index up. You specify which fields you want stored/indexed or term-vectorized (I know, it's not a word). # set to not store fields by default field_infos = FieldInfos.new(:store => :no) # must store id field however field_infos.add_field(:id, :store => :yes, :index => :untokenized) > The fourth question: in a message from August 23 there was a hint of > a write-up discussing the new API. Did this ever get published? No. But I did update the documentation here: http://ferret.davebalmain.com/api/files/TUTORIAL.html You may even find the Ferret FAQ even better. http://ferret.davebalmain.com/trac/wiki/FAQ And there may be an O'Reilly "shortcut" coming out soon. > I think there is some *very* nice work here. I'm looking forward to > using Ferret. Great. Thanks, Dave From hutch at recursive.ca Sun Oct 8 11:11:20 2006 From: hutch at recursive.ca (Bob Hutchison) Date: Sun, 8 Oct 2006 11:11:20 -0400 Subject: [Ferret-talk] How to proceed with incorporating Ferret? In-Reply-To: References: <3F9529F1-AED9-4F94-9B23-6691568C688D@recursive.ca> Message-ID: <56AF114C-3903-45BA-AE16-A0FBA8BB9E2E@recursive.ca> On 8-Oct-06, at 12:24 AM, David Balmain wrote: > On 10/7/06, Bob Hutchison wrote: >> Hi, >> >> I've listened in to this mail list for quite a while now but not >> doing anything with Ferret until I was ready to incorporate it. I've >> used Lucene for years, but not Ferret. >> >> I downloaded and installed the 'bleeding edge' version (lets call it >> 0.10.9.1). There appears to be a significant re-working of the API >> happening. It all looks good. But there might be a couple of gaps >> still there. > > I'm all ears. What do you think needs improvement? It may simply be a misunderstanding on my part, read on. I also can't figure out how to redefine the field used as an id (again, read on, the documented way isn't working for me and probably because of what comes up below). > >> The first question: should I even consider using the 0.10.9.1 version >> of Ferret? What I intend to use it for will not be a critical >> component, at least for the time being. I'm also used to working with >> shifting software. The advantage that I see is the new API. >> Performance is a BIG issue with my project. > > I've just release 0.10.10. Version 0.10.9 is probably the most stable > version to date. 0.10.10 has some significant changes to improve > performance of sorting and filtering of large unoptimized indexes > (putting Ferret orders up to orders of magnitude ahead of Lucene for > these tasks). In a few days we should know if I broke anything. There > are currently only 3 outstanding tickets on Trac and they are only on > Windows and OS X so if you are on Linux you should be fine. Of course I'm running OS X... this couldn't be easy :-) I'm also seeing issues 127 and 136 (like everyone else on OS X will be). Another thing for OS X, until Apple fixes their gcc4 compiler either use the gcc3 compiler or use -O1 rather than -O2. I changed the ext_conf file to do this, but the two OS X issue remain. If you don't do this you will eventually get a corrupted heap (usually takes a while). I've had to recompile ruby to this optimisation level for it to work reliably. > >> The second question: are there any opinions regarding ease-of-upgrade >> from the current stable version to what is being worked on now. I >> don't have anything to upgrade at the moment, but if I go with the >> stable version then I will have. > > Well, 0.10.9 is the most stable version since the pure ruby version so > that would be the version I go with. Also, I can usually fix most > problems within a day or two if I can reproduce the problem or you are > willing to give me ssh access to your server. Okay, I'm convinced. The most recent is the way to go. > >> The third question: it looks to me that in the 0.10.9.1 version the >> content of the fields is being stored in the index. For my >> application this is worse than a waste of time. Am I missing >> something. >> > > It depends how you set your index up. You specify which fields you > want stored/indexed or term-vectorized (I know, it's not a word). > > # set to not store fields by default > field_infos = FieldInfos.new(:store => :no) > # must store id field however > field_infos.add_field(:id, :store => :yes, :index => :untokenized) So, I tried requiring ferret. It simply won't admit to knowing anything about the FieldInfos class. How bad are those two remaining OS X bugs? So, I tried requiring rferret. That worked better. I tried your example (actually I tried this before posting and this is why I said I thought I saw a few gaps). It doesn't work for me. The initialize method for FieldInfos is defined as: def initialize(dir = nil, name = nil) @fi_array = [] @fi_hash = {} if dir and dir.exists?(name) The options in your example are assigned to the dir and an exists? method is undefined on a hash and so a method missing exception is thrown. I've happily forgotten most of my C code, but it looks as though the C version is doing something similar (not that it matters in my case because FieldInfos is invisible) > >> The fourth question: in a message from August 23 there was a hint of >> a write-up discussing the new API. Did this ever get published? > > No. But I did update the documentation here: > > http://ferret.davebalmain.com/api/files/TUTORIAL.html I thought that was the old way since I couldn't get it to work (see above). > > You may even find the Ferret FAQ even better. > > http://ferret.davebalmain.com/trac/wiki/FAQ I don't know how I missed that. Thanks. > > And there may be an O'Reilly "shortcut" coming out soon. That's great! Cheers, Bob > >> I think there is some *very* nice work here. I'm looking forward to >> using Ferret. > > Great. Thanks, > Dave > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk ---- Bob Hutchison -- blogs at Recursive Design Inc. -- Raconteur -- xampl for Ruby -- From hutch at recursive.ca Sun Oct 8 11:32:39 2006 From: hutch at recursive.ca (Bob Hutchison) Date: Sun, 8 Oct 2006 11:32:39 -0400 Subject: [Ferret-talk] How to proceed with incorporating Ferret? In-Reply-To: <56AF114C-3903-45BA-AE16-A0FBA8BB9E2E@recursive.ca> References: <3F9529F1-AED9-4F94-9B23-6691568C688D@recursive.ca> <56AF114C-3903-45BA-AE16-A0FBA8BB9E2E@recursive.ca> Message-ID: <097D5156-9E65-4804-8E76-D2DCA48C792C@recursive.ca> It looks as though I somehow got the wrong version out of subversion. Hold on while I do this again. Sorry about that. Bob On 8-Oct-06, at 11:11 AM, Bob Hutchison wrote: > > On 8-Oct-06, at 12:24 AM, David Balmain wrote: > >> On 10/7/06, Bob Hutchison wrote: >>> Hi, >>> >>> I've listened in to this mail list for quite a while now but not >>> doing anything with Ferret until I was ready to incorporate it. I've >>> used Lucene for years, but not Ferret. >>> >>> I downloaded and installed the 'bleeding edge' version (lets call it >>> 0.10.9.1). There appears to be a significant re-working of the API >>> happening. It all looks good. But there might be a couple of gaps >>> still there. >> >> I'm all ears. What do you think needs improvement? > > It may simply be a misunderstanding on my part, read on. I also > can't figure out how to redefine the field used as an id (again, > read on, the documented way isn't working for me and probably > because of what comes up below). > >> >>> The first question: should I even consider using the 0.10.9.1 >>> version >>> of Ferret? What I intend to use it for will not be a critical >>> component, at least for the time being. I'm also used to working >>> with >>> shifting software. The advantage that I see is the new API. >>> Performance is a BIG issue with my project. >> >> I've just release 0.10.10. Version 0.10.9 is probably the most stable >> version to date. 0.10.10 has some significant changes to improve >> performance of sorting and filtering of large unoptimized indexes >> (putting Ferret orders up to orders of magnitude ahead of Lucene for >> these tasks). In a few days we should know if I broke anything. There >> are currently only 3 outstanding tickets on Trac and they are only on >> Windows and OS X so if you are on Linux you should be fine. > > Of course I'm running OS X... this couldn't be easy :-) I'm also > seeing issues 127 and 136 (like everyone else on OS X will be). > Another thing for OS X, until Apple fixes their gcc4 compiler > either use the gcc3 compiler or use -O1 rather than -O2. I changed > the ext_conf file to do this, but the two OS X issue remain. If you > don't do this you will eventually get a corrupted heap (usually > takes a while). I've had to recompile ruby to this optimisation > level for it to work reliably. > >> >>> The second question: are there any opinions regarding ease-of- >>> upgrade >>> from the current stable version to what is being worked on now. I >>> don't have anything to upgrade at the moment, but if I go with the >>> stable version then I will have. >> >> Well, 0.10.9 is the most stable version since the pure ruby >> version so >> that would be the version I go with. Also, I can usually fix most >> problems within a day or two if I can reproduce the problem or you >> are >> willing to give me ssh access to your server. > > Okay, I'm convinced. The most recent is the way to go. > >> >>> The third question: it looks to me that in the 0.10.9.1 version the >>> content of the fields is being stored in the index. For my >>> application this is worse than a waste of time. Am I missing >>> something. >>> >> >> It depends how you set your index up. You specify which fields you >> want stored/indexed or term-vectorized (I know, it's not a word). >> >> # set to not store fields by default >> field_infos = FieldInfos.new(:store => :no) >> # must store id field however >> field_infos.add_field(:id, :store => :yes, :index >> => :untokenized) > > So, I tried requiring ferret. It simply won't admit to knowing > anything about the FieldInfos class. How bad are those two > remaining OS X bugs? > > So, I tried requiring rferret. That worked better. > > I tried your example (actually I tried this before posting and this > is why I said I thought I saw a few gaps). It doesn't work for me. > The initialize method for FieldInfos is defined as: > > def initialize(dir = nil, name = nil) > @fi_array = [] > @fi_hash = {} > if dir and dir.exists?(name) > > The options in your example are assigned to the dir and an exists? > method is undefined on a hash and so a method missing exception is > thrown. > > I've happily forgotten most of my C code, but it looks as though > the C version is doing something similar (not that it matters in my > case because FieldInfos is invisible) > >> >>> The fourth question: in a message from August 23 there was a hint of >>> a write-up discussing the new API. Did this ever get published? >> >> No. But I did update the documentation here: >> >> http://ferret.davebalmain.com/api/files/TUTORIAL.html > > I thought that was the old way since I couldn't get it to work (see > above). > >> >> You may even find the Ferret FAQ even better. >> >> http://ferret.davebalmain.com/trac/wiki/FAQ > > I don't know how I missed that. Thanks. > >> >> And there may be an O'Reilly "shortcut" coming out soon. > > That's great! > > Cheers, > Bob > >> >>> I think there is some *very* nice work here. I'm looking forward to >>> using Ferret. >> >> Great. Thanks, >> Dave >> _______________________________________________ >> Ferret-talk mailing list >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk > > ---- > Bob Hutchison -- blogs at hutch/> > Recursive Design Inc. -- > Raconteur -- > xampl for Ruby -- xampl/> > > > ---- Bob Hutchison -- blogs at Recursive Design Inc. -- Raconteur -- xampl for Ruby -- From hutch at recursive.ca Sun Oct 8 11:55:43 2006 From: hutch at recursive.ca (Bob Hutchison) Date: Sun, 8 Oct 2006 11:55:43 -0400 Subject: [Ferret-talk] How to proceed with incorporating Ferret? In-Reply-To: <097D5156-9E65-4804-8E76-D2DCA48C792C@recursive.ca> References: <3F9529F1-AED9-4F94-9B23-6691568C688D@recursive.ca> <56AF114C-3903-45BA-AE16-A0FBA8BB9E2E@recursive.ca> <097D5156-9E65-4804-8E76-D2DCA48C792C@recursive.ca> Message-ID: <9DC7758D-EA51-41F4-A0FA-9F8B67A70211@recursive.ca> On 8-Oct-06, at 11:32 AM, Bob Hutchison wrote: > It looks as though I somehow got the wrong version out of > subversion. Hold on while I do this again. Sorry about that. That is what happened, sorry for the noise. The 0.10.10 version is running at least 225 times faster. And the tutorial works. Sigh. (I got the version I was working from with this command: svn checkout svn://davebalmain.com/ferret/trunk ferret and I don't remember where I got that from) Well, I'm comfortably set. Cheers, Bob ---- Bob Hutchison -- blogs at Recursive Design Inc. -- Raconteur -- xampl for Ruby -- From anrake at gmail.com Sun Oct 8 21:38:52 2006 From: anrake at gmail.com (anrake) Date: Mon, 9 Oct 2006 03:38:52 +0200 Subject: [Ferret-talk] AaF not indexing models when no index present in multi_searc Message-ID: <98078283c417114f1e2a5563f5552426@ruby-forum.com> The first time I tried to run a multi_search, I got an error and discovered there were no indexes for any of the models. index > model folders were there but no actual indexes. I couldn't figure out how to get it to create indexes so I edited a record for each model which created all the indexes. Then when I ran the search, everything worked well. Is there a bug related to this somewhere or anyway to better force index creation when none exists. I'm running the newest version (but not trunk) of each Ferret and acts_as_ferret -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Sun Oct 8 22:44:58 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Mon, 9 Oct 2006 11:44:58 +0900 Subject: [Ferret-talk] How to proceed with incorporating Ferret? In-Reply-To: <9DC7758D-EA51-41F4-A0FA-9F8B67A70211@recursive.ca> References: <3F9529F1-AED9-4F94-9B23-6691568C688D@recursive.ca> <56AF114C-3903-45BA-AE16-A0FBA8BB9E2E@recursive.ca> <097D5156-9E65-4804-8E76-D2DCA48C792C@recursive.ca> <9DC7758D-EA51-41F4-A0FA-9F8B67A70211@recursive.ca> Message-ID: On 10/9/06, Bob Hutchison wrote: > > On 8-Oct-06, at 11:32 AM, Bob Hutchison wrote: > > > It looks as though I somehow got the wrong version out of > > subversion. Hold on while I do this again. Sorry about that. > > That is what happened, sorry for the noise. The 0.10.10 version is > running at least 225 times faster. And the tutorial works. Sigh. > > (I got the version I was working from with this command: > > svn checkout svn://davebalmain.com/ferret/trunk ferret > > and I don't remember where I got that from) > > Well, I'm comfortably set. > > Cheers, > Bob > Sorry that was my fault. The current version of Ferret is in a different repository: svn co svn://www.davebalmain.com/exp ferret The reason for this was that the curretn version started out as an experimental version where I was trying a few things out and ended out being a complete rewrite with different file format and all. I still have to roll it into the original ferret repository. Dave From khaosduke at gmail.com Mon Oct 9 04:35:00 2006 From: khaosduke at gmail.com (wc) Date: Mon, 9 Oct 2006 10:35:00 +0200 Subject: [Ferret-talk] multi_search problems, Never go away! Message-ID: Hello, I am trying to use the multi_search method, but I keep getting type error on nil objects, I send it [Model1,Model2] and it seems as though the class names keep getting clobbered and turn to nil, somewhere along the multi_index area. I tried to trace what was going on, but I got nothing, also, this only happens when there are actually hits(thank god, most of the time). Perhaps some insight? Thank you! -- Posted via http://www.ruby-forum.com/. From anrake at gmail.com Mon Oct 9 05:19:20 2006 From: anrake at gmail.com (anrake) Date: Mon, 9 Oct 2006 11:19:20 +0200 Subject: [Ferret-talk] ferret finds 'tests' but not 'test' In-Reply-To: References: <94cbc17ff76e8950daeea9a13b10afd6@ruby-forum.com> Message-ID: Hi, if I use this stemming analyzer, where do I put it ? /lib/ and require it in each model? -Anrake David Balmain wrote: > On 9/6/06, Alastair Moore wrote: >> Alastair > The default analyzer doesn't perform any stemming. You need to create > your own analyzer with a stemmer. Something like this; > > require 'rubygems' > require 'ferret' > > module Ferret::Analysis > class MyAnalyzer > def token_stream(field, text) > StemFilter.new(StandardTokenizer.new(text)) > end > end > end > > index = Ferret::I.new(:analyzer => Ferret::Analysis::MyAnalyzer.new) > > index << "test" > index << "tests debate debater debating the for," > puts index.search("test").total_hits > > Hope that helps, > Dave -- Posted via http://www.ruby-forum.com/. From none at none.com Mon Oct 9 09:27:34 2006 From: none at none.com (bbqTree) Date: Mon, 9 Oct 2006 15:27:34 +0200 Subject: [Ferret-talk] hello, acts_as_ferret questions, any help greatly appreciate Message-ID: <00999a3a6e339623990e55492113e273@ruby-forum.com> hi, ive been reading up on ferret, acts_as_ferret, and other search plugins for rails. after reading about ferret, i found out about the acts_as_ferrt plugin. my first question about acts_as_ferret: 1. from reading about ferret, do i still need to manually save the IDX and add a IDX column field to my model table for acts_as_ferret to work? they say that acts_as_ferret handles everything, but i wasnt sure what exactly does it handle when compared to the ferret tutorials that i read. 2. is there a complete tutorial online for acts_as_ferret plugin? so far all the blogs that i came across pretty much say the same thing. 3. does acts_as_ferret do this? if i specify acts_as_ferret to index just title column in my 'recipe' table, and if someone types 'bbq plate' in the search box, will these recipes with the following title match? "island bbq style plate" "bbq ribs" "potato plate salad" "my bbq plate" if so, will the one with the most matching text like "my bbq plate" be listed as #1? Thanks for the help! -- Posted via http://www.ruby-forum.com/. From Glsio.vip at gmail.com Mon Oct 9 11:07:14 2006 From: Glsio.vip at gmail.com (Glsio) Date: Mon, 9 Oct 2006 17:07:14 +0200 Subject: [Ferret-talk] How could acts_as_taggable work with ferret? Message-ID: Suppose that acts_as_taggable need three models: Tag, Tagging, Article Could tag search be realized using acts_as_ferret? Suppose tags:"a","ab","abc",If i wanted to get all the articles with tag name including "a", Could ferret satifisy this requirement? -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Oct 9 11:50:28 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 9 Oct 2006 17:50:28 +0200 Subject: [Ferret-talk] multi_search problems, Never go away! In-Reply-To: References: Message-ID: <20061009155028.GA6332@cordoba.webit.de> Hi! On Mon, Oct 09, 2006 at 10:35:00AM +0200, wc wrote: > Hello, I am trying to use the multi_search method, but I keep getting > type error on nil objects, I send it [Model1,Model2] and it seems as > though the class names keep getting clobbered and turn to nil, somewhere > along the multi_index area. I tried to trace what was going on, but I > got nothing, also, this only happens when there are actually hits(thank > god, most of the time). Perhaps some insight? Thank you! a stack trace and / or some code would help in tracking this down... Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From dbalmain.ml at gmail.com Mon Oct 9 12:22:31 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 10 Oct 2006 01:22:31 +0900 Subject: [Ferret-talk] hello, acts_as_ferret questions, any help greatly appreciate In-Reply-To: <00999a3a6e339623990e55492113e273@ruby-forum.com> References: <00999a3a6e339623990e55492113e273@ruby-forum.com> Message-ID: On 10/9/06, bbqTree wrote: > hi, ive been reading up on ferret, acts_as_ferret, and other search > plugins for rails. > > after reading about ferret, i found out about the acts_as_ferrt plugin. > > my first question about acts_as_ferret: > > 1. from reading about ferret, do i still need to manually save the IDX > and add a IDX column field to my model table for acts_as_ferret to work? > they say that acts_as_ferret handles everything, but i wasnt sure what > exactly does it handle when compared to the ferret tutorials that i > read. Personally I don't use acts_as_ferret but I'll try and answer these questions as best I can. The following will automatically index the fields of the Foo model: class Foo < ActiveRecord::Base acts_as_ferret end See http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage for more advanced usage. > 2. is there a complete tutorial online for acts_as_ferret plugin? so far > all the blogs that i came across pretty much say the same thing. That above link is the most complete tutorial I know of. Maybe someone else will know of other resources. If you need to do anything more advanced then you should probably learn more about Ferret itself. > 3. does acts_as_ferret do this? if i specify acts_as_ferret to index > just title column in my 'recipe' table, and if someone types 'bbq plate' > in the search box, will these recipes with the following title match? > > "island bbq style plate" > "bbq ribs" > "potato plate salad" > "my bbq plate" > > if so, will the one with the most matching text like "my bbq plate" be > listed as #1? Ferret is pretty easy to experiment with. Here is an example: require 'rubygems' require 'ferret' index = Ferret::I.new [ "island bbq style plate", "bbq ribs", "potato plate salad", "my bbq plate" ].each {|text| index << text} puts index.search('bbq plate') puts index.search('"bbq plate"') And the output: TopDocs: total_hits = 4, max_score = 0.883883 [ 3 "my bbq plate": 0.883883 0 "island bbq style plate": 0.707107 1 "bbq ribs": 0.220971 2 "potato plate salad": 0.176777 ] TopDocs: total_hits = 1, max_score = 1.250000 [ 3 "my bbq plate": 1.250000 ] So to answer your question, yes, the results are ordered by relevency. > Thanks for the help! no problem. Dave From kraemer at webit.de Mon Oct 9 12:22:38 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 9 Oct 2006 18:22:38 +0200 Subject: [Ferret-talk] hello, acts_as_ferret questions, any help greatly appreciate In-Reply-To: <00999a3a6e339623990e55492113e273@ruby-forum.com> References: <00999a3a6e339623990e55492113e273@ruby-forum.com> Message-ID: <20061009162238.GB6332@cordoba.webit.de> On Mon, Oct 09, 2006 at 03:27:34PM +0200, bbqTree wrote: > hi, ive been reading up on ferret, acts_as_ferret, and other search > plugins for rails. > > after reading about ferret, i found out about the acts_as_ferrt plugin. > > my first question about acts_as_ferret: > > 1. from reading about ferret, do i still need to manually save the IDX > and add a IDX column field to my model table for acts_as_ferret to work? > they say that acts_as_ferret handles everything, but i wasnt sure what > exactly does it handle when compared to the ferret tutorials that i > read. I don't know what that IDX column should be good for, where did you read about that ? basically acts_as_ferret can handle everything for you. just call acts_as_ferret in your model class definition, and you'll get (besides other goodies and methods for special use cases): - automatic index creation for existing records - new records are automatically added to the index - automatic index updates, when you update your model data - automatic removal of documents from the index when the corresponding model object is destroyed. - Model.find_by_contents('querystring') to retrieve model instances matching the querystring. the acts_as_ferret method offers several options to customize the way aaf works, i.e. which fields of your model should get indexed, and with which indexing options (refer to the api docs at http://projects.jkraemer.net/acts_as_ferret/rdoc) > 2. is there a complete tutorial online for acts_as_ferret plugin? so far > all the blogs that i came across pretty much say the same thing. the API docs for the acts_as_ferret and find_by_content methods should get you started. You might also want to check out the demo project from the acts_as_ferret svn (see the wiki at http://projects.jkraemer.net/acts_as_ferret/ for the repository url) > 3. does acts_as_ferret do this? if i specify acts_as_ferret to index > just title column in my 'recipe' table, and if someone types 'bbq plate' > in the search box, will these recipes with the following title match? > > "island bbq style plate" > "bbq ribs" > "potato plate salad" > "my bbq plate" since acts_as_ferret defaults to ANDed queries, only records 1 and 4 would match. you can however make aaf use ORed queries by specifying :or_default => false in your call to acts_as_ferret. > if so, will the one with the most matching text like "my bbq plate" be > listed as #1? I suppose it would be #1, not because the exact phrase matches, but because it's the most relevant hit (the overall size of the text is smaller than that of the other document matching the query, and a match in a short text generally scores higher than one in a large text). Anybody please correct me if I'm wrong here. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From lutz.horn at gmx.de Mon Oct 9 12:27:11 2006 From: lutz.horn at gmx.de (Lutz Horn) Date: Mon, 9 Oct 2006 18:27:11 +0200 Subject: [Ferret-talk] acts_as_ferret: case insensitive search Message-ID: How can I index and search RoR model objects in a case insensitive manner? In Ferret there is the LowerCaseFilter (http://ferret.davebalmain.com/api/classes/Ferret/Analysis/LowerCaseFilter.html). How can I utilize it and other filters with acts_as_ferret? -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Oct 9 12:38:11 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 9 Oct 2006 18:38:11 +0200 Subject: [Ferret-talk] AaF not indexing models when no index present in multi_searc In-Reply-To: <98078283c417114f1e2a5563f5552426@ruby-forum.com> References: <98078283c417114f1e2a5563f5552426@ruby-forum.com> Message-ID: <20061009163810.GE6332@cordoba.webit.de> On Mon, Oct 09, 2006 at 03:38:52AM +0200, anrake wrote: > The first time I tried to run a multi_search, I got an error and > discovered there were no indexes for any of the models. index > model > folders were there but no actual indexes. I couldn't figure out how to > get it to create indexes so I edited a record for each model which > created all the indexes. Then when I ran the search, everything worked > well. calling Model.find_by_contents('some query') on each model would have been sufficient, this method does rebuild the index if it doesn't exist yet. > Is there a bug related to this somewhere or anyway to better force index > creation when none exists. I'm running the newest version (but not > trunk) of each Ferret and acts_as_ferret This is a known issue (there's a #todo in the code somewhere...). I opened up a ticket for this one. Hope to get it fixed soon ;-) cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From khaosduke at gmail.com Mon Oct 9 17:31:47 2006 From: khaosduke at gmail.com (wc) Date: Mon, 9 Oct 2006 23:31:47 +0200 Subject: [Ferret-talk] multi_search problems, Never go away! In-Reply-To: <20061009155028.GA6332@cordoba.webit.de> References: <20061009155028.GA6332@cordoba.webit.de> Message-ID: Jens Kraemer wrote: > Hi! > > On Mon, Oct 09, 2006 at 10:35:00AM +0200, wc wrote: >> Hello, I am trying to use the multi_search method, but I keep getting >> type error on nil objects, I send it [Model1,Model2] and it seems as >> though the class names keep getting clobbered and turn to nil, somewhere >> along the multi_index area. I tried to trace what was going on, but I >> got nothing, also, this only happens when there are actually hits(thank >> god, most of the time). Perhaps some insight? Thank you! > > a stack trace and / or some code would help in tracking this down... > > Jens > > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 Here is the trace User.multi_search("foo",[Model1,Model2]) TypeError: nil is not a symbol from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:412:in `const_get' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:412:in `multi_search' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:431:in `id_multi_search' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/multi_index.rb:27:in `search_each' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:428:in `id_multi_search' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:411:in `multi_search' from (irb):1 both my models call: acts_as_ferret :store_class_name => true and that is basically it, find_by_contents works great, just not multi_search Hope this helps ~wil -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Mon Oct 9 18:58:13 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 10 Oct 2006 07:58:13 +0900 Subject: [Ferret-talk] How could acts_as_taggable work with ferret? In-Reply-To: References: Message-ID: On 10/10/06, Glsio wrote: > Suppose that acts_as_taggable need three models: Tag, Tagging, Article > > Could tag search be realized using acts_as_ferret? > Suppose tags:"a","ab","abc",If i wanted to get all the articles with tag > name including "a", Could ferret satifisy this requirement? > Yes, no problem at all. You just need to make sure that the tag field is included in the index for the Article model. Dave From dbalmain.ml at gmail.com Mon Oct 9 19:45:55 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 10 Oct 2006 08:45:55 +0900 Subject: [Ferret-talk] acts_as_ferret: case insensitive search In-Reply-To: References: Message-ID: On 10/10/06, Lutz Horn wrote: > How can I index and search RoR model objects in a case insensitive > manner? In Ferret there is the LowerCaseFilter > (http://ferret.davebalmain.com/api/classes/Ferret/Analysis/LowerCaseFilter.html). > How can I utilize it and other filters with acts_as_ferret? > The LowercaseFilter is used by default so you don't need to worry about it. If you need to build your own analyzer you can specify it as one of the ferret_options to acts_as_ferret. See the documentation here: http://projects.jkraemer.net/acts_as_ferret/rdoc/classes/FerretMixin/Acts/ARFerret/ClassMethods.html#M000006 To build your own analyzer see the documentation for Ferret::Analysis::Analyzer. Look here: http://ferret.davebalmain.com/api/ cheers, Dave From christopher.kilmer at biego.com Mon Oct 9 21:32:57 2006 From: christopher.kilmer at biego.com (Chris Kilmer) Date: Tue, 10 Oct 2006 03:32:57 +0200 Subject: [Ferret-talk] oddness when adding to index - Message-ID: <5198caf199792dc916294a8414d425e6@ruby-forum.com> I was having some odd results when working with acts_as_ferret (current trunk), so I decided to test with the current version of ferret to see if I encountered the same problem. I did. Here are the details: installed ferret 0.10.10 on debian sarge with 'sudo gem install ferret' (btw, same results on OSX) opened up an irb session: irb(main):001:0> require 'rubygems' => true irb(main):002:0> require 'ferret' => true irb(main):003:0> include Ferret => Object irb(main):004:0> i = Ferret::I.new => #:*, :dir=>#, :analyzer=>#, :lock_retry_time=>2}, @mon_entering_queue=[], @qp=nil, @searcher=nil, @mon_count=0, @default_field=:*, @close_dir=true, @auto_flush=false, @open=true, @mon_owner=nil, @id_field=:id, @reader=nil, @mon_waiting_queue=[], @writer=nil, @default_input_field=:id, @dir=#> *** Now let's add 3 strings to the index *** irb(main):005:0> ["While you were out pet care", "Eastside dog walker", "Top daw g dog walker"].each {|text| i << text } => ["While you were out pet care", "Eastside dog walker", "Top dawg dog walker"] *** Now let's do some searches *** irb(main):006:0> puts i.search('pet') TopDocs: total_hits = 1, max_score = 0.878416 [ 0 "While you were out pet care": 0.878416 ] => nil irb(main):007:0> puts i.search('dog') TopDocs: total_hits = 2, max_score = 0.500000 [ 1 "Eastside dog walker": 0.500000 2 "Top dawg dog walker": 0.500000 ] => nil irb(main):008:0> puts i.search('dawg') TopDocs: total_hits = 1, max_score = 0.702733 [ 2 "Top dawg dog walker": 0.702733 ] => nil irb(main):010:0> puts i.search('cat') TopDocs: total_hits = 0, max_score = 0.000000 [ ] => nil *** The previous 4 searches gave expected results. Notice that search for 'cat' returned nothing (as it should) *** *** Let's add some more strings to the index. They are the same, but does it matter? *** irb(main):010:0> ["While you were out pet care", "Eastside dog walker", "Top dawg g dog walker"].each { |text| i << text } => ["While you were out pet care", "Eastside dog walker", "Top dawg dog walker"] irb(main):011:0> ["While you were out pet care", "Eastside dog walker", "Top dawg g dog walker"].each { |text| i << text } *** Once again, do a search for 'cat'. *** puts i.search('cat') TopDocs: total_hits = 2, max_score = 1.395880 [ 2 "Top dawg dog walker": 1.395880 5 "Top dawg dog walker": 1.395880 ] => nil *** The last search returned two results for 'cat', which is incorrect *** It seems I can add any number of items to an index once without a problem. However, once I add more items to the index, I start getting incorrect resuts. Can anybody shed some light on the issue? Any help would be appreciated. -- Posted via http://www.ruby-forum.com/. From Glsio.vip at gmail.com Mon Oct 9 23:39:11 2006 From: Glsio.vip at gmail.com (Glsio) Date: Tue, 10 Oct 2006 05:39:11 +0200 Subject: [Ferret-talk] How could acts_as_taggable work with ferret? In-Reply-To: References: Message-ID: <29be76e457cee85cc1b7580b66a75a2b@ruby-forum.com> David Balmain wrote: > On 10/10/06, Glsio wrote: >> Suppose that acts_as_taggable need three models: Tag, Tagging, Article >> >> Could tag search be realized using acts_as_ferret? >> Suppose tags:"a","ab","abc",If i wanted to get all the articles with tag >> name including "a", Could ferret satifisy this requirement? >> > > Yes, no problem at all. You just need to make sure that the tag field > is included in the index for the Article model. > > Dave Hi,Dave Can you show me example? very grateful. Glsio -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Tue Oct 10 01:24:26 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 10 Oct 2006 14:24:26 +0900 Subject: [Ferret-talk] oddness when adding to index - In-Reply-To: <5198caf199792dc916294a8414d425e6@ruby-forum.com> References: <5198caf199792dc916294a8414d425e6@ruby-forum.com> Message-ID: On 10/10/06, Chris Kilmer wrote: > I was having some odd results when working with acts_as_ferret (current > trunk), so I decided to test with the current version of ferret to see > if I encountered the same problem. I did. Here are the details: > > installed ferret 0.10.10 on debian sarge with 'sudo gem install ferret' > (btw, same results on OSX) > > opened up an irb session: > > > irb(main):001:0> require 'rubygems' > => true > irb(main):002:0> require 'ferret' > => true > irb(main):003:0> include Ferret > => Object > > irb(main):004:0> i = Ferret::I.new > => #:*, > :dir=>#, > :analyzer=>#, > :lock_retry_time=>2}, @mon_entering_queue=[], @qp=nil, @searcher=nil, > @mon_count=0, @default_field=:*, @close_dir=true, @auto_flush=false, > @open=true, @mon_owner=nil, @id_field=:id, @reader=nil, > @mon_waiting_queue=[], @writer=nil, @default_input_field=:id, > @dir=#> > > *** Now let's add 3 strings to the index *** > > irb(main):005:0> ["While you were out pet care", "Eastside dog walker", > "Top daw > g dog walker"].each {|text| i << text } > => ["While you were out pet care", "Eastside dog walker", "Top dawg dog > walker"] > > *** Now let's do some searches *** > > irb(main):006:0> puts i.search('pet') > TopDocs: total_hits = 1, max_score = 0.878416 [ > 0 "While you were out pet care": 0.878416 > ] > => nil > > irb(main):007:0> puts i.search('dog') > TopDocs: total_hits = 2, max_score = 0.500000 [ > 1 "Eastside dog walker": 0.500000 > 2 "Top dawg dog walker": 0.500000 > ] > => nil > > irb(main):008:0> puts i.search('dawg') > TopDocs: total_hits = 1, max_score = 0.702733 [ > 2 "Top dawg dog walker": 0.702733 > ] > => nil > > irb(main):010:0> puts i.search('cat') > TopDocs: total_hits = 0, max_score = 0.000000 [ > ] > => nil > > *** The previous 4 searches gave expected results. Notice that search > for 'cat' returned nothing (as it should) *** > > *** Let's add some more strings to the index. They are the same, but > does it matter? *** > > > irb(main):010:0> ["While you were out pet care", "Eastside dog walker", > "Top dawg > g dog walker"].each { |text| i << text } > => ["While you were out pet care", "Eastside dog walker", "Top dawg dog > walker"] > irb(main):011:0> ["While you were out pet care", "Eastside dog walker", > "Top dawg > g dog walker"].each { |text| i << text } > > *** Once again, do a search for 'cat'. *** > > puts i.search('cat') > TopDocs: total_hits = 2, max_score = 1.395880 [ > 2 "Top dawg dog walker": 1.395880 > 5 "Top dawg dog walker": 1.395880 > ] > => nil > > > *** The last search returned two results for 'cat', which is incorrect > *** > > It seems I can add any number of items to an index once without a > problem. However, once I add more items to the index, I start getting > incorrect resuts. Can anybody shed some light on the issue? Any help > would be appreciated. > Hi Chris, That is definitely a bug. I'll look into it. Cheers, Dave From dbalmain.ml at gmail.com Tue Oct 10 01:39:35 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 10 Oct 2006 14:39:35 +0900 Subject: [Ferret-talk] oddness when adding to index - In-Reply-To: References: <5198caf199792dc916294a8414d425e6@ruby-forum.com> Message-ID: On 10/10/06, David Balmain wrote: > On 10/10/06, Chris Kilmer wrote: > > I was having some odd results when working with acts_as_ferret (current > > trunk), so I decided to test with the current version of ferret to see > > if I encountered the same problem. I did. Here are the details: > > > > installed ferret 0.10.10 on debian sarge with 'sudo gem install ferret' > > (btw, same results on OSX) > > > >recipe to reproduce error > > > > It seems I can add any number of items to an index once without a > > problem. However, once I add more items to the index, I start getting > > incorrect resuts. Can anybody shed some light on the issue? Any help > > would be appreciated. > > > > Hi Chris, > That is definitely a bug. I'll look into it. > > Cheers, > Dave > Yep, that was a bug introduced in verion 0.10.10. I'll get a new gem up ASAP. You can get the fixed code from subversion right now if you are in a hurry. From aditya_nalla at yahoo.co.in Tue Oct 10 02:28:55 2006 From: aditya_nalla at yahoo.co.in (Aditya) Date: Tue, 10 Oct 2006 08:28:55 +0200 Subject: [Ferret-talk] How could acts_as_taggable work with ferret? In-Reply-To: <29be76e457cee85cc1b7580b66a75a2b@ruby-forum.com> References: <29be76e457cee85cc1b7580b66a75a2b@ruby-forum.com> Message-ID: Glsio wrote: > David Balmain wrote: >> On 10/10/06, Glsio wrote: >>> Suppose that acts_as_taggable need three models: Tag, Tagging, Article >>> >>> Could tag search be realized using acts_as_ferret? >>> Suppose tags:"a","ab","abc",If i wanted to get all the articles with tag >>> name including "a", Could ferret satifisy this requirement? >>> >> >> Yes, no problem at all. You just need to make sure that the tag field >> is included in the index for the Article model. >> >> Dave > > Hi,Dave > Can you show me example? very grateful. > > Glsio Hi... All these to ur model acts_as_taggable acts_as_ferret :fields => ["field1", "field2", :tag_list] Take care of one thing ur page should not have any tabs..replace all tabs by spaces -- Posted via http://www.ruby-forum.com/. From jan.prill at gmail.com Tue Oct 10 02:52:39 2006 From: jan.prill at gmail.com (Jan Prill) Date: Tue, 10 Oct 2006 08:52:39 +0200 Subject: [Ferret-talk] oddness when adding to index - In-Reply-To: <5198caf199792dc916294a8414d425e6@ruby-forum.com> References: <5198caf199792dc916294a8414d425e6@ruby-forum.com> Message-ID: <562a35c10610092352y268696fcs649aad56e50ec7ab@mail.gmail.com> Hi Chris, can't reproduce this on windows and ferret 0.10.9. The following snippet gives these results: TopDocs: total_hits = 1, max_score = 0.878416 [ 0: 0.878416 ] TopDocs: total_hits = 2, max_score = 0.500000 [ 1: 0.500000 2: 0.500000 ] TopDocs: total_hits = 1, max_score = 0.702733 [ 2: 0.702733 ] TopDocs: total_hits = 0, max_score = 0.000000 [ ] TopDocs: total_hits = 0, max_score = 0.000000 [ ] Snippet: require 'rubygems' require 'ferret' include Ferret i = Ferret::I.new ["While you were out pet care", "Eastside dog walker", "Top dawg dog walker"].each {|text| i << text } puts i.search('pet') puts i.search('dog') puts i.search('dawg') puts i.search('cat') # Let's add some more strings to the index. ["While you were out pet care", "Eastside dog walker", "Top dawg dog walker"].each {|text| i << text } ["While you were out pet care", "Eastside dog walker", "Top dawg dog walker"].each {|text| i << text } puts i.search('cat') --------------------- So things seem to work as expected. You might try two things before further investigation: 1. run the script outside of irb: Does it give you the same (wrong) results? 2. gem uninstall ferret and gem install ferret to make sure you are using solely the latest version of ferret Cheers, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20061010/725d92d0/attachment.html From kraemer at webit.de Tue Oct 10 04:34:34 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 10 Oct 2006 10:34:34 +0200 Subject: [Ferret-talk] multi_search problems, Never go away! In-Reply-To: References: <20061009155028.GA6332@cordoba.webit.de> Message-ID: <20061010083434.GG6332@cordoba.webit.de> On Mon, Oct 09, 2006 at 11:31:47PM +0200, wc wrote: [..] > Here is the trace > User.multi_search("foo",[Model1,Model2]) > TypeError: nil is not a symbol > from > ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:412:in > `const_get' it seems your indexes don't contain the class_name field for all your indexed objects. > both my models call: > > acts_as_ferret :store_class_name => true you should have all *three* models (User, Model1 and Model2) using the :store_class_name option. Then, after rebuilding your indexes, multi_search should work fine. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From anrake at gmail.com Tue Oct 10 07:26:45 2006 From: anrake at gmail.com (anrake) Date: Tue, 10 Oct 2006 13:26:45 +0200 Subject: [Ferret-talk] AaF not indexing models when no index present in multi_s In-Reply-To: <20061009163810.GE6332@cordoba.webit.de> References: <98078283c417114f1e2a5563f5552426@ruby-forum.com> <20061009163810.GE6332@cordoba.webit.de> Message-ID: <6f67d974b180c75238fae7a6c3fd9ac9@ruby-forum.com> Thanks for that. It's nice to know it's not because I messed up something. Everything works fine for now. Thanks for the great plugin and for taking the time to answer so many questions on this forum. They've been a big help ironing out other issues too. Jens Kraemer wrote: > On Mon, Oct 09, 2006 at 03:38:52AM +0200, anrake wrote: >> The first time I tried to run a multi_search, I got an error and >> discovered there were no indexes for any of the models. index > model >> folders were there but no actual indexes. I couldn't figure out how to >> get it to create indexes so I edited a record for each model which >> created all the indexes. Then when I ran the search, everything worked >> well. > > calling Model.find_by_contents('some query') on each model would have > been sufficient, this method does rebuild the index if it doesn't exist > yet. > >> Is there a bug related to this somewhere or anyway to better force index >> creation when none exists. I'm running the newest version (but not >> trunk) of each Ferret and acts_as_ferret > > This is a known issue (there's a #todo in the code somewhere...). I > opened up a ticket for this one. Hope to get it fixed soon ;-) > > > cheers, > Jens > > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 -- Posted via http://www.ruby-forum.com/. From f at andreas-s.net Tue Oct 10 09:13:37 2006 From: f at andreas-s.net (Andreas Schwarz) Date: Tue, 10 Oct 2006 15:13:37 +0200 Subject: [Ferret-talk] Test failure: Rev. 638, Ruby 1.8.5, Debian Linux Message-ID: 1) Failure: test_reg_exp_analyzer(RegExpAnalyzerTest) [./test/unit/../unit/index/../../unit/store/../../unit/analysis/tc_analyzer.rb:439]: expected but was . 158 tests, 11954 assertions, 1 failures, 0 errors -- Posted via http://www.ruby-forum.com/. From christopher.kilmer at biego.com Tue Oct 10 09:55:43 2006 From: christopher.kilmer at biego.com (Chris Kilmer) Date: Tue, 10 Oct 2006 15:55:43 +0200 Subject: [Ferret-talk] oddness when adding to index - In-Reply-To: References: <5198caf199792dc916294a8414d425e6@ruby-forum.com> Message-ID: <9936d4ec9c062344ff09803392b8325b@ruby-forum.com> David Balmain wrote: > On 10/10/06, David Balmain wrote: >> > It seems I can add any number of items to an index once without a >> > Yep, that was a bug introduced in verion 0.10.10. I'll get a new gem > up ASAP. You can get the fixed code from subversion right now if you > are in a hurry. David, Thanks for the feedback. Will the gem that includes the fix be 0.10.11? -- Posted via http://www.ruby-forum.com/. From ahfeel at rift.fr Tue Oct 10 11:06:37 2006 From: ahfeel at rift.fr (ahFeel) Date: Tue, 10 Oct 2006 17:06:37 +0200 Subject: [Ferret-talk] Need help for coding an extension to ferret Message-ID: Hi, i'm working on a project using Ferret for indexing it's datas. I'm very happy with it but i need to code an extension to implement a .to_json method to TopDocs class, because ruby's json implementation is really really slow... It's my second (the first was the tutorial :/ ) ruby C extension, so i'm not really at ease with ruby C bindings, even with the C experience... Here is my problem : I would like to load each document from ids in my TopDoc object, to make the json string myself from that, but i don't know how to load my documents from this class... it's really weirdo to me actually :\ hope somebody can help ! Here is my code, situated in r_search.c : static VALUE frt_td_to_json(VALUE self) { int i; VALUE rhits = rb_funcall(self, id_hits, 0); VALUE rhit; const int len = RARRAY(rhits)->len; long pos; for (i = 0; i < len; i++) { rhit = RARRAY(rhits)->ptr[i]; pos = FIX2INT(rb_funcall(rhit, id_doc, 0)); // // HERE I WOUlD LIKE TO LOAD THE DOCUMENTS, ID IS THE GOOD DOC_ID.. // I WOULD LIKE TO GET THIS IN FACT (ruby): // doc_id = INDEX.search('query').hits.first // INDEX[doc_id].load <== THAT'S WHAT I WOULD LIKE TO GET ! // } rstr = rb_str_new2(str); free(str); return (argv[0]); } of course, i've bound the method to the good object etc... Hope somebody'll help ! Thank you by advance, Jeremie 'ahFeel' BORDIER -- Posted via http://www.ruby-forum.com/. From ahfeel at rift.fr Tue Oct 10 11:08:56 2006 From: ahfeel at rift.fr (ahFeel) Date: Tue, 10 Oct 2006 17:08:56 +0200 Subject: [Ferret-talk] Need help for coding an extension to ferret In-Reply-To: References: Message-ID: Damn, i've forgotten some debug / test code in the past :P of course, you don't have to care about these lines : > rstr = rb_str_new2(str); > free(str); > return (argv[0]); > } > Thank you by advance, > Jeremie 'ahFeel' BORDIER -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Tue Oct 10 11:10:07 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 11 Oct 2006 00:10:07 +0900 Subject: [Ferret-talk] Test failure: Rev. 638, Ruby 1.8.5, Debian Linux In-Reply-To: References: Message-ID: On 10/10/06, Andreas Schwarz wrote: > 1) Failure: > test_reg_exp_analyzer(RegExpAnalyzerTest) > [./test/unit/../unit/index/../../unit/store/../../unit/analysis/tc_analyzer.rb:439]: > expected but was > . > > 158 tests, 11954 assertions, 1 failures, 0 errors > Thanks Andreas, I'll look into this. From dbalmain.ml at gmail.com Tue Oct 10 11:51:06 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 11 Oct 2006 00:51:06 +0900 Subject: [Ferret-talk] oddness when adding to index - In-Reply-To: <9936d4ec9c062344ff09803392b8325b@ruby-forum.com> References: <5198caf199792dc916294a8414d425e6@ruby-forum.com> <9936d4ec9c062344ff09803392b8325b@ruby-forum.com> Message-ID: On 10/10/06, Chris Kilmer wrote: > David Balmain wrote: > > On 10/10/06, David Balmain wrote: > >> > It seems I can add any number of items to an index once without a > >> > > Yep, that was a bug introduced in verion 0.10.10. I'll get a new gem > > up ASAP. You can get the fixed code from subversion right now if you > > are in a hurry. > > David, > Thanks for the feedback. Will the gem that includes the fix be 0.10.11? Yep. I wanted to get it out today but I still have some testing to do so it'll have to wait until tomorrow. It'll definitely be out in the next 24 hours. Dave From aditya_nalla at yahoo.co.in Tue Oct 10 11:59:23 2006 From: aditya_nalla at yahoo.co.in (Aditya) Date: Tue, 10 Oct 2006 17:59:23 +0200 Subject: [Ferret-talk] performance Message-ID: <6e1d9307645b76971cb5f234451f7e96@ruby-forum.com> Hi AS you know ferret does indexing on the methods also. This feature can be used to do full text search on columns + tags(implemented using acts_as_taggable). I have a db where there are three tables say A,B,C The relationships are like - A has many B and B has many C. My tags go against table C. But I have to do full text search on columns of A + tags on C. What I can do is have an instance method for A which finds all grand child of A in C and returns their tags as a string which can be indexed. But wont that effect the performance a lot? Any suggestion -- Posted via http://www.ruby-forum.com/. From benlee at ece.ucsb.edu Tue Oct 10 16:52:59 2006 From: benlee at ece.ucsb.edu (Ben Lee) Date: Tue, 10 Oct 2006 22:52:59 +0200 Subject: [Ferret-talk] Indexing problems 10.9/10.10 Message-ID: I've been having trouble with indexing a large amount of documents (2.4M). Essentially, I have one process that is following the tutorial dumping documents to an index stored on the file system. If I open the index with another process, and run the size() method it is stuck at 90,000 documents. Additionally, if I search, I don't get results past an even smaller number of docs. I've tried the two latest ferret releases. Any ideas? Thanks, Ben -- Posted via http://www.ruby-forum.com/. From john at squirl.info Tue Oct 10 16:58:38 2006 From: john at squirl.info (John) Date: Tue, 10 Oct 2006 22:58:38 +0200 Subject: [Ferret-talk] Ferret returning too many results Message-ID: <8ba122d0e7e88687cc9f236ab96f8069@ruby-forum.com> Hi, I just upgraded from acts_as_ferret 0.2/Ferret 0.9.x to acts_as_ferret 0.3/ferret .10.9, and i'm getting a strange behavior: searches are returning far too many results, most of them superflous. i'll paste in my code below, it's pretty simple. i'm googling like mad, and going through both the source and the forums, and having trouble finding what could be causing this. any help greatly apreciated. relevant code from our rails app: # search_array hold an array of the model we want to search search_array.each do |asset| a = Object.const_get(asset) assets << (a.find_id_by_contents q, :limit => num_docs, :offset => first_doc) end assets.flatten! return assets -- Posted via http://www.ruby-forum.com/. From bill at ilovett.com Tue Oct 10 17:37:05 2006 From: bill at ilovett.com (Bill Lovett) Date: Tue, 10 Oct 2006 23:37:05 +0200 Subject: [Ferret-talk] sorting results with aaf multi_search Message-ID: <661efed4da4f0f3668c0f765bfa31fb1@ruby-forum.com> Is it possible to sort the result of acts_as_ferret multi_search the way you can with find_contents? I'm using the latest ferret and aaf. I have an interface that offers multiple search options which search different fields of a single model. In addition to these, I also have an "all" search type which is meant to pull in one additional model and consider all indexed fields. find_by_contents is working fine. But when I switch to the "all" search type that uses multi_search, the sort parameter seems to be ignored. Should I be doing something differently? -- Posted via http://www.ruby-forum.com/. From epetrie at tribune.com Tue Oct 10 19:11:32 2006 From: epetrie at tribune.com (Evan) Date: Wed, 11 Oct 2006 01:11:32 +0200 Subject: [Ferret-talk] Dynamic fields and inheritance Message-ID: I have a model that allows subclasses to dynamically define fields. The following code is a short test case that illustrates the problem I'm having: class Product < ActiveRecord::Base acts_as_ferret :fields => [:name] serialize :data def self.data_properties (*properties) properties.each do |property| define_method(property) {self.get_property(property)} define_method((property.to_s + '=').to_sym) \ {|value| self.set_property(property, value)} end end def get_property (property) self[:data][property] if self[:data] end def set_property (property, value) self[:data] = Hash.new unless self[:data] self[:data][property] = value end end Here's the migration for said model: class CreateProducts < ActiveRecord::Migration def self.up create_table :products do |t| t.column :name, :string t.column :type, :string t.column :data, :string end end def self.down drop_table :products end end An example of a couple subclasses are: class Book < Product acts_as_ferret :fields => [:name, :author, :pages] data_properties :author, :pages end and: class Music < Product acts_as_ferret :fields => [:name, :artist, :label] data_properties :artist, :label end Creating a couple of records works fine: >> (Book.new(:name => 'Moby Dick', :author => 'Herman Melville', :pages => 704)).save => true >> (Music.new(:name => 'Abbey Road', :artist => 'The Beatles', :label => 'Apple')).save => true But things start to get strange when I search for content: >> Book.find_by_contents('*') => # >> Music.find_by_contents('*') => #"Abbey Road", "type"=>"Music", "id"=>"2", "data"=>"--- \n:artist: The Beatles\n:label: Apple\n"}>]> I can provide debug output for the above commands if you'd like. My ultimate goal here is to be able to find Products by fields only defined for Products (i.e. name) as well as to be able to find subclasses of Products by their respective fields (i.e. find Music by artist). Any help, insight, or even telling me that I'm smoking crack would be greatly appreciated. Regards, Evan -- Posted via http://www.ruby-forum.com/. From anrake at gmail.com Tue Oct 10 20:14:46 2006 From: anrake at gmail.com (anrake) Date: Wed, 11 Oct 2006 02:14:46 +0200 Subject: [Ferret-talk] Experience with ferret on Dreamhost ? In-Reply-To: References: Message-ID: Hi, I am experiencing exactly the same phenomenon. Everything works find on my powerbook, but not on DH. I changed bash_profile to add my local .gems directory and installed ferret with no apparent problems. I added a line to environment.rb as instructed in the wiki but still get the same problems when I try to deploy my new site (via Capistrano). Likewise I think dispatch.fcgi is not starting. Any ideas? Chris Lowis wrote: > Does anybody have experience with running ferret on dreamhost ? > > My app is running ok until I install the acts_as_ferret plugin, at which > point I get "Rails application failed to start properly" errors. I've > used script/console to confirm that I can require 'ferret' and make a > new Index object . Everything appears to be ok in that respect. > Unfortunately there is nothing logged in these circumstances, except : > > [Wed Aug 16 07:10:23 2006] [error] [client 152.78.115.107] FastCGI: comm > with (dynamic) server > "/home/c_lowis/residence-review.com/public/dispatch.fcgi" aborted: > (first read) idle timeout (120 sec) > [Wed Aug 16 07:10:23 2006] [error] [client 152.78.115.107] FastCGI: > incomplete headers (0 bytes) received from server > "/home/c_lowis/residence-review.com/public/dispatch.fcgi" > > in the "apache" type logs that dreamhost gives me . Through trial and > error I am fairly sure it is ferret that is causing this, as when I > remove the plugin the site works ok. > > I am using ferret 0.9.5 . As far as I can see dispatch.fcgi is not > starting. > > Would appreciate any comments, > > Chris -- Posted via http://www.ruby-forum.com/. From miguel.wong at gmail.com Tue Oct 10 21:14:30 2006 From: miguel.wong at gmail.com (MIguel) Date: Wed, 11 Oct 2006 03:14:30 +0200 Subject: [Ferret-talk] EOF Error with Unit Tests In-Reply-To: References: <3f35515adc44e16be64c2d3b5854ba92@ruby-forum.com> <0aadb024eeeafcfac48b26984fb76590@ruby-forum.com> Message-ID: Hello Dave, Thanks for your reply. I am indeed using acts_as_ferret on ferret 0.9.6 on Windows. When i manually delete the test index, it sometimes work. But only sometimes. I do not see a pattern. Please help! Thanks in advance! Miguel p.s. I will also try the closing the index trick. If that work, i will report here for documentation purposes. thanks. David Balmain wrote: > On 9/22/06, Miguel wrote: >> > >> > When running unit tests one by one (test file by test file), this error >> > does not pop up. Does anyone know what is happening? >> > >> > Thanks! >> > > Hi Miguel, > > A couple of questions will help us answer this. Are you on Windows? Is > your application a Rails app? Are you using acts_as_ferret? > > The first thing I'd check is that you are closing your Index, > IndexReader or IndexWriter when you are finished with it (ie in your > test methods . Not doing this can possibly cause and EOFError. Also, > on Windows, I had a lot of trouble making sure files get deleted > correctly, but I may have made a mistake somewhere. > > I hope we can help you out, > Dave -- Posted via http://www.ruby-forum.com/. From benlee at ece.ucsb.edu Tue Oct 10 21:35:35 2006 From: benlee at ece.ucsb.edu (Ben Lee) Date: Tue, 10 Oct 2006 18:35:35 -0700 Subject: [Ferret-talk] Indexing problem 10.9/10.10 In-Reply-To: <3bcced9f0610101704s2a56d31fhd669bf437adc7f49@mail.gmail.com> References: <3bcced9f0610101704s2a56d31fhd669bf437adc7f49@mail.gmail.com> Message-ID: <3bcced9f0610101835i3a536fc3g48339ef6b771307c@mail.gmail.com> Sorry if this is a repost- I wasn't sure if the www.ruby-forum.com list works for postings. I've been having trouble with indexing a large amount of documents(2.4M). Essentially, I have one process that is following the tutorial dumping documents to an index stored on the file system. If I open the index with another process, and run the size() method it is stuck at a number of documents much smaller than the number I've added to the index. Eg. 290k -- when the indexer process has already gone through 1 M. Additionally, if I search, I don't get results past an even smaller number of docs (22k) . I've tried the two latest ferret releases. Does this listing of the index directory look right? -rw------- 1 blee blee 3.8M Oct 10 17:06 _v.fdt -rw------- 1 blee blee 51K Oct 10 17:06 _v.fdx -rw------- 1 blee blee 12M Oct 10 16:49 _u.cfs -rw------- 1 blee blee 97 Oct 10 16:49 fields -rw------- 1 blee blee 78 Oct 10 16:49 segments -rw------- 1 blee blee 11M Oct 10 16:23 _t.cfs -rw------- 1 blee blee 11M Oct 10 15:56 _s.cfs -rw------- 1 blee blee 15M Oct 10 15:11 _r.cfs -rw------- 1 blee blee 13M Oct 10 14:48 _q.cfs -rw------- 1 blee blee 14M Oct 10 14:37 _p.cfs -rw------- 1 blee blee 13M Oct 10 14:28 _o.cfs -rw------- 1 blee blee 12M Oct 10 14:19 _n.cfs -rw------- 1 blee blee 12M Oct 10 14:16 _m.cfs -rw------- 1 blee blee 118M Oct 10 14:10 _l.cfs -rw------- 1 blee blee 129M Oct 10 13:24 _a.cfs -rw------- 1 blee blee 0 Oct 10 13:00 ferret-write.lck Thanks, Ben From dbalmain.ml at gmail.com Tue Oct 10 22:00:31 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 11 Oct 2006 11:00:31 +0900 Subject: [Ferret-talk] Indexing problems 10.9/10.10 In-Reply-To: References: Message-ID: On 10/11/06, Ben Lee wrote: > I've been having trouble with indexing a large amount of documents > (2.4M). Essentially, I have one process that is following the tutorial > dumping documents to an index stored on the file system. If I open the > index with another process, and run the size() method it is stuck at > 90,000 documents. Additionally, if I search, I don't get results past an > even smaller number of docs. I've tried the two latest ferret releases. > > Any ideas? > > Thanks, > Ben You need to compile Ferret with large-file support. I'll put a section in the FAQ later. From dbalmain.ml at gmail.com Tue Oct 10 22:17:26 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 11 Oct 2006 11:17:26 +0900 Subject: [Ferret-talk] Need help for coding an extension to ferret In-Reply-To: References: Message-ID: On 10/11/06, ahFeel wrote: > Hi, > > i'm working on a project using Ferret for indexing it's datas. I'm very > happy with it but i need to code an extension to implement a .to_json > method to TopDocs class, because ruby's json implementation is really > really slow... > > It's my second (the first was the tutorial :/ ) ruby C extension, so i'm > not really at ease with ruby C bindings, even with the C experience... > > Here is my problem : > > I would like to load each document from ids in my TopDoc object, to make > the json string myself from that, but i don't know how to load my > documents from this class... it's really weirdo to me actually :\ hope > somebody can help ! > > Here is my code, situated in r_search.c : > > static VALUE > frt_td_to_json(VALUE self) > { > int i; > VALUE rhits = rb_funcall(self, id_hits, 0); > VALUE rhit; > const int len = RARRAY(rhits)->len; > long pos; > > for (i = 0; i < len; i++) > { > rhit = RARRAY(rhits)->ptr[i]; > pos = FIX2INT(rb_funcall(rhit, id_doc, 0)); > // > // HERE I WOUlD LIKE TO LOAD THE DOCUMENTS, ID IS THE GOOD DOC_ID.. > // I WOULD LIKE TO GET THIS IN FACT (ruby): > // doc_id = INDEX.search('query').hits.first > // INDEX[doc_id].load <== THAT'S WHAT I WOULD LIKE TO GET ! > // > } > rstr = rb_str_new2(str); > free(str); > return (argv[0]); > } > > of course, i've bound the method to the good object etc... > > Hope somebody'll help ! > Thank you by advance, > Jeremie 'ahFeel' BORDIER > The frt_td_to_s method is doing almost exactly this. Make sure you have have version 0.10.9 or later. Each TopDocs object has a reference to the searcher that created it. You can get a LazyDoc object for the searcher with the following call: LazyDoc *lazy_doc = sea->get_lazy_doc(searcher, id); By the way, id should be an int, not a long. To turn this into the lazy loading Hash object that you see in Ruby you use this method: VALUE frt_get_lazy_doc(LazyDoc *lazy_doc) The code for cLazyDoc is in r_index.c. Can I ask why you need to do this in the C code? It would seem to me to make a lot more sense to code this in Ruby but you probably have a good reason. Cheers, Dave From dbalmain.ml at gmail.com Tue Oct 10 22:21:06 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 11 Oct 2006 11:21:06 +0900 Subject: [Ferret-talk] EOF Error with Unit Tests In-Reply-To: References: <3f35515adc44e16be64c2d3b5854ba92@ruby-forum.com> <0aadb024eeeafcfac48b26984fb76590@ruby-forum.com> Message-ID: On 10/11/06, MIguel wrote: > Hello Dave, > > Thanks for your reply. I am indeed using acts_as_ferret on ferret 0.9.6 > on Windows. When i manually delete the test index, it sometimes work. > But only sometimes. I do not see a pattern. Please help! Thanks in > advance! > > Miguel > > p.s. I will also try the closing the index trick. If that work, i will > report here for documentation purposes. thanks. I'd say your error is almost certainly caused by the fact that you have left an IndexWriter or Index open somewhere. By the way, why are you still using 0.9.6? It is definitely worth upgrading as you'd be using the much slower pure ruby version on Windows. Cheers, Dave From peter at ioffer.com Wed Oct 11 00:44:56 2006 From: peter at ioffer.com (peter) Date: Tue, 10 Oct 2006 21:44:56 -0700 Subject: [Ferret-talk] Indexing problem 10.9/10.10 In-Reply-To: <3bcced9f0610101835i3a536fc3g48339ef6b771307c@mail.gmail.com> Message-ID: We've had somewhat of a similar situation ourselves, where we are indexing about a million records to an index, and each record can be somewhat large. Now..what happened on our side was that the index files (very similar in structure to what you have below) came up to a 2 gig limit and stopped there..and the indexer started crashing each time it hit this limit. On your side, I don't see your index file sizes really that large. I think the compiling with large file support only really kicks in when you hit this 2 gig size limit. Couple of thoughts that might help: 1. On our side, to keep size down, I would optimize the index at every 100,000 documents. The optimize call also flushes the index. 2. Make sure you close the index once you index your data. Small thing..but just making sure. 3. With the index being this large, we actually have two copies, one for searching against an already optimized index, and the other copy doing the indexing. This way, no items are being searched on while the indexing is taking place. 4. One neat thing that I learned with indexing large items, was that I don't have to actually store everything. I can have a field set to tokenize, but not store, so that it can be searched..but I don't need it to be displayed in the search results per say..I don't actually store it, so I was able to keep my index size down. > From: "Ben Lee" > Reply-To: ferret-talk at rubyforge.org > Date: Tue, 10 Oct 2006 18:35:35 -0700 > To: ferret-talk at rubyforge.org > Subject: [Ferret-talk] Indexing problem 10.9/10.10 > > Sorry if this is a repost- I wasn't sure if the www.ruby-forum.com > list works for postings. > I've been having trouble with indexing a large amount of documents(2.4M). > > > Essentially, I have one process that is following the tutorial > dumping documents to an index stored on the file system. If I open the > index with another process, and run the size() method it is stuck at > a number of documents much smaller than the number I've added to the index. > > Eg. 290k -- when the indexer process has already gone through 1 M. > > Additionally, if I search, I don't get results past an > even smaller number of docs (22k) . I've tried the two latest ferret releases. > > > Does this listing of the index directory look right? > > -rw------- 1 blee blee 3.8M Oct 10 17:06 _v.fdt > -rw------- 1 blee blee 51K Oct 10 17:06 _v.fdx > -rw------- 1 blee blee 12M Oct 10 16:49 _u.cfs > -rw------- 1 blee blee 97 Oct 10 16:49 fields > > -rw------- 1 blee blee 78 Oct 10 16:49 segments > -rw------- 1 blee blee 11M Oct 10 16:23 _t.cfs > -rw------- 1 blee blee 11M Oct 10 15:56 _s.cfs > -rw------- 1 blee blee 15M Oct 10 15:11 _r.cfs > -rw------- 1 blee blee 13M Oct 10 14:48 _q.cfs > > -rw------- 1 blee blee 14M Oct 10 14:37 _p.cfs > -rw------- 1 blee blee 13M Oct 10 14:28 _o.cfs > -rw------- 1 blee blee 12M Oct 10 14:19 _n.cfs > -rw------- 1 blee blee 12M Oct 10 14:16 _m.cfs > -rw------- 1 blee blee 118M Oct 10 14:10 _l.cfs > > -rw------- 1 blee blee 129M Oct 10 13:24 _a.cfs > -rw------- 1 blee blee 0 Oct 10 13:00 ferret-write.lck > > Thanks, > Ben > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From dbalmain.ml at gmail.com Wed Oct 11 01:24:51 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 11 Oct 2006 14:24:51 +0900 Subject: [Ferret-talk] Indexing problem 10.9/10.10 In-Reply-To: <3bcced9f0610101835i3a536fc3g48339ef6b771307c@mail.gmail.com> References: <3bcced9f0610101704s2a56d31fhd669bf437adc7f49@mail.gmail.com> <3bcced9f0610101835i3a536fc3g48339ef6b771307c@mail.gmail.com> Message-ID: On 10/11/06, Ben Lee wrote: > Sorry if this is a repost- I wasn't sure if the www.ruby-forum.com > list works for postings. > I've been having trouble with indexing a large amount of documents(2.4M). > > > Essentially, I have one process that is following the tutorial > dumping documents to an index stored on the file system. If I open the > index with another process, and run the size() method it is stuck at > a number of documents much smaller than the number I've added to the index. > > Eg. 290k -- when the indexer process has already gone through 1 M. > > Additionally, if I search, I don't get results past an > even smaller number of docs (22k) . I've tried the two latest ferret releases. > > > Does this listing of the index directory look right? > > -rw------- 1 blee blee 3.8M Oct 10 17:06 _v.fdt > -rw------- 1 blee blee 51K Oct 10 17:06 _v.fdx > -rw------- 1 blee blee 12M Oct 10 16:49 _u.cfs > -rw------- 1 blee blee 97 Oct 10 16:49 fields > > -rw------- 1 blee blee 78 Oct 10 16:49 segments > -rw------- 1 blee blee 11M Oct 10 16:23 _t.cfs > -rw------- 1 blee blee 11M Oct 10 15:56 _s.cfs > -rw------- 1 blee blee 15M Oct 10 15:11 _r.cfs > -rw------- 1 blee blee 13M Oct 10 14:48 _q.cfs > > -rw------- 1 blee blee 14M Oct 10 14:37 _p.cfs > -rw------- 1 blee blee 13M Oct 10 14:28 _o.cfs > -rw------- 1 blee blee 12M Oct 10 14:19 _n.cfs > -rw------- 1 blee blee 12M Oct 10 14:16 _m.cfs > -rw------- 1 blee blee 118M Oct 10 14:10 _l.cfs > > -rw------- 1 blee blee 129M Oct 10 13:24 _a.cfs > -rw------- 1 blee blee 0 Oct 10 13:00 ferret-write.lck > > Thanks, > Ben I thought this was possibly due to the fact that you didn't have Ferret compiled with large-file support but by the looks of it you aren't getting near that limit yet. In the directory listing you have here there is no way you could have added more than 290K documents unless you set :max_buffered_docs to a different value (> 10,000). Perhaps the index is getting over-written at some stage. Could you show us the code you are using for indexing? As for search results only showing for the top 22k documents, I'm not sure what the problem might be. You need to make sure you open the index reader or searcher after committing the index writer, otherwise the latest results won't show up. I don't think this is your problem though as I'm sure you would have opened the index-reader much later than after indexing 22k documents. Cheers, Dave From dbalmain.ml at gmail.com Wed Oct 11 02:16:58 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 11 Oct 2006 15:16:58 +0900 Subject: [Ferret-talk] Indexing problem 10.9/10.10 In-Reply-To: References: <3bcced9f0610101835i3a536fc3g48339ef6b771307c@mail.gmail.com> Message-ID: On 10/11/06, peter wrote: > We've had somewhat of a similar situation ourselves, where we are indexing > about a million records to an index, and each record can be somewhat large. > > Now..what happened on our side was that the index files (very similar in > structure to what you have below) came up to a 2 gig limit and stopped > there..and the indexer started crashing each time it hit this limit. > > On your side, I don't see your index file sizes really that large. I think > the compiling with large file support only really kicks in when you hit this > 2 gig size limit. Hi Peter, Did you manage to compile Ferret successfully with large-file support yourself? > Couple of thoughts that might help: > 1. On our side, to keep size down, I would optimize the index at every > 100,000 documents. The optimize call also flushes the index. You can also just call Index#flush to flush the index without having to optimize. Or IndexWriter#commit. Actually they should both be commit so I'm going to alias commit to flush in the Index class in the next version. > 2. Make sure you close the index once you index your data. Small > thing..but just making sure. > > 3. With the index being this large, we actually have two copies, one for > searching against an already optimized index, and the other copy doing the > indexing. This way, no items are being searched on while the indexing is > taking place. This shouldn't be necessary. Whatever version of the index you open the IndexReader on will be the version of the index that you are searching, even when it's files are deleted it will hold on to the file handles so the data should still be available. The operating system won't be able to use that disk space until you close the IndexReader (or Searcher). > 4. One neat thing that I learned with indexing large items, was that I > don't have to actually store everything. I can have a field set to > tokenize, but not store, so that it can be searched..but I don't need it to > be displayed in the search results per say..I don't actually store it, so I > was able to keep my index size down. Very good tip. You should also set :term_vector to :no unless you are using term-vectors. Cheers, Dave From benlee at ece.ucsb.edu Wed Oct 11 02:51:49 2006 From: benlee at ece.ucsb.edu (Ben Lee) Date: Tue, 10 Oct 2006 23:51:49 -0700 Subject: [Ferret-talk] Indexing problem 10.9/10.10 In-Reply-To: References: <3bcced9f0610101835i3a536fc3g48339ef6b771307c@mail.gmail.com> Message-ID: <3bcced9f0610102351u5267c05ateaaa9fe1e5c3aa20@mail.gmail.com> Thanks for the tips, things seem happier now. Yeah, the size of each document (number of tokens) is actually quite small in my case - I think this is just case of me messing up the flush/optimize/close tactics. On 10/10/06, peter wrote: > We've had somewhat of a similar situation ourselves, where we are indexing > about a million records to an index, and each record can be somewhat large. > > Now..what happened on our side was that the index files (very similar in > structure to what you have below) came up to a 2 gig limit and stopped > there..and the indexer started crashing each time it hit this limit. > > On your side, I don't see your index file sizes really that large. I think > the compiling with large file support only really kicks in when you hit this > 2 gig size limit. > > Couple of thoughts that might help: > 1. On our side, to keep size down, I would optimize the index at every > 100,000 documents. The optimize call also flushes the index. > > 2. Make sure you close the index once you index your data. Small > thing..but just making sure. > > 3. With the index being this large, we actually have two copies, one for > searching against an already optimized index, and the other copy doing the > indexing. This way, no items are being searched on while the indexing is > taking place. > > 4. One neat thing that I learned with indexing large items, was that I > don't have to actually store everything. I can have a field set to > tokenize, but not store, so that it can be searched..but I don't need it to > be displayed in the search results per say..I don't actually store it, so I > was able to keep my index size down. > > > > > From: "Ben Lee" > > Reply-To: ferret-talk at rubyforge.org > > Date: Tue, 10 Oct 2006 18:35:35 -0700 > > To: ferret-talk at rubyforge.org > > Subject: [Ferret-talk] Indexing problem 10.9/10.10 > > > > Sorry if this is a repost- I wasn't sure if the www.ruby-forum.com > > list works for postings. > > I've been having trouble with indexing a large amount of documents(2.4M). > > > > > > Essentially, I have one process that is following the tutorial > > dumping documents to an index stored on the file system. If I open the > > index with another process, and run the size() method it is stuck at > > a number of documents much smaller than the number I've added to the index. > > > > Eg. 290k -- when the indexer process has already gone through 1 M. > > > > Additionally, if I search, I don't get results past an > > even smaller number of docs (22k) . I've tried the two latest ferret releases. > > > > > > Does this listing of the index directory look right? > > > > -rw------- 1 blee blee 3.8M Oct 10 17:06 _v.fdt > > -rw------- 1 blee blee 51K Oct 10 17:06 _v.fdx > > -rw------- 1 blee blee 12M Oct 10 16:49 _u.cfs > > -rw------- 1 blee blee 97 Oct 10 16:49 fields > > > > -rw------- 1 blee blee 78 Oct 10 16:49 segments > > -rw------- 1 blee blee 11M Oct 10 16:23 _t.cfs > > -rw------- 1 blee blee 11M Oct 10 15:56 _s.cfs > > -rw------- 1 blee blee 15M Oct 10 15:11 _r.cfs > > -rw------- 1 blee blee 13M Oct 10 14:48 _q.cfs > > > > -rw------- 1 blee blee 14M Oct 10 14:37 _p.cfs > > -rw------- 1 blee blee 13M Oct 10 14:28 _o.cfs > > -rw------- 1 blee blee 12M Oct 10 14:19 _n.cfs > > -rw------- 1 blee blee 12M Oct 10 14:16 _m.cfs > > -rw------- 1 blee blee 118M Oct 10 14:10 _l.cfs > > > > -rw------- 1 blee blee 129M Oct 10 13:24 _a.cfs > > -rw------- 1 blee blee 0 Oct 10 13:00 ferret-write.lck > > > > Thanks, > > Ben > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > From miguel.wong at gmail.com Wed Oct 11 03:38:00 2006 From: miguel.wong at gmail.com (MIguel) Date: Wed, 11 Oct 2006 09:38:00 +0200 Subject: [Ferret-talk] EOF Error with Unit Tests In-Reply-To: References: <3f35515adc44e16be64c2d3b5854ba92@ruby-forum.com> <0aadb024eeeafcfac48b26984fb76590@ruby-forum.com> Message-ID: <7070dd6b87880936af8ef407ad391b29@ruby-forum.com> Hi David, I tried calling index.close at the teardown method. But since i am using acts_as_ferret, it looks like AAF closes the index after it's done updating it, so i am getting errors of "closing an already closed index" - am i doing anything wrong? Or should i just switch over the linux :-) I tried upgrading to 0.10.x, the gems installed OK, but there is this name error for the constant FIELD when i load up script/console. Thanks again for your answers!! Miguel David Balmain wrote: > On 10/11/06, MIguel wrote: >> report here for documentation purposes. thanks. > I'd say your error is almost certainly caused by the fact that you > have left an IndexWriter or Index open somewhere. By the way, why are > you still using 0.9.6? It is definitely worth upgrading as you'd be > using the much slower pure ruby version on Windows. > > Cheers, > Dave -- Posted via http://www.ruby-forum.com/. From ahfeel at rift.fr Wed Oct 11 03:47:36 2006 From: ahfeel at rift.fr (ahFeel) Date: Wed, 11 Oct 2006 09:47:36 +0200 Subject: [Ferret-talk] Need help for coding an extension to ferret In-Reply-To: References: Message-ID: <6695be8102f11d66dc4b0f4919ff102d@ruby-forum.com> David Balmain wrote: > On 10/11/06, ahFeel wrote: >> Here is my problem : >> { >> // >> >> of course, i've bound the method to the good object etc... >> >> Hope somebody'll help ! >> Thank you by advance, >> Jeremie 'ahFeel' BORDIER >> > > The frt_td_to_s method is doing almost exactly this. Make sure you > have have version 0.10.9 or later. Each TopDocs object has a reference > to the searcher that created it. You can get a LazyDoc object for the > searcher with the following call: > > LazyDoc *lazy_doc = sea->get_lazy_doc(searcher, id); > > By the way, id should be an int, not a long. To turn this into the > lazy loading Hash object that you see in Ruby you use this method: > > VALUE frt_get_lazy_doc(LazyDoc *lazy_doc) > > The code for cLazyDoc is in r_index.c. > > Can I ask why you need to do this in the C code? It would seem to me > to make a lot more sense to code this in Ruby but you probably have a > good reason. > > Cheers, > Dave Hi dave ! First, thank you a lot for this anwser ! I've been looking and trying a lot of things such as sea = (Searcher *)DATA_PTR(self) to use the sea->get_lazy_doc function, but it returns a null pointer... maybe i did something wrong, i'll retry, and be sure to have the last ferret release :) Concerning the reasons i'm doing this directly in C, it's just because we need a really fast implementation.. I'm working with Florent Solt who you already knows, works on a big project using Ruby / Ferret. These datas are supposed to transit over a RPC protocol, and there will be a lot of queries, so we really need a fast to_json method :) Thank you again, Jeremie 'ahFeel' BORDIER -- Posted via http://www.ruby-forum.com/. From ahfeel at rift.fr Wed Oct 11 03:55:50 2006 From: ahfeel at rift.fr (ahFeel) Date: Wed, 11 Oct 2006 09:55:50 +0200 Subject: [Ferret-talk] Need help for coding an extension to ferret In-Reply-To: <6695be8102f11d66dc4b0f4919ff102d@ruby-forum.com> References: <6695be8102f11d66dc4b0f4919ff102d@ruby-forum.com> Message-ID: Ok, i've updated the gem, and the last to_s method is helping a LOT ! thank you again Dave, ahf :) -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Wed Oct 11 04:10:25 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 11 Oct 2006 17:10:25 +0900 Subject: [Ferret-talk] Indexing problem 10.9/10.10 In-Reply-To: <3bcced9f0610102351u5267c05ateaaa9fe1e5c3aa20@mail.gmail.com> References: <3bcced9f0610101835i3a536fc3g48339ef6b771307c@mail.gmail.com> <3bcced9f0610102351u5267c05ateaaa9fe1e5c3aa20@mail.gmail.com> Message-ID: On 10/11/06, Ben Lee wrote: > Thanks for the tips, things seem happier now. Yeah, the size of each > document (number of tokens) is actually quite small in my case - I > think this is just case of me messing up the flush/optimize/close > tactics. > That's great to hear Ben. From kraemer at webit.de Wed Oct 11 04:38:44 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 11 Oct 2006 10:38:44 +0200 Subject: [Ferret-talk] sorting results with aaf multi_search In-Reply-To: <661efed4da4f0f3668c0f765bfa31fb1@ruby-forum.com> References: <661efed4da4f0f3668c0f765bfa31fb1@ruby-forum.com> Message-ID: <20061011083844.GA9323@cordoba.webit.de> On Tue, Oct 10, 2006 at 11:37:05PM +0200, Bill Lovett wrote: > Is it possible to sort the result of acts_as_ferret multi_search the > way you can with find_contents? > > I'm using the latest ferret and aaf. I have an interface that offers > multiple search options which search different fields of a single model. > In addition to these, I also have an "all" search type which is meant to > pull in one additional model and consider all indexed fields. > > find_by_contents is working fine. But when I switch to the "all" search > type that uses multi_search, the sort parameter seems to be ignored. > Should I be doing something differently? please try aaf from svn trunk, there is a known bug in this area that is fixed there. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Wed Oct 11 04:45:12 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 11 Oct 2006 10:45:12 +0200 Subject: [Ferret-talk] performance In-Reply-To: <6e1d9307645b76971cb5f234451f7e96@ruby-forum.com> References: <6e1d9307645b76971cb5f234451f7e96@ruby-forum.com> Message-ID: <20061011084511.GB9323@cordoba.webit.de> On Tue, Oct 10, 2006 at 05:59:23PM +0200, Aditya wrote: > Hi > > AS you know ferret does indexing on the methods also. This feature can > be used to > do full text search on columns + tags(implemented using > acts_as_taggable). > > I have a db where there are three tables say A,B,C > > The relationships are like - A has many B and B has many C. My tags go > against table C. But I have to do full text search on columns of A + > tags on C. > > What I can do is have an instance method for A which finds all grand > child of A in C and returns their tags as a string which can be indexed. > But wont that effect the performance a lot? Well, you can't tell without trying this out ;-) If it turns out to be a bottle neck, I'd rethink about optimizing this. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From john at squirl.info Wed Oct 11 05:53:58 2006 From: john at squirl.info (John) Date: Wed, 11 Oct 2006 11:53:58 +0200 Subject: [Ferret-talk] Ferret returning too many results In-Reply-To: <8ba122d0e7e88687cc9f236ab96f8069@ruby-forum.com> References: <8ba122d0e7e88687cc9f236ab96f8069@ruby-forum.com> Message-ID: <8f22be488be10c19b4f7a52d49649e48@ruby-forum.com> answering myself here, but i "fixed" the problem by reverting to 0.10.1. would still love to know what was causing the problem with 0.10.9. i get weird errors when trying to use the AAf 'id_multi_search' or 'multi_search' methods, too, with either 0.10.1 or 0.10.9, fwiw. am including acts_as_ferret like so: acts_as_ferret :fields => [ :name, :description, :condition ], :store_class_name => true ferret and aaf are the bomb, though, generally. thanks for them both. -- Posted via http://www.ruby-forum.com/. From jduflost at ben.vub.ac.be Wed Oct 11 06:03:33 2006 From: jduflost at ben.vub.ac.be (johan duflost) Date: Wed, 11 Oct 2006 12:03:33 +0200 Subject: [Ferret-talk] search results autocompletion - Checked by AntiVir DE References: <002d01c6e843$4e868750$0700000a@ORION><20061005083059.GB5467@cordoba.webit.de><004a01c6e886$16dc35f0$0700000a@ORION> <20061006082338.GH5467@cordoba.webit.de> Message-ID: <000901c6ed1c$82d5e180$0700000a@ORION> You're right. In fact, I remove the terms's accents before indexing them. Without this piece of code, it takes 'only' 6 minutes. ----- Original Message ----- From: "Jens Kraemer" To: Sent: Friday, October 06, 2006 10:23 AM Subject: Re: [Ferret-talk] search results autocompletion - Checked by AntiVir DE > On Thu, Oct 05, 2006 at 03:56:43PM +0200, johan duflost wrote: > [..] >> >> The indices from which I create the suggestions index are not very big: >> 80kb, 300kb and 2 Mb. >> >> After 20 minutes, I get a suggestions index of 1400 kb approximately. > > still looks somewhat slow to me... > > Jens > > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From kraemer at webit.de Wed Oct 11 07:37:38 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 11 Oct 2006 13:37:38 +0200 Subject: [Ferret-talk] Dynamic fields and inheritance In-Reply-To: References: Message-ID: <20061011113738.GC9323@cordoba.webit.de> On Wed, Oct 11, 2006 at 01:11:32AM +0200, Evan wrote: > I have a model that allows subclasses to dynamically define fields. > > The following code is a short test case that illustrates the problem I'm > having: > > class Product < ActiveRecord::Base > acts_as_ferret :fields => [:name] Don't call acts_as_ferret in your base class, instead add the :name field to the acts_as_ferret calls in Music and Book. That should fix your problems. acts_as_ferret isn't supposed to be called twice in a model (which is the case if you call it in your superclass, and in classes inheriting from that). cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Wed Oct 11 07:41:08 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 11 Oct 2006 13:41:08 +0200 Subject: [Ferret-talk] Ferret returning too many results In-Reply-To: <8ba122d0e7e88687cc9f236ab96f8069@ruby-forum.com> References: <8ba122d0e7e88687cc9f236ab96f8069@ruby-forum.com> Message-ID: <20061011114108.GD9323@cordoba.webit.de> On Tue, Oct 10, 2006 at 10:58:38PM +0200, John wrote: > Hi, I just upgraded from acts_as_ferret 0.2/Ferret 0.9.x to > acts_as_ferret 0.3/ferret .10.9, and i'm getting a strange behavior: > > searches are returning far too many results, most of them superflous. > i'll paste in my code below, it's pretty simple. i'm googling like mad, > and going through both the source and the forums, and having trouble > finding what could be causing this. any help greatly apreciated. > > relevant code from our rails app: > > # search_array hold an array of the model we want to search > search_array.each do |asset| > a = Object.const_get(asset) > assets << (a.find_id_by_contents q, :limit => num_docs, :offset => > first_doc) > end > assets.flatten! > return assets hm, we'll need some more information to help you out. What do your models look like, i.e. do you use single table inheritance or other 'special' things ? providing some real data, a query, and the results your search returned would really help, too. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From dbalmain.ml at gmail.com Wed Oct 11 08:36:11 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 11 Oct 2006 21:36:11 +0900 Subject: [Ferret-talk] EOF Error with Unit Tests In-Reply-To: <7070dd6b87880936af8ef407ad391b29@ruby-forum.com> References: <3f35515adc44e16be64c2d3b5854ba92@ruby-forum.com> <0aadb024eeeafcfac48b26984fb76590@ruby-forum.com> <7070dd6b87880936af8ef407ad391b29@ruby-forum.com> Message-ID: On 10/11/06, MIguel wrote: > Hi David, > > I tried calling index.close at the teardown method. But since i am > using acts_as_ferret, it looks like AAF closes the index after it's done > updating it, so i am getting errors of "closing an already closed index" > - am i doing anything wrong? Or should i just switch over the linux :-) > > I tried upgrading to 0.10.x, the gems installed OK, but there is this > name error for the constant FIELD when i load up script/console. > > Thanks again for your answers!! > > Miguel > The API changed between 0.9 and 0.10 so you'll have to make some changes. Check the tutorial at http://ferret.davebalmain.com/api, in particular the section on adding Documents. Hopefully that will help you fix the problem. Cheers, Dave From ahfeel at rift.fr Wed Oct 11 08:56:20 2006 From: ahfeel at rift.fr (ahFeel) Date: Wed, 11 Oct 2006 14:56:20 +0200 Subject: [Ferret-talk] Memory allocation bug with index.search Message-ID: Hi Dave ( again ! ) I've been searching for a while into my extension code to understand what was the matter, and Florent Solt who hasn't got my extension has the same problem, so it figured to be a ferret problem.. Unfortunately, we're unable at the moment to reproduce it in a little code so you could debug easilier... I'm working on it. Heres the issue : In the index, there are a lot of stuff with a type field, and we're experiencing the same bug with just type => hardware or mixed datas. We've got around 12400 docs with this type, here are the queries (THEY ARE ALL LAUNCHED IN A _NEW_ INSTANCE OF FERRET ! (with the same index of course)): this one works fine : >> INDEX.search('type:hardware').to_s => "TopDocs: total_hits = 12490, max_score = 1.751220 [\n\t13997 \"61426\": 1.751220\n\t13998 \"61427\": 1.751220\n\t13999 \"61428\": 1.751220\n\t14000 \"61429\": 1.751220\n\t14001 \"61430\": 1.751220\n\t14002 \"61431\": 1.751220\n\t14003 \"61432\": 1.751220\n\t14004 \"61433\": 1.751220\n\t14005 \"61434\": 1.751220\n\t14006 \"61435\": 1.751220\n]\n" and this one doesn't.. : >> INDEX.search('type:hardware', :limit => :all).to_s /usr/local/lib/site_ruby/1.8/ferret/index.rb:718: [BUG] rb_gc_mark(): unknown data type 0x18(0x89b4268) non objectruby 1.8.4 (2005-12-24) [i486-linux] Aborted i tried to get to the limit point and got it : works : >> INDEX.search('type:hardware', :limit => 9719).to_s don't work : >> INDEX.search('type:hardware', :limit => 9720).to_s /usr/local/lib/site_ruby/1.8/ferret/index.rb:718: [BUG] Segmentation fault ruby 1.8.4 (2005-12-24) [i486-linux] BUT ! if, IN THE SAME INSTANCE OF RUBY, i do the first and then the second one, it works ! >> INDEX.search('type:hardware', :limit => 9719).to_s ...works >> INDEX.search('type:hardware', :limit => 9720).to_s ...works >> INDEX.search('type:hardware', :limit => :all).to_s WORKS. So i guess there's a problem in result memory allocation, i don't know what i could add to help you, i'm trying to reproduce it in a little code... Ah, just another detail, this is how our index is created : INDEX_OPTIONS = { :path => path, :auto_flush => false, :max_buffer_memory => 0x4000000, :max_buffered_docs => 100000, :use_compound_file => false} INDEX = Ferret::Index::Index.new(INDEX_OPTIONS) but i tried without any max_* and it's the same... Tell me what i can do to help if i don't manage to reproduce it in a pastable code. Thanks by advance, Regards, Jeremie 'ahFeel' BORDIER -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Wed Oct 11 09:41:03 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 11 Oct 2006 22:41:03 +0900 Subject: [Ferret-talk] [ANN] Ferret 0.10.11 Bug fix release Message-ID: Hey folks, I've just release version 0.10.11. It's a major bug fix release for version 0.10.10. If you are still at version 0.10.9 however you may as well wait for version 0.10.12. The last two releases are just performance enhancement releases. Cheers, Dave From peter at ioffer.com Wed Oct 11 12:18:40 2006 From: peter at ioffer.com (peter) Date: Wed, 11 Oct 2006 09:18:40 -0700 Subject: [Ferret-talk] Indexing problem 10.9/10.10 In-Reply-To: Message-ID: Hey Dave! Yes..we actually compiled with large-file support, and things seem to be working just fine. And in the end, once I figured out that I can tokenize a large bit of text, and not have to actually store it, we were able to have the optimized index only be about 1 gig at the end, so large-file support never became an issue, even though we did compile it that way, just in case. With the two copies thing, we actually have two boxes in our cluster, each with a copy of the index used for searching, but only one copy used for indexing. That way, each box we have in the cluster can search locally, while the "indexing" box can index away, and update the copies when it's done. Oh..and I do turn off :term_vector for most of my fields..thanks for the tip. By the way, thanks for all the hard work you do in getting this product the best it can be. > From: "David Balmain" > Reply-To: ferret-talk at rubyforge.org > Date: Wed, 11 Oct 2006 15:16:58 +0900 > To: ferret-talk at rubyforge.org > Subject: Re: [Ferret-talk] Indexing problem 10.9/10.10 > > On 10/11/06, peter wrote: >> We've had somewhat of a similar situation ourselves, where we are indexing >> about a million records to an index, and each record can be somewhat large. >> >> Now..what happened on our side was that the index files (very similar in >> structure to what you have below) came up to a 2 gig limit and stopped >> there..and the indexer started crashing each time it hit this limit. >> >> On your side, I don't see your index file sizes really that large. I think >> the compiling with large file support only really kicks in when you hit this >> 2 gig size limit. > > Hi Peter, > Did you manage to compile Ferret successfully with large-file support > yourself? > >> Couple of thoughts that might help: >> 1. On our side, to keep size down, I would optimize the index at every >> 100,000 documents. The optimize call also flushes the index. > > You can also just call Index#flush to flush the index without having > to optimize. Or IndexWriter#commit. Actually they should both be > commit so I'm going to alias commit to flush in the Index class in the > next version. > >> 2. Make sure you close the index once you index your data. Small >> thing..but just making sure. >> >> 3. With the index being this large, we actually have two copies, one for >> searching against an already optimized index, and the other copy doing the >> indexing. This way, no items are being searched on while the indexing is >> taking place. > > This shouldn't be necessary. Whatever version of the index you open > the IndexReader on will be the version of the index that you are > searching, even when it's files are deleted it will hold on to the > file handles so the data should still be available. The operating > system won't be able to use that disk space until you close the > IndexReader (or Searcher). > >> 4. One neat thing that I learned with indexing large items, was that I >> don't have to actually store everything. I can have a field set to >> tokenize, but not store, so that it can be searched..but I don't need it to >> be displayed in the search results per say..I don't actually store it, so I >> was able to keep my index size down. > > Very good tip. You should also set :term_vector to :no unless you are > using term-vectors. > > Cheers, > Dave > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From ryansking at gmail.com Wed Oct 11 16:46:43 2006 From: ryansking at gmail.com (Ryan King) Date: Wed, 11 Oct 2006 13:46:43 -0700 Subject: [Ferret-talk] Ferret just got faster. In-Reply-To: References: Message-ID: <846f30c70610111346o5f973dfcvffb9a33ba149ce0e@mail.gmail.com> On 10/4/06, David Balmain wrote: > Hey guys, > > Sorry I haven't been around for the last few days. I've just finished > a coding marathon fixing up some of the performance problems in > Ferret. If you don't know what I'm talking about there has been a > problem with Filters and Sorts on large indexes. Well, I think I've > fixed the problem. I just installed 0.10.11 and I can confirm that this has greatly helped the performance of my app, which requires some custom sorting across several indexes approaching 1M items each. Just upgrading gave me about an order of magnitude increase in performance. David, you rock! -ryan From none at none.com Wed Oct 11 18:20:00 2006 From: none at none.com (koloa) Date: Thu, 12 Oct 2006 00:20:00 +0200 Subject: [Ferret-talk] Ferret just got faster. In-Reply-To: <846f30c70610111346o5f973dfcvffb9a33ba149ce0e@mail.gmail.com> References: <846f30c70610111346o5f973dfcvffb9a33ba149ce0e@mail.gmail.com> Message-ID: hey, awesome work. that is am amazing improvement! i hope to try out ferret asap. -- Posted via http://www.ruby-forum.com/. From cussen at gmail.com Wed Oct 11 19:10:22 2006 From: cussen at gmail.com (Johnny Cussen) Date: Thu, 12 Oct 2006 09:10:22 +1000 Subject: [Ferret-talk] Document Boost in acts_as_ferret? Message-ID: Is it possible to set a boost value for a document using acts_as_ferret (not field boost - document boost)? From khaosduke at gmail.com Wed Oct 11 21:53:15 2006 From: khaosduke at gmail.com (wc) Date: Thu, 12 Oct 2006 03:53:15 +0200 Subject: [Ferret-talk] multi_search problems, Never go away! In-Reply-To: <20061010083434.GG6332@cordoba.webit.de> References: <20061009155028.GA6332@cordoba.webit.de> <20061010083434.GG6332@cordoba.webit.de> Message-ID: <7b31d132a219521e648c4eb8bbd46a7a@ruby-forum.com> Jens Kraemer wrote: > On Mon, Oct 09, 2006 at 11:31:47PM +0200, wc wrote: > [..] >> Here is the trace >> User.multi_search("foo",[Model1,Model2]) >> TypeError: nil is not a symbol >> from >> ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:412:in >> `const_get' > > it seems your indexes don't contain the class_name field for all your > indexed objects. > >> both my models call: >> >> acts_as_ferret :store_class_name => true > > you should have all *three* models (User, Model1 and Model2) using the > :store_class_name option. > > Then, after rebuilding your indexes, multi_search should work fine. > > Jens > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 I have :store_class_name => true in all the models, and I search using model1.multi_search("foo",[Model1,Model2]) it works so long as there are no matches, but if there are matches I get the same error I previously posted about the nil object. I checked my indexes and they are create as well, for all the models I need to search -- Posted via http://www.ruby-forum.com/. From khaosduke at gmail.com Wed Oct 11 23:08:56 2006 From: khaosduke at gmail.com (wc) Date: Thu, 12 Oct 2006 05:08:56 +0200 Subject: [Ferret-talk] multi_search problems, Never go away! In-Reply-To: <7b31d132a219521e648c4eb8bbd46a7a@ruby-forum.com> References: <20061009155028.GA6332@cordoba.webit.de> <20061010083434.GG6332@cordoba.webit.de> <7b31d132a219521e648c4eb8bbd46a7a@ruby-forum.com> Message-ID: <75d8da812a1f0408bb0b97784e65bf94@ruby-forum.com> wc wrote: > Jens Kraemer wrote: >> On Mon, Oct 09, 2006 at 11:31:47PM +0200, wc wrote: >> [..] >>> Here is the trace >>> User.multi_search("foo",[Model1,Model2]) >>> TypeError: nil is not a symbol >>> from >>> ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:412:in >>> `const_get' >> >> it seems your indexes don't contain the class_name field for all your >> indexed objects. >> >>> both my models call: >>> >>> acts_as_ferret :store_class_name => true >> >> you should have all *three* models (User, Model1 and Model2) using the >> :store_class_name option. >> >> Then, after rebuilding your indexes, multi_search should work fine. >> >> Jens >> >> -- >> webit! Gesellschaft f?r neue Medien mbH www.webit.de >> Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de >> Schnorrstra?e 76 Tel +49 351 46766 0 >> D-01069 Dresden Fax +49 351 46766 66 > > I have :store_class_name => true in all the models, and I search using > model1.multi_search("foo",[Model1,Model2]) it works so long as there are > no matches, but if there are matches I get the same error I previously > posted about the nil object. I checked my indexes and they are create as > well, for all the models I need to search I also just upgraded to Ferret 0.10.11 and I am using acts_as_ferret latest stable thanks ~wil -- Posted via http://www.ruby-forum.com/. From khaosduke at gmail.com Thu Oct 12 03:47:01 2006 From: khaosduke at gmail.com (wc) Date: Thu, 12 Oct 2006 09:47:01 +0200 Subject: [Ferret-talk] multi_search problems, Never go away! In-Reply-To: <75d8da812a1f0408bb0b97784e65bf94@ruby-forum.com> References: <20061009155028.GA6332@cordoba.webit.de> <20061010083434.GG6332@cordoba.webit.de> <7b31d132a219521e648c4eb8bbd46a7a@ruby-forum.com> <75d8da812a1f0408bb0b97784e65bf94@ruby-forum.com> Message-ID: <42f20e15eb394b032862ba3aa64e260e@ruby-forum.com> So it seems as though :model => doc[:class_name] line 428 of class_methods.rb isn't doing what it's supposed to, I checked the indexes and there is in fact a class_name field being created from the acts_as_ferret :store_class_name => true, I just dont know why it isn't being used and turns to nil in this case, again only if there is a hit is it nil -- Posted via http://www.ruby-forum.com/. From khaosduke at gmail.com Thu Oct 12 03:56:14 2006 From: khaosduke at gmail.com (wc) Date: Thu, 12 Oct 2006 09:56:14 +0200 Subject: [Ferret-talk] multi_search problems, Never go away! In-Reply-To: <42f20e15eb394b032862ba3aa64e260e@ruby-forum.com> References: <20061009155028.GA6332@cordoba.webit.de> <20061010083434.GG6332@cordoba.webit.de> <7b31d132a219521e648c4eb8bbd46a7a@ruby-forum.com> <75d8da812a1f0408bb0b97784e65bf94@ruby-forum.com> <42f20e15eb394b032862ba3aa64e260e@ruby-forum.com> Message-ID: wc wrote: > > > So it seems as though :model => doc[:class_name] line 428 of > class_methods.rb isn't doing what it's supposed to, I checked the > indexes and there is in fact a class_name field being created from the > acts_as_ferret :store_class_name => true, I just dont know why it isn't > being used and turns to nil in this case, again only if there is a hit > is it nil I'm starting to think it might have something to do with one of my models being a HABTM relationship setup -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Oct 12 04:22:23 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 12 Oct 2006 10:22:23 +0200 Subject: [Ferret-talk] Document Boost in acts_as_ferret? In-Reply-To: References: Message-ID: <20061012082223.GE9323@cordoba.webit.de> On Thu, Oct 12, 2006 at 09:10:22AM +1000, Johnny Cussen wrote: > Is it possible to set a boost value for a document using > acts_as_ferret (not field boost - document boost)? This should be possible by overriding the to_doc instance method in your model: def to_doc doc = super doc.boost = 2 # fit to taste, i.e. get from instance variable doc end Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From f at andreas-s.net Thu Oct 12 04:45:03 2006 From: f at andreas-s.net (Andreas Schwarz) Date: Thu, 12 Oct 2006 10:45:03 +0200 Subject: [Ferret-talk] IO Error occured at :79 in xraise (IOError) Message-ID: Hi, after a long indexing run I got the following error. I have 149 MB space left on the disk, the index is 311 MB large; could Ferret have tried to use more than that for the optimizing? Or would that have resulted in a different error message? /usr/local/ruby-1.8.5/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/commands/runner.rb:27: /usr/local/ruby-1.8.5/lib/ruby/gems/1.8/gems/ferret-0.10.11/lib/ferret/index.rb:536:in `optimize': IO Error occured at :79 in xraise (IOError) Error occured in fs_store.c:226 - fso_flush_i flushing src of length 1024 from /usr/local/ruby-1.8.5/lib/ruby/gems/1.8/gems/ferret-0.10.11/lib/ferret/index.rb:536:in `optimize' from /usr/local/ruby-1.8.5/lib/ruby/1.8/monitor.rb:238:in `synchronize' from /usr/local/ruby-1.8.5/lib/ruby/gems/1.8/gems/ferret-0.10.11/lib/ferret/index.rb:534:in `optimize' from ./vendor/plugins/acts_as_ferret/lib/class_methods.rb:235:in `manual_index_update' from (eval):1 from /usr/local/ruby-1.8.5/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in `eval' from /usr/local/ruby-1.8.5/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/commands/runner.rb:27 from /usr/local/ruby-1.8.5/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in `gem_original_require' from /usr/local/ruby-1.8.5/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in `require' from script/runner:3 -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Thu Oct 12 04:55:40 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 12 Oct 2006 17:55:40 +0900 Subject: [Ferret-talk] IO Error occured at :79 in xraise (IOError) In-Reply-To: References: Message-ID: On 10/12/06, Andreas Schwarz wrote: > Hi, > > after a long indexing run I got the following error. I have 149 MB space > left on the disk, the index is 311 MB large; could Ferret have tried to > use more than that for the optimizing? Or would that have resulted in a > different error message? > > /usr/local/ruby-1.8.5/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/commands/runner.rb:27: > /usr/local/ruby-1.8.5/lib/ruby/gems/1.8/gems/ferret-0.10.11/lib/ferret/index.rb:536:in > `optimize': IO Error occured at :79 in xraise (IOError) > Error occured in fs_store.c:226 - fso_flush_i > flushing src of length 1024 > > from > /usr/local/ruby-1.8.5/lib/ruby/gems/1.8/gems/ferret-0.10.11/lib/ferret/index.rb:536:in > `optimize' > from /usr/local/ruby-1.8.5/lib/ruby/1.8/monitor.rb:238:in > `synchronize' > from > /usr/local/ruby-1.8.5/lib/ruby/gems/1.8/gems/ferret-0.10.11/lib/ferret/index.rb:534:in > `optimize' > from ./vendor/plugins/acts_as_ferret/lib/class_methods.rb:235:in > `manual_index_update' > from (eval):1 > from > /usr/local/ruby-1.8.5/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in > `eval' > from > /usr/local/ruby-1.8.5/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/commands/runner.rb:27 > from > /usr/local/ruby-1.8.5/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in > `gem_original_require' > from > /usr/local/ruby-1.8.5/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in > `require' > from script/runner:3 Hi Andreas, I think it is most probably because it ran out of space during indexing as you guessed. I'll have to change that error message to a friendlier one though. I'll fix that in the next version. Let me know if you get this problem even when you have more memory. Cheers, Dave From f at andreas-s.net Thu Oct 12 05:32:54 2006 From: f at andreas-s.net (Andreas Schwarz) Date: Thu, 12 Oct 2006 11:32:54 +0200 Subject: [Ferret-talk] IO Error occured at :79 in xraise (IOError) In-Reply-To: References: Message-ID: <09a60268a089d7adbda33c198ebbdd14@ruby-forum.com> David Balmain wrote: > On 10/12/06, Andreas Schwarz wrote: >> Error occured in fs_store.c:226 - fso_flush_i >> from ./vendor/plugins/acts_as_ferret/lib/class_methods.rb:235:in >> from >> /usr/local/ruby-1.8.5/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in >> `require' >> from script/runner:3 > > Hi Andreas, > > I think it is most probably because it ran out of space during > indexing as you guessed. But there's still 149 MB left, wouldn't it have filled it up first beforst giving up? > I'll have to change that error message to a > friendlier one though. I'll fix that in the next version. Let me know > if you get this problem even when you have more memory. I will. Thanks for the fast reply Andreas -- Posted via http://www.ruby-forum.com/. From ahfeel at rift.fr Thu Oct 12 05:50:10 2006 From: ahfeel at rift.fr (ahFeel) Date: Thu, 12 Oct 2006 11:50:10 +0200 Subject: [Ferret-talk] Patching ferret problems... Message-ID: <2ffdaaa411b710ff5736e36f47317176@ruby-forum.com> Hi dave, and everyone ;) I've finished my extension to export Index.search(..) loaded results into json. Here's a little benchmark of what i've done : >> Benchmark.realtime { INDEX.search(query, :limit => 1000).hits.each { |x| INDEX[x.doc].load.to_json } } => 7.38711595535278 >> Benchmark.realtime { INDEX.search('type:hardware', :limit => 1000).to_json } => 0.471335172653198 I patched that directly in /usr/lib/ruby/gems/1.8/gems/ferret-0.10.10/ext/r_search.c so i wanted to submit my modifications today, so i did : svn checkout svn://davebalmain.com/ferret/tags/REL-0.10.11 and tried to make my modifications into REL-0.10.11/ext/r_search.c where i discovered a T O T A L L Y different file... I tried to find the good one, but nothing.. ? And when my mate installed 0.10.11, he had the good file in /usr/lib.../ext/ :-? Could you tell me what to checkout or dunno, the way to patch this ? :/ Thanks by advance, Jeremie 'ahFeel' BORDIER -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Thu Oct 12 06:00:30 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 12 Oct 2006 19:00:30 +0900 Subject: [Ferret-talk] IO Error occured at :79 in xraise (IOError) In-Reply-To: <09a60268a089d7adbda33c198ebbdd14@ruby-forum.com> References: <09a60268a089d7adbda33c198ebbdd14@ruby-forum.com> Message-ID: On 10/12/06, Andreas Schwarz wrote: > David Balmain wrote: > > On 10/12/06, Andreas Schwarz wrote: > >> Error occured in fs_store.c:226 - fso_flush_i > >> from ./vendor/plugins/acts_as_ferret/lib/class_methods.rb:235:in > >> from > >> /usr/local/ruby-1.8.5/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in > >> `require' > >> from script/runner:3 > > > > Hi Andreas, > > > > I think it is most probably because it ran out of space during > > indexing as you guessed. > > But there's still 149 MB left, wouldn't it have filled it up first > beforst giving up? Perhaps I'm wrong but the single optimized file might have been larger than 149Mb so it might never have been committed to the file system, hence the amount of space left would stay at 149Mb. I don't know enough about file-systems to know if that is a plausible explanation. The error is actually happening in the write system call. That was the only place in FSDirectory where I didn't print out the strerror value so I've fixed that now. If it happens again it'll tell you why. Dave From dbalmain.ml at gmail.com Thu Oct 12 06:06:02 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 12 Oct 2006 19:06:02 +0900 Subject: [Ferret-talk] Patching ferret problems... In-Reply-To: <2ffdaaa411b710ff5736e36f47317176@ruby-forum.com> References: <2ffdaaa411b710ff5736e36f47317176@ruby-forum.com> Message-ID: On 10/12/06, ahFeel wrote: > Hi dave, and everyone ;) > > I've finished my extension to export Index.search(..) loaded results > into json. > Here's a little benchmark of what i've done : > > >> Benchmark.realtime { INDEX.search(query, :limit => 1000).hits.each { |x| INDEX[x.doc].load.to_json } } > => 7.38711595535278 > >> Benchmark.realtime { INDEX.search('type:hardware', :limit => 1000).to_json } > => 0.471335172653198 > > I patched that directly in > /usr/lib/ruby/gems/1.8/gems/ferret-0.10.10/ext/r_search.c > so i wanted to submit my modifications today, so i did : > svn checkout svn://davebalmain.com/ferret/tags/REL-0.10.11 > and tried to make my modifications into REL-0.10.11/ext/r_search.c where > i discovered a T O T A L L Y different file... > I tried to find the good one, but nothing.. ? > > And when my mate installed 0.10.11, he had the good file in > /usr/lib.../ext/ :-? > > Could you tell me what to checkout or dunno, the way to patch this ? :/ > > Thanks by advance, > Jeremie 'ahFeel' BORDIER Check out like this: svn co svn://www.davebalmain.com/exp/ ferret This will give you the working copy which should be fine. I'm not sure why checking out the tagged version didn't work although I haven't tried it myself. Dave From ahfeel at rift.fr Thu Oct 12 07:18:59 2006 From: ahfeel at rift.fr (ahFeel) Date: Thu, 12 Oct 2006 13:18:59 +0200 Subject: [Ferret-talk] Patching ferret problems... In-Reply-To: References: <2ffdaaa411b710ff5736e36f47317176@ruby-forum.com> Message-ID: David Balmain wrote: > svn co svn://www.davebalmain.com/exp/ ferret > Great ! Looks more familliar to me =) I've patched this code but i need to make just some little modifications, and i'll trac you the diff ! > This will give you the working copy which should be fine. I'm not sure > why checking out the tagged version didn't work although I haven't > tried it myself. You should have a look at that, it's really strange.. in your Trac the directory is marked as modificated 20hours ago but when you go into it, the last modifications are 3 weeks ago :\ > > Dave Thank you again for what you do and your fast usefull answers ! Jeremie 'ahFeel' BORDIER -- Posted via http://www.ruby-forum.com/. From indanapt at yahoo.com Thu Oct 12 11:46:48 2006 From: indanapt at yahoo.com (Jeff Gortatowsky) Date: Thu, 12 Oct 2006 17:46:48 +0200 Subject: [Ferret-talk] Newbie question: 28000+ files for 25000+ records? Message-ID: Hi Obviously my question is, is that normal? To have so many files? I was indexing 6 string fields from 25000+ model records (all of the same model). The index appears to be working. I guess I was expecting a few hundred files after optimzing, not more files that records indexed. Please understand I am brand spanking new to Lucene, Ferret, and AaF. I was using acts_as_ferret with :fields => ["user_id", "answer1", "answer2", "answer3", "answer4", "answer5", "answer6"], :merge_factor => 1000, :max_merge_document = 10000, :max_memory_buffer =>0x4000000 The fields are from 15 to 500 characters long. Also, was there any way to stop AaF from trying to create a new index with all the existing model data? I was surprised when after creating and updating one model object in the Rails Console, AaF took off trying to index all 8 million rows of the underlying table! I did search here on 'too many files' and "large number of files" but came up empty. I am sure my lack of domain knowledge is most likely what is hurting me. Thanks Jeff -- Posted via http://www.ruby-forum.com/. From fastjames at gmail.com Thu Oct 12 14:07:31 2006 From: fastjames at gmail.com (Jim Kane) Date: Thu, 12 Oct 2006 20:07:31 +0200 Subject: [Ferret-talk] Ferret::StateError while using acts_as_ferret Message-ID: <72fcb47db245987eb017036d002e51c6@ruby-forum.com> I'm fairly new to ferret / aaf and finding it much easier to use than HyperEstraier (which I migrated from). However, I am getting a few errors and I need to figure out if they're problems with my usage of ferret or a bug I should report. I'm currently running Ferret 0.10.11 with acts_as_ferret (latest via svn external) and 3 times today I've seen the following error in production: A Ferret::StateError occurred in directory#search: State Error occurred at :79 in xraise Error occurred in index.c:2098 - stde_doc_num Illegal state of TermDocEnum. You must call #next before you call #doc_num /usr/local/lib/ruby/gems/1.8/gems/ferret-0.10.11/lib/ferret/index.rb:370:in `search.each' The index in question is on a single model and contains about 330K items. I'm not doing anything unusual as far as I know -- just a call to find_by_contents sorted by a timestamp (stored nontokenized). Can anyone offer some advice on what I might be doing to cause this? I'm also getting occasional segfaults (already submitted as a ticket on the trac) but they don't _appear_ to be tied to this error. Jim -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Thu Oct 12 20:40:23 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Fri, 13 Oct 2006 09:40:23 +0900 Subject: [Ferret-talk] Newbie question: 28000+ files for 25000+ records? In-Reply-To: References: Message-ID: On 10/13/06, Jeff Gortatowsky wrote: > Hi > > Obviously my question is, is that normal? To have so many files? I was > indexing 6 string fields from 25000+ model records (all of the same > model). The index appears to be working. I guess I was expecting a few > hundred files after optimzing, not more files that records indexed. Hi Jeff, this doesn't sound right at all. Could send a partial listing of the directory so I can see what files are in it? Do `ls -l` so I can see their sizes too. > Please understand I am brand spanking new to Lucene, Ferret, and AaF. No problem, we're here to help. > I was using acts_as_ferret with > > :fields => ["user_id", > "answer1", > "answer2", > "answer3", > "answer4", > "answer5", > "answer6"], > :merge_factor => 1000, > :max_merge_document = 10000, > :max_memory_buffer =>0x4000000 > > > The fields are from 15 to 500 characters long. > > Also, was there any way to stop AaF from trying to create a new index > with all the existing model data? I was surprised when after creating > and updating one model object in the Rails Console, AaF took off trying > to index all 8 million rows of the underlying table! I'll leave these kinds of questions to the acts_as_ferret users. Cheers, Dave From dbalmain.ml at gmail.com Thu Oct 12 23:04:48 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Fri, 13 Oct 2006 12:04:48 +0900 Subject: [Ferret-talk] Ferret::StateError while using acts_as_ferret In-Reply-To: <72fcb47db245987eb017036d002e51c6@ruby-forum.com> References: <72fcb47db245987eb017036d002e51c6@ruby-forum.com> Message-ID: On 10/13/06, Jim Kane wrote: > I'm fairly new to ferret / aaf and finding it much easier to use than > HyperEstraier (which I migrated from). However, I am getting a few > errors and I need to figure out if they're problems with my usage of > ferret or a bug I should report. I'm currently running Ferret 0.10.11 > with acts_as_ferret (latest via svn external) and 3 times today I've > seen the following error in production: > > A Ferret::StateError occurred in directory#search: > > State Error occurred at :79 in xraise > Error occurred in index.c:2098 - stde_doc_num > Illegal state of TermDocEnum. You must call #next before you call > #doc_num > > > /usr/local/lib/ruby/gems/1.8/gems/ferret-0.10.11/lib/ferret/index.rb:370:in > `search.each' > > > The index in question is on a single model and contains about 330K > items. I'm not doing anything unusual as far as I know -- just a call > to find_by_contents sorted by a timestamp (stored nontokenized). Can > anyone offer some advice on what I might be doing to cause this? I'm > also getting occasional segfaults (already submitted as a ticket on the > trac) but they don't _appear_ to be tied to this error. > > Jim Hi Jim, I'm not sure what might be causing this error. Did you reindex when you upgraded to 0.10.11? That may help. If it doesn't, try going back to version 0.10.9. This error may have been introduced in the performance enhancements I added in version 0.10.10. Let me know how you go. Cheers, Dave From tennisbum2002 at hotmail.com Fri Oct 13 03:41:36 2006 From: tennisbum2002 at hotmail.com (Eric Gross) Date: Fri, 13 Oct 2006 09:41:36 +0200 Subject: [Ferret-talk] multi_search error undefined method Message-ID: Hi, Im having problems using the multi_search command. I keep getting the following error. "undefined method `<<' for Book:Class" here is the code associated with this. class Book < ActiveRecord::Base acts_as_ferret :store_class_name => true end class User < ActiveRecord::Base acts_as_ferret :store_class_name => true end and the call is the following t=User.multi_search(@query,Book). I tried deleting my browser session and rebuilding the index (by deleting it). Im still getting the same error. Any ideas? -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Fri Oct 13 04:31:00 2006 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 13 Oct 2006 10:31:00 +0200 Subject: [Ferret-talk] Newbie question: 28000+ files for 25000+ records? In-Reply-To: References: Message-ID: <20061013083100.GA14271@cordoba.webit.de> On Fri, Oct 13, 2006 at 09:40:23AM +0900, David Balmain wrote: > On 10/13/06, Jeff Gortatowsky wrote: [..] > > Also, was there any way to stop AaF from trying to create a new > > index with all the existing model data? I was surprised when after > > creating and updating one model object in the Rails Console, AaF > > took off trying to index all 8 million rows of the underlying table! aaf always tries to create the index if it doesn't exist yet. The whole point about aaf is to keep the index in sync with your database. Therefore it is necessary to add all existing records to a newly created index. Although it would be easy to add an option to suppress the indexing of existing data, I don't think this is useful, because you'll end up with an index only containing new or updated records, but not those that already existed at index creation time. I can't imagine this is what you want ;-) To keep the index creation from happening when the index is accessed the first time from your app (could be a search, or some update/create operation), you can build up the index from the console, i.e. RAILS_ENV=production script/console >> Model.rebuild_index cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Fri Oct 13 04:41:40 2006 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 13 Oct 2006 10:41:40 +0200 Subject: [Ferret-talk] multi_search error undefined method In-Reply-To: References: Message-ID: <20061013084140.GB14271@cordoba.webit.de> On Fri, Oct 13, 2006 at 09:41:36AM +0200, Eric Gross wrote: > Hi, > > Im having problems using the multi_search command. I keep getting the > following error. > > "undefined method `<<' for Book:Class" > > here is the code associated with this. > > class Book < ActiveRecord::Base > acts_as_ferret :store_class_name => true > end > > > class User < ActiveRecord::Base > acts_as_ferret :store_class_name => true > end > > and the call is the following > > t=User.multi_search(@query,Book). try this: t = User.multi_search(@query, [ Book ]) I just committed a fix so that t=User.multi_search(@query,Book) will work, too. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From f at andreas-s.net Fri Oct 13 13:18:44 2006 From: f at andreas-s.net (Andreas Schwarz) Date: Fri, 13 Oct 2006 19:18:44 +0200 Subject: [Ferret-talk] Ferret 0.10.11 & AAF: sorting Time fields doesn't work Message-ID: Ferret 0.10.11 & AAF: the time seems to be stored in a format that can't be sorted, the order doesn't make any sense. Workaround: use to_i on the Time object before putting it into the index. -- Posted via http://www.ruby-forum.com/. From f at andreas-s.net Fri Oct 13 13:37:52 2006 From: f at andreas-s.net (Andreas Schwarz) Date: Fri, 13 Oct 2006 19:37:52 +0200 Subject: [Ferret-talk] uninitialized constant LockException Message-ID: <2291824ddec9a4b58c780483562a1496@ruby-forum.com> When I try to search a locked index, I get the exception uninitialized constant LockException: /usr/local/ruby-1.8.5/lib/ruby/gems/1.8/gems/ferret-0.10.11/lib/ferret/index.rb:383:in `[]' /usr/local/ruby-1.8.5/lib/ruby/1.8/monitor.rb:238:in `synchronize' /usr/local/ruby-1.8.5/lib/ruby/gems/1.8/gems/ferret-0.10.11/lib/ferret/index.rb:382:in `[]' -- Posted via http://www.ruby-forum.com/. From anrake at gmail.com Fri Oct 13 21:28:50 2006 From: anrake at gmail.com (Eric Obershaw) Date: Sat, 14 Oct 2006 03:28:50 +0200 Subject: [Ferret-talk] customer analyzer? Message-ID: I'd like to make my own analyzer for stemming, but where do I put it or how do I reference it? -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Sat Oct 14 04:03:03 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sat, 14 Oct 2006 17:03:03 +0900 Subject: [Ferret-talk] uninitialized constant LockException In-Reply-To: <2291824ddec9a4b58c780483562a1496@ruby-forum.com> References: <2291824ddec9a4b58c780483562a1496@ruby-forum.com> Message-ID: On 10/14/06, Andreas Schwarz wrote: > When I try to search a locked index, I get the exception > > uninitialized constant LockException: Fixed. I also fixed that error you got when running the unit tests. Thanks. Dave From dbalmain.ml at gmail.com Sat Oct 14 04:07:54 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sat, 14 Oct 2006 17:07:54 +0900 Subject: [Ferret-talk] Ferret 0.10.11 & AAF: sorting Time fields doesn't work In-Reply-To: References: Message-ID: On 10/14/06, Andreas Schwarz wrote: > Ferret 0.10.11 & AAF: the time seems to be stored in a format that can't > be sorted, the order doesn't make any sense. Workaround: use to_i on the > Time object before putting it into the index. This is fine for sorting but it won't work when you want to do range queries. If you want to run range queries against the field you will need to pad the integer to a fixed width. I usually add dates in YYYYMMDD format so that they are also human readable. Another thing to take into account is the precision you want. The higher the precision you store the longer indexing and searching will take. For small indexes however the difference will be negligible. Cheers, Dave Cheers, Dave From dbalmain.ml at gmail.com Sat Oct 14 04:13:12 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sat, 14 Oct 2006 17:13:12 +0900 Subject: [Ferret-talk] customer analyzer? In-Reply-To: References: Message-ID: On 10/14/06, Eric Obershaw wrote: > I'd like to make my own analyzer for stemming, but where do I put it or > how do I reference it? You create the analyzer like in the example here: http://ferret.davebalmain.com/api/classes/Ferret/Analysis/StemFilter.html Then you pass it to your Index or IndexWriter as the :analyzer parameter: index = Ferret::Index::Index.new(:analyzer => MyAnalyzer.new) #or writer = Ferret::Index::IndexWriter.new(:analyzer => MyAnalyzer.new) Hope that makes sense. Dave From f at andreas-s.net Sat Oct 14 04:20:46 2006 From: f at andreas-s.net (Andreas Schwarz) Date: Sat, 14 Oct 2006 10:20:46 +0200 Subject: [Ferret-talk] Ferret 0.10.11 & AAF: sorting Time fields doesn't work In-Reply-To: References: Message-ID: David Balmain wrote: > On 10/14/06, Andreas Schwarz wrote: >> Ferret 0.10.11 & AAF: the time seems to be stored in a format that can't >> be sorted, the order doesn't make any sense. Workaround: use to_i on the >> Time object before putting it into the index. > > This is fine for sorting but it won't work when you want to do range > queries. If you want to run range queries against the field you will > need to pad the integer to a fixed width. Thanks, I didn't think of that. The width of a unix timestamp isn't going to change for the next few years, though. > I usually add dates in YYYYMMDD format so that they are also human > readable. I need at least seconds in this case. > Another thing to take into account is the precision you > want. The higher the precision you store the longer indexing and > searching will take. For small indexes however the difference will be > negligible. Will this also affect :sort performance? I have another index with 450k documents that I also would like to sort by time, that's probably no longer a small index. Is the sorting capable of :sorting and :limiting a large numer of results (e.g. 100k), or is this something that should only be done with small result sets? Andreas -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Sat Oct 14 11:16:17 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sun, 15 Oct 2006 00:16:17 +0900 Subject: [Ferret-talk] Ferret 0.10.11 & AAF: sorting Time fields doesn't work In-Reply-To: References: Message-ID: On 10/14/06, Andreas Schwarz wrote: > David Balmain wrote: > > On 10/14/06, Andreas Schwarz wrote: > >> Ferret 0.10.11 & AAF: the time seems to be stored in a format that can't > >> be sorted, the order doesn't make any sense. Workaround: use to_i on the > >> Time object before putting it into the index. > > > > This is fine for sorting but it won't work when you want to do range > > queries. If you want to run range queries against the field you will > > need to pad the integer to a fixed width. > > Thanks, I didn't think of that. The width of a unix timestamp isn't > going to change for the next few years, though. > > > I usually add dates in YYYYMMDD format so that they are also human > > readable. > > I need at least seconds in this case. > > > Another thing to take into account is the precision you > > want. The higher the precision you store the longer indexing and > > searching will take. For small indexes however the difference will be > > negligible. > > Will this also affect :sort performance? I have another index with 450k > documents that I also would like to sort by time, that's probably no > longer a small index. Is the sorting capable of :sorting and :limiting a > large numer of results (e.g. 100k), or is this something that should > only be done with small result sets? It should sort the first time in under one second. After that the sort will be cached so it will be even quicker. To use sort caching though, be sure to use a sort object and not a sort_field or plain old string. Also, for best performance, sort by byte instead of integer (which will only work if you have a fixed width integer field). ie :sort => Sort.new(SortField.new(:field_name, :type => :byte)) Cheers, Dave From bfkeats at engmail.uwaterloo.ca Sat Oct 14 12:49:10 2006 From: bfkeats at engmail.uwaterloo.ca (Brian Keats) Date: Sat, 14 Oct 2006 18:49:10 +0200 Subject: [Ferret-talk] Using the wildcard plus partial searched In-Reply-To: <2426e1f71ec93c48e175f489fdb7d9be@ruby-forum.com> References: <2426e1f71ec93c48e175f489fdb7d9be@ruby-forum.com> Message-ID: <973e0bd78d8c40876ede62f6c0f93f88@ruby-forum.com> > I've noticed that typing "t" in the search box will return no results > however if I type "t*" it brings up all results beginning with t. > > I would like this behaviour on by default without having to type the > wildcard. Is there a way to do this? Add the following search method to your model def self.search(query) criteria = query.split wild_criteria = [] criteria.each do |criterion| if criterion == "AND" || criterion == "OR" wild_criteria.push(criterion) else wild_criterion = "*"+criterion+"*" wild_criteria.push(wild_criterion) end end wild_criteria.pop if wild_criteria.last == "AND" || wild_criteria.last == "OR" wild_query = wild_criteria.join(" ") self.find_by_contents(wild_query) end This also removes trailing logical operators, which will prevent it from screwing up a live search. -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Sat Oct 14 17:02:30 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Sat, 14 Oct 2006 23:02:30 +0200 Subject: [Ferret-talk] Attaching files in Tracs to tickets is broken! Message-ID: <5cd8a1d649ac8acce29f56889805c1c9@ruby-forum.com> Hi, I tried opening a new ticket with a patch I wanted to submit. But when I tried to attach files to the ticket I keep getting a ticket not found error. I kept trying to get it to work, but I accidentally opened 4 new tickets. Sorry. Charlie -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Sat Oct 14 17:45:37 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Sat, 14 Oct 2006 23:45:37 +0200 Subject: [Ferret-talk] How can I do my own search limits? Message-ID: <7a422fdf109869f7a261c607c1448539@ruby-forum.com> I'm trying to add a way to query across associations for a model in acts_as_ferret. Say I have a model A and it has a relationship with model B. Like say a Book has many pages. I want to search across the pages of the Book and produce a list of unique books who's pages match the terms. So if I have a page that hits then I will add that book to my list of results. Right now the multi_search returns all pages and books that match the query. This sort of gets difficult with pagination because you can't just keep track of it yourself. Also the total hits will include all hits on pages and books. The way the pagination works today with ferret is you hand it :offsets and :limit params. But, these are fixed width params. I could end up with 100's of pages that all belong to the same book so I have to skip all of those. This sort of seems like a different kind of search. Not a multi_search or find_by_contents, but a find_by_association. Where the hit on the association returns an object of the associated type. Is there something in ferret that allows me to scroll through the results one by one and stop when I've reached my limit? Charlie -- Posted via http://www.ruby-forum.com/. From anrake at gmail.com Sat Oct 14 23:04:34 2006 From: anrake at gmail.com (anrake o.) Date: Sun, 15 Oct 2006 05:04:34 +0200 Subject: [Ferret-talk] customer analyzer? In-Reply-To: References: Message-ID: Thanks. I'll try that, -- Posted via http://www.ruby-forum.com/. From none at none.com Sat Oct 14 23:30:29 2006 From: none at none.com (koloa) Date: Sun, 15 Oct 2006 05:30:29 +0200 Subject: [Ferret-talk] acts_as_attachment and tagging? Message-ID: hi, i read this: http://www.johnnysthoughts.com/2006/08/27/ruby-on-rails-using-full-text-search-with-tagging/ does this mean i do not have to install the acts_as_taggable plugin? all i need to do is something like this is my model class? acts_as_ferret :field=>['name', :tag_list] -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Sat Oct 14 23:46:38 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sun, 15 Oct 2006 12:46:38 +0900 Subject: [Ferret-talk] How can I do my own search limits? In-Reply-To: <7a422fdf109869f7a261c607c1448539@ruby-forum.com> References: <7a422fdf109869f7a261c607c1448539@ruby-forum.com> Message-ID: On 10/15/06, Charlie Hubbard wrote: > > I'm trying to add a way to query across associations for a model in > acts_as_ferret. Say I have a model A and it has a relationship with > model B. Like say a Book has many pages. I want to search across the > pages of the Book and produce a list of unique books who's pages match > the terms. So if I have a page that hits then I will add that book to > my list of results. Right now the multi_search returns all pages and > books that match the query. > > This sort of gets difficult with pagination because you can't just keep > track of it yourself. Also the total hits will include all hits on > pages and books. The way the pagination works today with ferret is you > hand it :offsets and :limit params. But, these are fixed width params. > I could end up with 100's of pages that all belong to the same book so I > have to skip all of those. > > This sort of seems like a different kind of search. Not a multi_search > or find_by_contents, but a find_by_association. Where the hit on the > association returns an object of the associated type. If I manage to implement the Ferret object database[1] this will be simple. Currently though there are two ways to do this. You can index all of the Page data in the Book document, presumably in a :page field. Or you can store the Book ids in the Pages and create a Book id set by scanning through all matching pages. [1] http://www.ruby-forum.com/topic/82086#142613 > Is there something in ferret that allows me to scroll through the > results one by one and stop when I've reached my limit? Sure. Set :limit => :all and call search_each. Then break when you reach your limit. From dbalmain.ml at gmail.com Sat Oct 14 23:49:41 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sun, 15 Oct 2006 12:49:41 +0900 Subject: [Ferret-talk] Attaching files in Tracs to tickets is broken! In-Reply-To: <5cd8a1d649ac8acce29f56889805c1c9@ruby-forum.com> References: <5cd8a1d649ac8acce29f56889805c1c9@ruby-forum.com> Message-ID: On 10/15/06, Charlie Hubbard wrote: > Hi, > > I tried opening a new ticket with a patch I wanted to submit. But when > I tried to attach files to the ticket I keep getting a ticket not found > error. I kept trying to get it to work, but I accidentally opened 4 new > tickets. Sorry. > > Charlie Just to let Jens know, this is on the acts_as_ferret trac, not Ferret's trac. Cheers, Dave From charlie.hubbard at gmail.com Sun Oct 15 10:30:47 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Sun, 15 Oct 2006 16:30:47 +0200 Subject: [Ferret-talk] How can I do my own search limits? In-Reply-To: References: <7a422fdf109869f7a261c607c1448539@ruby-forum.com> Message-ID: <424e0115f98cf0a317553bb92ee90f52@ruby-forum.com> David Balmain wrote: > If I manage to implement the Ferret object database[1] this will be > simple. Currently though there are two ways to do this. You can index > all of the Page data in the Book document, presumably in a :page > field. Or you can store the Book ids in the Pages and create a Book id > set by scanning through all matching pages. > > [1] http://www.ruby-forum.com/topic/82086#142613 The first option has problems because a book's content will be too large for a single field. It would overrun ferret's maximum field length. I'm pretty much doing the second option now. But, it's drawback is pagination gets tough. I'm not sure how having the ferret object database would actually work to solve this problem. How would your queries express what the user intends? How would it know I want to include all the Page objects as apart of a search on Books? Seems like you'd have to specify that sort of thing as options to the search. Like we have to specify eager loading with :include option to find. >> Is there something in ferret that allows me to scroll through the >> results one by one and stop when I've reached my limit? > > Sure. Set :limit => :all and call search_each. Then break when you > reach your limit. That will work for creating a list of Books, and ensuring a show say 10 unique books per page. But, I won't be able to tell what the total number of hits were. Any ideas? Also it gets hard to do pagination because you can't compute where the next window starts and ends. So how do you know what the offset parameter is for the previous pages. Or the offset for the 9th page is? Charlie -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Sun Oct 15 12:16:17 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Mon, 16 Oct 2006 01:16:17 +0900 Subject: [Ferret-talk] How can I do my own search limits? In-Reply-To: <424e0115f98cf0a317553bb92ee90f52@ruby-forum.com> References: <7a422fdf109869f7a261c607c1448539@ruby-forum.com> <424e0115f98cf0a317553bb92ee90f52@ruby-forum.com> Message-ID: On 10/15/06, Charlie Hubbard wrote: > David Balmain wrote: > > > If I manage to implement the Ferret object database[1] this will be > > simple. Currently though there are two ways to do this. You can index > > all of the Page data in the Book document, presumably in a :page > > field. Or you can store the Book ids in the Pages and create a Book id > > set by scanning through all matching pages. > > > > [1] http://www.ruby-forum.com/topic/82086#142613 > > The first option has problems because a book's content will be too large > for a single field. It would overrun ferret's maximum field length. Then change the maximum field length. IndexWriter has a :max_field_length parameter. > I'm pretty much doing the second option now. But, it's drawback is > pagination gets tough. I'm not sure how having the ferret object > database would actually work to solve this problem. How would your > queries express what the user intends? How would it know I want to > include all the Page objects as apart of a search on Books? Seems like > you'd have to specify that sort of thing as options to the search. Like > we have to specify eager loading with :include option to find. Well the user would just type their query as usual but you'd write the query something like: Books.find("pages match '#{query}'", :limit => 10) Or something like that. I haven't worked the details yet. And you would be able to specify whether you wanted lazy or eager loading too. > >> Is there something in ferret that allows me to scroll through the > >> results one by one and stop when I've reached my limit? > > > > Sure. Set :limit => :all and call search_each. Then break when you > > reach your limit. > > That will work for creating a list of Books, and ensuring a show say 10 > unique books per page. But, I won't be able to tell what the total > number of hits were. Any ideas? > > Also it gets hard to do pagination because you can't compute where the > next window starts and ends. So how do you know what the offset > parameter is for the previous pages. Or the offset for the 9th page is? > Scroll through all matches or use option 1. Cheers, Dave From charlie.hubbard at gmail.com Sun Oct 15 18:05:24 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Mon, 16 Oct 2006 00:05:24 +0200 Subject: [Ferret-talk] How can I do my own search limits? In-Reply-To: References: <7a422fdf109869f7a261c607c1448539@ruby-forum.com> <424e0115f98cf0a317553bb92ee90f52@ruby-forum.com> Message-ID: <873164f696bd71734fa95fba4be5eac4@ruby-forum.com> David Balmain wrote: > Well the user would just type their query as usual but you'd write the > query something like: > > Books.find("pages match '#{query}'", :limit => 10) > > Or something like that. I haven't worked the details yet. And you > would be able to specify whether you wanted lazy or eager loading too. That's what I guessed you'd have to do. Change the query language to support this concept. I was actually working on adding a new method to acts_as_ferret where you could pass these associations matches in like: Book.find_by_association( query, [:pages], { :limit => 20 } ) Since I can't change the query language, but I could express the same sort of behavior. This would result in a multi_index query across Book and Page indexes. But, tracking total_hits, and paging just don't work with this approach. The only option you have is to iterate over all the matches. When we do ferret queries does ferret actually go over the entire search space to calculate all the possible documents that matched the query? Then just returns the ones within the offset and limits? If that's the case then it's doable to create this type of search, but it would make more sense to modify ferret to support this type of query. I'm interested in your database approach. It could help simplify this problem. It seems doable to add this to acts_as_ferret without needing a seperate project. Not to mention it's really needed in Rails apps as well. Charlie -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Sun Oct 15 18:06:48 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 16 Oct 2006 00:06:48 +0200 Subject: [Ferret-talk] Attaching files in Tracs to tickets is broken! In-Reply-To: References: <5cd8a1d649ac8acce29f56889805c1c9@ruby-forum.com> Message-ID: <20061015220648.GB9556@cordoba.webit.de> On Sun, Oct 15, 2006 at 12:49:41PM +0900, David Balmain wrote: > On 10/15/06, Charlie Hubbard wrote: > > Hi, > > > > I tried opening a new ticket with a patch I wanted to submit. But when > > I tried to attach files to the ticket I keep getting a ticket not found > > error. I kept trying to get it to work, but I accidentally opened 4 new > > tickets. Sorry. no need to be sorry, I broke this when trying to fix the attachment downloads. Now everything should be fine again. Could you please try to upload a diff against aaf trunk for your patch ? I'm very interested and this and liked to intergate this into the next version of aaf I'll release by the end of this week. thanks, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Sun Oct 15 18:13:39 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 16 Oct 2006 00:13:39 +0200 Subject: [Ferret-talk] acts_as_attachment and tagging? In-Reply-To: References: Message-ID: <20061015221339.GC9556@cordoba.webit.de> On Sun, Oct 15, 2006 at 05:30:29AM +0200, koloa wrote: > hi, i read this: > > http://www.johnnysthoughts.com/2006/08/27/ruby-on-rails-using-full-text-search-with-tagging/ > > > does this mean i do not have to install the acts_as_taggable plugin? all > i need to do is something like this is my model class? > acts_as_ferret :field=>['name', :tag_list] from quick-reading that post, you'll still need acts_as_taggable to have your taggings get saved correctly. he just uses aaf to search posts by tags. Maybe you could get away with just storing tags in a text field on each record, and searching this field with ferret - but this way you'll lose the possibility do do relational db queries on your tags. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From howardmoon at hitcity.com.au Sun Oct 15 19:53:25 2006 From: howardmoon at hitcity.com.au (Peter Royle) Date: Mon, 16 Oct 2006 01:53:25 +0200 Subject: [Ferret-talk] Very small scores for search results Message-ID: Hi Everyone, I'm using Ferret 0.10.11 with acts_as_ferret from SVN (same results with 0.10.10 and 0.10.9 though). I'm running into an odd problem where the scores of my top-ranking search results are ridiculously small - even when the query is one that should match at least one document with a decent score. To give an example, I have just the names of 5 businesses indexed using the standard analyzer. (The same happens with thousands of records indexed by many fields but I've simplified for this example). One of those businesses is called "ABC Master Building Designers". When I do a query for "building" I get "ABC Master Building Designers" as the top result, but with the following explanation (via code a added to acts_as_ferret for debugging): QUERY: id:building name:building EXPLANATION of building: 8.438619e-42 = product of: 1.687724e-41 = weight(name:building in 3), product of: 0.6125279 = query_weight(name:building), product of: 2.386294 = idf(doc_freq=1) 0.2566858 = query_norm 2.755373e-41 = field_weight(name:building in 3), product of: 1.0 = tf(term_freq(name:building)=1) 2.386294 = idf(doc_freq=1) 1.15467e-41 = field_norm(field=name, doc=3) 0.5 = coord(1/2) Note the tiny score of field_norm which is throwing the whole score out. The net result is that all the records aren't differenciated by much and so the ordering of the results rarely makes much sense. I sometimes get restaurants in the search results! I haven't used any boost or anything on the name field. My Business class calls AaF like this: class Business < ActiveRecord::Base acts_as_ferret( :fields => { :name => { } }, :or_default => true ) ... end Does anyone have any ideas as to what might be causeing this? Any help would be greatly appreciated. Thanks, Pete. -- Posted via http://www.ruby-forum.com/. From brent.salo at scgcanada.com Sun Oct 15 20:07:41 2006 From: brent.salo at scgcanada.com (Brent Salo) Date: Mon, 16 Oct 2006 02:07:41 +0200 Subject: [Ferret-talk] Win XP / Ferret & Acts_as_ferret .dump problem In-Reply-To: <426978784cdd2e7bb1d6e0b31e296a8e@ruby-forum.com> References: <0dd7a17d3f24fa5d2093e07b4d3d7f18@ruby-forum.com> <20060922091703.GA11602@cordoba.webit.de> <426978784cdd2e7bb1d6e0b31e296a8e@ruby-forum.com> Message-ID: <8dd78e760bd44e18faa8b9d21bb6aa2b@ruby-forum.com> Hello Ilya and Dave, Have you figured out what is causing this issue? It is happening for me with acts_as_ferret and rmagick. My rails app works great on some windows boxes, but on my 2003 server it gives me the nondecript EBADF error whenever I hit any controller functions that use ferret or rmagick. Initially it would not load on any of my windows boxes, but the quick workaround of modifying base.rb to replace tabs with spaces worked. Unfortunately however the app is still crapping out whenever I hit rmagick or ferret functions on the 2003 box. Any updates? Thanks Brent -- Posted via http://www.ruby-forum.com/. From john at squirl.info Sun Oct 15 21:20:35 2006 From: john at squirl.info (John Mcgrath) Date: Mon, 16 Oct 2006 03:20:35 +0200 Subject: [Ferret-talk] seg fault, ferret 0.10.11 Message-ID: Hi, we're using Ferret 0.10.11 with acts_as_ferret (stable from svn), on a unix box, running rails 1.1.6 in production. a few days ago i rebuilt the index (by deleting the previous one and letting acts_as_ferret do its thing), and it ran fine for a few days. this evening i got a seg fault when one of the indexes was being updated via aaf, and now all ferret searches are busted, returning this error: EOFError (End-of-File Error occured at :79 in xraise Error occured in compound_io.c:123 - cmpdi_read_i Tried to read past end of file. File length is <86> and tried to read to <164> ): We had a similar problem last week, and I had hoped that upgrading from 0.10.1 to 0.10.11 would fix it, but apparently not. we have 36 models using acts_as_ferret, each with a separate index. between them all there are about 13,000 items in the index. i'm going to rebuild the index again, which seems to fix this temporarily, but i'd really love to figure out what's causing it and fix the problem. any help gratefully appreciated. thanks, john -- Posted via http://www.ruby-forum.com/. From jordan.w.frank at gmail.com Mon Oct 16 02:16:59 2006 From: jordan.w.frank at gmail.com (Jordan Frank) Date: Mon, 16 Oct 2006 02:16:59 -0400 Subject: [Ferret-talk] seg faults and problems with new version Message-ID: Hi all, first off, it's 2AM and I'm not thinking properly, so please forgive me if this one's easy, but I just need to get this going. First problem, using 0.9.6 on all of our development machines, works great, then we move it to a server running x86_64 linux and it segfaults as soon as it tries to create an Index. I've tried rebuilding with different optimization flags, to no avail. So I figured, hey, let's upgrade to 0.10.x and see how that goes. So I updated to the latest gem of ferret, and switched over to the stable tagged branch of acts_as_ferret, but now I get the following: >> Person.rebuild_index NoMethodError: undefined method `exists?' for {:index=>:yes, :term_vector=>:no, :store=>:no, :boost=>1.0}:Hash from /usr/lib/ruby/site_ruby/1.8/ferret/index/field_infos.rb:20:in `initialize' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:166:in `rebuild_index' from (irb):1 Any thoughts? I don't want to start going into the code and messing with stuff in the acts_as_ferret plugin, because I'm sure it must be working for others, and I'm just doing something stupid...so what am I doing wrong? Cheers, Jordan From cussen at gmail.com Mon Oct 16 02:21:39 2006 From: cussen at gmail.com (Johnny Cussen) Date: Mon, 16 Oct 2006 16:21:39 +1000 Subject: [Ferret-talk] Very small scores for search results In-Reply-To: References: Message-ID: Pete, I noticed the same thing over the weekend. Haven't started investigating yet though. Johnny On 16/10/2006, at 9:53 AM, Peter Royle wrote: > Hi Everyone, > > I'm using Ferret 0.10.11 with acts_as_ferret from SVN (same results > with > 0.10.10 and 0.10.9 though). > > I'm running into an odd problem where the scores of my top-ranking > search results are ridiculously small - even when the query is one > that > should match at least one document with a decent score. > > To give an example, I have just the names of 5 businesses indexed > using > the standard analyzer. (The same happens with thousands of records > indexed by many fields but I've simplified for this example). One of > those businesses is called "ABC Master Building Designers". When I > do a > query for "building" I get "ABC Master Building Designers" as the top > result, but with the following explanation (via code a added to > acts_as_ferret for debugging): > > QUERY: id:building name:building > > EXPLANATION of building: 8.438619e-42 = product of: > 1.687724e-41 = weight(name:building in 3), product of: > 0.6125279 = query_weight(name:building), product of: > 2.386294 = idf(doc_freq=1) > 0.2566858 = query_norm > 2.755373e-41 = field_weight(name:building in 3), product of: > 1.0 = tf(term_freq(name:building)=1) > 2.386294 = idf(doc_freq=1) > 1.15467e-41 = field_norm(field=name, doc=3) > 0.5 = coord(1/2) > > Note the tiny score of field_norm which is throwing the whole score > out. > The net result is that all the records aren't differenciated by > much and > so the ordering of the results rarely makes much sense. I sometimes > get > restaurants in the search results! > > I haven't used any boost or anything on the name field. My Business > class calls AaF like this: > > class Business < ActiveRecord::Base > > acts_as_ferret( > :fields => { :name => { } }, > :or_default => true > ) > > ... > > end > > Does anyone have any ideas as to what might be causeing this? Any help > would be greatly appreciated. > > Thanks, > > Pete. > > > > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From ork at orkland.de Mon Oct 16 02:49:50 2006 From: ork at orkland.de (Benjamin Krause) Date: Mon, 16 Oct 2006 08:49:50 +0200 Subject: [Ferret-talk] seg faults and problems with new version In-Reply-To: References: Message-ID: <45332B8E.3010303@orkland.de> Hey .. > First problem, using 0.9.6 on all of our development machines, >works great, then we move it to a server running x86_64 linux and it >segfaults as soon as it tries to create an Index. I've tried > > We had some problems with 0.9.x on x86_64 as well, but it is running very stable on 0.10.x now.. So I guess you shouldn't use 0.9.x anymore.. Not sure about the AAF issue, though.. But i guess you should try to get the latest svn version of the plugin. Jens is currently updating the code to match the latest version of ferret. Ben From dbalmain.ml at gmail.com Mon Oct 16 02:54:19 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Mon, 16 Oct 2006 15:54:19 +0900 Subject: [Ferret-talk] Very small scores for search results In-Reply-To: References: Message-ID: On 10/16/06, Peter Royle wrote: > Hi Everyone, > > I'm using Ferret 0.10.11 with acts_as_ferret from SVN (same results with > 0.10.10 and 0.10.9 though). > > I'm running into an odd problem where the scores of my top-ranking > search results are ridiculously small - even when the query is one that > should match at least one document with a decent score. > > To give an example, I have just the names of 5 businesses indexed using > the standard analyzer. (The same happens with thousands of records > indexed by many fields but I've simplified for this example). One of > those businesses is called "ABC Master Building Designers". When I do a > query for "building" I get "ABC Master Building Designers" as the top > result, but with the following explanation (via code a added to > acts_as_ferret for debugging): > > QUERY: id:building name:building > > EXPLANATION of building: 8.438619e-42 = product of: > 1.687724e-41 = weight(name:building in 3), product of: > 0.6125279 = query_weight(name:building), product of: > 2.386294 = idf(doc_freq=1) > 0.2566858 = query_norm > 2.755373e-41 = field_weight(name:building in 3), product of: > 1.0 = tf(term_freq(name:building)=1) > 2.386294 = idf(doc_freq=1) > 1.15467e-41 = field_norm(field=name, doc=3) > 0.5 = coord(1/2) > > Note the tiny score of field_norm which is throwing the whole score out. > The net result is that all the records aren't differenciated by much and > so the ordering of the results rarely makes much sense. I sometimes get > restaurants in the search results! > > I haven't used any boost or anything on the name field. My Business > class calls AaF like this: > > class Business < ActiveRecord::Base > > acts_as_ferret( > :fields => { :name => { } }, > :or_default => true > ) > > ... > > end > > Does anyone have any ideas as to what might be causeing this? Any help > would be greatly appreciated. Hi Pete, Are you on a Mac by any chance? There are problems with the scoring on OS X but I'm not sure why. Cheers, Dave From dbalmain.ml at gmail.com Mon Oct 16 03:03:04 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Mon, 16 Oct 2006 16:03:04 +0900 Subject: [Ferret-talk] Win XP / Ferret & Acts_as_ferret .dump problem In-Reply-To: <8dd78e760bd44e18faa8b9d21bb6aa2b@ruby-forum.com> References: <0dd7a17d3f24fa5d2093e07b4d3d7f18@ruby-forum.com> <20060922091703.GA11602@cordoba.webit.de> <426978784cdd2e7bb1d6e0b31e296a8e@ruby-forum.com> <8dd78e760bd44e18faa8b9d21bb6aa2b@ruby-forum.com> Message-ID: On 10/16/06, Brent Salo wrote: > Hello Ilya and Dave, > Have you figured out what is causing this issue? > It is happening for me with acts_as_ferret and rmagick. > My rails app works great on some windows boxes, but on my 2003 server it > gives me the nondecript EBADF error whenever I hit any controller > functions that use ferret or rmagick. > Initially it would not load on any of my windows boxes, but the quick > workaround of modifying base.rb to replace tabs with spaces worked. > Unfortunately however the app is still crapping out whenever I hit > rmagick or ferret functions on the 2003 box. > Any updates? > Thanks > Brent Afraid not. But it is interesting that the problem is not consistent across all Windows machines. I still haven't been able to reproduce the problem here. From dbalmain.ml at gmail.com Mon Oct 16 03:23:28 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Mon, 16 Oct 2006 16:23:28 +0900 Subject: [Ferret-talk] How can I do my own search limits? In-Reply-To: <873164f696bd71734fa95fba4be5eac4@ruby-forum.com> References: <7a422fdf109869f7a261c607c1448539@ruby-forum.com> <424e0115f98cf0a317553bb92ee90f52@ruby-forum.com> <873164f696bd71734fa95fba4be5eac4@ruby-forum.com> Message-ID: On 10/16/06, Charlie Hubbard wrote: > David Balmain wrote: > > > Well the user would just type their query as usual but you'd write the > > query something like: > > > > Books.find("pages match '#{query}'", :limit => 10) > > > > Or something like that. I haven't worked the details yet. And you > > would be able to specify whether you wanted lazy or eager loading too. > > That's what I guessed you'd have to do. Change the query language to > support this concept. I was actually working on adding a new method to > acts_as_ferret where you could pass these associations matches in like: > > Book.find_by_association( query, [:pages], { :limit => 20 } ) > > Since I can't change the query language, but I could express the same > sort of behavior. This would result in a multi_index query across Book > and Page indexes. But, tracking total_hits, and paging just don't work > with this approach. The only option you have is to iterate over all the > matches. > > When we do ferret queries does ferret actually go over the entire search > space to calculate all the possible documents that matched the query? > Then just returns the ones within the offset and limits? Yes, that's exactly how it works. > If that's the case then it's doable to create this type of search, but > it would make more sense to modify ferret to support this type of query. I don't see a way to add this feature cleanly. It is just as easy for you to do iterate through all the results yourself. Besides, you still haven't explained why you can't add all Pages to each Book document? As I said, the field length limit isn't an issue. This would be the best way to solve this problem. > I'm interested in your database approach. It could help simplify this > problem. It seems doable to add this to acts_as_ferret without needing > a seperate project. Not to mention it's really needed in Rails apps as > well. > In my suggested database approach the search would be the equivalent of a simple SQL join query. By adding a feature like this to acts_as_ferret you'll need to pull all the matching page ids out of the index and peform a much slower SQL query for all books that include those page ids. I'm not sure it is feasible but I'll leave that decision to the acts_as_ferret developers. The best solution is definitely to index all the pages with the book document, even if it means indexing each page twice. Cheers, Dave From samuelgiffney at gmail.com Mon Oct 16 04:37:33 2006 From: samuelgiffney at gmail.com (Sam) Date: Mon, 16 Oct 2006 10:37:33 +0200 Subject: [Ferret-talk] Ruby Hacker Interview: Dave Balmain Message-ID: Guess Dave wasn't going to blow his own trumpet so someone else has to do it for him. An interesting interview with Dave that filled in the gaps on the About Me page I had asked about... http://on-ruby.blogspot.com/2006/10/ruby-hacker-interview-dave-balmain.html -- Posted via http://www.ruby-forum.com/. From howardmoon at hitcity.com.au Mon Oct 16 05:49:30 2006 From: howardmoon at hitcity.com.au (Peter Royle) Date: Mon, 16 Oct 2006 11:49:30 +0200 Subject: [Ferret-talk] Very small scores for search results In-Reply-To: References: Message-ID: > Hi Pete, > > Are you on a Mac by any chance? There are problems with the scoring on > OS X but I'm not sure why. > > Cheers, > Dave Hi Dave. Yes, I am! I've deployed on my Linux box and reindexed and everything seems to be going fine. Thanks for the tip. Johnny, does this solve it for you too? Pete. -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Oct 16 06:03:29 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 16 Oct 2006 12:03:29 +0200 Subject: [Ferret-talk] How can I do my own search limits? In-Reply-To: References: <7a422fdf109869f7a261c607c1448539@ruby-forum.com> <424e0115f98cf0a317553bb92ee90f52@ruby-forum.com> <873164f696bd71734fa95fba4be5eac4@ruby-forum.com> Message-ID: <20061016100329.GF14271@cordoba.webit.de> Hi! On Mon, Oct 16, 2006 at 04:23:28PM +0900, David Balmain wrote: > On 10/16/06, Charlie Hubbard wrote: [..] > > I'm interested in your database approach. It could help simplify this > > problem. It seems doable to add this to acts_as_ferret without needing > > a seperate project. Not to mention it's really needed in Rails apps as > > well. > > > > In my suggested database approach the search would be the equivalent > of a simple SQL join query. By adding a feature like this to > acts_as_ferret you'll need to pull all the matching page ids out of > the index and peform a much slower SQL query for all books that > include those page ids. I'm not sure it is feasible but I'll leave > that decision to the acts_as_ferret developers. The best solution is > definitely to index all the pages with the book document, even if it > means indexing each page twice. I'd suggest going that route, too. An imho interesting question around this is, how much the size of the value for that pages field containing all pages of a book really would influence the total index size (when not storing the contents and not storing term vectors), i.e. will the index size grow in a linear way, or will it grow slower over time, as with bigger size of the value of a field more terms occur more than once ? Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Mon Oct 16 06:08:13 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 16 Oct 2006 12:08:13 +0200 Subject: [Ferret-talk] seg faults and problems with new version In-Reply-To: References: Message-ID: <20061016100813.GG14271@cordoba.webit.de> On Mon, Oct 16, 2006 at 02:16:59AM -0400, Jordan Frank wrote: > Hi all, > first off, it's 2AM and I'm not thinking properly, so please > forgive me if this one's easy, but I just need to get this going. > > First problem, using 0.9.6 on all of our development machines, > works great, then we move it to a server running x86_64 linux and it > segfaults as soon as it tries to create an Index. I've tried > rebuilding with different optimization flags, to no avail. So I > figured, hey, let's upgrade to 0.10.x and see how that goes. So I > updated to the latest gem of ferret, and switched over to the stable > tagged branch of acts_as_ferret, but now I get the following: > > >> Person.rebuild_index > NoMethodError: undefined method `exists?' for {:index=>:yes, > :term_vector=>:no, :store=>:no, :boost=>1.0}:Hash > from /usr/lib/ruby/site_ruby/1.8/ferret/index/field_infos.rb:20:in that field_infos.rb seems to belong to an older version of Ferret. Mine (0.10.11) doesn't call exists? anywhere. I think I've seen that problem before, you should check where your gem install command installed ferret to, and if you have lying around an older version in /usr/lib/ruby/site_ruby (that location doesn't look like a gem repository anyway). Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From cussen at gmail.com Mon Oct 16 06:24:10 2006 From: cussen at gmail.com (Johnny Cussen) Date: Mon, 16 Oct 2006 20:24:10 +1000 Subject: [Ferret-talk] Very small scores for search results In-Reply-To: References: Message-ID: <2AA4463B-7E7C-48A3-96A6-A2B12F795EBE@gmail.com> Yep. Weird huh. On 16/10/2006, at 7:49 PM, Peter Royle wrote: >> Hi Pete, >> >> Are you on a Mac by any chance? There are problems with the >> scoring on >> OS X but I'm not sure why. >> >> Cheers, >> Dave > > Hi Dave. > > Yes, I am! I've deployed on my Linux box and reindexed and everything > seems to be going fine. Thanks for the tip. > > Johnny, does this solve it for you too? > > Pete. > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From dbalmain.ml at gmail.com Mon Oct 16 06:45:35 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Mon, 16 Oct 2006 19:45:35 +0900 Subject: [Ferret-talk] How can I do my own search limits? In-Reply-To: <20061016100329.GF14271@cordoba.webit.de> References: <7a422fdf109869f7a261c607c1448539@ruby-forum.com> <424e0115f98cf0a317553bb92ee90f52@ruby-forum.com> <873164f696bd71734fa95fba4be5eac4@ruby-forum.com> <20061016100329.GF14271@cordoba.webit.de> Message-ID: On 10/16/06, Jens Kraemer wrote: > Hi! > > On Mon, Oct 16, 2006 at 04:23:28PM +0900, David Balmain wrote: > > On 10/16/06, Charlie Hubbard wrote: > [..] > > > I'm interested in your database approach. It could help simplify this > > > problem. It seems doable to add this to acts_as_ferret without needing > > > a seperate project. Not to mention it's really needed in Rails apps as > > > well. > > > > > > > In my suggested database approach the search would be the equivalent > > of a simple SQL join query. By adding a feature like this to > > acts_as_ferret you'll need to pull all the matching page ids out of > > the index and peform a much slower SQL query for all books that > > include those page ids. I'm not sure it is feasible but I'll leave > > that decision to the acts_as_ferret developers. The best solution is > > definitely to index all the pages with the book document, even if it > > means indexing each page twice. > > I'd suggest going that route, too. > > An imho interesting question around this is, how much the size of the > value for that pages field containing all pages of a book really would > influence the total index size (when not storing the contents and not > storing term vectors), i.e. will the index size grow in a linear way, or > will it grow slower over time, as with bigger size of the value of a > field more terms occur more than once ? > > Jens That is an interesting question. I haven't done any tests to back this up but I would guess you are correct. Indexing the content as a single field in Book will take up a lot less space than it would in separated into multiple documents as pages. So indexing the field twice as I suggested shouldn't double the size of your index. In fact, if you give the fields the same name (ie :content for both Page and Book) then the increase in index size will be negligable. There will however be a noticable difference in indexing time but again, it shouldn't be double. As far as search goes this solution will probably be orders of magnitude better. Dave From dbalmain.ml at gmail.com Mon Oct 16 07:00:06 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Mon, 16 Oct 2006 20:00:06 +0900 Subject: [Ferret-talk] Very small scores for search results In-Reply-To: <2AA4463B-7E7C-48A3-96A6-A2B12F795EBE@gmail.com> References: <2AA4463B-7E7C-48A3-96A6-A2B12F795EBE@gmail.com> Message-ID: On 10/16/06, Johnny Cussen wrote: > Yep. Weird huh. Not as weird as you might think. OS X (and other *BSD based systems) have a different endianess to Windows and Linux. Unfortuntately I don't have a Mac to test on. I'm waiting for someone to donate a Mac or enough money for me to buy one. ;-) Alternatively, I'm sure I could fix the problem if someone could offer me an ssh login to an OS X server. Or better yet, someone could send me a patch. If any Mac users are reading this and they'd like to have a go at fixing this themselves, the problem has something to do with the way floats are compressed into bytes in c/src/helper.c. The C unit tests probably won't pass so if you can fix them the problem should be fixed. Let me know if anyone wants to have a go at fixing this. Cheers, Dave From marvin at rectangular.com Mon Oct 16 08:24:30 2006 From: marvin at rectangular.com (Marvin Humphrey) Date: Mon, 16 Oct 2006 05:24:30 -0700 Subject: [Ferret-talk] Very small scores for search results In-Reply-To: References: <2AA4463B-7E7C-48A3-96A6-A2B12F795EBE@gmail.com> Message-ID: <46F52197-FADA-4215-B469-954266E62108@rectangular.com> On Oct 16, 2006, at 4:00 AM, David Balmain wrote: > If any Mac users > are reading this and they'd like to have a go at fixing this > themselves, the problem has something to do with the way floats are > compressed into bytes in c/src/helper.c. The C unit tests probably > won't pass so if you can fix them the problem should be fixed. Let me > know if anyone wants to have a go at fixing this. /me raises hand. Marvin Humphrey Rectangular Research http://www.rectangular.com/ From dbalmain.ml at gmail.com Mon Oct 16 08:51:12 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Mon, 16 Oct 2006 21:51:12 +0900 Subject: [Ferret-talk] Very small scores for search results In-Reply-To: <46F52197-FADA-4215-B469-954266E62108@rectangular.com> References: <2AA4463B-7E7C-48A3-96A6-A2B12F795EBE@gmail.com> <46F52197-FADA-4215-B469-954266E62108@rectangular.com> Message-ID: On 10/16/06, Marvin Humphrey wrote: > > On Oct 16, 2006, at 4:00 AM, David Balmain wrote: > > > If any Mac users > > are reading this and they'd like to have a go at fixing this > > themselves, the problem has something to do with the way floats are > > compressed into bytes in c/src/helper.c. The C unit tests probably > > won't pass so if you can fix them the problem should be fixed. Let me > > know if anyone wants to have a go at fixing this. > > /me raises hand. > > Marvin Humphrey > Rectangular Research > http://www.rectangular.com/ You have a Mac Marvin? I didn't realize. I'm guessing you already know how to run the C unit tests. Let me know if there is anything else I can do to help. Dave From marvin at rectangular.com Mon Oct 16 09:16:46 2006 From: marvin at rectangular.com (Marvin Humphrey) Date: Mon, 16 Oct 2006 06:16:46 -0700 Subject: [Ferret-talk] Very small scores for search results In-Reply-To: References: <2AA4463B-7E7C-48A3-96A6-A2B12F795EBE@gmail.com> Message-ID: <124BA98F-1817-4859-AB74-27063AAABBEC@rectangular.com> On Oct 16, 2006, at 4:00 AM, David Balmain wrote: > The C unit tests probably > won't pass so if you can fix them the problem should be fixed. Test output for subversion repository revision 653 on my G4 PowerBook below... I don't see anything failing specifically relating to how Similarity encodes/decodes norms. Is there a test for that? Have a look at... The important test is the one that just takes 0 .. 255, transforms those to 256 floats, transforms them back again and checks that we get 0 .. 255. You have something like that? /me investigates ... Ah, don't see something like that in test_similarity.c Probably I can add that and send you a patch. Think that's the right direction, based on the test results? PS: I have a Mac Mini that just sits there as a backup in case the PowerBook has to go into the shop. If you write a script that emails you results in case of test failures, I can set up a cron to do nightly smokes. Marvin Humphrey Rectangular Research http://www.rectangular.com/ slothbear:~/projects/ferret010/ruby marvin$ ruby setup.rb test Running tests... Loading once Loaded suite test Started ........................................................................ ...........................................FF.............F............. ............... Finished in 8.037183 seconds. 1) Failure: test_sorts(SearchAndSortTest) [./test/unit/../unit/analysis/../../unit/index/../../unit/ query_parser/../../unit/search/tc_search_and_sort.rb:40:in `do_test_top_docs' ./test/unit/../unit/analysis/../../unit/index/../../unit/ query_parser/../../unit/search/tc_search_and_sort.rb:39:in `times' ./test/unit/../unit/analysis/../../unit/index/../../unit/ query_parser/../../unit/search/tc_search_and_sort.rb:39:in `do_test_top_docs' ./test/unit/../unit/analysis/../../unit/index/../../unit/ query_parser/../../unit/search/tc_search_and_sort.rb:113:in `test_sorts']: <8> expected but was <1>. 2) Failure: test_boolean_query(SearcherTest) [./test/unit/../unit/analysis/../../unit/index/../../unit/ query_parser/../../unit/search/tc_index_searcher.rb:39:in `check_hits' ./test/unit/../unit/analysis/../../unit/index/../../unit/ query_parser/../../unit/search/tm_searcher.rb:98:in `test_boolean_query']: <14> expected but was <2>. 3) Failure: test_boolean_query(SimpleMultiSearcherTest) [./test/unit/../unit/analysis/../../unit/index/../../unit/ query_parser/../../unit/search/tc_index_searcher.rb:39:in `check_hits' ./test/unit/../unit/analysis/../../unit/index/../../unit/ query_parser/../../unit/search/tm_searcher.rb:98:in `test_boolean_query']: <14> expected but was <2>. 159 tests, 11469 assertions, 3 failures, 0 errors slothbear:~/projects/ferret010/ruby marvin$ From none at none.com Mon Oct 16 09:19:55 2006 From: none at none.com (poipu) Date: Mon, 16 Oct 2006 15:19:55 +0200 Subject: [Ferret-talk] acts_as_ferret: can i specify a search on 1 field as suppose Message-ID: to the ones i defined in my model? for example if in the model i specify acts_as_ferret to index only column 1, 2, and 3 in my table....how can i perform a search just for column 1 if need be. for example, id like to give the user the ability to just search on title name vs description, etc... thanks! -- Posted via http://www.ruby-forum.com/. From brent.salo at scgcanada.com Mon Oct 16 09:42:13 2006 From: brent.salo at scgcanada.com (Guest) Date: Mon, 16 Oct 2006 15:42:13 +0200 Subject: [Ferret-talk] Win XP / Ferret & Acts_as_ferret .dump problem In-Reply-To: References: <0dd7a17d3f24fa5d2093e07b4d3d7f18@ruby-forum.com> <20060922091703.GA11602@cordoba.webit.de> <426978784cdd2e7bb1d6e0b31e296a8e@ruby-forum.com> <8dd78e760bd44e18faa8b9d21bb6aa2b@ruby-forum.com> Message-ID: <00bdd2f88d162d0e77b50a3b8e92c441@ruby-forum.com> Ok thanks... On that note, does anyone have a quick method of removing all tabs and unwelcome characters from a rails app? David Balmain wrote: > On 10/16/06, Brent Salo wrote: >> Any updates? >> Thanks >> Brent > > Afraid not. But it is interesting that the problem is not consistent > across all Windows machines. I still haven't been able to reproduce > the problem here. -- Posted via http://www.ruby-forum.com/. From anrake at gmail.com Mon Oct 16 09:42:31 2006 From: anrake at gmail.com (anrake o.) Date: Mon, 16 Oct 2006 15:42:31 +0200 Subject: [Ferret-talk] Experience with ferret on Dreamhost ? In-Reply-To: References: Message-ID: I think I figured out this problem. All you had to do was add this line to the top of environment.rb ENV['GEM_PATH'] = '/home/USERNAME/.gems' + ':/usr/lib/ruby/gems/1.8' The DH wiki says to put the following, but it didn't seem to work. ENV['GEM_PATH'] = File.expand_path('~/.gems') + ':/usr/lib/ruby/gems/1.8' I actually created a new test project in dev. mode and saw an error some where like "couldn't expand ~" or "unknown comand expand_path" or something like that and just put in the absolute path as above on a whim. Then it worked fine. anrake wrote: > Hi, I am experiencing exactly the same phenomenon. > Everything works find on my powerbook, but not on DH. I changed > bash_profile to add my local .gems directory and installed ferret with > no apparent problems. I added a line to environment.rb as instructed in > the wiki but still get the same problems when I try to deploy my new > site (via Capistrano). Likewise I think dispatch.fcgi is not starting. > > Any ideas? > > Chris Lowis wrote: >> Does anybody have experience with running ferret on dreamhost ? >> >> My app is running ok until I install the acts_as_ferret plugin, at which >> point I get "Rails application failed to start properly" errors. I've >> used script/console to confirm that I can require 'ferret' and make a >> new Index object . Everything appears to be ok in that respect. >> Unfortunately there is nothing logged in these circumstances, except : >> >> [Wed Aug 16 07:10:23 2006] [error] [client 152.78.115.107] FastCGI: comm >> with (dynamic) server >> "/home/c_lowis/residence-review.com/public/dispatch.fcgi" aborted: >> (first read) idle timeout (120 sec) >> [Wed Aug 16 07:10:23 2006] [error] [client 152.78.115.107] FastCGI: >> incomplete headers (0 bytes) received from server >> "/home/c_lowis/residence-review.com/public/dispatch.fcgi" >> >> in the "apache" type logs that dreamhost gives me . Through trial and >> error I am fairly sure it is ferret that is causing this, as when I >> remove the plugin the site works ok. >> >> I am using ferret 0.9.5 . As far as I can see dispatch.fcgi is not >> starting. >> >> Would appreciate any comments, >> >> Chris -- Posted via http://www.ruby-forum.com/. From shammond at patientslikeme.com Mon Oct 16 10:02:32 2006 From: shammond at patientslikeme.com (Steven Hammond) Date: Mon, 16 Oct 2006 10:02:32 -0400 Subject: [Ferret-talk] Win XP / Ferret & Acts_as_ferret .dump problem In-Reply-To: <8dd78e760bd44e18faa8b9d21bb6aa2b@ruby-forum.com> References: <0dd7a17d3f24fa5d2093e07b4d3d7f18@ruby-forum.com> <20060922091703.GA11602@cordoba.webit.de> <426978784cdd2e7bb1d6e0b31e296a8e@ruby-forum.com> <8dd78e760bd44e18faa8b9d21bb6aa2b@ruby-forum.com> Message-ID: <64AFD52B-3981-4390-920C-93710CAA4070@patientslikeme.com> Can you post your changes to base.rb? I'd like to give that a try. Thanks, Steve On Oct 15, 2006, at 8:07 PM, Brent Salo wrote: > Hello Ilya and Dave, > Have you figured out what is causing this issue? > It is happening for me with acts_as_ferret and rmagick. > My rails app works great on some windows boxes, but on my 2003 > server it > gives me the nondecript EBADF error whenever I hit any controller > functions that use ferret or rmagick. > Initially it would not load on any of my windows boxes, but the quick > workaround of modifying base.rb to replace tabs with spaces worked. > Unfortunately however the app is still crapping out whenever I hit > rmagick or ferret functions on the 2003 box. > Any updates? > Thanks > Brent > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From dbalmain.ml at gmail.com Mon Oct 16 10:16:03 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Mon, 16 Oct 2006 23:16:03 +0900 Subject: [Ferret-talk] Very small scores for search results In-Reply-To: <124BA98F-1817-4859-AB74-27063AAABBEC@rectangular.com> References: <2AA4463B-7E7C-48A3-96A6-A2B12F795EBE@gmail.com> <124BA98F-1817-4859-AB74-27063AAABBEC@rectangular.com> Message-ID: On 10/16/06, Marvin Humphrey wrote: > > On Oct 16, 2006, at 4:00 AM, David Balmain wrote: > > > The C unit tests probably > > won't pass so if you can fix them the problem should be fixed. > > Test output for subversion repository revision 653 on my G4 PowerBook > below... Hmmm. They failures are related to float/byte encoding as they are occuring because the scoring is wrong. But the float/byte conversion test doesn't seem to be failing. > I don't see anything failing specifically relating to how Similarity > encodes/decodes norms. Is there a test for that? Have a look at... > > > > The important test is the one that just takes 0 .. 255, transforms > those to 256 floats, transforms them back again and checks that we > get 0 .. 255. You have something like that? > > /me investigates ... > > Ah, don't see something like that in test_similarity.c Probably I > can add that and send you a patch. Think that's the right direction, > based on the test results? Yeah, it is in test/test_helper.c. I guess I should put a comment in test_similarity about that since that is where most people would expect to find such a test. Anyway, since it is passing the error must be occuring somewhere else. I can't think why though as it definitely seems to have something to do with the norms. > PS: I have a Mac Mini that just sits there as a backup in case the > PowerBook has to go into the shop. If you write a script that emails > you results in case of test failures, I can set up a cron to do > nightly smokes. > test results That'd be great thanks. I'll probably take a while to get around to it. Anyway, don't spend too much time on this. I think it is better for both of us if you concentrate on Lucy. I've seen a lot of action recently on the commits list. :D Cheers, Dave From dbalmain.ml at gmail.com Mon Oct 16 10:18:41 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Mon, 16 Oct 2006 23:18:41 +0900 Subject: [Ferret-talk] acts_as_ferret: can i specify a search on 1 field as suppose In-Reply-To: References: Message-ID: On 10/16/06, poipu wrote: > to the ones i defined in my model? for example if in the model i specify > acts_as_ferret to index only column 1, 2, and 3 in my table....how can i > perform a search just for column 1 if need be. > > for example, id like to give the user the ability to just search on > title name vs description, etc... > > > thanks! You prepend the search with the field-name and a colon. For example: title:"War and Peace" author:(Leo Tolstoy) Hope that helps, Dave From charlie.hubbard at gmail.com Mon Oct 16 10:20:18 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Mon, 16 Oct 2006 16:20:18 +0200 Subject: [Ferret-talk] How can I do my own search limits? In-Reply-To: References: <7a422fdf109869f7a261c607c1448539@ruby-forum.com> <424e0115f98cf0a317553bb92ee90f52@ruby-forum.com> <873164f696bd71734fa95fba4be5eac4@ruby-forum.com> Message-ID: <63bb73430a267e70943677e89c222972@ruby-forum.com> David Balmain wrote: >> If that's the case then it's doable to create this type of search, but >> it would make more sense to modify ferret to support this type of query. > > I don't see a way to add this feature cleanly. It is just as easy for > you to do iterate through all the results yourself. Besides, you still > haven't explained why you can't add all Pages to each Book document? > As I said, the field length limit isn't an issue. This would be the > best way to solve this problem. There is no reason why I couldn't. I was just trying to figure out a way to avoid it. The big drawback to indexing all the pages onto a single field in book would mean I'd have to pick a size of the field up front that could be the maximum. I don't have a lot of data yet, but I tried running some tests. A 94 chapter book it's somewhere around of 100,000. But that's a smaller book. It's just something you have to watch closely which I was trying to avoid is all. Right now your right the best approach is to store it twice. > In my suggested database approach the search would be the equivalent > of a simple SQL join query. By adding a feature like this to > acts_as_ferret you'll need to pull all the matching page ids out of > the index and peform a much slower SQL query for all books that > include those page ids. I'm not sure it is feasible but I'll leave > that decision to the acts_as_ferret developers. The best solution is > definitely to index all the pages with the book document, even if it > means indexing each page twice. I was thinking it would be more like a SQL union. In other words the query didn't have to match the Book document in order to be included. It just had to match the Page object to be included. For example, say I have a book title of Lucene in Action, but you'd expect a query "java" would pull that one back. Java is probably mentioned in the text of that book. I sort of saw it as a multi_index query, since aaf maps the objects that way, where you'd first query Book Documents, then query the Page documents. Instead of adding those Page documents to the resulting array. They would only add a new entry if there was a Book not already there. I suppose I could do that in Ruby, but it just seems like it might be more optimized if ferret understood this type of relationship since it is already iterating over this already. -- Posted via http://www.ruby-forum.com/. From fastjames at gmail.com Mon Oct 16 10:40:59 2006 From: fastjames at gmail.com (Jim Kane) Date: Mon, 16 Oct 2006 16:40:59 +0200 Subject: [Ferret-talk] Ferret::QueryParser::QueryParseException Message-ID: <0bf798aa2a169dfba412cf502e9b2100@ruby-forum.com> During our last week of Ferret / aaf usage (also our first week of Ferret / aaf usage), I have received 8 messages stating that our app encountered a Ferret::QueryParser::QueryParseException. For instance: A Ferret::QueryParser::QueryParseException occurred in foo#search: Error occurred in src/q_parser.y:279 - yyerror couldn't parse query "com -- 404". Error message was syntax error /where/gems/are/stored/gems/ferret-0.10.11/lib/ferret/index.rb:709 in 'parse' (thanks to the excellent exception_notification plugin for what you see above) So I did a little research into the Exception, which only left me with more questions. According to the RDoc for Ferret, the Ferret::QueryParser#new method has a :clean_string option that should escape any special characters, and by default it's on. I emailed the author of aaf and he said that he doesn't do anything with this flag, so it should still be on. In that case, why am I still seeing this exception? Jim -- Posted via http://www.ruby-forum.com/. From waspfactory at gmail.com Mon Oct 16 11:25:57 2006 From: waspfactory at gmail.com (Caspar Bl) Date: Mon, 16 Oct 2006 17:25:57 +0200 Subject: [Ferret-talk] Sorting by score Message-ID: <85bfb43ed70f65124de978016e63e3a8@ruby-forum.com> Hi I think this is a very easy question but here goes: I want to sort my results by a boolean field and then by score, I thought this would be a default configuration but apparently not. sort_fields = [] sort_fields << Ferret::Search::SortField.new(:sponsored, :reverse => :true) that is my current code, how do iu alter it so that the results are then sorted by highest score first? thanks very much. regards caspar -- Posted via http://www.ruby-forum.com/. From brent.salo at scgcanada.com Mon Oct 16 11:31:14 2006 From: brent.salo at scgcanada.com (Guest) Date: Mon, 16 Oct 2006 17:31:14 +0200 Subject: [Ferret-talk] Win XP / Ferret & Acts_as_ferret .dump problem In-Reply-To: <64AFD52B-3981-4390-920C-93710CAA4070@patientslikeme.com> References: <0dd7a17d3f24fa5d2093e07b4d3d7f18@ruby-forum.com> <20060922091703.GA11602@cordoba.webit.de> <426978784cdd2e7bb1d6e0b31e296a8e@ruby-forum.com> <8dd78e760bd44e18faa8b9d21bb6aa2b@ruby-forum.com> <64AFD52B-3981-4390-920C-93710CAA4070@patientslikeme.com> Message-ID: <71d964d68f25c0e449bb423ebfaccbc3@ruby-forum.com> Steven Hammond wrote: > Can you post your changes to base.rb? I'd like to give that a try. > > Thanks, > Steve sure: actionpack-1.12.1/lib/action_view/base.rb modify compile_template to begin with template=template.gsub(/\t/," ") It completely fixed the issue on one of my win systems, but on another one of them (both 2003), it doesnt work. This may cause many other issues than it fixes however so I would label it a hack for now. I found this fix online somewhere, but I can't find it anymore. Good luck. Brent -- Posted via http://www.ruby-forum.com/. From none at none.com Mon Oct 16 11:32:31 2006 From: none at none.com (poipu) Date: Mon, 16 Oct 2006 17:32:31 +0200 Subject: [Ferret-talk] acts_as_ferret: can i specify a search on 1 field as sup In-Reply-To: References: Message-ID: hi dave, thanks for the reply. so just to clarify for myself....if in my view i have a category textbox and a query text box,,,all i need to do is model.find_by_contents(params[:category]+":"+params[:query]) ? -- Posted via http://www.ruby-forum.com/. From epetrie at tribune.com Mon Oct 16 11:41:23 2006 From: epetrie at tribune.com (Evan) Date: Mon, 16 Oct 2006 17:41:23 +0200 Subject: [Ferret-talk] Dynamic fields and inheritance In-Reply-To: <20061011113738.GC9323@cordoba.webit.de> References: <20061011113738.GC9323@cordoba.webit.de> Message-ID: <9f661711f29c79332af4de6db280ff03@ruby-forum.com> Jens Kraemer wrote: > On Wed, Oct 11, 2006 at 01:11:32AM +0200, Evan wrote: > Don't call acts_as_ferret in your base class, instead add the :name > field to the acts_as_ferret calls in Music and Book. That should fix > your problems. I assume that I will be unable to call Product.find_by_contents in this case. So, in order to do search of all products I would have to do a multi-index search? -- Posted via http://www.ruby-forum.com/. From jordan.w.frank at gmail.com Mon Oct 16 11:58:24 2006 From: jordan.w.frank at gmail.com (Jordan Frank) Date: Mon, 16 Oct 2006 11:58:24 -0400 Subject: [Ferret-talk] seg faults and problems with new version In-Reply-To: <20061016100813.GG14271@cordoba.webit.de> References: <20061016100813.GG14271@cordoba.webit.de> Message-ID: Yeah, it definitely was some old version kicking around. I deleted that, got the latest and greatest version of Ferret and the aaf plugin, and it all seems to work great now. Time to put it through the irons and see if the new version is as stable on x86_64 as the old version was on plain old 32-bit Linux. Thanks for the speedy response, you guys saved the day as usual. Cheers, Jordan Frank jordan.w.frank at gmail.com On 10/16/06, Jens Kraemer wrote: > On Mon, Oct 16, 2006 at 02:16:59AM -0400, Jordan Frank wrote: > > Hi all, > > first off, it's 2AM and I'm not thinking properly, so please > > forgive me if this one's easy, but I just need to get this going. > > > > First problem, using 0.9.6 on all of our development machines, > > works great, then we move it to a server running x86_64 linux and it > > segfaults as soon as it tries to create an Index. I've tried > > rebuilding with different optimization flags, to no avail. So I > > figured, hey, let's upgrade to 0.10.x and see how that goes. So I > > updated to the latest gem of ferret, and switched over to the stable > > tagged branch of acts_as_ferret, but now I get the following: > > > > >> Person.rebuild_index > > NoMethodError: undefined method `exists?' for {:index=>:yes, > > :term_vector=>:no, :store=>:no, :boost=>1.0}:Hash > > from /usr/lib/ruby/site_ruby/1.8/ferret/index/field_infos.rb:20:in > > that field_infos.rb seems to belong to an older version of Ferret. Mine > (0.10.11) doesn't call exists? anywhere. > > I think I've seen that problem before, you should check where your gem > install command installed ferret to, and if you have lying around an > older version in /usr/lib/ruby/site_ruby (that location doesn't look > like a gem repository anyway). > > Jens > > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From dbalmain.ml at gmail.com Mon Oct 16 12:05:07 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 17 Oct 2006 01:05:07 +0900 Subject: [Ferret-talk] Ferret::QueryParser::QueryParseException In-Reply-To: <0bf798aa2a169dfba412cf502e9b2100@ruby-forum.com> References: <0bf798aa2a169dfba412cf502e9b2100@ruby-forum.com> Message-ID: On 10/16/06, Jim Kane wrote: > During our last week of Ferret / aaf usage (also our first week of > Ferret / aaf usage), I have received 8 messages stating that our app > encountered a Ferret::QueryParser::QueryParseException. For instance: > > A Ferret::QueryParser::QueryParseException occurred in foo#search: > > Error occurred in src/q_parser.y:279 - yyerror > couldn't parse query "com -- 404". Error message was syntax error > > /where/gems/are/stored/gems/ferret-0.10.11/lib/ferret/index.rb:709 in > 'parse' > > (thanks to the excellent exception_notification plugin for what you see > above) > > So I did a little research into the Exception, which only left me with > more questions. According to the RDoc for Ferret, the > Ferret::QueryParser#new method has a :clean_string option that should > escape any special characters, and by default it's on. I emailed the > author of aaf and he said that he doesn't do anything with this flag, so > it should still be on. In that case, why am I still seeing this > exception? > > Jim clean_string isn't perfect. It escapes special characters within phrases (except for '<>' and '|' which have special meaning within phrases). It also tries to match up quotes and brackets. If it still can't parse the query then it will raise an exception unless you set :handle_exception to true in which case the exception will be ignored and the query will be parsed as a simple boolean query and all special characters will be ignored. I hope that makes sense. If you have any suggestions I'd be happy to hear them. Cheers, Dave From dbalmain.ml at gmail.com Mon Oct 16 12:13:51 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 17 Oct 2006 01:13:51 +0900 Subject: [Ferret-talk] How can I do my own search limits? In-Reply-To: <63bb73430a267e70943677e89c222972@ruby-forum.com> References: <7a422fdf109869f7a261c607c1448539@ruby-forum.com> <424e0115f98cf0a317553bb92ee90f52@ruby-forum.com> <873164f696bd71734fa95fba4be5eac4@ruby-forum.com> <63bb73430a267e70943677e89c222972@ruby-forum.com> Message-ID: On 10/16/06, Charlie Hubbard wrote: > David Balmain wrote: > > >> If that's the case then it's doable to create this type of search, but > >> it would make more sense to modify ferret to support this type of query. > > > > I don't see a way to add this feature cleanly. It is just as easy for > > you to do iterate through all the results yourself. Besides, you still > > haven't explained why you can't add all Pages to each Book document? > > As I said, the field length limit isn't an issue. This would be the > > best way to solve this problem. > > There is no reason why I couldn't. I was just trying to figure out a > way to avoid it. The big drawback to indexing all the pages onto a > single field in book would mean I'd have to pick a size of the field up > front that could be the maximum. I don't have a lot of data yet, but I > tried running some tests. A 94 chapter book it's somewhere around of > 100,000. But that's a smaller book. It's just something you have to > watch closely which I was trying to avoid is all. Right now your right > the best approach is to store it twice. Set it to Ferret::FIX_INT_MAX. This is the largest number that you set any of the properties too and effectively sets no limit to the field length. I'll add :all as an option at some point. > > In my suggested database approach the search would be the equivalent > > of a simple SQL join query. By adding a feature like this to > > acts_as_ferret you'll need to pull all the matching page ids out of > > the index and peform a much slower SQL query for all books that > > include those page ids. I'm not sure it is feasible but I'll leave > > that decision to the acts_as_ferret developers. The best solution is > > definitely to index all the pages with the book document, even if it > > means indexing each page twice. > > I was thinking it would be more like a SQL union. In other words the > query didn't have to match the Book document in order to be included. > It just had to match the Page object to be included. For example, say I > have a book title of Lucene in Action, but you'd expect a query "java" > would pull that one back. Java is probably mentioned in the text of > that book. I sort of saw it as a multi_index query, since aaf maps the > objects that way, where you'd first query Book Documents, then query the > Page documents. Instead of adding those Page documents to the resulting > array. They would only add a new entry if there was a Book not already > there. I suppose I could do that in Ruby, but it just seems like it > might be more optimized if ferret understood this type of relationship > since it is already iterating over this already. Trust me, Ferret is complex enough as it is without having to understand relationships between different documents. I need to draw the line somewhere. If I want to add features like this I need to design Ferret from the ground up to be more like a database which is exactly what I intend to do with the Ferret object database. I hope that makes sense. Dave From dbalmain.ml at gmail.com Mon Oct 16 13:30:57 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 17 Oct 2006 02:30:57 +0900 Subject: [Ferret-talk] Sorting by score In-Reply-To: <85bfb43ed70f65124de978016e63e3a8@ruby-forum.com> References: <85bfb43ed70f65124de978016e63e3a8@ruby-forum.com> Message-ID: On 10/17/06, Caspar Bl wrote: > Hi I think this is a very easy question but here goes: > > I want to sort my results by a boolean field and then by score, I > thought this would be a default configuration but apparently not. > > [this] is my current code, how do iu alter it so that the results are then > sorted by highest score first? > > sort_fields = [] > sort_fields << Ferret::Search::SortField.new(:sponsored, :reverse => > :true) sort_fields << Ferret::Search::SortField::SCORE sort = Ferret::Search::Sort.new(sort_fields) You can pass the array of SortFields as the :sort parameter or even a sort string ("sponsored DESC, SCORE"). Cheers, Dave From dbalmain.ml at gmail.com Mon Oct 16 13:35:26 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 17 Oct 2006 02:35:26 +0900 Subject: [Ferret-talk] acts_as_ferret: can i specify a search on 1 field as sup In-Reply-To: References: Message-ID: On 10/17/06, poipu wrote: > hi dave, thanks for the reply. > > so just to clarify for myself....if in my view i have a category textbox > and a query text box,,,all i need to do is > > model.find_by_contents(params[:category]+":"+params[:query]) > If params[:category] is the name of the field you want to search then yes, almost. You should also put brackets around params[:query] so the whole query is restricted to the field. model.find_by_contents(params[:category]+":("+params[:query] + ")") Other wise the query ruby AND rails on the :title field would look like this title:ruby AND rails instead of: title:(ruby AND rails) Hope that makes sense. Dave From marvin at rectangular.com Mon Oct 16 13:40:51 2006 From: marvin at rectangular.com (Marvin Humphrey) Date: Mon, 16 Oct 2006 10:40:51 -0700 Subject: [Ferret-talk] Very small scores for search results In-Reply-To: References: <2AA4463B-7E7C-48A3-96A6-A2B12F795EBE@gmail.com> <124BA98F-1817-4859-AB74-27063AAABBEC@rectangular.com> Message-ID: <21A1629D-F8BB-4C49-9662-F5CC7F696515@rectangular.com> On Oct 16, 2006, at 7:16 AM, David Balmain wrote: > Yeah, it is in test/test_helper.c. helper.c was the culprit, all right... >> I can set up a cron to do nightly smokes. > I'll probably take a while to get around to it. Weirdo. :D I'd LUV to have regular smoke tests done for me on systems I don't have access to! The big one for me is Windows. Fortunately, there's a bunch of people on PerlMonks who'll run tests for me on their Windows boxes when I ask. I'll probably whip up a Perl script that smokes Ferret. If I don't generalize it (assume availability of svn, etc), that's cake -- 50 lines, including the email message. The only reason I didn't volunteer at first is that I figured you could write one in Ruby and then you might get some other smokers besides me. > Anyway, don't spend too much time on this. I didn't. But it wasn't hard to find something which made the failing tests go away. Patch below. The patch might not be 100% optimal -- I didn't bother looking at how POSH implements those functions. I'll leave that to you. Meanwhile, I'll go implement the same functionality for Charmonizer. Funny how I've been working on this very issue! > I think it is better > for both of us if you concentrate on Lucy. I've seen a lot of action > recently on the commits list. :D Yeah, it's nice when a concept works out and stuff just flows... :) Marvin Humphrey Rectangular Research http://www.rectangular.com/ Index: c/src/helper.c =================================================================== --- c/src/helper.c (revision 653) +++ c/src/helper.c (working copy) @@ -14,13 +14,21 @@ { union { f_i32 i; float f; } tmp; tmp.f = f; +#ifdef POSH_LITTLE_ENDIAN return POSH_LittleU32(tmp.i); +#else + return POSH_BigU32(tmp.i); +#endif } float int2float(f_i32 i32) { union { f_i32 i; float f; } tmp; +#ifdef POSH_LITTLE_ENDIAN tmp.i = POSH_LittleU32(i32); +#else + tmp.i = POSH_BigU32(i32); +#endif return tmp.f; } From shammond at patientslikeme.com Mon Oct 16 13:56:43 2006 From: shammond at patientslikeme.com (Steven Hammond) Date: Mon, 16 Oct 2006 13:56:43 -0400 Subject: [Ferret-talk] Win XP / Ferret & Acts_as_ferret .dump problem In-Reply-To: <71d964d68f25c0e449bb423ebfaccbc3@ruby-forum.com> References: <0dd7a17d3f24fa5d2093e07b4d3d7f18@ruby-forum.com> <20060922091703.GA11602@cordoba.webit.de> <426978784cdd2e7bb1d6e0b31e296a8e@ruby-forum.com> <8dd78e760bd44e18faa8b9d21bb6aa2b@ruby-forum.com> <64AFD52B-3981-4390-920C-93710CAA4070@patientslikeme.com> <71d964d68f25c0e449bb423ebfaccbc3@ruby-forum.com> Message-ID: <2973F4EC-FCB3-41D6-BB18-7B75E89B67CB@patientslikeme.com> Thanks, I'll report back success or failure with this. Steve On Oct 16, 2006, at 11:31 AM, Guest wrote: > Steven Hammond wrote: >> Can you post your changes to base.rb? I'd like to give that a try. >> >> Thanks, >> Steve > > sure: > actionpack-1.12.1/lib/action_view/base.rb > modify compile_template to begin with > > template=template.gsub(/\t/," ") > > It completely fixed the issue on one of my win systems, > but on another one of them (both 2003), it doesnt work. > > This may cause many other issues than it fixes however so I would > label > it a hack for now. I found this fix online somewhere, but I can't find > it anymore. > > Good luck. > Brent > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From ilya at fortehost.com Mon Oct 16 14:12:42 2006 From: ilya at fortehost.com (Ilya Grigorik) Date: Mon, 16 Oct 2006 20:12:42 +0200 Subject: [Ferret-talk] Win XP / Ferret & Acts_as_ferret .dump problem In-Reply-To: <71d964d68f25c0e449bb423ebfaccbc3@ruby-forum.com> References: <0dd7a17d3f24fa5d2093e07b4d3d7f18@ruby-forum.com> <20060922091703.GA11602@cordoba.webit.de> <426978784cdd2e7bb1d6e0b31e296a8e@ruby-forum.com> <8dd78e760bd44e18faa8b9d21bb6aa2b@ruby-forum.com> <64AFD52B-3981-4390-920C-93710CAA4070@patientslikeme.com> <71d964d68f25c0e449bb423ebfaccbc3@ruby-forum.com> Message-ID: <43c85f3e468012d1acaed6397c795eb0@ruby-forum.com> Interesting solution Brent.. I actually just ended up installing VMWare / Fedora Core 5 and doing my development from there. No problems on *nix distributions. :) It would be really nice if someone could resolve this problem though! (Once and for all) Cheers, Ilya P.S. Running ferret on http://www.graphics-world.com - loving it! :) -- Posted via http://www.ruby-forum.com/. From none at none.com Mon Oct 16 14:21:24 2006 From: none at none.com (poipu) Date: Mon, 16 Oct 2006 20:21:24 +0200 Subject: [Ferret-talk] acts_as_ferret: can i specify a search on 1 field as sup In-Reply-To: References: Message-ID: <7580122dba1a4b76f2935244aa157bf7@ruby-forum.com> hi dave, thank you so much for your post. greatly appriciated! -- Posted via http://www.ruby-forum.com/. From fastjames at gmail.com Mon Oct 16 15:28:20 2006 From: fastjames at gmail.com (Jim Kane) Date: Mon, 16 Oct 2006 21:28:20 +0200 Subject: [Ferret-talk] Ferret::QueryParser::QueryParseException In-Reply-To: References: <0bf798aa2a169dfba412cf502e9b2100@ruby-forum.com> Message-ID: David Balmain wrote: > On 10/16/06, Jim Kane wrote: >> 'parse' >> exception? >> >> Jim > > clean_string isn't perfect. It escapes special characters within > phrases (except for '<>' and '|' which have special meaning within > phrases). It also tries to match up quotes and brackets. If it still > can't parse the query then it will raise an exception unless you set > :handle_exception to true in which case the exception will be ignored > and the query will be parsed as a simple boolean query and all special > characters will be ignored. > > I hope that makes sense. If you have any suggestions I'd be happy to > hear them. > > Cheers, > Dave OK, the actual flag I needed was :handle_parse_errors as opposed to :handle_exception, or :handle_parser_errors (that one's in the RDocs on the ferret trac). Calling acts_as_ferret like so did the trick: acts_as_ferret({:fields => {:field => {:store => :compressed}}}, {:handle_parser_errors => true}) Thanks for pointing me in the right direction. I was going to submit a patch to the docs but it looks like you've fixed it already in the trunk. Jim -- Posted via http://www.ruby-forum.com/. From peter at ioffer.com Mon Oct 16 16:06:50 2006 From: peter at ioffer.com (peter) Date: Mon, 16 Oct 2006 13:06:50 -0700 Subject: [Ferret-talk] Setting the boost field after indexing Message-ID: I was wondering about the potential impact of setting the boost field for an index after the index has been created and optimized. I know that some field_info settings can't really be modified, such as :store or :tokenize, as they really apply when you are indexing, but is boost? If boost is only used for searching, can it be modified dynamically when searching? Thanks for any thoughts. From none at none.com Mon Oct 16 18:26:26 2006 From: none at none.com (koloa) Date: Tue, 17 Oct 2006 00:26:26 +0200 Subject: [Ferret-talk] pagination in acts_as_ferret In-Reply-To: <20060503163044.GS29289@cordoba.webit.de> References: <20060503163044.GS29289@cordoba.webit.de> Message-ID: <17ea47bafad9e62b273baa7796003c11@ruby-forum.com> Hello Jens, just a quick question about your code... if i am using this: @test = County.find_by_contents(@params['search_string'],:first_doc=>0,:num_docs=>10) do i just setup up numbered links in my view to pass a variable to first_doc? so if i wanted to get the next set, id do: @test = County.find_by_contents(@params['search_string'],:first_doc=>@params['10'], :num_docs=>10) if this is the case, is there an easier way and also a way to count how many sets or how many pages there are in a table for a given query? would i just get a total from the array and then divide by 10 docs? etc...? thanks! Jens Kraemer wrote: > Hi! > > On Mon, May 01, 2006 at 08:55:22AM +0200, SchmakO wrote: >> end > find_by_contents has two options suitable for paging: > :first_doc (first result to retrieve) and > :num_docs (number of results to retrieve). > > so to retrieve results 10 to 20, you would use > @results = > Tutorial.find_by_contents(@query,:first_doc=>10,:num_docs=>10) > > hth, > Jens > > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Mon Oct 16 20:21:41 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 17 Oct 2006 09:21:41 +0900 Subject: [Ferret-talk] Setting the boost field after indexing In-Reply-To: References: Message-ID: On 10/17/06, peter wrote: > I was wondering about the potential impact of setting the boost field for an > index after the index has been created and optimized. > > I know that some field_info settings can't really be modified, such as > :store or :tokenize, as they really apply when you are indexing, but is > boost? If boost is only used for searching, can it be modified dynamically > when searching? > > Thanks for any thoughts. Hi Peter, If you want to set boosts at search time you can set the boost on the query. For example: +title:ferret^20.0 +content:ruby content:rails^0.1 Does that solve your problem? Otherwise, changing the boost on fields is just as problematic as changing the :store or :tokenized values. At indexing time the the global field boost, the local field boost, the document boost and the field length normalization factor are all multiplied together to get a single normalization factor for that field. This value is then encoded as a single byte and stored in the norm file (which you can omit to save space by setting :index => :untokenized_omit_norms). This saves a lot of work at search time but it also makes it impossible to change the global field boost value at search time. Anyway, I hope this clears things up and the query boosting does what you need it to. Cheers, Dave From dbalmain.ml at gmail.com Mon Oct 16 20:30:50 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 17 Oct 2006 09:30:50 +0900 Subject: [Ferret-talk] pagination in acts_as_ferret In-Reply-To: <17ea47bafad9e62b273baa7796003c11@ruby-forum.com> References: <20060503163044.GS29289@cordoba.webit.de> <17ea47bafad9e62b273baa7796003c11@ruby-forum.com> Message-ID: On 10/17/06, koloa wrote: > > > Hello Jens, just a quick question about your code... > > if i am using this: > > @test = > County.find_by_contents(@params['search_string'],:first_doc=>0,:num_docs=>10) For starters, :first_doc is now :offset and :num_docs is now :limit, as of Ferret 0.10.0. > do i just setup up numbered links in my view to pass a variable to > first_doc? so if i wanted to get the next set, id do: > > > @test = > County.find_by_contents(@params['search_string'],:first_doc=>@params['10'], > :num_docs=>10) Strange parameter name but yes, that's how you'd do it. > if this is the case, is there an easier way and also a way to count how > many sets or how many pages there are in a table for a given query? > would i just get a total from the array and then divide by 10 docs? > etc...? I'm not sure about easier way but you can get the total number of matching hits from @test.total_hits. On the other hand, @test.size will be the number of hits returned. Hope that answers your question, Dave From dbalmain.ml at gmail.com Mon Oct 16 21:33:57 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 17 Oct 2006 10:33:57 +0900 Subject: [Ferret-talk] Very small scores for search results In-Reply-To: <21A1629D-F8BB-4C49-9662-F5CC7F696515@rectangular.com> References: <2AA4463B-7E7C-48A3-96A6-A2B12F795EBE@gmail.com> <124BA98F-1817-4859-AB74-27063AAABBEC@rectangular.com> <21A1629D-F8BB-4C49-9662-F5CC7F696515@rectangular.com> Message-ID: On 10/17/06, Marvin Humphrey wrote: > > On Oct 16, 2006, at 7:16 AM, David Balmain wrote: > > > Yeah, it is in test/test_helper.c. > > helper.c was the culprit, all right... > > >> I can set up a cron to do nightly smokes. > > > I'll probably take a while to get around to it. > > Weirdo. > > :D > > I'd LUV to have regular smoke tests done for me on systems I don't > have access to! The big one for me is Windows. Fortunately, there's > a bunch of people on PerlMonks who'll run tests for me on their > Windows boxes when I ask. You're right. I'm a fool to pass up such an offer so lightly. I guess I just really want a Mac user within the Ferret community to take ownership of this. > I'll probably whip up a Perl script that smokes Ferret. If I don't > generalize it (assume availability of svn, etc), that's cake -- 50 > lines, including the email message. The only reason I didn't > volunteer at first is that I figured you could write one in Ruby and > then you might get some other smokers besides me. You're right. I'll do this. > > Anyway, don't spend too much time on this. > > I didn't. But it wasn't hard to find something which made the > failing tests go away. Patch below. > > The patch might not be 100% optimal -- I didn't bother looking at how > POSH implements those functions. I'll leave that to you. Funnily enough the patch reduces the operation to a no-op. I guess I don't need to worry about endianess here since floats have the same endianess as integers. I should have thought about that a little more and I could have saved you the trouble of having to look at it. :P > Meanwhile, I'll go implement the same functionality for Charmonizer. > Funny how I've been working on this very issue! Great. I'm going to swap out POSH for charminizer in Ferret ASAP. Thanks again Marvin. I'll check smoke_test.rb into the base directory of the Ferret repo when I'm done. Cheers, Dave > > I think it is better > > for both of us if you concentrate on Lucy. I've seen a lot of action > > recently on the commits list. :D > > Yeah, it's nice when a concept works out and stuff just flows... :) > > Marvin Humphrey > Rectangular Research > http://www.rectangular.com/ > > Index: c/src/helper.c > =================================================================== > --- c/src/helper.c (revision 653) > +++ c/src/helper.c (working copy) > @@ -14,13 +14,21 @@ > { > union { f_i32 i; float f; } tmp; > tmp.f = f; > +#ifdef POSH_LITTLE_ENDIAN > return POSH_LittleU32(tmp.i); > +#else > + return POSH_BigU32(tmp.i); > +#endif > } > float int2float(f_i32 i32) > { > union { f_i32 i; float f; } tmp; > +#ifdef POSH_LITTLE_ENDIAN > tmp.i = POSH_LittleU32(i32); > +#else > + tmp.i = POSH_BigU32(i32); > +#endif > return tmp.f; > } > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From marvin at rectangular.com Mon Oct 16 22:45:41 2006 From: marvin at rectangular.com (Marvin Humphrey) Date: Mon, 16 Oct 2006 19:45:41 -0700 Subject: [Ferret-talk] Very small scores for search results In-Reply-To: References: <2AA4463B-7E7C-48A3-96A6-A2B12F795EBE@gmail.com> <124BA98F-1817-4859-AB74-27063AAABBEC@rectangular.com> <21A1629D-F8BB-4C49-9662-F5CC7F696515@rectangular.com> Message-ID: <9610FBCA-9472-45E7-AB6F-31DEF7C85144@rectangular.com> On Oct 16, 2006, at 6:33 PM, David Balmain wrote: > Funnily enough the patch reduces the operation to a no-op. Ah. Makes sense. > I guess I > don't need to worry about endianess here since floats have the same > endianess as integers. I believe that the representation is IEEE 754 both on little-endian chips like the Intels, and big-endian chips like the PowerPC. The sign bit is indeed the "leftmost" bit in that representation, regardless of chip architecture. Where the float-int union technique (which is also used by KinoSearch and CLucene) will fall down is on architectures that don't use IEEE 754, like VAX. Then the encode/decode will get all screwed up. http://www.codeproject.com/tools/libnumber.asp Fortunately, the 0 .. 255 test will fail, so we'll know about the problem when it occurs. Non-IEEE floats are rare, these days, anyhow. POSH doesn't even support 'em. > Great. I'm going to swap out POSH for charminizer in Ferret ASAP. That will be very helpful. We'll see how soon ASAP is. :) > I'll check smoke_test.rb into the base directory > of the Ferret repo when I'm done. Grooves. Marvin Humphrey Rectangular Research http://www.rectangular.com/ From kraemer at webit.de Tue Oct 17 03:54:50 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 17 Oct 2006 09:54:50 +0200 Subject: [Ferret-talk] Dynamic fields and inheritance In-Reply-To: <9f661711f29c79332af4de6db280ff03@ruby-forum.com> References: <20061011113738.GC9323@cordoba.webit.de> <9f661711f29c79332af4de6db280ff03@ruby-forum.com> Message-ID: <20061017075450.GJ14271@cordoba.webit.de> On Mon, Oct 16, 2006 at 05:41:23PM +0200, Evan wrote: > Jens Kraemer wrote: > > On Wed, Oct 11, 2006 at 01:11:32AM +0200, Evan wrote: > > Don't call acts_as_ferret in your base class, instead add the :name > > field to the acts_as_ferret calls in Music and Book. That should fix > > your problems. > > I assume that I will be unable to call Product.find_by_contents in this > case. So, in order to do search of all products I would have to do a > multi-index search? right. You could also do it the other way around: just call acts_as_ferret in your Product class, and override to_doc in your child classes to add your dynamic properties to the ferret document. Don't forget to use :store_classname => true when you call acts_as_ferret. If you want to declare special Ferret options on a per field level, you'd have to declare them in the acts_as_ferret call in class Product for all kinds of products. That's not really nice but should work. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Tue Oct 17 07:43:16 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 17 Oct 2006 13:43:16 +0200 Subject: [Ferret-talk] pagination in acts_as_ferret In-Reply-To: References: <20060503163044.GS29289@cordoba.webit.de> <17ea47bafad9e62b273baa7796003c11@ruby-forum.com> Message-ID: <20061017114316.GL14271@cordoba.webit.de> On Tue, Oct 17, 2006 at 09:30:50AM +0900, David Balmain wrote: > On 10/17/06, koloa wrote: [..] > > > > @test = > > County.find_by_contents(@params['search_string'],:first_doc=>0,:num_docs=>10) > > For starters, :first_doc is now :offset and :num_docs is now :limit, > as of Ferret 0.10.0. right, though aaf still works with the old naming, too ;-) [..] > > if this is the case, is there an easier way and also a way to count how > > many sets or how many pages there are in a table for a given query? > > would i just get a total from the array and then divide by 10 docs? > > etc...? > > I'm not sure about easier way but you can get the total number of > matching hits from @test.total_hits. On the other hand, @test.size > will be the number of hits returned. exactly. I didn't try this, but it should be possible to use a Rails Paginator to handle the whole pagination stuff. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From neeraj.jsr at gmail.com Tue Oct 17 08:21:47 2006 From: neeraj.jsr at gmail.com (Raj Singh) Date: Tue, 17 Oct 2006 14:21:47 +0200 Subject: [Ferret-talk] Error : End-of-File Error occured at Message-ID: <1567e15bddc1e640d2b2e17e3411c84f@ruby-forum.com> Everything was working fine till last night. This morning I have many errors. I am using acts_as_ferret. Last updated around a month ago on linux. There are two different type of exceptions. I have over 12 exception emails but these are the two distince types. First exception: A EOFError occurred in home#event_info: End-of-File Error occured at :79 in xraise Error occured in store.c:216 - is_refill current pos = 778, file length = 778 /usr/local/lib/ruby/site_ruby/1.8/ferret/index.rb:517:in `close' Second exception: A Ferret::Store::Lock::LockError occurred in home#event_info: Lock Error occured at :103 in xpop_context Error occured in index.c:5372 - iw_open Couldn't obtain write lock when opening IndexWriter /usr/local/lib/ruby/site_ruby/1.8/ferret/index.rb:656:in `initialize' -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Tue Oct 17 09:59:51 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Tue, 17 Oct 2006 15:59:51 +0200 Subject: [Ferret-talk] Problems with stop word analysis and queries Message-ID: <761aae60f9052ede65a381fa02a14a8c@ruby-forum.com> I have a problem, and I think it's because stop word analysis isn't happenning in the queries. I think it's the way ferret is doing things that's causing this bug. Let's say I'm searching across documents with a title. And I have a document with a title of "Bash Guide for Beginners". If a user types in the query: Bash Guide for Beginners No quotes. I get no hits. But if I drop the "for" ferret finds it. Then say I type a quoted query like "Bash Guide for Beginners" ferret finds it. So I tried all of this from a rails app, and I thought maybe acts_as_ferret was doing something with the query to cause the "for" word to stay in the query. But, I then opened up a script/console and fetched the ferret index directly like: findex = MyModel.ferret_index findex.search("Bash Guide for Beginners") And nothing came up. I tried all three queries, full title no quotes, full title minus "for", and quoted query with the same results as above. So now I think it might be a problem with ferret. My theory is maybe the stop word analysis isn't taking place when I submit queries. That's my theory at least. Any ideas? Charlie -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Tue Oct 17 10:43:25 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Tue, 17 Oct 2006 16:43:25 +0200 Subject: [Ferret-talk] Error : End-of-File Error occured at In-Reply-To: <1567e15bddc1e640d2b2e17e3411c84f@ruby-forum.com> References: <1567e15bddc1e640d2b2e17e3411c84f@ruby-forum.com> Message-ID: Try rebuilding your index like the following: > ruby script/console >> MyModel.rebuild_index Charlie Raj Singh wrote: > Everything was working fine till last night. This morning I have many > errors. > I am using acts_as_ferret. Last updated around a month ago on linux. > There are two different type of exceptions. I have over 12 exception > emails but these are the two distince types. > > First exception: > > A EOFError occurred in home#event_info: > > End-of-File Error occured at :79 in xraise > Error occured in store.c:216 - is_refill > current pos = 778, file length = 778 > > > /usr/local/lib/ruby/site_ruby/1.8/ferret/index.rb:517:in `close' > > > Second exception: > A Ferret::Store::Lock::LockError occurred in home#event_info: > > Lock Error occured at :103 in xpop_context > Error occured in index.c:5372 - iw_open > Couldn't obtain write lock when opening IndexWriter > > > /usr/local/lib/ruby/site_ruby/1.8/ferret/index.rb:656:in `initialize' -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Tue Oct 17 11:17:42 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 18 Oct 2006 00:17:42 +0900 Subject: [Ferret-talk] [ANN] Ferret Smoke Test Message-ID: Hey folks, I've added a smoke test script to Ferret. It is named smoke_test.rb and it can be found in the base of the working directory. So what are you supposed to do with it you ask? Well, if you want to help keep Ferret working on your system of choice then set up a cron task to run this script regularly. What the script does is call `svn update` to get the latest working revision. If it is already at the latest revision it stops there. Otherwise it will run both the straight C unit tests and the Ferret unit tests and post the results here: http://camping.davebalmain.com/smoke_alarm/ And I'll be notified if anything breaks. This way I'll know of I'm breaking Ferret on a different system to mine, when I make make the next check-in. So if you want to get on board, here is a summary: prompt> svn co svn://www.davebalmain.com/exp ferret prompt> cd ferret prompt> ruby smoke_test.rb # no-op since you are already at latest revision # Now add the script to your crontab[1] or whatever you use to schedule # processes on your system. For me, I did this: prompt> sudo vi /etc/crontab # then add the following line: 17 * * * * root /usr/bin/ruby /path/to/smoke_test.rb true dbalmain at gmail.com The first parameter specifies whether to do a full rebuild before testing. This is preferable although it will take a little more processor time. The second parameter is obviously your email address. Don't worry, it won't be published anywhere. It just lets me contact you about any problems that are showing up on your system. A quick disclaimer to finish off. I hacked this up pretty quickly so there are probably a few improvements that could be made. Please feel free to post your suggestions. Cheers, Dave [1] There is a brief cron tutorial here: http://www.clockwatchers.com/cron_general.html From dbalmain.ml at gmail.com Tue Oct 17 11:27:24 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 18 Oct 2006 00:27:24 +0900 Subject: [Ferret-talk] Error : End-of-File Error occured at In-Reply-To: References: <1567e15bddc1e640d2b2e17e3411c84f@ruby-forum.com> Message-ID: On 10/17/06, Charlie Hubbard wrote: > > Try rebuilding your index like the following: > > > ruby script/console > > >> MyModel.rebuild_index > Good advice. Also make sure you have the latest version of Ferret. Version 0.10.10 will corrupt your index and eventually segfault. From dbalmain.ml at gmail.com Tue Oct 17 11:35:09 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 18 Oct 2006 00:35:09 +0900 Subject: [Ferret-talk] Problems with stop word analysis and queries In-Reply-To: <761aae60f9052ede65a381fa02a14a8c@ruby-forum.com> References: <761aae60f9052ede65a381fa02a14a8c@ruby-forum.com> Message-ID: On 10/17/06, Charlie Hubbard wrote: > > I have a problem, and I think it's because stop word analysis isn't > happenning in the queries. I think it's the way ferret is doing things > that's causing this bug. Let's say I'm searching across documents with > a title. And I have a document with a title of "Bash Guide for > Beginners". If a user types in the query: > > Bash Guide for Beginners > > No quotes. I get no hits. But if I drop the "for" ferret finds it. > Then say I type a quoted query like "Bash Guide for Beginners" ferret > finds it. So I tried all of this from a rails app, and I thought maybe > acts_as_ferret was doing something with the query to cause the "for" > word to stay in the query. But, I then opened up a script/console and > fetched the ferret index directly like: > > findex = MyModel.ferret_index > findex.search("Bash Guide for Beginners") > > And nothing came up. I tried all three queries, full title no quotes, > full title minus "for", and quoted query with the same results as above. > So now I think it might be a problem with ferret. My theory is maybe > the stop word analysis isn't taking place when I submit queries. That's > my theory at least. Any ideas? > > Charlie Hi Charlie, I'm afraid I can't reproduce the problem here. So unless someone else can help you I see you have two options. Try reproducing the problem with a small script like this: require 'rubygems' require 'ferret' index = Ferret::I.new(:or_default => false) index << "Bash Guide for Beginners" puts index.search('Bash Guide for Beginners') index.close Or you could send me the index off-list and I'll have a look at it here. Cheers, Dave From bk at benjaminkrause.com Tue Oct 17 12:29:26 2006 From: bk at benjaminkrause.com (Benjamin Krause) Date: Tue, 17 Oct 2006 18:29:26 +0200 Subject: [Ferret-talk] [ANN] Ferret Smoke Test In-Reply-To: References: Message-ID: <453504E6.4060106@benjaminkrause.com> Hey .. > I've added a smoke test script to Ferret. It is named smoke_test.rb > and it can be found in the base of the working directory. > Great! You will get a x86_64-linux result once a day :-) > prompt> sudo vi /etc/crontab > # then add the following line: > 17 * * * * root /usr/bin/ruby /path/to/smoke_test.rb true > dbalmain at gmail.com > I would suggest not doing this as root .. it will run fine as a standard user .. If someone hacks your Makefile, a lot of people might be in troube ;-) > A quick disclaimer to finish off. I hacked this up pretty quickly so > there are probably a few improvements that could be made. Please feel > free to post your suggestions. > I'll definately take a look .. Ben From dbalmain.ml at gmail.com Tue Oct 17 12:58:21 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 18 Oct 2006 01:58:21 +0900 Subject: [Ferret-talk] [ANN] Ferret Smoke Test In-Reply-To: <453504E6.4060106@benjaminkrause.com> References: <453504E6.4060106@benjaminkrause.com> Message-ID: On 10/18/06, Benjamin Krause wrote: > Hey .. > > I've added a smoke test script to Ferret. It is named smoke_test.rb > > and it can be found in the base of the working directory. > > > Great! You will get a x86_64-linux result once a day :-) > > > prompt> sudo vi /etc/crontab > > # then add the following line: > > 17 * * * * root /usr/bin/ruby /path/to/smoke_test.rb true > > dbalmain at gmail.com > > > I would suggest not doing this as root .. it will run fine as a standard > user .. If someone hacks > your Makefile, a lot of people might be in troube ;-) Oh, good catch Ben. That will teach me for cut and pasting. For that same reason, beware adding it to /etc/cron.hourly ro /etc/cron.daily as they are probably run by root too. Dave From neeraj.jsr at gmail.com Tue Oct 17 13:40:14 2006 From: neeraj.jsr at gmail.com (Raj Singh) Date: Tue, 17 Oct 2006 19:40:14 +0200 Subject: [Ferret-talk] Error : End-of-File Error occured at In-Reply-To: References: <1567e15bddc1e640d2b2e17e3411c84f@ruby-forum.com> Message-ID: <742d3d03a03681833faeee146ccb259f@ruby-forum.com> It might sound stupid question but I don't have an answer. How do I find what version of ferret is installed on my server. I couldn't install ferret on windows and hence I use ferret installed on the hosting server. Is there a particular command that I could execute to find what version of Ferret is running or should I ask this question to the admin at hosting. Thanks David Balmain wrote: > On 10/17/06, Charlie Hubbard wrote: >> >> Try rebuilding your index like the following: >> >> > ruby script/console >> >> >> MyModel.rebuild_index >> > > Good advice. Also make sure you have the latest version of Ferret. > Version 0.10.10 will corrupt your index and eventually segfault. -- Posted via http://www.ruby-forum.com/. From peter at ioffer.com Tue Oct 17 13:49:07 2006 From: peter at ioffer.com (peter) Date: Tue, 17 Oct 2006 10:49:07 -0700 Subject: [Ferret-talk] Error : End-of-File Error occured at In-Reply-To: <742d3d03a03681833faeee146ccb259f@ruby-forum.com> Message-ID: The command is usually gem list --local This gives the version of all of the gems, in which ferret should be one of them. > From: Raj Singh > Reply-To: ferret-talk at rubyforge.org > Date: Tue, 17 Oct 2006 19:40:14 +0200 > To: ferret-talk at rubyforge.org > Subject: Re: [Ferret-talk] Error : End-of-File Error occured at > > It might sound stupid question but I don't have an answer. How do I find > what version of ferret is installed on my server. > > I couldn't install ferret on windows and hence I use ferret installed on > the hosting server. Is there a particular command that I could execute > to find what version of Ferret is running or should I ask this question > to the admin at hosting. > > Thanks > > > David Balmain wrote: >> On 10/17/06, Charlie Hubbard wrote: >>> >>> Try rebuilding your index like the following: >>> >>>> ruby script/console >>> >>>>> MyModel.rebuild_index >>> >> >> Good advice. Also make sure you have the latest version of Ferret. >> Version 0.10.10 will corrupt your index and eventually segfault. > > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From bk at benjaminkrause.com Tue Oct 17 14:45:48 2006 From: bk at benjaminkrause.com (Benjamin Krause) Date: Tue, 17 Oct 2006 20:45:48 +0200 Subject: [Ferret-talk] Error : End-of-File Error occured at In-Reply-To: <742d3d03a03681833faeee146ccb259f@ruby-forum.com> References: <1567e15bddc1e640d2b2e17e3411c84f@ruby-forum.com> <742d3d03a03681833faeee146ccb259f@ruby-forum.com> Message-ID: <453524DC.1080800@benjaminkrause.com> Raj Singh schrieb: > It might sound stupid question but I don't have an answer. How do I find > what version of ferret is installed on my server. > > I couldn't install ferret on windows and hence I use ferret installed on > the hosting server. Is there a particular command that I could execute > to find what version of Ferret is running or should I ask this question > to the admin at hosting. > Hey .. There're several ways.. if you installed ferret as gem (the suggested and default way) try: benjamin at home ~ $ gem list ferret *** LOCAL GEMS *** ferret (0.10.9) Ruby indexing library. If you installed it via svn checkout, try this: benjamin at home ~/trunk $ script/console Loading development environment. >> Ferret::VERSION => "0.10.9" Ben From charlie.hubbard at gmail.com Tue Oct 17 17:07:30 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Tue, 17 Oct 2006 23:07:30 +0200 Subject: [Ferret-talk] Problems with stop word analysis and queries In-Reply-To: References: <761aae60f9052ede65a381fa02a14a8c@ruby-forum.com> Message-ID: <9b0ba2e969d53f3057e3e8f3c8616cc0@ruby-forum.com> Ok thanks for looking into. I'm going to try and reproduce it with plain ferret, if not I'll send you the index. Charlie David Balmain wrote: > On 10/17/06, Charlie Hubbard wrote: >> Then say I type a quoted query like "Bash Guide for Beginners" ferret >> So now I think it might be a problem with ferret. My theory is maybe >> the stop word analysis isn't taking place when I submit queries. That's >> my theory at least. Any ideas? >> >> Charlie > > Hi Charlie, > I'm afraid I can't reproduce the problem here. So unless someone else > can help you I see you have two options. Try reproducing the problem > with a small script like this: > > require 'rubygems' > require 'ferret' > > index = Ferret::I.new(:or_default => false) > index << "Bash Guide for Beginners" > puts index.search('Bash Guide for Beginners') > index.close > > Or you could send me the index off-list and I'll have a look at it here. > > Cheers, > Dave -- Posted via http://www.ruby-forum.com/. From tennisbum2002 at hotmail.com Tue Oct 17 21:24:54 2006 From: tennisbum2002 at hotmail.com (Eric Gross) Date: Wed, 18 Oct 2006 03:24:54 +0200 Subject: [Ferret-talk] multi_search error undefined method In-Reply-To: References: Message-ID: <876a59ccc5cc191a81c339891fd026a3@ruby-forum.com> Anyone? -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Tue Oct 17 22:09:43 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 18 Oct 2006 11:09:43 +0900 Subject: [Ferret-talk] multi_search error undefined method In-Reply-To: <876a59ccc5cc191a81c339891fd026a3@ruby-forum.com> References: <876a59ccc5cc191a81c339891fd026a3@ruby-forum.com> Message-ID: On 10/18/06, Eric Gross wrote: > Anyone? > Jens already replied to this (see below). I have no idea why it didn't make it onto rubyforum. Try joining the mailing list. Cheers, Dave On 10/13/06, Jens Kraemer wrote: > On Fri, Oct 13, 2006 at 09:41:36AM +0200, Eric Gross wrote: > > Hi, > > > > Im having problems using the multi_search command. I keep getting the > > following error. > > > > "undefined method `<<' for Book:Class" > > > > here is the code associated with this. > > > > class Book < ActiveRecord::Base > > acts_as_ferret :store_class_name => true > > end > > > > > > class User < ActiveRecord::Base > > acts_as_ferret :store_class_name => true > > end > > > > and the call is the following > > > > t=User.multi_search(@query,Book). > > try this: > > t = User.multi_search(@query, [ Book ]) > > > I just committed a fix so that > > t=User.multi_search(@query,Book) > will work, too. > > cheers, > Jens From tennisbum2002 at hotmail.com Tue Oct 17 23:59:54 2006 From: tennisbum2002 at hotmail.com (Eric Gross) Date: Wed, 18 Oct 2006 05:59:54 +0200 Subject: [Ferret-talk] multi_search error undefined method In-Reply-To: References: <876a59ccc5cc191a81c339891fd026a3@ruby-forum.com> Message-ID: <2c9c73075589f964af5594449470b2cd@ruby-forum.com> David, I dont see any other response on this topic except for yours, can you send me her response or repost it? David Balmain wrote: > On 10/18/06, Eric Gross wrote: >> Anyone? >> > > Jens already replied to this (see below). I have no idea why it didn't > make it onto rubyforum. Try joining the mailing list. > > Cheers, > Dave -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Wed Oct 18 01:59:25 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 18 Oct 2006 14:59:25 +0900 Subject: [Ferret-talk] multi_search error undefined method In-Reply-To: <2c9c73075589f964af5594449470b2cd@ruby-forum.com> References: <876a59ccc5cc191a81c339891fd026a3@ruby-forum.com> <2c9c73075589f964af5594449470b2cd@ruby-forum.com> Message-ID: On 10/18/06, Eric Gross wrote: > David, I dont see any other response on this topic except for yours, can > you send me her response or repost it? I forwarded you his reply in my previous email but it seems to have been stripped. I'd better let Andreas know about that. Anyway, you can see the full thread here: http://thread.gmane.org/gmane.comp.lang.ruby.ferret.general/1443/focus=1445 Cheers, Dave From heikowebers at gmx.net Wed Oct 18 07:03:00 2006 From: heikowebers at gmx.net (hawe) Date: Wed, 18 Oct 2006 13:03:00 +0200 Subject: [Ferret-talk] install ferret on windows Message-ID: <4a7404ee0ad4003672ebf2632ad88d23@ruby-forum.com> Hi! I'm trying to install ferret on windows, so I chose ferret-0.10.9-mswin32.gem from the download page, as it includes a already pre-compiled ferret_ext.so (is that correct?) and I don't have any C compiler here. The gem installed it correctly, but the test didn't work. So I called these commands: rake ext ruby setup.rb config ruby setup.rb setup ruby setup.rb install the first was aborted (missing nmake of course) I copied the extconf.rb from the 0.10.11 distribution, as it was not included. The 2nd command said "The C extensions could not be installed", the 3rd told me not to worry about that and the 4th seemed to work fine. After that the tests are running, but keep crashing with C:/Ruby/lib/ruby/site_ruby/1.8/ferret/index.rb:125: [BUG] Segmentation fault ruby 1.8.2 (2004-12-25) [i386-mswin32] Is that because this version is buggy or is there no other way than to install a C compiler? Thanks for your help. hawe. -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Wed Oct 18 09:36:11 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 18 Oct 2006 22:36:11 +0900 Subject: [Ferret-talk] install ferret on windows In-Reply-To: <4a7404ee0ad4003672ebf2632ad88d23@ruby-forum.com> References: <4a7404ee0ad4003672ebf2632ad88d23@ruby-forum.com> Message-ID: On 10/18/06, hawe wrote: > Hi! > > I'm trying to install ferret on windows, so I chose > ferret-0.10.9-mswin32.gem from the download page, as it includes a > already pre-compiled ferret_ext.so (is that correct?) and I don't have > any C compiler here. I think you'll need to upgrade your version Ruby. It appears you have 1.8.2. Ferret is compiled against Onclick-Installer 1.8.4-20. > The gem installed it correctly, but the test didn't > work. > > So I called these commands: > rake ext > ruby setup.rb config > ruby setup.rb setup > ruby setup.rb install These commands won't work as you don't have a C compiler. > the first was aborted (missing nmake of course) > I copied the extconf.rb from the 0.10.11 distribution, as it was not > included. The 2nd command said "The C extensions could not be > installed", the 3rd told me not to worry about that and the 4th seemed > to work fine. > > After that the tests are running, but keep crashing with > C:/Ruby/lib/ruby/site_ruby/1.8/ferret/index.rb:125: [BUG] Segmentation > fault > ruby 1.8.2 (2004-12-25) [i386-mswin32] > > Is that because this version is buggy or is there no other way than to > install a C compiler? I'm not quite sure what you mean here about another way to install a C compiler. If you want to compile Ferret yourself you'll need to get ahold of VC6 or possibly MingW but I recommend just upgrading Ruby and using the precompiled gem. Let me know if you are still having problems. Cheers, Dave From howardmoon at hitcity.com.au Wed Oct 18 09:41:37 2006 From: howardmoon at hitcity.com.au (Peter Royle) Date: Wed, 18 Oct 2006 15:41:37 +0200 Subject: [Ferret-talk] [ANN] Ferret Smoke Test In-Reply-To: References: <453504E6.4060106@benjaminkrause.com> Message-ID: <9ee55408d26d9ef4e69b35942e1a633b@ruby-forum.com> Hi, I'm on Mac OS X, PPC64. Just ran the smoke_alarm script but before it would run I had to install Ruby 1.8.4 (for Net:HTTP.post_form). So I now have /usr/bin/ruby (1.8.2) and /usr/local/bin/ruby (1.8.4) I started the script using /usr/local/bin/ruby (1.8.4) but it failed with this during ruby tests: Loading once Loaded suite /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake/rake_test_loader Started ................................................................rake aborted! Command failed with status (): [/usr/bin/ruby -Ilib:test/unit "/usr/lib/ru...] Looks like it's still trying to use /usr/bin/ruby (1.8.2). Is that going to be the problem? Pete. -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Wed Oct 18 10:39:17 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 18 Oct 2006 23:39:17 +0900 Subject: [Ferret-talk] [ANN] Ferret Smoke Test In-Reply-To: <9ee55408d26d9ef4e69b35942e1a633b@ruby-forum.com> References: <453504E6.4060106@benjaminkrause.com> <9ee55408d26d9ef4e69b35942e1a633b@ruby-forum.com> Message-ID: On 10/18/06, Peter Royle wrote: > Hi, > > I'm on Mac OS X, PPC64. Just ran the smoke_alarm script but before it > would run I had to install Ruby 1.8.4 (for Net:HTTP.post_form). So I now > have /usr/bin/ruby (1.8.2) and /usr/local/bin/ruby (1.8.4) > > I started the script using /usr/local/bin/ruby (1.8.4) but it failed > with this during ruby tests: > > Loading once > Loaded suite > /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake/rake_test_loader > Started > ................................................................rake > aborted! > Command failed with status (): [/usr/bin/ruby -Ilib:test/unit > "/usr/lib/ru...] > > Looks like it's still trying to use /usr/bin/ruby (1.8.2). Is that going > to be the problem? > > Pete. Hi Pete, I'm not aware of anyone running Ferret with Ruby 1.8.2 so I can't say if it will be a problem or not. By the looks of it, it is. Could you post the full error message? By the way, the reason that it reverts to /usr/lib/ruby is that the script makes a system call to rake which automatically uses the old version of ruby, probably because it appears first in the execution path. Cheers, Dave From ilya at fortehost.com Wed Oct 18 15:37:17 2006 From: ilya at fortehost.com (Ilya Grigorik) Date: Wed, 18 Oct 2006 21:37:17 +0200 Subject: [Ferret-talk] [Bug] Seg Faulting in index.rb:718 Message-ID: Hey, Ferret is repeatedly seg-faulting my mongrel servers on the same line: /usr/local/lib/ruby/gems/1.8/gems/ferret-0.10.11/lib/ferret/index.rb:718: [BUG] Segmentation fault ruby 1.8.5 (2006-08-25) [i686-linux] I'm using ferret 0.10.11. I haven't had the time to dig into yet, it's on the backburner right now - I just keep several spare servers and restart them periodically but maybe someone else can spot the problem. Ilya http://www.igvita.com/blog -- Posted via http://www.ruby-forum.com/. From howardmoon at hitcity.com.au Wed Oct 18 19:54:45 2006 From: howardmoon at hitcity.com.au (Peter Royle) Date: Thu, 19 Oct 2006 01:54:45 +0200 Subject: [Ferret-talk] [ANN] Ferret Smoke Test In-Reply-To: References: <453504E6.4060106@benjaminkrause.com> <9ee55408d26d9ef4e69b35942e1a633b@ruby-forum.com> Message-ID: > Hi Pete, > I'm not aware of anyone running Ferret with Ruby 1.8.2 so I can't say > if it will be a problem or not. By the looks of it, it is. Could you > post the full error message? I can't seem to reproduce it, but the results of that run did end up getting posted as "Test run on 2006-10-18 22:38:15". Now, even with a PATH that favours Ruby 1.8.2, the tests all seem to pass when I call make and rake on the command line and when I run smoke_alarm.rb manually or from cron (commenting out the no-op and form submission code). Not sure what was going on but it seems to have gone away. 1.8.2 is the version that comes stock with Mac OS X which is why I've been using it so there might be others in the same boat. I'm now testing against 1.8.4, but if you'd like someone to be testing against 1.8.2 I could do that. There appears to be someone else testing against ppc64/1.8.4 anyway. What do you think? Pete. -- Posted via http://www.ruby-forum.com/. From jon at mywellnet.com Wed Oct 18 20:09:36 2006 From: jon at mywellnet.com (Jon) Date: Thu, 19 Oct 2006 02:09:36 +0200 Subject: [Ferret-talk] joins and table names in ferret Message-ID: I'm having trouble figuring out how to do Ferret queries across multiple tables as you would in a normal SQL call. For example, let's say I have two ActiveRecord classes, Book, and Author, where Book has a 'description' field and Author has a 'name' field, and where Book has a belongs_to relationship with Author (Book belolngs to Author, Author has_many Books). Let's say I'd like to find all Books with the term 'programming' in their description that are by people with 'Smith' in their name. I'd like to do something like this: Book.find_by_contents("books.description:programming AND authors.name:Smith", {}, {:include => :author}) However, this function does not seem to allow the specification of table names. This type of call would be relatively easy in plain SQL, but I'd like to use Ferret for all queries to keep things uniform and to take advantage of its speed. I've looked into using the multi_search option also, but can't figure out how to use it to do even simple joins such as this one. Any help would be greatly appreciated. Thanks. -- Posted via http://www.ruby-forum.com/. From jon at mywellnet.com Wed Oct 18 20:32:37 2006 From: jon at mywellnet.com (Jon) Date: Thu, 19 Oct 2006 02:32:37 +0200 Subject: [Ferret-talk] joins and table names in ferret In-Reply-To: References: Message-ID: <364d305aca479155760801707cd785a8@ruby-forum.com> I realized that I wasn't fully clear in this example...I would want returned all Book objects where the Book description contains the word 'programming', the Author name contains the word 'Smith', AND (what I didn't explicitly say before) the book is by that author. So more informally, I want to find all programming books who are by someone with 'Smith' in their name. This would amount to the extra join statment of 'books.author_id=author.id'. I'm having a really hard time figure out how to do this in Ferret (using acts_as_ferret). Jon wrote: > I'm having trouble figuring out how to do Ferret queries across multiple > tables as you would in a normal SQL call. For example, let's say I have > two ActiveRecord classes, Book, and Author, where Book has a > 'description' field and Author has a 'name' field, and where Book has a > belongs_to relationship with Author (Book belolngs to Author, Author > has_many Books). Let's say I'd like to find all Books with the term > 'programming' in their description that are by people with 'Smith' in > their name. I'd like to do something like this: > > Book.find_by_contents("books.description:programming AND > authors.name:Smith", > {}, > {:include => :author}) > > However, this function does not seem to allow the specification of table > names. This type of call would be relatively easy in plain SQL, but I'd > like to use Ferret for all queries to keep things uniform and to take > advantage of its speed. I've looked into using the multi_search option > also, but can't figure out how to use it to do even simple joins such as > this one. Any help would be greatly appreciated. Thanks. -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Wed Oct 18 22:05:14 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 19 Oct 2006 11:05:14 +0900 Subject: [Ferret-talk] joins and table names in ferret In-Reply-To: <364d305aca479155760801707cd785a8@ruby-forum.com> References: <364d305aca479155760801707cd785a8@ruby-forum.com> Message-ID: On 10/19/06, Jon wrote: > I realized that I wasn't fully clear in this example...I would want > returned all Book objects where the Book description contains the word > 'programming', the Author name contains the word 'Smith', AND (what I > didn't explicitly say before) the book is by that author. So more > informally, I want to find all programming books who are by someone with > 'Smith' in their name. This would amount to the extra join statment of > 'books.author_id=author.id'. I'm having a really hard time figure out > how to do this in Ferret (using acts_as_ferret). > > Jon wrote: > > I'm having trouble figuring out how to do Ferret queries across multiple > > tables as you would in a normal SQL call. For example, let's say I have > > two ActiveRecord classes, Book, and Author, where Book has a > > 'description' field and Author has a 'name' field, and where Book has a > > belongs_to relationship with Author (Book belolngs to Author, Author > > has_many Books). Let's say I'd like to find all Books with the term > > 'programming' in their description that are by people with 'Smith' in > > their name. I'd like to do something like this: > > > > Book.find_by_contents("books.description:programming AND > > authors.name:Smith", > > {}, > > {:include => :author}) > > > > However, this function does not seem to allow the specification of table > > names. This type of call would be relatively easy in plain SQL, but I'd > > like to use Ferret for all queries to keep things uniform and to take > > advantage of its speed. I've looked into using the multi_search option > > also, but can't figure out how to use it to do even simple joins such as > > this one. Any help would be greatly appreciated. Thanks. Hi Jon, Ferret isn't a database (yet?[1]). If you need to run queries like this then you should index the author name in the book document and forget about doing joins for the moment. Queries will be a lot faster this way too. I think this is pretty easy in acts as ferret. Just define the method author_names def author_names authors.map{|author| author.name} end And make sure :author_names is in your list of fields to index. Cheers, Dave [1] http://www.mail-archive.com/ferret-talk at rubyforge.org/msg01183.html From dbalmain.ml at gmail.com Wed Oct 18 22:16:03 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 19 Oct 2006 11:16:03 +0900 Subject: [Ferret-talk] [Bug] Seg Faulting in index.rb:718 In-Reply-To: References: Message-ID: On 10/19/06, Ilya Grigorik wrote: > Hey, > > Ferret is repeatedly seg-faulting my mongrel servers on the same line: > > /usr/local/lib/ruby/gems/1.8/gems/ferret-0.10.11/lib/ferret/index.rb:718: > [BUG] Segmentation fault > ruby 1.8.5 (2006-08-25) [i686-linux] > > I'm using ferret 0.10.11. I haven't had the time to dig into yet, it's > on the backburner right now - I just keep several spare servers and > restart them periodically but maybe someone else can spot the problem. Hi Ilya, I'm afraid that isn't much help since the error is occuring during a call to search, meaning the segfault could be happening on any one of 50,000 lines of the C code. My first suggestion would be reindexing. If the index was built with a previous version of Ferret then that could possibly be the problem. Otherwise, if you can work out how to reproduce it consistently I'll be happy to ssh onto your server and attempt to fix the problem. Let me know how you go. Cheers, Dave From ilya at fortehost.com Thu Oct 19 00:16:48 2006 From: ilya at fortehost.com (Ilya Grigorik) Date: Thu, 19 Oct 2006 06:16:48 +0200 Subject: [Ferret-talk] [Bug] Seg Faulting in index.rb:718 In-Reply-To: References: Message-ID: <6d535845689870e08f457171400f4a96@ruby-forum.com> Dave, I've just rebuilt the index - I'll let you know if that helps at all. But, since the line number doesnt give us anything, do you have any tips for tracking down the problem? I can't seem to reproduce the error myself so I'm simply pulling the Seg Fault messages from my mongrel.log - which doesn't give much aside from the actual Seg Fault error. I've been trying to correlate the seg faults with my production.log, but so far, no luck. Ilya -- Posted via http://www.ruby-forum.com/. From jon at mywellnet.com Thu Oct 19 02:52:32 2006 From: jon at mywellnet.com (Jon) Date: Thu, 19 Oct 2006 08:52:32 +0200 Subject: [Ferret-talk] joins and table names in ferret In-Reply-To: References: <364d305aca479155760801707cd785a8@ruby-forum.com> Message-ID: <7e39c1d1b695a72ca99972fc5cbb6f1e@ruby-forum.com> Thanks a ton for the quick reply...very much appreciated. And thanks for the great software as well. David Balmain wrote: > On 10/19/06, Jon wrote: >> > I'm having trouble figuring out how to do Ferret queries across multiple >> > {}, >> > {:include => :author}) >> > >> > However, this function does not seem to allow the specification of table >> > names. This type of call would be relatively easy in plain SQL, but I'd >> > like to use Ferret for all queries to keep things uniform and to take >> > advantage of its speed. I've looked into using the multi_search option >> > also, but can't figure out how to use it to do even simple joins such as >> > this one. Any help would be greatly appreciated. Thanks. > > Hi Jon, > > Ferret isn't a database (yet?[1]). If you need to run queries like -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Thu Oct 19 03:24:32 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 19 Oct 2006 16:24:32 +0900 Subject: [Ferret-talk] [ANN] Ferret Smoke Test In-Reply-To: References: <453504E6.4060106@benjaminkrause.com> <9ee55408d26d9ef4e69b35942e1a633b@ruby-forum.com> Message-ID: On 10/19/06, Peter Royle wrote: > > Hi Pete, > > I'm not aware of anyone running Ferret with Ruby 1.8.2 so I can't say > > if it will be a problem or not. By the looks of it, it is. Could you > > post the full error message? > > I can't seem to reproduce it, but the results of that run did end up > getting posted as "Test run on 2006-10-18 22:38:15". Now, even with a > PATH that favours Ruby 1.8.2, the tests all seem to pass when I call > make and rake on the command line and when I run smoke_alarm.rb manually > or from cron (commenting out the no-op and form submission code). Not > sure what was going on but it seems to have gone away. Great :D. One less thing for me to worry about. > 1.8.2 is the version that comes stock with Mac OS X which is why I've > been using it so there might be others in the same boat. I'm now testing > against 1.8.4, but if you'd like someone to be testing against 1.8.2 I > could do that. There appears to be someone else testing against > ppc64/1.8.4 anyway. What do you think? > > Pete. It'd be nice to have someone testing against 1.8.2 but I wouldn't want you to have to use 1.8.2 just to run Ferret's smoke test. I'll leave it up to you. Cheers, Dave From jduflost at ben.vub.ac.be Thu Oct 19 06:19:21 2006 From: jduflost at ben.vub.ac.be (johan duflost) Date: Thu, 19 Oct 2006 12:19:21 +0200 Subject: [Ferret-talk] problem with queries Message-ID: <002801c6f368$0b3f8b60$0700000a@ORION> Hello, I upgraded to ferret 0.10.10 and I noticed a strange behaviour with queries. Now the queries return strange results. For example, the two following queries return the same results: familynames|firstnames:andre familynames|firstnames:andr Another example, the first query returns a correct result + incoherent results, the second query returns only the correct result: (familynames|firstnames:baus) (familynames|firstnames:baus*) Could you help me please? Thank you Johan From mark.puckett at gmail.com Thu Oct 19 11:55:32 2006 From: mark.puckett at gmail.com (Mark Puckett) Date: Thu, 19 Oct 2006 17:55:32 +0200 Subject: [Ferret-talk] not able to install acts_as_ferret Message-ID: I have an older version installed and want to try the latest ferret/aaf to see if it solves some perf problems, but haven't been able to get aaf on multiple tries on multiple days now. script/plugin install svn://projects.jkraemer.net/acts_as_ferret/trunk/plugin/acts_as_ferret svn: Can't connect to host 'projects.jkraemer.net': Operation timed out Has anybody else installed this plugin lately? Any other way to get the latest? Thanks -Mark -- Posted via http://www.ruby-forum.com/. From mark.puckett at gmail.com Thu Oct 19 12:14:29 2006 From: mark.puckett at gmail.com (Mark Puckett) Date: Thu, 19 Oct 2006 18:14:29 +0200 Subject: [Ferret-talk] Experience with ferret on Dreamhost ? In-Reply-To: References: Message-ID: I recently put a site up on dreamhost (my first on dh, and my first using ferret/aaf). I don't believe I had the problem stated below, but it's hard to tell. On occassion it appears that fastcgi isn't starting, but if I poke at it a bit (touch dispatch.fcgi) enough, it eventually will fire up. Pages without ferret appear to work fine. Search pages, however, barely/rarely work. My guess is that dreamhost is killing the fastcgi process due to high cpu and/or memory, but I'm not sure how to verify that. In script/console, I ran a search and it took about 50 seconds to return. So clearly I have something amiss with my ferret/aaf config or how I'm using it. I have about 50k rows in one table and 7 fields indexed (with only 4 of them ever being populated with data). I currently have ferret 0.9.3. I've tried upgrading to the latest, but haven't been able to get a new aaf (see post about not being able to install latest aaf) to accompany it so I'm kinda stuck there until I can get a new aaf. Anyway, if anybody has any ideas or expectations that should be set with a shared hosting plan at dreamhost, I'm all ears. Thanks -Mark anrake o. wrote: > I think I figured out this problem. All you had to do was add this line > to the top of environment.rb > > ENV['GEM_PATH'] = '/home/USERNAME/.gems' + ':/usr/lib/ruby/gems/1.8' > > The DH wiki says to put the following, but it didn't seem to work. > > ENV['GEM_PATH'] = File.expand_path('~/.gems') + > ':/usr/lib/ruby/gems/1.8' > > I actually created a new test project in dev. mode and saw an error > some where like "couldn't expand ~" or "unknown comand expand_path" or > something like that and just put in the absolute path as above on a > whim. Then it worked fine. > > > anrake wrote: >> Hi, I am experiencing exactly the same phenomenon. >> Everything works find on my powerbook, but not on DH. I changed >> bash_profile to add my local .gems directory and installed ferret with >> no apparent problems. I added a line to environment.rb as instructed in >> the wiki but still get the same problems when I try to deploy my new >> site (via Capistrano). Likewise I think dispatch.fcgi is not starting. >> >> Any ideas? >> >> Chris Lowis wrote: >>> Does anybody have experience with running ferret on dreamhost ? >>> >>> My app is running ok until I install the acts_as_ferret plugin, at which >>> point I get "Rails application failed to start properly" errors. I've >>> used script/console to confirm that I can require 'ferret' and make a >>> new Index object . Everything appears to be ok in that respect. >>> Unfortunately there is nothing logged in these circumstances, except : >>> >>> [Wed Aug 16 07:10:23 2006] [error] [client 152.78.115.107] FastCGI: comm >>> with (dynamic) server >>> "/home/c_lowis/residence-review.com/public/dispatch.fcgi" aborted: >>> (first read) idle timeout (120 sec) >>> [Wed Aug 16 07:10:23 2006] [error] [client 152.78.115.107] FastCGI: >>> incomplete headers (0 bytes) received from server >>> "/home/c_lowis/residence-review.com/public/dispatch.fcgi" >>> >>> in the "apache" type logs that dreamhost gives me . Through trial and >>> error I am fairly sure it is ferret that is causing this, as when I >>> remove the plugin the site works ok. >>> >>> I am using ferret 0.9.5 . As far as I can see dispatch.fcgi is not >>> starting. >>> >>> Would appreciate any comments, >>> >>> Chris -- Posted via http://www.ruby-forum.com/. From mark.puckett at gmail.com Thu Oct 19 13:59:07 2006 From: mark.puckett at gmail.com (Mark Puckett) Date: Thu, 19 Oct 2006 19:59:07 +0200 Subject: [Ferret-talk] not able to install acts_as_ferret In-Reply-To: References: Message-ID: <7bd13bf4ff73d6a2401a52fa6d142afb@ruby-forum.com> Still can't connect, but I finally noticed the trunk export at the bottom of the aaf wiki. -Mark Mark Puckett wrote: > I have an older version installed and want to try the latest ferret/aaf > to see if it solves some perf problems, but haven't been able to get aaf > on multiple tries on multiple days now. > > script/plugin install > svn://projects.jkraemer.net/acts_as_ferret/trunk/plugin/acts_as_ferret > > svn: Can't connect to host 'projects.jkraemer.net': Operation timed out > > Has anybody else installed this plugin lately? > Any other way to get the latest? > > Thanks > -Mark -- Posted via http://www.ruby-forum.com/. From aditya_nalla at yahoo.co.in Thu Oct 19 14:13:11 2006 From: aditya_nalla at yahoo.co.in (Aditya) Date: Thu, 19 Oct 2006 20:13:11 +0200 Subject: [Ferret-talk] wrong indexing when I use disable ferret and ferret update i Message-ID: I am using acts_as_ferret plugin. ITs working great on my local but when i put it on my textdrive server indexing is not proper. If I delete the indexes created, and search then the new set of indexes seem to work fine. But when I use ferret_create(I have do so for a new entery in my db), the whole indexing thing goes wrong. I get wrong results. Is this due to character encoding? Thanks Aditya -- Posted via http://www.ruby-forum.com/. From edgargonzalez at gmail.com Thu Oct 19 15:57:00 2006 From: edgargonzalez at gmail.com (Edgar) Date: Thu, 19 Oct 2006 21:57:00 +0200 Subject: [Ferret-talk] How to deal with accentuated chars in 0.10.8? Message-ID: I'm startin to use Ferret and acts_as_ferret. I need to use something like EuropeanAnalyzer (http://olivier.liquid-concept.com/fr/pages/2006_acts_as_ferret_accentuated_chars). By example, if the user search by "gonzalez" you can find documents taht contents the term "gonz?lez" (gonzález) The EuropeanAnalyzer is based on Ferret::Analysis::TokenFilter, but seems that in 0.10.x this is not available. What is the way to do this ? -- Posted via http://www.ruby-forum.com/. From adams.brad at gmail.com Thu Oct 19 16:31:57 2006 From: adams.brad at gmail.com (Brad Adams) Date: Thu, 19 Oct 2006 22:31:57 +0200 Subject: [Ferret-talk] kill the stopwords!!! In-Reply-To: <20060908074344.GD23939@cordoba.webit.de> References: <64E8273D-D7CC-4DE9-8BA4-CB50DAF4D123@fhwang.net> <20060908074344.GD23939@cordoba.webit.de> Message-ID: <5b8105f5c679ba46985c6557a80e4bd3@ruby-forum.com> > or, with aaf: > acts_as_ferret :analyzer => StandardAnalyzer.new([]) > I've tried this with aaf, and it still uses stopwords. Anyone else have this problem? I'm running 10.10 and aaf, plugin (as current as today...not sure what v.). I've tried: acts_as_ferret :fields => [:name], :analyzer => Ferret::Analysis::StandardAnalyzer.new([]) acts_as_ferret :fields => [:name], :analyzer => StandardAnalyzer.new([]) even different analyzers. All of them still seem to use the stopwords. Anyone have an idea? -- Posted via http://www.ruby-forum.com/. From adams.brad at gmail.com Thu Oct 19 17:27:36 2006 From: adams.brad at gmail.com (Brad Adams) Date: Thu, 19 Oct 2006 23:27:36 +0200 Subject: [Ferret-talk] kill the stopwords!!! In-Reply-To: <5b8105f5c679ba46985c6557a80e4bd3@ruby-forum.com> References: <64E8273D-D7CC-4DE9-8BA4-CB50DAF4D123@fhwang.net> <20060908074344.GD23939@cordoba.webit.de> <5b8105f5c679ba46985c6557a80e4bd3@ruby-forum.com> Message-ID: <534277142690857cc84bfded6faa8e61@ruby-forum.com> Brad Adams wrote: > >> or, with aaf: >> acts_as_ferret :analyzer => StandardAnalyzer.new([]) >> > > I've tried this with aaf, and it still uses stopwords. Anyone else have > this problem? I'm running 10.10 and aaf, plugin (as current as > today...not sure what v.). > > I've tried: > acts_as_ferret :fields => [:name], :analyzer => > Ferret::Analysis::StandardAnalyzer.new([]) > > acts_as_ferret :fields => [:name], :analyzer => StandardAnalyzer.new([]) > > even different analyzers. All of them still seem to use the stopwords. > Anyone have an idea? I've got it to work...after countless tries with different syntax, and analyzers. It worked only when I passed 'nil'. acts_as_ferret( { :fields => [:name] }, { :analyzer => Ferret::Analysis::StandardAnalyzer.new([nil]) } ) Hope that'll help anyone else that comes across this. -- Posted via http://www.ruby-forum.com/. From indanapt at yahoo.com Thu Oct 19 19:15:11 2006 From: indanapt at yahoo.com (Jeff Gortatowsky) Date: Fri, 20 Oct 2006 01:15:11 +0200 Subject: [Ferret-talk] Newbie question: 28000+ files for 25000+ records? In-Reply-To: References: Message-ID: David Balmain wrote: > On 10/13/06, Jeff Gortatowsky wrote: > Hi Jeff, this doesn't sound right at all. Could send a partial listing > of the directory so I can see what files are in it? Do `ls -l` so I Below is a very very very partial listing. My env is Windows XP Pro. THe verions of gems is listed below as well. Basically I accessed the first model object and said model.save! to kick off the indexing. Which it did. BTW: this is SQLServer if it matters. BTW: The searching the index works. well... I found out when I asked to highlight() that I never get anything back. Looking at the soruce code and my fields I find I must have had (or defaulted) to :store=>no so I have to retrieve the row, iterate myself over the fields to find out which field matched, and then display the results. That is not pretty but I have to admit, it's painless. Still 25,000 records made 28,000+ files. Can you imagine all 8.1 million records!! Is it because one of the fields being indexed is always unique (think User ID/Primary key)? I was going to trying it in Lucene and see what happens. I figure if it is different, the must be doing something odd in Ferret/AaF. Plus I can try native ferret to create the index and forego AaF for the initial index creation (assuming that is a 'fix'). Thank you for any time and effort. I am becoming quite a Ruby/Rails/Ferret fan for prototyping. I can say as I am ready for Rails on my production envronoment hosting 40k logged in users a night, but it's wonderful for concept exploration. Here is the partial listing (they are representational of all the other files except for the last two of which they are the only ones). After the listing is my gems versions 10/11/2006 08:23 PM 1,300 _z.cfs 10/11/2006 08:25 PM 1,314 _z0.cfs 10/11/2006 08:25 PM 1,705 _z1.cfs 10/11/2006 08:26 PM 3,039 _z2.cfs 10/11/2006 08:26 PM 970 _z3.cfs 10/11/2006 08:26 PM 3,015 _z4.cfs 10/11/2006 08:26 PM 14,266 _z5.cfs 10/11/2006 08:26 PM 770 _z6.cfs 10/11/2006 08:26 PM 815 _z7.cfs 10/11/2006 08:26 PM 1,150 _z8.cfs 10/11/2006 08:26 PM 1,564 _z9.cfs 10/11/2006 08:26 PM 2,283 _za.cfs 10/11/2006 08:26 PM 1,259 _zb.cfs 10/11/2006 08:26 PM 1,598 _zc.cfs 10/11/2006 08:26 PM 1,655 _zd.cfs 10/11/2006 08:26 PM 5,466 _ze.cfs 10/11/2006 08:26 PM 1,242 _zf.cfs 10/11/2006 08:26 PM 13,609 _zg.cfs 10/11/2006 08:26 PM 2,081 _zh.cfs 10/11/2006 08:26 PM 1,101 _zi.cfs 10/11/2006 08:26 PM 1,053 _zj.cfs 10/11/2006 08:26 PM 2,208 _zk.cfs 10/11/2006 08:26 PM 920 _zl.cfs 10/11/2006 08:26 PM 3,003 _zm.cfs 10/11/2006 08:26 PM 2,148 _zn.cfs 10/11/2006 08:26 PM 1,195 _zo.cfs 10/11/2006 08:26 PM 1,707 _zp.cfs 10/11/2006 08:26 PM 1,747 _zq.cfs 10/11/2006 08:26 PM 12,889 _zr.cfs 10/11/2006 08:26 PM 2,531 _zs.cfs 10/11/2006 08:26 PM 1,359 _zt.cfs 10/11/2006 08:26 PM 2,330 _zu.cfs 10/11/2006 08:26 PM 1,793 _zv.cfs 10/11/2006 08:26 PM 1,788 _zw.cfs 10/11/2006 08:26 PM 3,135 _zx.cfs 10/11/2006 08:26 PM 2,603 _zy.cfs 10/11/2006 08:26 PM 2,210 _zz.cfs 10/12/2006 08:39 AM 213 fields 10/12/2006 08:40 AM 29 segments 28381 File(s) 261,021,758 bytes 2 Dir(s) 35,192,127,488 bytes fre actionmailer (1.2.5), actionpack (1.12.5), actionwebservice (1.1.6) activerecord (1.14.4), activesupport (1.3.1), ferret (0.10.9), fxri (0.3.3), fxruby (1.6.2, 1.6.1, 1.6.0, 1.2.6), gem_plugin (0.2.1) log4r (1.0.5), mongrel (0.3.13.3) rails (1.1.6), rake (0.7.1) sources (0.0.1), win32-clipboard (0.4.1, 0.4.0) win32-dir (0.3.0) win32-eventlog (0.4.2, 0.4.1) win32-file (0.5.2) win32-file-stat (1.2.2) win32-process (0.5.1, 0.4.2) win32-sapi (0.1.3) win32-service (0.5.0) win32-sound (0.4.0) windows-pr (0.5.4, 0.5.1) -- Posted via http://www.ruby-forum.com/. From john at johnleach.co.uk Thu Oct 19 19:22:15 2006 From: john at johnleach.co.uk (John Leach) Date: Fri, 20 Oct 2006 00:22:15 +0100 Subject: [Ferret-talk] problem with queries In-Reply-To: <002801c6f368$0b3f8b60$0700000a@ORION> References: <002801c6f368$0b3f8b60$0700000a@ORION> Message-ID: <1161300136.6010.38.camel@localhost.localdomain> Hi Johan, I seemed to have a similar problem with 0.10.10. A search for "monkey" would produce correct results + incoherent results, whereas a search for "monkey~" returns only the correct result. Setting various default slop values helped none. Anyway, before I could confirm this wasn't something dumb *I* was doing, 0.10.11 came out and solved my problem. So try upgrading (the gem is available). John. -- http://johnleach.co.uk On Thu, 2006-10-19 at 12:19 +0200, johan duflost wrote: > I upgraded to ferret 0.10.10 and I noticed a strange behaviour with queries. > Another example, the first query returns a correct result + incoherent > results, the second query returns only the correct result: > > (familynames|firstnames:baus) > > (familynames|firstnames:baus*) From indanapt at yahoo.com Thu Oct 19 19:22:49 2006 From: indanapt at yahoo.com (Jeff Gortatowsky) Date: Fri, 20 Oct 2006 01:22:49 +0200 Subject: [Ferret-talk] Newbie question: 28000+ files for 25000+ records? In-Reply-To: <20061013083100.GA14271@cordoba.webit.de> References: <20061013083100.GA14271@cordoba.webit.de> Message-ID: <83c1f81f8a51db26bf47a962d497d05e@ruby-forum.com> Jens Kraemer wrote: > To keep the index creation from happening when the index is accessed the > first time from your app (could be a search, or some update/create > operation), you can build up the index from the console, i.e. > > RAILS_ENV=production script/console >>> Model.rebuild_index > > cheers, > Jens Thank you Jens. While all you say is true, the original rowset was over 8.000.000 rows and would have taken days or more. I just wanted to do some experimentation to see if my code would work. AaF is not that well documented (well perhaps for those smarter than I) and therefore I thought my best best was to play with it in the console. Little did I realize it would go off and index the table to start. And while you are correct that most of the time, you want an index, really there are use cases where you only want to index data from that time forward. Anyway I created a much smaller table and worked with it to start. However it created 28.000 files for 25.000 records. Still not quite right. But it does work in that I can search it. BTW: Is there a method to say only return fields from the documents that matched and not all the fields of documents that had matches? Of course I did my own filter. Best wishes and thank you for the advice and counsel. Jeff -- Posted via http://www.ruby-forum.com/. From neeraj.jsr at gmail.com Thu Oct 19 21:04:23 2006 From: neeraj.jsr at gmail.com (Raj Singh) Date: Fri, 20 Oct 2006 03:04:23 +0200 Subject: [Ferret-talk] Error : End-of-File Error occured at In-Reply-To: <453524DC.1080800@benjaminkrause.com> References: <1567e15bddc1e640d2b2e17e3411c84f@ruby-forum.com> <742d3d03a03681833faeee146ccb259f@ruby-forum.com> <453524DC.1080800@benjaminkrause.com> Message-ID: Thanks. Rebuilding the index solved the problem. I'm running ferret 0.10.9. -=-Raj Benjamin Krause wrote: > Raj Singh schrieb: >> It might sound stupid question but I don't have an answer. How do I find >> what version of ferret is installed on my server. >> >> I couldn't install ferret on windows and hence I use ferret installed on >> the hosting server. Is there a particular command that I could execute >> to find what version of Ferret is running or should I ask this question >> to the admin at hosting. >> > Hey .. > > There're several ways.. if you installed ferret as gem (the suggested > and default way) try: > > benjamin at home ~ $ gem list ferret > > *** LOCAL GEMS *** > > ferret (0.10.9) > Ruby indexing library. > > > If you installed it via svn checkout, try this: > > benjamin at home ~/trunk $ script/console > Loading development environment. >>> Ferret::VERSION > => "0.10.9" > > > Ben -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Fri Oct 20 01:55:17 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Fri, 20 Oct 2006 14:55:17 +0900 Subject: [Ferret-talk] How to deal with accentuated chars in 0.10.8? In-Reply-To: References: Message-ID: On 10/20/06, Edgar wrote: > I'm startin to use Ferret and acts_as_ferret. > > I need to use something like EuropeanAnalyzer > (http://olivier.liquid-concept.com/fr/pages/2006_acts_as_ferret_accentuated_chars). > > By example, if the user search by "gonzalez" you can find documents taht > contents the term "gonz?lez" (gonzález) > > The EuropeanAnalyzer is based on Ferret::Analysis::TokenFilter, but > seems that in 0.10.x this is not available. > > What is the way to do this ? # try this. Make sure you use the -KU flag. require 'rubygems' require 'ferret' require 'jcode' ACCENTUATED_CHARS = '???A?????a????????????????' REPLACEMENT_CHARS = 'aaaaaaaaaaooooeeeeeeeeuuuc' module Ferret::Analysis class TokenFilter < TokenStream # Construct a token stream filtering the given input. def initialize(input) @input = input end end # replace accentuated chars with ASCII one class ToASCIIFilter < TokenFilter def next() token = @input.next() unless token.nil? token.text = token.text.downcase.tr(ACCENTUATED_CHARS, REPLACEMENT_CHARS) end token end end class EuropeanAnalyzer def token_stream(field, string) return ToASCIIFilter.new(StandardTokenizer.new(string)) end end end analyzer = Ferret::Analysis::EuropeanAnalyzer.new ts = analyzer.token_stream('xxx', "Let's see what " + "happens to ???A?????a????????????????") while t = ts.next puts t end From dbalmain.ml at gmail.com Fri Oct 20 02:07:45 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Fri, 20 Oct 2006 15:07:45 +0900 Subject: [Ferret-talk] Newbie question: 28000+ files for 25000+ records? In-Reply-To: <83c1f81f8a51db26bf47a962d497d05e@ruby-forum.com> References: <20061013083100.GA14271@cordoba.webit.de> <83c1f81f8a51db26bf47a962d497d05e@ruby-forum.com> Message-ID: On 10/20/06, Jeff Gortatowsky wrote: > Jens Kraemer wrote: > > > To keep the index creation from happening when the index is accessed the > > first time from your app (could be a search, or some update/create > > operation), you can build up the index from the console, i.e. > > > > RAILS_ENV=production script/console > >>> Model.rebuild_index > > > > cheers, > > Jens > > > Thank you Jens. While all you say is true, the original rowset was over > 8.000.000 rows and would have taken days or more. I just wanted to do > some experimentation to see if my code would work. AaF is not that well > documented (well perhaps for those smarter than I) and therefore I > thought my best best was to play with it in the console. Little did I > realize it would go off and index the table to start. > > And while you are correct that most of the time, you want an index, > really there are use cases where you only want to index data from that > time forward. That may be true but I don't think the goal of acts_as_ferret should be to cover all possibly use cases. It's job is to make using Ferret with ActiveRecord as easy as possible. If you need to do anything more complicated than usual then why not just use Ferret directly? > Anyway I created a much smaller table and worked with it > to start. However it created 28.000 files for 25.000 records. Still not > quite right. But it does work in that I can search it. There is something very wrong there but I have no idea what the problem. For some reason Ferret doesn't seem to be merging the index segments (judging by your following email). > BTW: Is there a method to say only return fields from the documents that > matched and not all the fields of documents that had matches? Of course > I did my own filter. Ferret documents are lazy loading so only the fields that you view get loaded. However, there is currently no way to find out which fields matched. Cheers, Dave From dbalmain.ml at gmail.com Fri Oct 20 02:35:14 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Fri, 20 Oct 2006 15:35:14 +0900 Subject: [Ferret-talk] Newbie question: 28000+ files for 25000+ records? In-Reply-To: References: Message-ID: On 10/20/06, Jeff Gortatowsky wrote: > David Balmain wrote: > > On 10/13/06, Jeff Gortatowsky wrote: > > > Hi Jeff, this doesn't sound right at all. Could send a partial listing > > of the directory so I can see what files are in it? Do `ls -l` so I > > > Below is a very very very partial listing. My env is Windows XP Pro. THe > verions of gems is listed below as well. Basically I accessed the first > model object and said model.save! to kick off the indexing. Which it > did. BTW: this is SQLServer if it matters. BTW: The searching the index > works. well... Ahhh. I've had this problem in Windows before but I thought it was fixed. For some reason the operating system musn't be allowing Ferret to delete the index files when it is finished with them. I'm not sure why this would be happening though. This would gives us approximately 25_000 + 2500 + 250 + 25 + 2 = 27777 files after merging. This is still short of the 28300 files you have though. :-( > I found out when I asked to highlight() that I never get anything back. > Looking at the soruce code and my fields I find I must have had (or > defaulted) to :store=>no so I have to retrieve the row, iterate myself > over the fields to find out which field matched, and then display the > results. That is not pretty but I have to admit, it's painless. This is one of the reasons I want to implement a database based on Ferret, so that operations like this will be very simple. I could add a highlighting method for externally stored fields but you need to store term vectors for the highlighting to work exactly (ie for stemmed terms and matching sloppy phrases exactly) so if you are storing term_vectors, you may as well store the field as well. For externally stored fields the highlighting method you are using is best. > Still > 25,000 records made 28,000+ files. Can you imagine all 8.1 million > records!! Is it because one of the fields being indexed is always unique > (think User ID/Primary key)? No, I think the majority of those files are obselete. In fact I'm not sure if Windows would even allow you to open that many files at once (and Ferret does open all of the files in the index directory.) If you open up the segments file you'll see a list of the segments that are actually still being used by Ferret (along with a bunch of binary data). Given that your segments file is only 29 bytes, I'm guessing that you have optimized your index and you only have one valid index segment. The rest is junk. For the record I indexed 2,000,000 records the other day (approximately 4000kb each) in 2 1/2 hours and I had at most 120 files in my index directory. > I was going to trying it in Lucene and see what happens. I figure if it > is different, the must be doing something odd in Ferret/AaF. Plus I can > try native ferret to create the index and forego AaF for the initial > index creation (assuming that is a 'fix'). Lucene actually records a list of files it fails to delete and continues to try and delete those files. It's a bit of a hack and I was hoping to get away with not doing that in Ferret. Looks like I was wrong. I wonder why it works for me and not for you. I have XP Home edition so it should be the same. > Thank you for any time and effort. I am becoming quite a > Ruby/Rails/Ferret fan for prototyping. I can say as I am ready for Rails > on my production envronoment hosting 40k logged in users a night, but > it's wonderful for concept exploration. > > Here is the partial listing (they are representational of all the other > files except for the last two of which they are the only ones). After > the listing is my gems versions > > 10/11/2006 08:23 PM 1,300 _z.cfs > 10/11/2006 08:25 PM 1,314 _z0.cfs > 10/11/2006 08:25 PM 1,705 _z1.cfs > 10/11/2006 08:26 PM 3,039 _z2.cfs > 10/11/2006 08:26 PM 970 _z3.cfs > 10/11/2006 08:26 PM 3,015 _z4.cfs > 10/11/2006 08:26 PM 14,266 _z5.cfs > 10/11/2006 08:26 PM 770 _z6.cfs > 10/11/2006 08:26 PM 815 _z7.cfs > 10/11/2006 08:26 PM 1,150 _z8.cfs > 10/11/2006 08:26 PM 1,564 _z9.cfs > 10/11/2006 08:26 PM 2,283 _za.cfs > 10/11/2006 08:26 PM 1,259 _zb.cfs > 10/11/2006 08:26 PM 1,598 _zc.cfs > 10/11/2006 08:26 PM 1,655 _zd.cfs > 10/11/2006 08:26 PM 5,466 _ze.cfs > 10/11/2006 08:26 PM 1,242 _zf.cfs > 10/11/2006 08:26 PM 13,609 _zg.cfs > 10/11/2006 08:26 PM 2,081 _zh.cfs > 10/11/2006 08:26 PM 1,101 _zi.cfs > 10/11/2006 08:26 PM 1,053 _zj.cfs > 10/11/2006 08:26 PM 2,208 _zk.cfs > 10/11/2006 08:26 PM 920 _zl.cfs > 10/11/2006 08:26 PM 3,003 _zm.cfs > 10/11/2006 08:26 PM 2,148 _zn.cfs > 10/11/2006 08:26 PM 1,195 _zo.cfs > 10/11/2006 08:26 PM 1,707 _zp.cfs > 10/11/2006 08:26 PM 1,747 _zq.cfs > 10/11/2006 08:26 PM 12,889 _zr.cfs > 10/11/2006 08:26 PM 2,531 _zs.cfs > 10/11/2006 08:26 PM 1,359 _zt.cfs > 10/11/2006 08:26 PM 2,330 _zu.cfs > 10/11/2006 08:26 PM 1,793 _zv.cfs > 10/11/2006 08:26 PM 1,788 _zw.cfs > 10/11/2006 08:26 PM 3,135 _zx.cfs > 10/11/2006 08:26 PM 2,603 _zy.cfs > 10/11/2006 08:26 PM 2,210 _zz.cfs > 10/12/2006 08:39 AM 213 fields > 10/12/2006 08:40 AM 29 segments > 28381 File(s) 261,021,758 bytes > 2 Dir(s) 35,192,127,488 bytes fre > > > > actionmailer (1.2.5), actionpack (1.12.5), actionwebservice (1.1.6) > activerecord (1.14.4), activesupport (1.3.1), ferret (0.10.9), > fxri (0.3.3), fxruby (1.6.2, 1.6.1, 1.6.0, 1.2.6), gem_plugin (0.2.1) > log4r (1.0.5), mongrel (0.3.13.3) > rails (1.1.6), rake (0.7.1) > sources (0.0.1), win32-clipboard (0.4.1, 0.4.0) > win32-dir (0.3.0) > win32-eventlog (0.4.2, 0.4.1) > win32-file (0.5.2) > win32-file-stat (1.2.2) > win32-process (0.5.1, 0.4.2) > win32-sapi (0.1.3) > win32-service (0.5.0) > win32-sound (0.4.0) > windows-pr (0.5.4, 0.5.1) > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From dbalmain.ml at gmail.com Fri Oct 20 02:36:49 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Fri, 20 Oct 2006 15:36:49 +0900 Subject: [Ferret-talk] problem with queries In-Reply-To: <1161300136.6010.38.camel@localhost.localdomain> References: <002801c6f368$0b3f8b60$0700000a@ORION> <1161300136.6010.38.camel@localhost.localdomain> Message-ID: On 10/20/06, John Leach wrote: > Hi Johan, > > I seemed to have a similar problem with 0.10.10. A search for "monkey" > would produce correct results + incoherent results, whereas a search for > "monkey~" returns only the correct result. > > Setting various default slop values helped none. > > Anyway, before I could confirm this wasn't something dumb *I* was doing, > 0.10.11 came out and solved my problem. So try upgrading (the gem is > available). That's right. 0.10.10 was a buggy release. Go for 0.10.11 or later. (0.10.12 is coming in an hour) From jbordier at rift.fr Fri Oct 20 08:10:26 2006 From: jbordier at rift.fr (ahFeel) Date: Fri, 20 Oct 2006 14:10:26 +0200 Subject: [Ferret-talk] Big problem with 0.10.12 gem :O In-Reply-To: <9d479328d9ae9d45f239027b0d16705b@ruby-forum.com> References: <9d479328d9ae9d45f239027b0d16705b@ruby-forum.com> Message-ID: > Here's the dump... http://pastie.caboo.se/18686 ahFeel <= here's the dumb :/ Cheers :) J?r?mie 'ahFeel' BORDIER -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Fri Oct 20 09:18:47 2006 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 20 Oct 2006 15:18:47 +0200 Subject: [Ferret-talk] not able to install acts_as_ferret In-Reply-To: <7bd13bf4ff73d6a2401a52fa6d142afb@ruby-forum.com> References: <7bd13bf4ff73d6a2401a52fa6d142afb@ruby-forum.com> Message-ID: <20061020131847.GA24958@cordoba.webit.de> Hi Mark, are you sure there's no firewall blocking things on your side ? The svn: protocol uses Port 3690, sometimes this is a problem with restrictive corporate networks and such. Jens On Thu, Oct 19, 2006 at 07:59:07PM +0200, Mark Puckett wrote: > Still can't connect, but I finally noticed the trunk export at the > bottom of the aaf wiki. -Mark > > Mark Puckett wrote: > > I have an older version installed and want to try the latest ferret/aaf > > to see if it solves some perf problems, but haven't been able to get aaf > > on multiple tries on multiple days now. > > > > script/plugin install > > svn://projects.jkraemer.net/acts_as_ferret/trunk/plugin/acts_as_ferret > > > > svn: Can't connect to host 'projects.jkraemer.net': Operation timed out > > > > Has anybody else installed this plugin lately? > > Any other way to get the latest? > > > > Thanks > > -Mark > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From dbalmain.ml at gmail.com Fri Oct 20 09:46:13 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Fri, 20 Oct 2006 22:46:13 +0900 Subject: [Ferret-talk] [ANN] Ferret 0.10.13 released Message-ID: Hi Folks, I've just release Ferret 0.10.13 (skip 0.10.12, it was a bad build). There are two interesting additions to this release. You can now access the Filter#bits method of the built in filters so you can can use them in your own code, possibly within your own custom filters. For example you could implement a custom filter like so: class MultiFilter < Hash def bits(index_reader) bit_vector = Ferret::Utils::BitVector.new.not! filters = self.values filters.each {|filter| bit_vector.and!(filter.bits(index_reader))} bit_vector end end And you would use it like this: mf = MultiFilter.new mf[:category] = category_filter mf[:country] = country_filter # run the query with the filter index.search(query, :filter => mf) # filters can be changed and deleted mf[:category] = new_category_filter mf.delete(:country) index.search(query, :filter => mf) The other major addition is a MappingFilter (< TokenFilter). This can be used to transform your code from UTF-8 to ascii for example. I posted an example of how to do this earlier today. However, using the mapping filter you can apply a list of mappings string mappings rather than just character mappings. Obviously you could acheive this with a list of "String#gsub!"s but MappingFilter will compile the mappings into a DFA so it will be a *lot* faster. Here is an example: include Ferret::Analysis class EuropeanAnalyzer MAPPING = { ['?', '?', '?', 'A', '?', '?', '?', '?', '?', 'a'] => 'a', ['?', '?', '?', '?'] => 'o', ['?', '?', '?', '?', '?', '?', '?', '?'] => 'e', ['?', '?', '?'] => 'u', ['?'] => 'c' } def token_stream(field, string) return MappingFilter.new(StandardTokenizer.new(string), MAPPING) end end Happy Ferreting and check the Ferret homepage[1] if you are able to contribute. Cheers, Dave [1] http://ferret.davebalmain.com/trac/ From dbalmain.ml at gmail.com Fri Oct 20 09:55:08 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Fri, 20 Oct 2006 22:55:08 +0900 Subject: [Ferret-talk] Big problem with 0.10.12 gem :O In-Reply-To: References: <9d479328d9ae9d45f239027b0d16705b@ruby-forum.com> Message-ID: On 10/20/06, ahFeel wrote: > > Here's the dump... > > http://pastie.caboo.se/18686 > > ahFeel <= here's the dumb :/ > > Cheers :) > J?r?mie 'ahFeel' BORDIER Thanks J?r?mie, I tried deleting the release but it is still on the gemserver. Anyway, I've released 0.10.13. Cheers, Dave From edgargonzalez at gmail.com Fri Oct 20 10:26:35 2006 From: edgargonzalez at gmail.com (Edgar) Date: Fri, 20 Oct 2006 16:26:35 +0200 Subject: [Ferret-talk] How to deal with accentuated chars in 0.10.8? In-Reply-To: References: Message-ID: <442beae86c479c24222e14df04a36abb@ruby-forum.com> David, Thanks for the tip, but I'll try your latest release (0.10.13) :-) -- Posted via http://www.ruby-forum.com/. From neeraj.jsr at gmail.com Fri Oct 20 11:01:04 2006 From: neeraj.jsr at gmail.com (Raj Singh) Date: Fri, 20 Oct 2006 17:01:04 +0200 Subject: [Ferret-talk] Error : End-of-File Error occured at In-Reply-To: References: <1567e15bddc1e640d2b2e17e3411c84f@ruby-forum.com> <742d3d03a03681833faeee146ccb259f@ruby-forum.com> <453524DC.1080800@benjaminkrause.com> Message-ID: <7bdd7e18d9a394b1b19d82ed22166196@ruby-forum.com> This problem is back and now I know the pattern. I rebuilt the index and things started working. Then I started adding events again to the application. After adding 60/70 events the problem was back. I got the exception because of End-of-File Error occured at Then I rebuilt the index and it started working. Again I had the same issue after I added 50/60 records. Am I missing something here. I am using ferret 0.10.9 and the latest acts_as_ferret plugin. Thanks Raj Singh wrote: > Thanks. > > Rebuilding the index solved the problem. I'm running ferret 0.10.9. > > -=-Raj > > Benjamin Krause wrote: >> Raj Singh schrieb: >>> It might sound stupid question but I don't have an answer. How do I find >>> what version of ferret is installed on my server. >>> >>> I couldn't install ferret on windows and hence I use ferret installed on >>> the hosting server. Is there a particular command that I could execute >>> to find what version of Ferret is running or should I ask this question >>> to the admin at hosting. >>> >> Hey .. >> >> There're several ways.. if you installed ferret as gem (the suggested >> and default way) try: >> >> benjamin at home ~ $ gem list ferret >> >> *** LOCAL GEMS *** >> >> ferret (0.10.9) >> Ruby indexing library. >> >> >> If you installed it via svn checkout, try this: >> >> benjamin at home ~/trunk $ script/console >> Loading development environment. >>>> Ferret::VERSION >> => "0.10.9" >> >> >> Ben -- Posted via http://www.ruby-forum.com/. From jbordier at rift.fr Fri Oct 20 11:02:26 2006 From: jbordier at rift.fr (ahFeel) Date: Fri, 20 Oct 2006 17:02:26 +0200 Subject: [Ferret-talk] Bug in search matching ? Message-ID: <934d68f32a7a8f98fc139adc745963cd@ruby-forum.com> Hi :) Here's a little code reproducing something that i consider as a bug, if it's not please explain :] http://pastie.caboo.se/18693 Thanks by advance, Cheers, J?r?mie 'ahFeel' BORDIER -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Fri Oct 20 11:19:38 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sat, 21 Oct 2006 00:19:38 +0900 Subject: [Ferret-talk] [Bug] Seg Faulting in index.rb:718 In-Reply-To: <6d535845689870e08f457171400f4a96@ruby-forum.com> References: <6d535845689870e08f457171400f4a96@ruby-forum.com> Message-ID: On 10/19/06, Ilya Grigorik wrote: > Dave, > > I've just rebuilt the index - I'll let you know if that helps at all. > But, since the line number doesnt give us anything, do you have any tips > for tracking down the problem? I can't seem to reproduce the error > myself so I'm simply pulling the Seg Fault messages from my mongrel.log > - which doesn't give much aside from the actual Seg Fault error. I've > been trying to correlate the seg faults with my production.log, but so > far, no luck. > > Ilya You could try recompiling Ferret with the option -dH (as long as you are using gcc). This will cause Ferret to dump core when it segfaults. You can then use gdb to try and find the error or email me the core dump. There are instructions for building your own gem here: http://ferret.davebalmain.com/trac/wiki/DownloadCurrent Cheers, Dave From dbalmain.ml at gmail.com Fri Oct 20 11:21:11 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sat, 21 Oct 2006 00:21:11 +0900 Subject: [Ferret-talk] not able to install acts_as_ferret In-Reply-To: <20061020131847.GA24958@cordoba.webit.de> References: <7bd13bf4ff73d6a2401a52fa6d142afb@ruby-forum.com> <20061020131847.GA24958@cordoba.webit.de> Message-ID: On 10/20/06, Jens Kraemer wrote: > Hi Mark, > > are you sure there's no firewall blocking things on your side ? > The svn: protocol uses Port 3690, sometimes this is a problem with > restrictive corporate networks and such. > > Jens FYI I can connect from here too. Must be a firewall issue. From dbalmain.ml at gmail.com Fri Oct 20 12:05:57 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sat, 21 Oct 2006 01:05:57 +0900 Subject: [Ferret-talk] Bug in search matching ? In-Reply-To: <934d68f32a7a8f98fc139adc745963cd@ruby-forum.com> References: <934d68f32a7a8f98fc139adc745963cd@ruby-forum.com> Message-ID: On 10/21/06, ahFeel wrote: > Hi :) > > Here's a little code reproducing something that i consider as a bug, if > it's not please explain :] > > http://pastie.caboo.se/18693 Hi J?r?mie, You can get rid of this behaviour by building your own analyzer and not including the HyphenFilter. This is a tricky issue which I haven't quite worked out yet. For example, when you search for "set-up" do you want that to match "set up" and "setup". What if you search for "setup" or "set up"? Should they match all three versions too? With the current HyphenFilter these all three versions in queries will match all three versions in the index. However, this comes at the loss of recall. The problems occur during phrase queries. To make it so that "set-up" matches both "setup" and "set up", "set-up" is analyzed as "set up and "setup" so in the first position there are two words in the tokenstream; "set" and "setup". When I parse the phrase "set-up files" I get the two phrases: "set____up__files" "setup______files" So as you can see the second phrase only has two terms. so there is a gap in betwen. To get the phrase "setup files" to match this I need to give it a slop value. Now I realize the solution is not ideal. I've had to forsake some precision for a gain in recall but I can't think of a better way. If you can come up with a fool-proof way to handle hyphenated terms I'd love to hear it. I will probably remove the HyphenFilter from the StandardFilter in a futer version if I can't think of a better way to do this. By the way, for the people reading this who think that "setup" is not a word, I agree so consider "e-mail" and "email" instead. Cheers, Dave PS: I've pasted the code below for reference. I'm not sure how long the pasties stick around for. require 'rubygems' require 'ferret' path = "/tmp/index" system("rm -rf #{path}; mkdir -p #{path}") index = Ferret::Index::Index.new(:path => path) index << {:type => :bug, :name => 'foo-bar'} index << {:type => :bug, :name => 'foo-bar-core'} queries = ['foo-bar', 'foo-core'] queries.each do |name| query = "type:bug AND name:#{name}" puts "\nquery : #{query}" res = index.search(query) puts "total hits = #{res.total_hits}" res.hits.each { |x| p index[x.doc].load.inspect } end From indanapt at yahoo.com Fri Oct 20 12:08:21 2006 From: indanapt at yahoo.com (Jeff Gortatowsky) Date: Fri, 20 Oct 2006 18:08:21 +0200 Subject: [Ferret-talk] Newbie question: 28000+ files for 25000+ records? In-Reply-To: References: Message-ID: Thanks everyone for you help. Of source I meant no disrespect about AaF indexing tables as soon as it discovers there is none. I am quite appreciative of the Plugin as it is. Please don't think I don't appreciate it. :) And I really thank you all for the help C:\ruby\omi\index\development\string_answer>od -c segments 0000000 \0 \0 \0 \0 \0 \0 \0 \0 E . # ? \0 \0 \0 \0 0000020 \0 \0 n ? 001 004 _ l w a + ? 001 0000035 So am I correct in say that the file _lwa.cfs is the only file really needed? Thanks again. It's great to see that it really worked and that the only problem is 'Windoze' related. I would be working on Ubuntu but the SQLServer adapter I tried there could not page through data sets. -- Posted via http://www.ruby-forum.com/. From jbordier at rift.fr Fri Oct 20 13:35:49 2006 From: jbordier at rift.fr (ahFeel) Date: Fri, 20 Oct 2006 19:35:49 +0200 Subject: [Ferret-talk] Bug in search matching ? In-Reply-To: References: <934d68f32a7a8f98fc139adc745963cd@ruby-forum.com> Message-ID: <59abed2e61e3a63b47f8813e6537c87f@ruby-forum.com> Hi Dave ! Thank you for your answer, i've totally understood the matter and i must say that's quite annoying... I guess that you won't satisfy everyone with removing this feature, because it really depends on the application you wanna run.. Someone running something like a wiki would like to get the same results with e-mail and email, that's correct and it's of course a good feature, but in my case, i really don't want my query "name:package-dev" to send back 'package-foobar-dev' etc... that's a really big problem for me. Actually, i think using different operators calling different parsing methods could be a *correct* solution, like "type:e-mail" would match email and 'e mail' and "type=e-mail" would only match "e-mail" (in regexp: /^e-mail.*/). The '=' operator is quite self explanatory for exact pattern matching, so it could be easy to understand... It could be a way to keep the flexibility of the current search matching method, and to include a more strict pattern method for those who needs that.. Anyway, Thank you for the solution ! Cheers, Jeremie 'ahFeel' BORDIER -- Posted via http://www.ruby-forum.com/. From edgargonzalez at gmail.com Fri Oct 20 13:46:50 2006 From: edgargonzalez at gmail.com (Edgar) Date: Fri, 20 Oct 2006 19:46:50 +0200 Subject: [Ferret-talk] Newbie question - search by "cancion" found "canció n" Message-ID: My documents contains acuted characters (á é í etc ) It's possible in ferret that searching by "cancion" term found the documents contains the words "canción" and "cancion" ? -- Posted via http://www.ruby-forum.com/. From indanapt at yahoo.com Fri Oct 20 13:48:19 2006 From: indanapt at yahoo.com (Jeff Gortatowsky) Date: Fri, 20 Oct 2006 19:48:19 +0200 Subject: [Ferret-talk] Newbie question: 28000+ files for 25000+ records? In-Reply-To: References: Message-ID: <5ec848bf99aca41dd92406f27e9e9543@ruby-forum.com> Jeff Gortatowsky wrote: > Thanks everyone for you help. Sorry for all the typos. To place an ending on this I did indeed move all the files out except fields, segments, and _lwa.cfs and it -seems- to be working. If I have a moment to look at where I may have created a problem closing files, I will. And I can easily try recreating the index on Ubuntu as it involves no paging (which I only deal with when interacting with end users). Thanks Dave and Jen for the advice and counsel. -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Fri Oct 20 13:58:22 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sat, 21 Oct 2006 02:58:22 +0900 Subject: [Ferret-talk] Newbie question: 28000+ files for 25000+ records? In-Reply-To: References: Message-ID: On 10/21/06, Jeff Gortatowsky wrote: > Thanks everyone for you help. Of source I meant no disrespect about AaF > indexing tables as soon as it discovers there is none. I am quite > appreciative of the Plugin as it is. Please don't think I don't > appreciate it. :) > > And I really thank you all for the help > > C:\ruby\omi\index\development\string_answer>od -c segments > 0000000 \0 \0 \0 \0 \0 \0 \0 \0 E . # ? \0 \0 \0 \0 > 0000020 \0 \0 n ? 001 004 _ l w a + ? 001 > 0000035 > > So am I correct in say that the file _lwa.cfs is the only file really > needed? Well, sort of. Ferret does write a couple of files while it is indexing that won't appear in the segments file. Also don't delete the fields file. Otherwise, "lwa" is base 36 integer so any file labled _lw9 and bellow you can delete. > Thanks again. It's great to see that it really worked and that the only > problem is 'Windoze' related. I would be working on Ubuntu but the > SQLServer adapter I tried there could not page through data sets. Ferret definitely works on Ubuntu. That's were it was developed and I think Jens my actually develop acts_as_ferret on Ubuntu too. Cheers, Dave From heikowebers at gmx.net Fri Oct 20 20:04:14 2006 From: heikowebers at gmx.net (hawe) Date: Sat, 21 Oct 2006 02:04:14 +0200 Subject: [Ferret-talk] install ferret on windows In-Reply-To: References: <4a7404ee0ad4003672ebf2632ad88d23@ruby-forum.com> Message-ID: Thanks, that's it... -- Posted via http://www.ruby-forum.com/. From payscroll at gmail.com Sat Oct 21 12:05:45 2006 From: payscroll at gmail.com (Alfred Toh) Date: Sat, 21 Oct 2006 18:05:45 +0200 Subject: [Ferret-talk] find_by_content result set Message-ID: <59a79b137c59f644c40acde82139d50a@ruby-forum.com> Hi Guys I'm experiencing with AAF and Ferret with the intention of deploying into the site that I am working on now. So I setup AAF to index 3 fields that I have in this model and i tried doing a find_by_contents and it returned the # References: <59a79b137c59f644c40acde82139d50a@ruby-forum.com> Message-ID: Ok.. after digging more into acts_as_ferret.rb, I realize I can add an attr_accessor for @results Any reason/problem that I shouldn't be doing that that since it was not left out in the latest release? Another problem I'm having now is to sort the result by using the find_options find_options is a hash passed on to active_record?s find when retrieving the data from db, useful to i.e. prefetch relationships. but I wasn't able to get it to work with :order by doing this... moto=Pay.find_by_contents("motorola",:limit=>:all,:order=>"salary DESC") anyone has similar problems too? thanks for helping! Rgds Alfred Toh Payscroll.com Alfred Toh wrote: > Hi Guys > > I'm experiencing with AAF and Ferret with the intention of deploying > into the site that I am working on now. > > So I setup AAF to index 3 fields that I have in this model and i tried > doing a find_by_contents and it returned the > > # @total_hits=1157, @results=[# @attributes={"add...... > > but it seems that there is no way to access the result set... I read > somewhere that it's not implement yet.. So any knows when the release > will be. and until there is there any workaround for this. > > I know that you can do a find_by_contents_id, but that returns the model > id set, and then you need to build the result set from the id. > > Thanks for any advice! > > Rgds > Alfred Toh > Payscroll.com -- Posted via http://www.ruby-forum.com/. From papipo at gmail.com Sat Oct 21 20:32:12 2006 From: papipo at gmail.com (Rodrigo Alvarez) Date: Sun, 22 Oct 2006 02:32:12 +0200 Subject: [Ferret-talk] Rails association and multiple indexes Message-ID: <9565a0f4c7830d53af9ba0aafcfb4ac2@ruby-forum.com> Hi! If I have two models, Product and Manufacturer, of course Product belongs_to :manufacturer. A search engine would allow a user to look for a product by its name or manufacturer. Is it better to define a method like: def searchable_field "#{name} #{manufacturer.name}" end and add it as indexable field (acts_as_ferret :fields => ['searchable_field'])... Or maybe that it is more advisable to index different fields: Product < ActiveRecord::Base acts_as_ferret :fields => ['name', 'manufacturer_name'] def manufacturer_name "#{manufacturer.name}" end end Thanks in advance. -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Sun Oct 22 06:12:26 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sun, 22 Oct 2006 19:12:26 +0900 Subject: [Ferret-talk] Rails association and multiple indexes In-Reply-To: <9565a0f4c7830d53af9ba0aafcfb4ac2@ruby-forum.com> References: <9565a0f4c7830d53af9ba0aafcfb4ac2@ruby-forum.com> Message-ID: On 10/22/06, Rodrigo Alvarez wrote: > Hi! > > If I have two models, Product and Manufacturer, of course Product > belongs_to :manufacturer. > > A search engine would allow a user to look for a product by its name or > manufacturer. Is it better to define a method like: > > def searchable_field > "#{name} #{manufacturer.name}" > end > > and add it as indexable field (acts_as_ferret :fields => > ['searchable_field'])... > > Or maybe that it is more advisable to index different fields: > > Product < ActiveRecord::Base > acts_as_ferret :fields => ['name', 'manufacturer_name'] > > def manufacturer_name > "#{manufacturer.name}" > end > end > > Thanks in advance. Hi Rodrigo, It's better to index individual fields. You can easily search both fields like this: "name|manufacturer_name:(#{query})" I can't really think of any advantages to putting buth fields into the one search field. -- Dave Balmain http://www.davebalmain.com/ From tennisbum2002 at hotmail.com Sun Oct 22 18:40:38 2006 From: tennisbum2002 at hotmail.com (Eric Gross) Date: Mon, 23 Oct 2006 00:40:38 +0200 Subject: [Ferret-talk] pagination in acts_as_ferret In-Reply-To: <20061017114316.GL14271@cordoba.webit.de> References: <20060503163044.GS29289@cordoba.webit.de> <17ea47bafad9e62b273baa7796003c11@ruby-forum.com> <20061017114316.GL14271@cordoba.webit.de> Message-ID: hey guys, any idea how to use those options with multi_search I tried it on find_by_contents and it works fine, however, for multi_search i do: @results = User.multi_search(parse(@query),[Book],{:offset=>0,:limit=>5}) or @results = User.multi_search(parse(@query),[Book],:offset=>0,:limit=>5) and neither works, however I get no error either. Whats wrong? -- Posted via http://www.ruby-forum.com/. From tennisbum2002 at hotmail.com Sun Oct 22 18:41:08 2006 From: tennisbum2002 at hotmail.com (Eric Gross) Date: Mon, 23 Oct 2006 00:41:08 +0200 Subject: [Ferret-talk] Search multiple models In-Reply-To: References: Message-ID: <537c544d1d0870ae2f1e80de36b79f1e@ruby-forum.com> hey guys, any idea how to use those options with multi_search I tried it on find_by_contents and it works fine, however, for multi_search i do: @results = User.multi_search(parse(@query),[Book],{:offset=>0,:limit=>5}) or @results = User.multi_search(parse(@query),[Book],:offset=>0,:limit=>5) and neither works, however I get no error either. Whats wrong? -- Posted via http://www.ruby-forum.com/. From andreas.korth at gmx.net Sun Oct 22 20:52:03 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Mon, 23 Oct 2006 02:52:03 +0200 Subject: [Ferret-talk] Trouble with custom Analyzer Message-ID: Hi! I wanted to build my own custom Analyzer like so: class Analyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, string) StopFilter.new(LetterTokenizer.new(string, true), @stop_words) end end As one can easily spot, I essentially want a LetterAnalyzer with stop word filtering. However, using that analyzer (for indexing) results in a segmentation fault. /opt/local/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferret/ index.rb:281: [BUG] Segmentation fault ruby 1.8.5 (2006-08-25) [powerpc-darwin8.8.0] This is admittedly a rather naive implementation which is extrapolated from those I found in the docs. So what am I missing here? Cheers, Andy From howardmoon at hitcity.com.au Sun Oct 22 21:09:01 2006 From: howardmoon at hitcity.com.au (Peter Royle) Date: Mon, 23 Oct 2006 03:09:01 +0200 Subject: [Ferret-talk] [ANN] Ferret Smoke Test In-Reply-To: References: <453504E6.4060106@benjaminkrause.com> <9ee55408d26d9ef4e69b35942e1a633b@ruby-forum.com> Message-ID: <9be58b95e6e474b7b4a9355c428975c1@ruby-forum.com> OK I'll go with 1.8.2 then. I've got both 18.2 and 1.8.4 installed so it's not a drama. The only thing is that smoke_alarm.rb reports the version it's being run against (has to be 1.8.4), not necessarily the one being used to run the tests. I might have a crack at fixing this up a bit later and maybe send you a patch. I've done one trial run ("Test run on 2006-10-23 10:03:18") which uses... IO.popen('ruby -v') do |io| @version_info = io.read end ...to get the version being used to run the tests, so it'll just a matter of parsing it to extract the different pieces of info. Pete. -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Mon Oct 23 00:25:08 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Mon, 23 Oct 2006 13:25:08 +0900 Subject: [Ferret-talk] [ANN] Ferret Smoke Test In-Reply-To: <9be58b95e6e474b7b4a9355c428975c1@ruby-forum.com> References: <453504E6.4060106@benjaminkrause.com> <9ee55408d26d9ef4e69b35942e1a633b@ruby-forum.com> <9be58b95e6e474b7b4a9355c428975c1@ruby-forum.com> Message-ID: On 10/23/06, Peter Royle wrote: > OK I'll go with 1.8.2 then. I've got both 18.2 and 1.8.4 installed so > it's not a drama. > > The only thing is that smoke_alarm.rb reports the version it's being run > against (has to be 1.8.4), not necessarily the one being used to run the > tests. I might have a crack at fixing this up a bit later and maybe send > you a patch. I've done one trial run ("Test run on 2006-10-23 10:03:18") > which uses... > > IO.popen('ruby -v') do |io| > @version_info = io.read > end > > ...to get the version being used to run the tests, so it'll just a > matter of parsing it to extract the different pieces of info. I like this solution. I'm not really concerned about parsing it to extract the different fields. It is just to give me an indication of the system the test is running on so leaving it in a single field should be fine. Cheers, Dave From dbalmain.ml at gmail.com Mon Oct 23 00:32:25 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Mon, 23 Oct 2006 13:32:25 +0900 Subject: [Ferret-talk] Trouble with custom Analyzer In-Reply-To: References: Message-ID: On 10/23/06, Andreas Korth wrote: > Hi! > > I wanted to build my own custom Analyzer like so: > > class Analyzer < Ferret::Analysis::Analyzer > > include Ferret::Analysis > > def initialize(stop_words = ENGLISH_STOP_WORDS) > @stop_words = stop_words > end > > def token_stream(field, string) > StopFilter.new(LetterTokenizer.new(string, true), @stop_words) > end > > end > > As one can easily spot, I essentially want a LetterAnalyzer with stop > word filtering. However, using that analyzer (for indexing) results > in a segmentation fault. > > /opt/local/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferret/ > index.rb:281: [BUG] Segmentation fault > ruby 1.8.5 (2006-08-25) [powerpc-darwin8.8.0] > > This is admittedly a rather naive implementation which is > extrapolated from those I found in the docs. So what am I missing here? Hi Andy, This works for me so I'll need a little more info to solve the problem. First, try running this: require 'rubygems' require 'ferret' class Analyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, string) StopFilter.new(LetterTokenizer.new(string, true), @stop_words) end end i = Ferret::I.new(:analyzer => Analyzer.new) i << "A sentence to analyze" puts i.search("analyze") If that works, try and track down where in your code ferret is seg-faulting. Cheers, Dave -- Dave Balmain http://www.davebalmain.com/ From aditya_nalla at yahoo.co.in Mon Oct 23 01:17:44 2006 From: aditya_nalla at yahoo.co.in (Roger) Date: Mon, 23 Oct 2006 07:17:44 +0200 Subject: [Ferret-talk] Score in ferret Message-ID: <262ace77ed48e9ce0cd79414b7c8118d@ruby-forum.com> Whats the relevance of score given by ferret? -- Posted via http://www.ruby-forum.com/. From howardmoon at hitcity.com.au Mon Oct 23 03:08:07 2006 From: howardmoon at hitcity.com.au (Peter Royle) Date: Mon, 23 Oct 2006 09:08:07 +0200 Subject: [Ferret-talk] [ANN] Ferret Smoke Test In-Reply-To: References: <453504E6.4060106@benjaminkrause.com> <9ee55408d26d9ef4e69b35942e1a633b@ruby-forum.com> <9be58b95e6e474b7b4a9355c428975c1@ruby-forum.com> Message-ID: David Balmain wrote: > I'm not really concerned about parsing it It was simple enough so I parsed it anyway. Keeps it in sync with the existing CGI I figured. http://ferret.davebalmain.com/trac/ticket/145 Pete. -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Oct 23 03:57:27 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 23 Oct 2006 09:57:27 +0200 Subject: [Ferret-talk] find_by_content result set In-Reply-To: References: <59a79b137c59f644c40acde82139d50a@ruby-forum.com> Message-ID: <20061023075727.GC24958@cordoba.webit.de> Hi! On Sat, Oct 21, 2006 at 07:11:00PM +0200, Alfred Toh wrote: > Ok.. after digging more into acts_as_ferret.rb, I realize I can add an > attr_accessor for @results > > Any reason/problem that I shouldn't be doing that that since it was not > left out in the latest release? There's no need to do this, as the SearchResults class hands through all method calls to @results by overriding method_missing. Just use the SearchResults instance as you would use an array. > Another problem I'm having now is to sort the result by using the > find_options > > find_options is a hash passed on to active_record?s find when retrieving > the data from db, useful to i.e. prefetch relationships. > > but I wasn't able to get it to work with :order by doing this... > > moto=Pay.find_by_contents("motorola",:limit=>:all,:order=>"salary DESC") try moto=Pay.find_by_contents("motorola",{},{ :limit=>:all,:order=>"salary DESC" }) there's two optional hash arguments to find_by_contents, the first one being ferret_options, and the second one the find_options. As this problem arises quite often, I start thinking about merging them together into one in future versions. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From dbalmain.ml at gmail.com Mon Oct 23 05:00:47 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Mon, 23 Oct 2006 18:00:47 +0900 Subject: [Ferret-talk] Score in ferret In-Reply-To: <262ace77ed48e9ce0cd79414b7c8118d@ruby-forum.com> References: <262ace77ed48e9ce0cd79414b7c8118d@ruby-forum.com> Message-ID: On 10/23/06, Roger wrote: > Whats the relevance of score given by ferret? > It's a score given for how relevent the matching document is for the given query. It uses the same formula as the Lucene scoring algorithm. See here: http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html -- Dave Balmain http://www.davebalmain.com/ From kraemer at webit.de Mon Oct 23 05:22:18 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 23 Oct 2006 11:22:18 +0200 Subject: [Ferret-talk] Search multiple models In-Reply-To: <537c544d1d0870ae2f1e80de36b79f1e@ruby-forum.com> References: <537c544d1d0870ae2f1e80de36b79f1e@ruby-forum.com> Message-ID: <20061023092218.GD24958@cordoba.webit.de> On Mon, Oct 23, 2006 at 12:41:08AM +0200, Eric Gross wrote: > hey guys, any idea how to use those options with multi_search > > I tried it on find_by_contents and it works fine, however, for > multi_search i do: > > @results = > User.multi_search(parse(@query),[Book],{:offset=>0,:limit=>5}) > > or > > @results = User.multi_search(parse(@query),[Book],:offset=>0,:limit=>5) > > and neither works, however I get no error either. Whats wrong? that's not implemented yet, but there's a patch in trac I plan to integrate into the next release of aaf. http://projects.jkraemer.net/acts_as_ferret/ticket/60 Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From cdfdfs at sdfds.com Mon Oct 23 05:25:43 2006 From: cdfdfs at sdfds.com (Clare) Date: Mon, 23 Oct 2006 11:25:43 +0200 Subject: [Ferret-talk] Carrot2 Message-ID: <12fa3d4823c7419afc1d820e23a55844@ruby-forum.com> Carrot2 - the clustering engine has a ready to use integration with the Lucene index. http://project.carrot2.org/architecture.html. Does anyone know whether this would work with the Ferret index as standard? Thanks -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Oct 23 05:47:25 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 23 Oct 2006 11:47:25 +0200 Subject: [Ferret-talk] Carrot2 In-Reply-To: <12fa3d4823c7419afc1d820e23a55844@ruby-forum.com> References: <12fa3d4823c7419afc1d820e23a55844@ruby-forum.com> Message-ID: <20061023094725.GF24958@cordoba.webit.de> On Mon, Oct 23, 2006 at 11:25:43AM +0200, Clare wrote: > Carrot2 - the clustering engine has a ready to use integration with the > Lucene index. http://project.carrot2.org/architecture.html. > > Does anyone know whether this would work with the Ferret index as > standard? recent ferret indexes aren't Lucene-compatible any more, so a Ferret input component would have to be implemented. Or you build an OpenSearch-compatible frontend to your Ferret index, there's already an input component for this. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From andreas.korth at gmx.net Mon Oct 23 08:48:27 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Mon, 23 Oct 2006 14:48:27 +0200 Subject: [Ferret-talk] Trouble with custom Analyzer In-Reply-To: References: Message-ID: On 23.10.2006, at 06:32, David Balmain wrote: > Hi Andy, > > This works for me so I'll need a little more info to solve the > problem. First, try running this: > > [...] > > i = Ferret::I.new(:analyzer => Analyzer.new) > > i << "A sentence to analyze" > > puts i.search("analyze") > > If that works, try and track down where in your code ferret is seg- > faulting. Dave, thanks for the hint. I was using the add_document method instead of << to add documents to the index. Changing the above code to i = Ferret::I.new() i.addDocument("A sentence to analyze", Analyzer.new) still works fine. However, changing my original code to use the << method (and specifying the Analyzer with Index.new) solves the problem. I didn't manage to distill a concise test case from my code to reproduce the segfault. And hey, why bother, it works just fine now :) Thanks again, Andy From andreas.korth at gmx.net Mon Oct 23 20:51:39 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Tue, 24 Oct 2006 02:51:39 +0200 Subject: [Ferret-talk] Locking issues when adding to the Index Message-ID: <826D1C02-11E7-49A8-9515-527396D01530@gmx.net> Hi, I keep getting locking errors when updating my index which puzzles me since only one process is supposed to be accessing the index at a time (at least in development mode). I'm using Ferret with Rails (NOT acts_as_ferret, though) and employed an observer to add a new document to the index when a new ActiveRecord is created. I'm not sure if this has any impact on the concurrency issues, but I thought I mention it. Find attached the verbatim error message and the relevant part of the stack trace. Any ideas? Cheers, Andreas -- Ferret::Store::Lock::LockError (Lock Error occured at :103 in xpop_context Error occured in index.c:5368 - iw_open Couldn't obtain write lock when opening IndexWriter ): /usr/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferret/index.rb: 664:in `initialize' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferret/index.rb: 664:in `ensure_writer_open' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferret/index.rb: 276:in `<<' /usr/lib/ruby/1.8/monitor.rb:229:in `synchronize' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferret/index.rb: 254:in `<<' /app/models/index.rb:105:in `add' /app/models/index.rb:20:in `after_create' /usr/lib/ruby/gems/1.8/gems/activerecord-1.14.4.5263/lib/ active_record/observer.rb:154:in `update' /usr/lib/ruby/1.8/observer.rb:185:in `notify_observers' [...] From andreas.korth at gmx.net Mon Oct 23 21:31:16 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Tue, 24 Oct 2006 03:31:16 +0200 Subject: [Ferret-talk] Locking issues when adding to the Index In-Reply-To: <826D1C02-11E7-49A8-9515-527396D01530@gmx.net> References: <826D1C02-11E7-49A8-9515-527396D01530@gmx.net> Message-ID: <38D62A88-213F-4255-A40C-4FD97B0193EA@gmx.net> After careful reconsideration of the facts I eventually came to the conclusion that closing the index after write operations is good practice. I can recommend this approach to anyone running into the same problems. My apologies for not RTFM. Cheers, Andy On 24.10.2006, at 02:51, Andreas Korth wrote: > Hi, > > I keep getting locking errors when updating my index which puzzles me > since only one process is supposed to be accessing the index at a > time (at least in development mode). > > I'm using Ferret with Rails (NOT acts_as_ferret, though) and employed > an observer to add a new document to the index when a new > ActiveRecord is created. I'm not sure if this has any impact on the > concurrency issues, but I thought I mention it. > > Find attached the verbatim error message and the relevant part of the > stack trace. > > Any ideas? > > Cheers, > Andreas From scott at remixation.com Tue Oct 24 17:28:29 2006 From: scott at remixation.com (Scott Persinger) Date: Tue, 24 Oct 2006 23:28:29 +0200 Subject: [Ferret-talk] Problem with stop words Message-ID: I am seeing trouble with searches for 'you' not returning anything. It appears that 'you' is a stop word to the standard analyzer: require 'rubygems' require 'ferret' index = Ferret::I.new(:or_default => false) index << 'you' puts index.search('you') returns no hits. I assumed from the docs that StandardAnalyzer was using stop words as defined by: Ferret::Analysis::ENGLISH_STOP_WORDS but when I print that to the console I get: ["a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", "s", "such", "t", "that", "the", "their", "then", "there", "these", "they", "this", "to", "was", "will", "with"] I don't see 'you' in there. Supplying my own stop words seems to fix the problem: STOP_WORDS = ["a", "the", "and", "or"] index = Ferret::I.new(:or_default => false, :analyzer => Ferret::Analysis::StandardAnalyzer.new(STOP_WORDS)) index << 'you' puts index.search('you') this returns a hit. I am running the latest Windows build, but I've seen the same behavior on Linux with the latest builds. I am happy with my solution, but it seems odd that 'you' should be standard stop word. -- Posted via http://www.ruby-forum.com/. From andreas.korth at gmx.net Tue Oct 24 18:31:07 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Wed, 25 Oct 2006 00:31:07 +0200 Subject: [Ferret-talk] Problem with stop words In-Reply-To: References: Message-ID: <6B625C57-11C1-46B2-BFA0-3A33C919C066@gmx.net> On 24.10.2006, at 23:28, Scott Persinger wrote: > I am seeing trouble with searches for 'you' not returning anything. It > appears that 'you' is a stop word to the standard analyzer: > I assumed from the docs that StandardAnalyzer was using stop words > as defined by: > > Ferret::Analysis::ENGLISH_STOP_WORDS > > I don't see 'you' in there. StandardAnalyzer actually uses Ferret::Analysis::FULL_ENGLISH_STOP_WORDS by default. (Note the 'FULL_') > Supplying my own stop words seems to fix the problem: Standard stop words are just a one-size-fit-all reasonable default. For maximum control you should always supply your own list of stop words. > I am running the latest Windows build, but I've seen the same behavior > on Linux with the latest builds. I am happy with my solution, but it > seems odd that 'you' should be standard stop word. Depends on how you look at it. 'You' is definitely not the least adequate candidate for a stop word. Then again, it's not included in Ferret::Analysis::ENGLISH_STOP_WORDS. Cheers, Andy From jon at mywellnet.com Tue Oct 24 21:01:01 2006 From: jon at mywellnet.com (Jon) Date: Wed, 25 Oct 2006 03:01:01 +0200 Subject: [Ferret-talk] problem with TermQuery Message-ID: <55465a9df438b1f58308912d40c6aac9@ruby-forum.com> This might be more of a Lucene question, but I can't figure it out. How come this works: Item.find_id_by_contents("name:Bob") but this returns no results: Item.find_id_by_contents(Ferret::Search::TermQuery.new(:name, "Bob")) Thanks in advance! -Jon -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Tue Oct 24 21:57:11 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 25 Oct 2006 01:57:11 +0000 Subject: [Ferret-talk] problem with TermQuery In-Reply-To: <55465a9df438b1f58308912d40c6aac9@ruby-forum.com> References: <55465a9df438b1f58308912d40c6aac9@ruby-forum.com> Message-ID: On 10/25/06, Jon wrote: > This might be more of a Lucene question, but I can't figure it out. How > come this works: > > Item.find_id_by_contents("name:Bob") > > but this returns no results: > > Item.find_id_by_contents(Ferret::Search::TermQuery.new(:name, "Bob")) > > > > Thanks in advance! > > -Jon > Hi Jon, You need to downcase bob. The query parser does that for you. -- Dave Balmain http://www.davebalmain.com/ From dbalmain.ml at gmail.com Tue Oct 24 22:14:49 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 25 Oct 2006 02:14:49 +0000 Subject: [Ferret-talk] Problem with stop words In-Reply-To: <6B625C57-11C1-46B2-BFA0-3A33C919C066@gmx.net> References: <6B625C57-11C1-46B2-BFA0-3A33C919C066@gmx.net> Message-ID: On 10/24/06, Andreas Korth wrote: > > On 24.10.2006, at 23:28, Scott Persinger wrote: > > > I am seeing trouble with searches for 'you' not returning anything. It > > appears that 'you' is a stop word to the standard analyzer: > > > I assumed from the docs that StandardAnalyzer was using stop words > > as defined by: > > > > Ferret::Analysis::ENGLISH_STOP_WORDS > > > > I don't see 'you' in there. > > StandardAnalyzer actually uses > Ferret::Analysis::FULL_ENGLISH_STOP_WORDS by default. (Note the 'FULL_') My apologies. This had been fixed in the documentation a while ago. I just have updated the docs on the Ferret homepage for a while. > > Supplying my own stop words seems to fix the problem: > > Standard stop words are just a one-size-fit-all reasonable default. > For maximum control you should always supply your own list of stop > words. > > I am running the latest Windows build, but I've seen the same behavior > > on Linux with the latest builds. I am happy with my solution, but it > > seems odd that 'you' should be standard stop word. > > Depends on how you look at it. 'You' is definitely not the least > adequate candidate for a stop word. Then again, it's not included in > Ferret::Analysis::ENGLISH_STOP_WORDS. > > Cheers, > Andy Thanks Andy. Actually the reason for the two English stop-word lists is that they come from two different sources. ENGLISH_STOP_WORDS is the list taken from Lucene. FULL_ENGLISH_STOP_WORDS is taken from Martin Porter's website[1]. I hope that clears things up a little. You are quite right in saying you should probably use your own list of stop words for best results. Cheers, Dave [1] http://snowball.tartarus.org/ From wminkstein at gmail.com Wed Oct 25 00:41:14 2006 From: wminkstein at gmail.com (William Minkstein) Date: Wed, 25 Oct 2006 06:41:14 +0200 Subject: [Ferret-talk] Search result inconsistencies due to indexing Message-ID: I seem to be having problems with getting my searcher to be consistent while indexing. I am running the latest version of ferret (0.10.13) and I am using the Searchable plugin. Currently the way it indexes is by using a callback in the model of either after_update or after_create to index the fields that I have setup to be indexed. Right now I update the index about once every 4 or 5 minutes. The problem occurs when I do a search and either no results ( or a very small amount like 2) are returned for particular keywords that previously had plenty of results. I am thinking that it has something to do with indexing or possibly that Searchable does not optimize after it adds to the index. I currently have about 10,000 records in my index and am fairly new to ferret. Any assistance would be helpful. Thanks. Andy -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Wed Oct 25 05:44:05 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 25 Oct 2006 11:44:05 +0200 Subject: [Ferret-talk] Search result inconsistencies due to indexing In-Reply-To: References: Message-ID: <20061025094405.GC4769@cordoba.webit.de> On Wed, Oct 25, 2006 at 06:41:14AM +0200, William Minkstein wrote: > I seem to be having problems with getting my searcher to be consistent > while indexing. I am running the latest version of ferret (0.10.13) and > I am using the Searchable plugin. Currently the way it indexes is by > using a callback in the model of either after_update or after_create to > index the fields that I have setup to be indexed. huh, I didn't ever hear about that plugin before - the DrB remote indexing stuff is quite interesting indeed. > Right now I update the index about once every 4 or 5 minutes. The > problem occurs when I do a search and either no results ( or a very > small amount like 2) are returned for particular keywords that > previously had plenty of results. I am thinking that it has something > to do with indexing or possibly that Searchable does not optimize after > it adds to the index. Did you contact the plugin author about this ? cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From brodaigh at gmail.com Wed Oct 25 06:56:02 2006 From: brodaigh at gmail.com (Georgina Lynch) Date: Wed, 25 Oct 2006 12:56:02 +0200 Subject: [Ferret-talk] i cant install acts_as_ferret Message-ID: <145109d61c1e40ab183163df8814794a@ruby-forum.com> This is what happens when i try to get acts_as_ferret ...."nothing much".... Please help me and excuse me if its really dumb, i'm new to this! thanks C:\rails\app>gem install ferret Attempting local installation of 'ferret' Local gem file not found: ferret*.gem Attempting remote installation of 'ferret' Updating Gem source index for: http://gems.rubyforge.org Select which gem to install for your platform (i386-mswin32) 1. ferret 0.10.13 (ruby) 2. ferret 0.10.12 (ruby) 3. ferret 0.10.11 (ruby) 4. ferret 0.10.10 (ruby) 5. ferret 0.10.9 (mswin32)####{the list goes on) > 5 Successfully installed ferret-0.10.9-mswin32 Installing RDoc documentation for ferret-0.10.9-mswin32... C:\rails\app>ruby script/plugin install svn://projects.jkraemer.net/acts_as_ferret/tags/stable/acts_as_ferret C:\rails\app> -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Wed Oct 25 07:38:48 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 25 Oct 2006 13:38:48 +0200 Subject: [Ferret-talk] i cant install acts_as_ferret In-Reply-To: <145109d61c1e40ab183163df8814794a@ruby-forum.com> References: <145109d61c1e40ab183163df8814794a@ruby-forum.com> Message-ID: <20061025113848.GD4769@cordoba.webit.de> On Wed, Oct 25, 2006 at 12:56:02PM +0200, Georgina Lynch wrote: > This is what happens when i try to get acts_as_ferret ...."nothing > much".... > Please help me and excuse me if its really dumb, i'm new to this! thanks could you please try to checkout aaf with a subversion client of your choice ? i.e. svn co svn://projects.jkraemer.net/acts_as_ferret/tags/stable/acts_as_ferret if that works, just move the acts_as_ferret directory to vendor/plugins and you're done. if it doesn't, chances are you're sitting behind a firewall that doesn't allow the svn-protocol. if it doesn't, there's a quite recent snapshot of the current trunk attached at the bottom of this page: http://projects.jkraemer.net/acts_as_ferret cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From wminkstein at gmail.com Wed Oct 25 09:19:29 2006 From: wminkstein at gmail.com (William (Andy) Minkstein) Date: Wed, 25 Oct 2006 15:19:29 +0200 Subject: [Ferret-talk] Search result inconsistencies due to indexing In-Reply-To: <20061025094405.GC4769@cordoba.webit.de> References: <20061025094405.GC4769@cordoba.webit.de> Message-ID: <9d0bb41ee6f98c43bf323884d1f55da7@ruby-forum.com> Jens Kraemer wrote: > On Wed, Oct 25, 2006 at 06:41:14AM +0200, William Minkstein wrote: >> I seem to be having problems with getting my searcher to be consistent >> while indexing. I am running the latest version of ferret (0.10.13) and >> I am using the Searchable plugin. Currently the way it indexes is by >> using a callback in the model of either after_update or after_create to >> index the fields that I have setup to be indexed. > > huh, I didn't ever hear about that plugin before - the DrB remote > indexing stuff is quite interesting indeed. > The url for author's page is: http://searchable.rubyforge.org/ There are things about it that are very convenient. >> Right now I update the index about once every 4 or 5 minutes. The >> problem occurs when I do a search and either no results ( or a very >> small amount like 2) are returned for particular keywords that >> previously had plenty of results. I am thinking that it has something >> to do with indexing or possibly that Searchable does not optimize after >> it adds to the index. > > Did you contact the plugin author about this ? > > Yes I did contact the author about this. I checked the Searchable code that indexes records and it seems to be pretty consistent with how people are indexing with just Ferret itself. I guess I was wondering if anyone else experienced having inconsistent search results after updating or adding records to their index. > cheers, > Jens > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 Andy -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Wed Oct 25 11:27:27 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 25 Oct 2006 15:27:27 +0000 Subject: [Ferret-talk] Search result inconsistencies due to indexing In-Reply-To: <9d0bb41ee6f98c43bf323884d1f55da7@ruby-forum.com> References: <20061025094405.GC4769@cordoba.webit.de> <9d0bb41ee6f98c43bf323884d1f55da7@ruby-forum.com> Message-ID: On 10/25/06, William (Andy) Minkstein wrote: > Jens Kraemer wrote: > > On Wed, Oct 25, 2006 at 06:41:14AM +0200, William Minkstein wrote: > >> I seem to be having problems with getting my searcher to be consistent > >> while indexing. I am running the latest version of ferret (0.10.13) and > >> I am using the Searchable plugin. Currently the way it indexes is by > >> using a callback in the model of either after_update or after_create to > >> index the fields that I have setup to be indexed. > > > > huh, I didn't ever hear about that plugin before - the DrB remote > > indexing stuff is quite interesting indeed. > > > The url for author's page is: http://searchable.rubyforge.org/ There > are things about it that are very convenient. Hehe. I hadn't seen this either. Seth Fitzsimmons, if you're out there, nice work. I'm going to be adding a DRb server to Ferret soon. I'll definitely be checking out your code. If you'd like to contribute, please do. :) > >> Right now I update the index about once every 4 or 5 minutes. The > >> problem occurs when I do a search and either no results ( or a very > >> small amount like 2) are returned for particular keywords that > >> previously had plenty of results. I am thinking that it has something > >> to do with indexing or possibly that Searchable does not optimize after > >> it adds to the index. > > > > Did you contact the plugin author about this ? > > > > > > Yes I did contact the author about this. I checked the Searchable code > that indexes records and it seems to be pretty consistent with how > people are indexing with just Ferret itself. I guess I was wondering if > anyone else experienced having inconsistent search results after > updating or adding records to their index. I haven't seen this problem in version 0.10.13. The way I would go about debugging it, though, is to store all the fields in the index. Then if you look at a certain document in the index and it contains the data you're searching for but doesn't get matched in the search results it is a bug. In this case you can send me a zipped up copy of the index and I'll fix the problem. Otherwise I'm not sure there's much else we can do (unless you can give me ssh access to your server). Cheers, Dave From fastjames at gmail.com Wed Oct 25 14:06:54 2006 From: fastjames at gmail.com (Jim Kane) Date: Wed, 25 Oct 2006 20:06:54 +0200 Subject: [Ferret-talk] Ferret::StateError while using acts_as_ferret In-Reply-To: References: <72fcb47db245987eb017036d002e51c6@ruby-forum.com> Message-ID: <4cc78015e83a87f15934a905cb43eede@ruby-forum.com> David Balmain wrote: > On 10/13/06, Jim Kane wrote: >> Error occurred in index.c:2098 - stde_doc_num >> to find_by_contents sorted by a timestamp (stored nontokenized). Can >> anyone offer some advice on what I might be doing to cause this? I'm >> also getting occasional segfaults (already submitted as a ticket on the >> trac) but they don't _appear_ to be tied to this error. >> >> Jim > > Hi Jim, > I'm not sure what might be causing this error. Did you reindex when > you upgraded to 0.10.11? That may help. If it doesn't, try going back > to version 0.10.9. This error may have been introduced in the > performance enhancements I added in version 0.10.10. > > Let me know how you go. I tried reindexing with 0.10.11 but this didn't seem to help matters (and it REALLY barfed while the reindex was going on). I downgraded to 0.10.9 and reindexed this morning; since the reindex finished, I haven't had a single segfault or StateError (previously I had plenty of both). Hooray! Jim -- Posted via http://www.ruby-forum.com/. From fastjames at gmail.com Wed Oct 25 15:01:02 2006 From: fastjames at gmail.com (Jim Kane) Date: Wed, 25 Oct 2006 21:01:02 +0200 Subject: [Ferret-talk] Ferret::StateError while using acts_as_ferret In-Reply-To: <4cc78015e83a87f15934a905cb43eede@ruby-forum.com> References: <72fcb47db245987eb017036d002e51c6@ruby-forum.com> <4cc78015e83a87f15934a905cb43eede@ruby-forum.com> Message-ID: <4809687a2e2c10ddad0e450032830875@ruby-forum.com> Jim Kane wrote: > David Balmain wrote: >> On 10/13/06, Jim Kane wrote: >>> Error occurred in index.c:2098 - stde_doc_num >>> to find_by_contents sorted by a timestamp (stored nontokenized). Can >>> anyone offer some advice on what I might be doing to cause this? I'm >>> also getting occasional segfaults (already submitted as a ticket on the >>> trac) but they don't _appear_ to be tied to this error. >>> >>> Jim >> >> Hi Jim, >> I'm not sure what might be causing this error. Did you reindex when >> you upgraded to 0.10.11? That may help. If it doesn't, try going back >> to version 0.10.9. This error may have been introduced in the >> performance enhancements I added in version 0.10.10. >> >> Let me know how you go. > I tried reindexing with 0.10.11 but this didn't seem to help matters > (and it REALLY barfed while the reindex was going on). I downgraded to > 0.10.9 and reindexed this morning; since the reindex finished, I haven't > had a single segfault or StateError (previously I had plenty of both). Alright, I was going to add this to the open trac ticket I have regarding segfaults but akismet appears to be in a bad mood today. My earlier observations were based on a static index, which is great and all, but I do need to update it throughout the day. I restarted the update process and within an hour I encountered 2 segfaults. I'm not sure where to go from here. Jim -- Posted via http://www.ruby-forum.com/. From tamerhelmysalama at gmail.com Wed Oct 25 18:24:49 2006 From: tamerhelmysalama at gmail.com (Tamer Salama) Date: Thu, 26 Oct 2006 00:24:49 +0200 Subject: [Ferret-talk] Experience with ferret on Dreamhost ? In-Reply-To: References: Message-ID: <5977f10065865988b4bbb572c141e227@ruby-forum.com> Chris Lowis wrote: > Does anybody have experience with running ferret on dreamhost ? > > My app is running ok until I install the acts_as_ferret plugin, at which > point I get "Rails application failed to start properly" errors. I've > used script/console to confirm that I can require 'ferret' and make a > new Index object . Everything appears to be ok in that respect. > Unfortunately there is nothing logged in these circumstances, except : > > [Wed Aug 16 07:10:23 2006] [error] [client 152.78.115.107] FastCGI: comm > with (dynamic) server > "/home/c_lowis/residence-review.com/public/dispatch.fcgi" aborted: > (first read) idle timeout (120 sec) > [Wed Aug 16 07:10:23 2006] [error] [client 152.78.115.107] FastCGI: > incomplete headers (0 bytes) received from server > "/home/c_lowis/residence-review.com/public/dispatch.fcgi" > > in the "apache" type logs that dreamhost gives me . Through trial and > error I am fairly sure it is ferret that is causing this, as when I > remove the plugin the site works ok. > > I am using ferret 0.9.5 . As far as I can see dispatch.fcgi is not > starting. > > Would appreciate any comments, > > Chris What made the FCGI work for me was: - Creating the development database - Re-commenting out the RAILS_ENV line in environment.rb Although, I was thinking that this would make it a 'development' env, yet, to my astonismed, it was the production that kicked in. I haven't had the time to isolate this, but, I finally got it to work. Good Luck. - Tamer Salama -- Posted via http://www.ruby-forum.com/. From andreas.korth at gmx.net Wed Oct 25 19:13:19 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Thu, 26 Oct 2006 01:13:19 +0200 Subject: [Ferret-talk] Experience with ferret on Dreamhost ? In-Reply-To: <5977f10065865988b4bbb572c141e227@ruby-forum.com> References: <5977f10065865988b4bbb572c141e227@ruby-forum.com> Message-ID: Chris Lowis wrote: > Does anybody have experience with running ferret on dreamhost ? > > My app is running ok until I install the acts_as_ferret plugin, at > which > point I get "Rails application failed to start properly" errors. I've > used script/console to confirm that I can require 'ferret' and make a > new Index object . Everything appears to be ok in that respect. > Unfortunately there is nothing logged in these circumstances, except : > > [Wed Aug 16 07:10:23 2006] [error] [client 152.78.115.107] FastCGI: > comm > with (dynamic) server > "/home/c_lowis/residence-review.com/public/dispatch.fcgi" aborted: > (first read) idle timeout (120 sec) > [Wed Aug 16 07:10:23 2006] [error] [client 152.78.115.107] FastCGI: > incomplete headers (0 bytes) received from server > "/home/c_lowis/residence-review.com/public/dispatch.fcgi" > > in the "apache" type logs that dreamhost gives me . Through trial and > error I am fairly sure it is ferret that is causing this, as when I > remove the plugin the site works ok. > > I am using ferret 0.9.5 . As far as I can see dispatch.fcgi is not > starting. I'm not sure if this applies to your situation but I got this particular error message whenever Rails (i.e. dispatch.fcgi) returned a response starting with a whitespace character. Especially AJAX calls to actions that do a 'render :partial' are prone to this error. Maybe it's actually the acts_as_ferret plugin that causes fcgi to crash but I thought I better mention it. You might also want to consider Mongrel as an alternative to FCGI. I switched my apps a month ago and I'm really happy with it. Dealing with the peculiarities of FCGI can be very frustrating. At least it's easier with Mongrel to track down errors. Cheers, Andy From chris.lowis at gmail.com Thu Oct 26 04:00:26 2006 From: chris.lowis at gmail.com (Chris Lowis) Date: Thu, 26 Oct 2006 10:00:26 +0200 Subject: [Ferret-talk] Experience with ferret on Dreamhost ? In-Reply-To: References: <5977f10065865988b4bbb572c141e227@ruby-forum.com> Message-ID: <7c569ed0318dee463abb39dfc62e18eb@ruby-forum.com> Thank's to all above for the suggestions, hopefully these will also help someone with the same problem. I notice now that dreamhost has the ferret gem pre-installed, so this might also help. > You might also want to consider Mongrel as an alternative to FCGI. I > switched my apps a month ago and I'm really happy with it. Dealing > with the peculiarities of FCGI can be very frustrating. At least it's > easier with Mongrel to track down errors. I'd love to ! It's very nice to work with it on my development machine. At the moment dreamhost doesn't support mongrel, although dreamhost customers can vote for Mongrel support here : https://panel.dreamhost.com/index.cgi?tree=home.sugg&category=Software%20Installations&search=mongrel Thank you all again, Chris -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Thu Oct 26 12:24:04 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Fri, 27 Oct 2006 01:24:04 +0900 Subject: [Ferret-talk] Away for a week Message-ID: Hey folks, I'm off to Vietnam for a week on my way home to Australia so I'll be off the list for a while. Don't think I'm ignoring you. When I get back I intend to aggressively hunt down the segfault problem that some of you are having in Ferret so that problem will soon be ancient history. If anyone can narrow down a test case that can consistently segfault it would be a massive help. Also, if anyone can figure out what the problem is with Ferret on Windows that breaks Rails views with tabs in them, you'd be a hero to a lot of people. See you all soon, -- Dave Balmain http://www.davebalmain.com/ From john at squirl.info Thu Oct 26 14:23:47 2006 From: john at squirl.info (John Mcgrath) Date: Thu, 26 Oct 2006 20:23:47 +0200 Subject: [Ferret-talk] Away for a week In-Reply-To: References: Message-ID: <68b0e3f628c7c8fb56550dcc87569e81@ruby-forum.com> That clears up one question I've had -- based on your late-night (to me) posts, I wondered if you were nocturnal, or on the other side of the world :-) Godspeed. > I'm off to Vietnam for a week on my way home to Australia -- Posted via http://www.ruby-forum.com/. From sdfdsfsdf at Sdfsdf.com Thu Oct 26 16:06:55 2006 From: sdfdsfsdf at Sdfsdf.com (Ghost) Date: Thu, 26 Oct 2006 22:06:55 +0200 Subject: [Ferret-talk] ferret finds 'tests' but not 'test' In-Reply-To: References: <94cbc17ff76e8950daeea9a13b10afd6@ruby-forum.com> Message-ID: <490ff92ace22fc678e620105f75bc5b3@ruby-forum.com> anrake wrote: > Hi, if I use this stemming analyzer, where do I put it ? /lib/ and > require it in each model? > > -Anrake > > David Balmain wrote: >> On 9/6/06, Alastair Moore wrote: Can someone give Can someone give me an idiots guide as to how to implement this custom stemming analyser. I do not know where to start. Thanks for your patience. >>> Alastair >> The default analyzer doesn't perform any stemming. You need to create >> your own analyzer with a stemmer. Something like this; >> >> require 'rubygems' >> require 'ferret' >> >> module Ferret::Analysis >> class MyAnalyzer >> def token_stream(field, text) >> StemFilter.new(StandardTokenizer.new(text)) >> end >> end >> end >> >> index = Ferret::I.new(:analyzer => Ferret::Analysis::MyAnalyzer.new) >> >> index << "test" >> index << "tests debate debater debating the for," >> puts index.search("test").total_hits >> >> Hope that helps, >> Dave -- Posted via http://www.ruby-forum.com/. From andreas.korth at gmx.net Thu Oct 26 17:36:19 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Thu, 26 Oct 2006 23:36:19 +0200 Subject: [Ferret-talk] ferret finds 'tests' but not 'test' In-Reply-To: <490ff92ace22fc678e620105f75bc5b3@ruby-forum.com> References: <94cbc17ff76e8950daeea9a13b10afd6@ruby-forum.com> <490ff92ace22fc678e620105f75bc5b3@ruby-forum.com> Message-ID: <6EABC590-396E-4CB6-A289-56E7D4CB970B@gmx.net> On 26.10.2006, at 22:06, Ghost wrote: > Can someone give me an idiots guide as to how to implement this custom > stemming analyser. I do not know where to start. 1. Create the analyzer as David outlined it and name the file "my_analyzer.rb". If you put it in /app/models you don't need any require statements since every .rb file in /app/models gets automagically 'required' by Rails. > # file: app/models/my_analyzer.rb > > require 'rubygems' > require 'ferret' > > module Ferret::Analysis > class MyAnalyzer > def token_stream(field, text) > StemFilter.new(StandardTokenizer.new(text)) > end > end > end 2. When you create an Index instance, pass it your analyzer, like so: index = Ferret::I.new(:analyzer => Ferret::Analysis::MyAnalyzer.new) 3. Test your analyzer, e.g. index << "walking" index << "walked" index << "walks" index.search("walk").total_hits # -> 3 > Thanks for your patience. You're welcome. And may I kindly ask you to use a valid email address and perhaps your real name for future posts? Kind regards, Andreas From mark.puckett at gmail.com Fri Oct 27 03:26:38 2006 From: mark.puckett at gmail.com (Mark Puckett) Date: Fri, 27 Oct 2006 09:26:38 +0200 Subject: [Ferret-talk] not able to install acts_as_ferret In-Reply-To: References: <7bd13bf4ff73d6a2401a52fa6d142afb@ruby-forum.com> <20061020131847.GA24958@cordoba.webit.de> Message-ID: <044eb642614ff53acead543d527b51da@ruby-forum.com> David Balmain wrote: > On 10/20/06, Jens Kraemer wrote: >> Hi Mark, >> >> are you sure there's no firewall blocking things on your side ? >> The svn: protocol uses Port 3690, sometimes this is a problem with >> restrictive corporate networks and such. >> >> Jens > > FYI I can connect from here too. Must be a firewall issue. Yes, it is. Sorry, and thanks for the reply. -Mark -- Posted via http://www.ruby-forum.com/. From waspfactory at gggggmmmmmail.com Fri Oct 27 05:58:34 2006 From: waspfactory at gggggmmmmmail.com (Ghost) Date: Fri, 27 Oct 2006 11:58:34 +0200 Subject: [Ferret-talk] ferret finds 'tests' but not 'test' In-Reply-To: <6EABC590-396E-4CB6-A289-56E7D4CB970B@gmx.net> References: <94cbc17ff76e8950daeea9a13b10afd6@ruby-forum.com> <490ff92ace22fc678e620105f75bc5b3@ruby-forum.com> <6EABC590-396E-4CB6-A289-56E7D4CB970B@gmx.net> Message-ID: Hi I'm still having trouble with this. Probably something stupid but here goes. I'm using ferret version 0.13 and aaf. I created this file in my app/models directory require 'ferret' include Ferret module Ferret::Analysis class MyAnalyzer def token_stream(field, text) StemFilter.new(StandardTokenizer.new(text)) end end end naming it my_analyzer.rb as directed. and then in my ferret model i have the following declarion. acts_as_ferret :fields=> ['short_description'],:analyzer => Ferret::Analysis::MyAnalyzer.new I tried to rebuild my index but it crashes out with the following error: >> VoObject.rebuild_index NameError: uninitialized constant MyAnalyzer from /usr/lib/ruby/gems/1.8/gems/activesupport-1.3.1/lib/active_support/dependencies.rb:123:in `const_missing' from script/../config/../config/../app/models/vo_object.rb:14 from /usr/lib/ruby/gems/1.8/gems/activesupport-1.3.1/lib/active_support/dependencies.rb:140:in `load' from /usr/lib/ruby/gems/1.8/gems/activesupport-1.3.1/lib/active_support/dependencies.rb:56:in `require_or_load' from /usr/lib/ruby/gems/1.8/gems/activesupport-1.3.1/lib/active_support/dependencies.rb:30:in `depend_on' from /usr/lib/ruby/gems/1.8/gems/activesupport-1.3.1/lib/active_support/dependencies.rb:85:in `require_dependency' from /usr/lib/ruby/gems/1.8/gems/activesupport-1.3.1/lib/active_support/dependencies.rb:98:in `const_missing' from /usr/lib/ruby/gems/1.8/gems/activesupport-1.3.1/lib/active_support/dependencies.rb:131:in `const_missing' from (irb):11 >> Nasty eh? Any idea what is going on here? Why can't my VoObject model see the new analyzer? Thanks again. > You're welcome. And may I kindly ask you to use a valid email address > and perhaps your real name for future posts? I used to post with a valid email address. But then the number of spam messages i recieved went from 1 or 2 a week to 50-60 a day. Ruby Forum used to print the email addresses on the page. Heres a comprimise. Regards Caspar -- Posted via http://www.ruby-forum.com/. From andreas.korth at gmx.net Fri Oct 27 08:44:05 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Fri, 27 Oct 2006 14:44:05 +0200 Subject: [Ferret-talk] ferret finds 'tests' but not 'test' In-Reply-To: References: <94cbc17ff76e8950daeea9a13b10afd6@ruby-forum.com> <490ff92ace22fc678e620105f75bc5b3@ruby-forum.com> <6EABC590-396E-4CB6-A289-56E7D4CB970B@gmx.net> Message-ID: <99502401-110C-40D0-8B23-918A040EA6E3@gmx.net> Hi Caspar, On 27.10.2006, at 11:58, Ghost wrote: > Hi I'm still having trouble with this. Probably something stupid but > here goes. > > I created this file in my app/models directory > naming it my_analyzer.rb as directed. > > I tried to rebuild my index but it crashes out with the following > error: > >>> VoObject.rebuild_index > NameError: uninitialized constant MyAnalyzer Sorry, I forgot to mention that the directory structure needs to resemble the module nesting, i.e. the file must go in app/models/ ferret/analysis instead of just app/models. Cheers, Andy From heikowebers at gmx.net Fri Oct 27 11:15:57 2006 From: heikowebers at gmx.net (hawe) Date: Fri, 27 Oct 2006 17:15:57 +0200 Subject: [Ferret-talk] Regexpr. analyzer Message-ID: Hi! I want to index html files, but w/o the tags, so I was thinking either I remove them before I index it (expensive), or put up an RegExpAnalyzer. BTW, when using an analyzer, does that mean that everything which it declines (i.e. the RegExpAnalyzer doesn't match) won't be put into the index files (i.e. blows it up)? I came up with a simple test, which didn't work in act_as_ferret, but now in pure ferret doesn't work as well. I expected, with the code below, that only "abc" will be indexed, as only it matches the regexpr. What's wrong? @index = Ferret::Index::Index.new(:path => 'c:/projects/peter/lib/ferretidx', :analyzer => RegExpAnalyzer.new(/[a-f]/)) @index << {:id => "15", :title => "Programming Ruby", :content => "some thing abc"} @index.search_each('content:"some"') do |id, score| puts "Document #{id} found with a score of #{score}" end Thanks a lot, hawe. -- Posted via http://www.ruby-forum.com/. From ryansking at gmail.com Fri Oct 27 12:32:53 2006 From: ryansking at gmail.com (Ryan King) Date: Fri, 27 Oct 2006 09:32:53 -0700 Subject: [Ferret-talk] not able to install acts_as_ferret In-Reply-To: <044eb642614ff53acead543d527b51da@ruby-forum.com> References: <7bd13bf4ff73d6a2401a52fa6d142afb@ruby-forum.com> <20061020131847.GA24958@cordoba.webit.de> <044eb642614ff53acead543d527b51da@ruby-forum.com> Message-ID: <846f30c70610270932w404cd73ew5e0b501adee82570@mail.gmail.com> On 10/27/06, Mark Puckett wrote: > David Balmain wrote: > > On 10/20/06, Jens Kraemer wrote: > >> Hi Mark, > >> > >> are you sure there's no firewall blocking things on your side ? > >> The svn: protocol uses Port 3690, sometimes this is a problem with > >> restrictive corporate networks and such. > >> > >> Jens > > > > FYI I can connect from here too. Must be a firewall issue. > > Yes, it is. Sorry, and thanks for the reply. Why not allow HTTP access to the svn repo? -ryan From andreas.korth at gmx.net Fri Oct 27 13:45:50 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Fri, 27 Oct 2006 19:45:50 +0200 Subject: [Ferret-talk] Regexpr. analyzer In-Reply-To: References: Message-ID: <75E62FF6-B897-4F78-B842-C4E39FE01AB8@gmx.net> On 27.10.2006, at 17:15, hawe wrote: > I want to index html files, but w/o the tags, so I was thinking > either I > remove them before I index it (expensive), or put up an > RegExpAnalyzer. What's so expensive about stripping the tags prior to adding the html to the index? I'm not sure which regex engine RegExpAnalyzer uses, but the Ruby's regex engine is implemented in C, so it shouldn't make much of a difference. > BTW, when using an analyzer, does that mean that everything which it > declines (i.e. the RegExpAnalyzer doesn't match) won't be put into the > index files (i.e. blows it up)? Yep. That's why you should use this analyzer only for the field that's used to index the HTML, perhaps by using a PerFieldAnalzyer. > I came up with a simple test, which didn't work in act_as_ferret, but > now in pure ferret doesn't work as well. I expected, with the code > below, that only "abc" will be indexed, as only it matches the > regexpr. > What's wrong? > > @index = Ferret::Index::Index.new(:path => > 'c:/projects/peter/lib/ferretidx', > :analyzer => RegExpAnalyzer.new(/[a-f]/)) > > @index << {:id => "15", :title => "Programming Ruby", :content => > "some thing abc"} > > @index.search_each('content:"some"') do |id, score| > puts "Document #{id} found with a score of #{score}" > end Consider: index = Ferret::I.new(:analyzer => Ferret::Analysis::RegExpAnalyzer.new(/[a-f]/)) index << "prose" index << "fade" index.search("prose").total_hits # -> 2 What happens is that "prose" becomes "e" and "fade" goes untouched. Ferret uses the same analyzer for indexing and query parsing. As a consequence, index.search("prose") becomes index.search("e") which matches both "fade" and "prose". I'd suggest you use a separate tag stripper instead of using RegExpAnalyzer. Proper tag stripping is not a trivial RegExp, especially if you're dealing with non-well-formed documents. HTH, Andy From brodaigh at gmail.com Fri Oct 27 21:43:40 2006 From: brodaigh at gmail.com (crissy crissy) Date: Sat, 28 Oct 2006 03:43:40 +0200 Subject: [Ferret-talk] i cant install acts_as_ferret In-Reply-To: <20061025113848.GD4769@cordoba.webit.de> References: <145109d61c1e40ab183163df8814794a@ruby-forum.com> <20061025113848.GD4769@cordoba.webit.de> Message-ID: Jens Kraemer wrote: > On Wed, Oct 25, 2006 at 12:56:02PM +0200, Georgina Lynch wrote: >> This is what happens when i try to get acts_as_ferret ...."nothing >> much".... >> Please help me and excuse me if its really dumb, i'm new to this! thanks > > could you please try to checkout aaf with a subversion client of your > choice ? i.e. > > svn co > svn://projects.jkraemer.net/acts_as_ferret/tags/stable/acts_as_ferret > > if that works, just move the acts_as_ferret directory to vendor/plugins > and you're done. if it doesn't, chances are you're sitting behind a > firewall that doesn't allow the svn-protocol. if it doesn't, there's a > quite recent snapshot of the current trunk attached at the bottom of > this page: http://projects.jkraemer.net/acts_as_ferret > > cheers, > Jens > > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 Thankyou for your reply Jens Kraemer, I can install it on my remote server ok, (probably because its not windows)but when i do i get rails application failed to start properly". If i delete aaf and the app works fine again, but with no aaf, sigh. Ummm.. What am i not doing to get this to work? -- Posted via http://www.ruby-forum.com/. From brodaigh at gmail.com Fri Oct 27 21:58:21 2006 From: brodaigh at gmail.com (crissy crissy) Date: Sat, 28 Oct 2006 03:58:21 +0200 Subject: [Ferret-talk] i cant install acts_as_ferret In-Reply-To: References: <145109d61c1e40ab183163df8814794a@ruby-forum.com> <20061025113848.GD4769@cordoba.webit.de> Message-ID: <9248bd3755d6d1aba83e8cfa19c3fef6@ruby-forum.com> crissy crissy wrote: > >> Sorry, The hosting company said they installed the ferret gem but i cant see it listed. So if i can get on to them there shouldnt be a problem. Please ignore last post. sorry! Thanks again -- Posted via http://www.ruby-forum.com/. From wminkstein at gmail.com Sat Oct 28 11:06:56 2006 From: wminkstein at gmail.com (William (Andy) Minkstein) Date: Sat, 28 Oct 2006 17:06:56 +0200 Subject: [Ferret-talk] Search result inconsistencies due to indexing In-Reply-To: References: <20061025094405.GC4769@cordoba.webit.de> <9d0bb41ee6f98c43bf323884d1f55da7@ruby-forum.com> Message-ID: <11c4e0d249c89dd40326426dfe5d397d@ruby-forum.com> Hey David, Thanks for the quick response. Sorry about the delay in responding, we had a deadline we had to meet for this app we're developing and that has been taking up most of my life. I am having a lot of trouble reproducing the bug I first reported. I talked to the writer of the Searchable plugin (Seth) and he is stumped as well. It seems that creating an index from scratch works fine and the bug only occurs when documents are added or deleted. A weird thing happens though once searchable starts updating the index. It creates folders inside the index directory called development and production. Seth claims that this is not desired behavior but I guess I was wondering if having files or directories like that in the index directory would cause it to behave strangely? Another question I have is can certain characters in the text that is being indexed cause problems when trying to retrieve search results? Should I be filtering out any non-alphanumerics? Not sure if that matters at all but like I said I am new to this. Andy David Balmain wrote: > On 10/25/06, William (Andy) Minkstein wrote: >> > >> The url for author's page is: http://searchable.rubyforge.org/ There >> are things about it that are very convenient. > > Hehe. I hadn't seen this either. Seth Fitzsimmons, if you're out > there, nice work. I'm going to be adding a DRb server to Ferret soon. > I'll definitely be checking out your code. If you'd like to > contribute, please do. :) > >> >> Yes I did contact the author about this. I checked the Searchable code >> that indexes records and it seems to be pretty consistent with how >> people are indexing with just Ferret itself. I guess I was wondering if >> anyone else experienced having inconsistent search results after >> updating or adding records to their index. > > I haven't seen this problem in version 0.10.13. The way I would go > about debugging it, though, is to store all the fields in the index. > Then if you look at a certain document in the index and it contains > the data you're searching for but doesn't get matched in the search > results it is a bug. In this case you can send me a zipped up copy of > the index and I'll fix the problem. Otherwise I'm not sure there's > much else we can do (unless you can give me ssh access to your > server). > > Cheers, > Dave -- Posted via http://www.ruby-forum.com/. From clarecav at nospamblistit.com Sun Oct 29 16:44:37 2006 From: clarecav at nospamblistit.com (Clare) Date: Sun, 29 Oct 2006 22:44:37 +0100 Subject: [Ferret-talk] Thesaurus search Message-ID: <35ef8784892f0c1c666590a3d25936a6@ruby-forum.com> Can anyone help me with doing searches using thesaurus. I really want to do searches that are simple that I make up. For example, a search on "TV" will bring back results that include "Television" and vice versa. Any help appreciated. Clare -- Posted via http://www.ruby-forum.com/. From andreas.korth at gmx.net Sun Oct 29 18:45:08 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Mon, 30 Oct 2006 00:45:08 +0100 Subject: [Ferret-talk] Thesaurus search In-Reply-To: <35ef8784892f0c1c666590a3d25936a6@ruby-forum.com> References: <35ef8784892f0c1c666590a3d25936a6@ruby-forum.com> Message-ID: <081F10A0-F7CA-4312-BAE9-60E49031936A@gmx.net> On 29.10.2006, at 22:44, Clare wrote: > Can anyone help me with doing searches using thesaurus. > > I really want to do searches that are simple that I make up. For > example, a search on "TV" will bring back results that include > "Television" and vice versa. I'm not quite sure if I get the point of your question. Are you looking for a thesaurus dictionary or a concept of implementing a thesaurus based search with Ferret? There are several thesauri available for download. WordNet, for example: http://www.semanticweb.org/library/ These thesauri are in RDF format which is XML. You could parse an RDF file and write it to a database or generate a huge Ruby hash for fast lookup. You could even use a separate Ferret index as your thesaurus database. Implementing the search is fairly easy. Look up a search term in the thesaurus and use the synonyms to build the actual query for Ferret. Cheers, Andy From lists at qutek.net Sun Oct 29 19:16:23 2006 From: lists at qutek.net (Quinn Harris) Date: Sun, 29 Oct 2006 18:16:23 -0600 Subject: [Ferret-talk] File Store permissions Message-ID: <200610291716.29523.lists@qutek.net> I am using Ferret for a Rails app in which Rails runs as one user but I have other processes that run as a different user that modify the ferret index. This is done in large part to mitigate the damage if a major exploit is found in Rails again. The problem is Ferret creates all its index files with rw for the user only. I have included a small patch that changes Ferret to create these files with the rw permissions for the group based on the parent directory permissions. Could this patch or similar find its way into the official releases? A read only file store mode would also be usefull but not essential. Thanks, Quinn diff -puN ext-orig/store.h ext/store.h --- ext-orig/store.h 2006-09-23 22:11:22.000000000 -0600 +++ ext/store.h 2006-10-21 14:36:50.000000000 -0600 @@ -176,6 +176,8 @@ struct Store CompoundStore *cmpd; /* for compound_store only */ } dir; + mode_t file_mode; + HashSet *locks; /** diff -puN ext-orig/fs_store.c ext/fs_store.c --- ext-orig/fs_store.c 2006-09-23 22:11:22.000000000 -0600 +++ ext/fs_store.c 2006-10-21 15:06:47.000000000 -0600 @@ -51,7 +51,7 @@ static void fs_touch(Store *store, char int f; char path[MAX_FILE_PATH]; join_path(path, store->dir.path, filename); - if ((f = creat(path, S_IRUSR | S_IWUSR)) == 0) { + if ((f = creat(path, store->file_mode)) == 0) { RAISE(IO_ERROR, "couldn't create file %s: <%s>", path, strerror(errno)); } @@ -252,7 +252,7 @@ static OutStream *fs_new_output(Store *s { char path[MAX_FILE_PATH]; int fd = open(join_path(path, store->dir.path, filename), - O_WRONLY | O_CREAT | O_BINARY, S_IRUSR | S_IWUSR); + O_WRONLY | O_CREAT | O_BINARY, store->file_mode); OutStream *os; if (fd < 0) { RAISE(IO_ERROR, "couldn't create OutStream %s: <%s>", @@ -431,9 +431,19 @@ static void fs_close_i(Store *store) static Store *fs_store_new(const char *pathname) { + struct stat stt; Store *new_store = store_new(); new_store->dir.path = estrdup(pathname); + + new_store->file_mode = S_IRUSR | S_IWUSR; + if (!stat(new_store->dir.path, &stt) && + stt.st_gid == getgid()) { + if (stt.st_mode & S_IWGRP) + umask(S_IWOTH); + new_store->file_mode |= stt.st_mode & (S_IRGRP | S_IWGRP); + } + new_store->touch = &fs_touch; new_store->exists = &fs_exists; new_store->remove = &fs_remove; From miguel.wong at gmail.com Mon Oct 30 10:32:06 2006 From: miguel.wong at gmail.com (Miguel) Date: Mon, 30 Oct 2006 16:32:06 +0100 Subject: [Ferret-talk] PerFieldAnalyzer and AAF Message-ID: <4f882c29d756c84d6123a29992be7e24@ruby-forum.com> Hi All, Does anyone know if you can user PerFieldAnalyzer with the acts_as_ferret method? My goal is to index fields with different analyzers for a class. Thanks in advance! Miguel -- Posted via http://www.ruby-forum.com/. From jeffrey at silveregg.co.jp Tue Oct 31 01:55:07 2006 From: jeffrey at silveregg.co.jp (Jeffrey Gelens) Date: Tue, 31 Oct 2006 15:55:07 +0900 Subject: [Ferret-talk] No search results using Searcher Message-ID: <1162277707.11601.11.camel@jeffrey.esaka> I just started using Ferret and I successfully indexed some documents. I can search this index using the following code: index = Index::Index.new(:path => path) index.search_each("something") do |doc, score| print "##{doc} #{index[doc]['url']} - #{score}" print "\n" end However, when I try to use Search::Searcher and QueryParser I don't get any results. I tried the following code: queryparser = QueryParser.new() searcher = Searcher.new(path) queryparser.fields = searcher.reader.fields searcher.search(queryparser.parse("something")) I index all my documents as follows: index = Index::Index.new(:path => path, :analyzer => Analysis::RegExpAnalyzer.new(/./, false)) index << { :title => title, :url => link, :body => page } What am I doing wrong? Thanks! -- Jeffrey Gelens From tennisbum2002 at hotmail.com Tue Oct 31 04:02:50 2006 From: tennisbum2002 at hotmail.com (Eric Gross) Date: Tue, 31 Oct 2006 10:02:50 +0100 Subject: [Ferret-talk] conditional boost? friends to come up at top of search... Message-ID: <4d8a5a69850433ec2e2ef883f9c92835@ruby-forum.com> Hey guys, im trying to get my friends to come up at the top of the act as ferret search. I would query the whole result set first, then move my friends to the top, but the thing is, Im paginating my results and use the offset and limit parameters in the multi_search() function. Anyone know how to do this? Thanks in advance... -- Posted via http://www.ruby-forum.com/. From john at squirl.info Tue Oct 31 12:02:25 2006 From: john at squirl.info (John Mcgrath) Date: Tue, 31 Oct 2006 18:02:25 +0100 Subject: [Ferret-talk] corrupted index preventing save Message-ID: <1ea825f9c558ed0e493a7af70fbd371d@ruby-forum.com> Hi, I'm using Rails/AAF with Ferret 0.10.11, and my index occasionally (every few weeks, roughly) becomes corrupted. If the index is busted, until I rebuild it our users are unable to save anything. I get errors like the one below, and the save rolls back. My question is, is there any way to catch the error, and continue with the save even if the model isn't indexed? What would be ideal is if i could have a catch the error, and have it send me a notification, which would solve two problems: the save would still happen so our users wouldn't be impacted, and I would know exactly when the index had become corrupted, and could rebuild it. TIA for any help, John here's the error i'm getting: Processing UserController#signup (for 72.227.101.170 at 2006-10-31 11:23:06) [POST] Session ID: 7f854fa9eaf95becbb9723a9bd48f9c2 Parameters: {"user"=>{"subscribe_to_newsletter"=>"0", "password_confirmation"=>"[FILTERED]", "terms"=>"1", "password"=>"[FILTERED]", "login"=>"jmcgrath", "email"=>"jmcgrath at whoi.edu"}, "commit"=>"Sign Up", "action"=>"signup", "controller"=>"user"} Unable to send confirmation E-Mail: Lock Error occured at :103 in xpop_context Error occured in index.c:5371 - iw_open Couldn't obtain write lock when opening IndexWriter -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Oct 31 12:35:15 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 31 Oct 2006 18:35:15 +0100 Subject: [Ferret-talk] not able to install acts_as_ferret In-Reply-To: <846f30c70610270932w404cd73ew5e0b501adee82570@mail.gmail.com> References: <7bd13bf4ff73d6a2401a52fa6d142afb@ruby-forum.com> <20061020131847.GA24958@cordoba.webit.de> <044eb642614ff53acead543d527b51da@ruby-forum.com> <846f30c70610270932w404cd73ew5e0b501adee82570@mail.gmail.com> Message-ID: <20061031173515.GC13698@cordoba.webit.de> On Fri, Oct 27, 2006 at 09:32:53AM -0700, Ryan King wrote: > On 10/27/06, Mark Puckett wrote: > > David Balmain wrote: > > > On 10/20/06, Jens Kraemer wrote: > > >> Hi Mark, > > >> > > >> are you sure there's no firewall blocking things on your side ? > > >> The svn: protocol uses Port 3690, sometimes this is a problem with > > >> restrictive corporate networks and such. > > >> > > >> Jens > > > > > > FYI I can connect from here too. Must be a firewall issue. > > > > Yes, it is. Sorry, and thanks for the reply. > > Why not allow HTTP access to the svn repo? Initially I didn't use Apache as web server on this machine, so there was no mod_svn and hence no http access to the repo. I migrated (back) to Apache some time ago, so I could give mod_svn a try. I'll look into this. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From andreas.korth at gmx.net Tue Oct 31 13:47:30 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Tue, 31 Oct 2006 19:47:30 +0100 Subject: [Ferret-talk] corrupted index preventing save In-Reply-To: <1ea825f9c558ed0e493a7af70fbd371d@ruby-forum.com> References: <1ea825f9c558ed0e493a7af70fbd371d@ruby-forum.com> Message-ID: <8CD4FCFF-8E3C-432A-8796-E5B52F62A631@gmx.net> On 31.10.2006, at 18:02, John Mcgrath wrote: > Hi, I'm using Rails/AAF with Ferret 0.10.11, and my index occasionally > (every few weeks, roughly) becomes corrupted. > > If the index is busted, until I rebuild it our users are unable to > save > anything. I get errors like the one below, and the save rolls back. The acts_as_ferret plugin employs ActiveRecord callbacks such as after_update to index the models. If an exception is thrown inside a callback method, the action is rolled back. > My question is, is there any way to catch the error, and continue with > the save even if the model isn't indexed? Several ways. You could overwrite the save mehtod (either on a per- model-basis or for ActiveRecord::Base) to read: def save begin create_or_update rescue => any_exception # deal with exceptions you can handle or re-raise end end Or, even better, you could patch the acts_as_ferret code to resort to a callback such as "rescue_error_in_ferret". See the 'ferret_create' method of 'acts_as_ferret/lib/instance_methods.rb'. You'd basically wrap the method in a begin/rescue block and see if the model respond_to? :rescue_error_in_ferret. If it does, call that method or else re-raise the exception. Cheers, Andy From mark.puckett at gmail.com Tue Oct 31 16:10:33 2006 From: mark.puckett at gmail.com (Mark Puckett) Date: Tue, 31 Oct 2006 22:10:33 +0100 Subject: [Ferret-talk] searchable or acts_as_ferret or neither? Message-ID: <73119141bf0e84da110ae6d0e1a7ef78@ruby-forum.com> Has anybody tried both and favored one of the other? Or maybe tried both and in the end used neither? I'm having perf (generic searches across 4 fields take upwards of 120 seconds) and index rebuild issues (not sure how long it takes to fully rebuild since the process either gets killed or dies due to a file lock exception) with a dreamhost site that is using AAF, has 65k rows in 1 model, and has 4 columns indexed, using the StandardAnalyzer (only because using the Stemming Analyzer appeared to make the issues significantly worse). I'm using ferret (0.10.11) and the 10/12 trunk dump of aaf. So I'm wondering if I might find better performance and/or less issues with searchable, or if ultimately I just need to (RTFM and) write the code myself. Any (constructive) suggestions are appreciated. -Mark -- Posted via http://www.ruby-forum.com/.