From jan.prill at gmail.com Wed Nov 1 03:26:42 2006 From: jan.prill at gmail.com (Jan Prill) Date: Wed, 1 Nov 2006 08:26:42 +0000 Subject: [Ferret-talk] searchable or acts_as_ferret or neither? In-Reply-To: <73119141bf0e84da110ae6d0e1a7ef78@ruby-forum.com> References: <73119141bf0e84da110ae6d0e1a7ef78@ruby-forum.com> Message-ID: <562a35c10611010026k3807ccc5h8217efccc9abb381@mail.gmail.com> Hi Mark, would you mind posting some of your search-code? How is the performance on your development machine. There's something going wrong big time in your dreamhost installation.. No chance a query is taking two minutes on this very moderate amount of data. Cheers, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20061101/a3ba8a2d/attachment.html From kraemer at webit.de Wed Nov 1 06:07:55 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 1 Nov 2006 12:07:55 +0100 Subject: [Ferret-talk] corrupted index preventing save In-Reply-To: <8CD4FCFF-8E3C-432A-8796-E5B52F62A631@gmx.net> References: <1ea825f9c558ed0e493a7af70fbd371d@ruby-forum.com> <8CD4FCFF-8E3C-432A-8796-E5B52F62A631@gmx.net> Message-ID: <20061101110755.GL4769@cordoba.webit.de> On Tue, Oct 31, 2006 at 07:47:30PM +0100, Andreas Korth wrote: > > On 31.10.2006, at 18:02, John Mcgrath wrote: > > > Hi, I'm using Rails/AAF with Ferret 0.10.11, and my index occasionally > > (every few weeks, roughly) becomes corrupted. > > > > If the index is busted, until I rebuild it our users are unable to > > save > > anything. I get errors like the one below, and the save rolls back. > > The acts_as_ferret plugin employs ActiveRecord callbacks such as > after_update to index the models. If an exception is thrown inside a > callback method, the action is rolled back. > > > My question is, is there any way to catch the error, and continue with > > the save even if the model isn't indexed? > > Several ways. You could overwrite the save mehtod (either on a per- > model-basis or for ActiveRecord::Base) to read: > > def save > begin > create_or_update > rescue => any_exception > # deal with exceptions you can handle or re-raise > end > end > > Or, even better, you could patch the acts_as_ferret code to resort to > a callback such as "rescue_error_in_ferret". See the 'ferret_create' > method of 'acts_as_ferret/lib/instance_methods.rb'. You'd basically > wrap the method in a begin/rescue block and see if the model > respond_to? :rescue_error_in_ferret. If it does, call that method or > else re-raise the exception. overwriting the callback handlers in your model would be another possibility: class MyModel < AR::Base acts_as_ferret ... # ferret_create is declared by aaf, and used for before_update and # before_create events. alias :old_ferret_create :ferret_create def ferret_create old_ferret_create rescue # handle the error... true # tell AR everything is fine end end Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Wed Nov 1 07:39:10 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 1 Nov 2006 13:39:10 +0100 Subject: [Ferret-talk] PerFieldAnalyzer and AAF In-Reply-To: <4f882c29d756c84d6123a29992be7e24@ruby-forum.com> References: <4f882c29d756c84d6123a29992be7e24@ruby-forum.com> Message-ID: <20061101123910.GN4769@cordoba.webit.de> On Mon, Oct 30, 2006 at 04:32:06PM +0100, Miguel wrote: > Hi All, > > Does anyone know if you can user PerFieldAnalyzer with the > acts_as_ferret method? My goal is to index fields with different > analyzers for a class. Thanks in advance! I didn't use PerFieldAnalyzer yet, but I see no reason why this shouldn't work. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Wed Nov 1 08:04:08 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 1 Nov 2006 14:04:08 +0100 Subject: [Ferret-talk] No search results using Searcher In-Reply-To: <1162277707.11601.11.camel@jeffrey.esaka> References: <1162277707.11601.11.camel@jeffrey.esaka> Message-ID: <20061101130408.GO4769@cordoba.webit.de> On Tue, Oct 31, 2006 at 03:55:07PM +0900, Jeffrey Gelens wrote: > I just started using Ferret and I successfully indexed some documents. I > can search this index using the following code: > > index = Index::Index.new(:path => path) > index.search_each("something") do |doc, score| > print "##{doc} #{index[doc]['url']} - #{score}" > print "\n" > end > > However, when I try to use Search::Searcher and QueryParser I don't get > any results. I tried the following code: > > queryparser = QueryParser.new() > searcher = Searcher.new(path) > queryparser.fields = searcher.reader.fields > searcher.search(queryparser.parse("something")) > > I index all my documents as follows: > > index = Index::Index.new(:path => path, :analyzer => > Analysis::RegExpAnalyzer.new(/./, false)) > index << { :title => title, :url => link, :body => page } > > What am I doing wrong? Basically you should use the same analyzer to analyze queries as you used to analyze your content. So constructing your queryparser like this: qp = QueryParser.new(:analyzer => Analysis::RegExpAnalyzer.new(/./, false)) your searches should work. However, your regexp for the analyzer looks strange - /./ matches every single character, including whitespace. So each field's value would be indexed as 1-character long terms, which probably is not what you want. However I don't know why searching through the Index class worked, I'd suspect it not to work, too. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Wed Nov 1 08:05:42 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 1 Nov 2006 14:05:42 +0100 Subject: [Ferret-talk] conditional boost? friends to come up at top of search... In-Reply-To: <4d8a5a69850433ec2e2ef883f9c92835@ruby-forum.com> References: <4d8a5a69850433ec2e2ef883f9c92835@ruby-forum.com> Message-ID: <20061101130542.GP4769@cordoba.webit.de> On Tue, Oct 31, 2006 at 10:02:50AM +0100, Eric Gross wrote: > Hey guys, im trying to get my friends to come up at the top of the act > as ferret search. I would query the whole result set first, then move my > friends to the top, but the thing is, Im paginating my results and use > the offset and limit parameters in the multi_search() function. > > Anyone know how to do this? We'd need some more info on how you store your friend-of-relationship, and how your index looks like (i.e. what fields does it contain). cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From curtis.hatter at insightbb.com Wed Nov 1 09:54:25 2006 From: curtis.hatter at insightbb.com (Curtis Hatter) Date: Wed, 1 Nov 2006 09:54:25 -0500 Subject: [Ferret-talk] aaf and stop words; query parser Message-ID: <002f01c6fdc5$a009f6b0$0202a8c0@again> I've been trying to implement acts_as_ferret in my latest project and ran into a snag. If I do a search for 'auditor state' then the search works perfectly. If I include a stop word, as in 'auditor of state', then I get no results. I'd prefer not to set stop words to nil and index everything. The solution, that I have yet to attempt, is to use Ferret::QueryParser instead of passing the query as a string to the search method. I couldn't find a way to do this with the current acts_as_ferret plugin and was wondering if modifying the plugin to have a "ferret_query_parser" method would be better than trying to use Ferret directly from my app model. Also, wouldn't this approach be necessary if I implement my own analyzer? I was thinking of possibly using the double metaphone algorithm and thinking that without the query parser to analyze the search string using my custom analyzer that I wouldn't get any results. I hope that I haven't missed something obvious in aaf's api. On a side note, is there any recommended place to place custom analyzers for rails apps? Thanks, Curtis -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20061101/ad65daf5/attachment-0001.html From kraemer at webit.de Wed Nov 1 12:27:31 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 1 Nov 2006 18:27:31 +0100 Subject: [Ferret-talk] aaf and stop words; query parser In-Reply-To: <002f01c6fdc5$a009f6b0$0202a8c0@again> References: <002f01c6fdc5$a009f6b0$0202a8c0@again> Message-ID: <20061101172731.GA16601@cordoba.webit.de> Hi! On Wed, Nov 01, 2006 at 09:54:25AM -0500, Curtis Hatter wrote: > I've been trying to implement acts_as_ferret in my latest project and ran into a snag. If I do a search for 'auditor state' then the search works perfectly. If I include a stop word, as in 'auditor of state', then I get no results. I'd prefer not to set stop words to nil and index everything. what version of AAF/Ferret do you use ? Afair that issue isn't new, and should have been fixed some time ago. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From curtis.hatter at insightbb.com Wed Nov 1 13:14:21 2006 From: curtis.hatter at insightbb.com (Curtis Hatter) Date: Wed, 1 Nov 2006 13:14:21 -0500 Subject: [Ferret-talk] aaf and stop words; query parser References: <002f01c6fdc5$a009f6b0$0202a8c0@again> <20061101172731.GA16601@cordoba.webit.de> Message-ID: <002801c6fde1$8dcb2ed0$0202a8c0@again> Currently I'm using AAF 0.10 and windows build of Ferret version 0.10.9 I'm currently moving my development platform to a FreeBSD machine which is why I haven't been able to do much testing. The FreeBSD version will be 0.10.13 I looked into the archives I have but only solution I found was to set the stopwords to nil. Thanks, Curtis ----- Original Message ----- From: "Jens Kraemer" To: Sent: Wednesday, November 01, 2006 12:27 PM Subject: Re: [Ferret-talk] aaf and stop words; query parser Hi! On Wed, Nov 01, 2006 at 09:54:25AM -0500, Curtis Hatter wrote: > I've been trying to implement acts_as_ferret in my latest project and ran into a snag. If I do a search for 'auditor state' then the search works perfectly. If I include a stop word, as in 'auditor of state', then I get no results. I'd prefer not to set stop words to nil and index everything. what version of AAF/Ferret do you use ? Afair that issue isn't new, and should have been fixed some time ago. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 _______________________________________________ Ferret-talk mailing list Ferret-talk at rubyforge.org http://rubyforge.org/mailman/listinfo/ferret-talk From anotherbritt at gmail.com Thu Nov 2 13:28:38 2006 From: anotherbritt at gmail.com (Britt Selvitelle) Date: Thu, 2 Nov 2006 19:28:38 +0100 Subject: [Ferret-talk] highlighting with find_by_contents Message-ID: <7c12e24c9b9539b7e21bf84296c52290@ruby-forum.com> I'm trying to highlight keyword snippets using the highlight method of the results returned from find_by_contents (the actual models), but always come up with an empty array. Any ideas what could be going wrong? -- Posted via http://www.ruby-forum.com/. From curtis.hatter at insightbb.com Thu Nov 2 16:48:04 2006 From: curtis.hatter at insightbb.com (curtis.hatter at insightbb.com) Date: Thu, 02 Nov 2006 16:48:04 -0500 Subject: [Ferret-talk] highlighting with find_by_contents In-Reply-To: <7c12e24c9b9539b7e21bf84296c52290@ruby-forum.com> References: <7c12e24c9b9539b7e21bf84296c52290@ruby-forum.com> Message-ID: Have you defined the field(s) as storable? Link to FieldInfo class: http://ferret.davebalmain.com/api/classes/Ferret/Index/FieldInfo.html You need to set up your fields with acts_as_ferret. This is how I have mine setup (still learning Ferret and AAF so may not be totally correct but highlighting works): acts_as_ferret( :fields => { ? :name => {}, ? :desc => {}, ? :body => {:store => :yes}, ? :role => {}, }) This allows me to use the highlighting with the "body" field. The other ones still can't highlight. Curtis ----- Original Message ----- From: Britt Selvitelle Date: Thursday, November 2, 2006 13:57 Subject: [Ferret-talk] highlighting with find_by_contents To: ferret-talk at rubyforge.org > I'm trying to highlight keyword snippets using the highlight > method of > the results returned from find_by_contents (the actual models), > but > always come up with an empty array. Any ideas what could be > going wrong? > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20061102/ae89831a/attachment.html From cgansen at gmail.com Thu Nov 2 17:57:00 2006 From: cgansen at gmail.com (Chris Gansen) Date: Thu, 2 Nov 2006 16:57:00 -0600 Subject: [Ferret-talk] Indexing and searching across multiple locales Message-ID: Hi - I'm currently investigating support for Ferret and content that spans multiple locales. I am particularly interested in using stemming and fuzzy searches (e.g. with slop factor) across multiple locales. So far I've followed the online docs for implementing a Stemming Analyzer, and it is working for English terms just fine. I've also written a method to import data from the legacy XML files and save as ActiveRecord objects (using AAF). However, I'm not certain the the locale-switching is working properly: doc = Document.import_from_xml(filename) Ferret::locale = doc.locale_id # locale_id is "en.UTF-8" or "fr.UTF-8" for example doc.save What's the best way to handle the import of data, where locale is changing from document to document? What other considerations should I keep in mind when using Ferret across multiple locales? Thanks for any tips! --chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20061102/752792e0/attachment.html From dan at tut0r.com Thu Nov 2 18:47:21 2006 From: dan at tut0r.com (Dan Yelp) Date: Fri, 3 Nov 2006 00:47:21 +0100 Subject: [Ferret-talk] Safe to read index while it is being written to? Message-ID: <28a7193190a403b6853c595fe94a50fa@ruby-forum.com> Is it safe to open the index to do searches on while another process is writing to the index? -- Posted via http://www.ruby-forum.com/. From erik at ehatchersolutions.com Thu Nov 2 21:48:28 2006 From: erik at ehatchersolutions.com (Erik Hatcher) Date: Thu, 2 Nov 2006 21:48:28 -0500 Subject: [Ferret-talk] Safe to read index while it is being written to? In-Reply-To: <28a7193190a403b6853c595fe94a50fa@ruby-forum.com> References: <28a7193190a403b6853c595fe94a50fa@ruby-forum.com> Message-ID: <87493A44-CF8F-414D-A6F2-E4AFB7E043A2@ehatchersolutions.com> On Nov 2, 2006, at 6:47 PM, Dan Yelp wrote: > Is it safe to open the index to do searches on while another > process is > writing to the index? Yes. The caveat is that an IndexReader or IndexSearcher only sees what was indexed at the time the index was opened, and will not see documents written since. Seeing changes requires re-opening. See Solr discussions on auto-warming if you need faster searches out of the gate. Erik From erik at ehatchersolutions.com Fri Nov 3 06:25:33 2006 From: erik at ehatchersolutions.com (Erik Hatcher) Date: Fri, 3 Nov 2006 06:25:33 -0500 Subject: [Ferret-talk] Safe to read index while it is being written to? In-Reply-To: <87493A44-CF8F-414D-A6F2-E4AFB7E043A2@ehatchersolutions.com> References: <28a7193190a403b6853c595fe94a50fa@ruby-forum.com> <87493A44-CF8F-414D-A6F2-E4AFB7E043A2@ehatchersolutions.com> Message-ID: <78EB930A-BA04-4C49-A729-B98B8C7433A6@ehatchersolutions.com> oops... i replied as if this was the java-user list, not the Ferret one. sorry for mixing up my languages. ai ya! On Nov 2, 2006, at 9:48 PM, Erik Hatcher wrote: > > On Nov 2, 2006, at 6:47 PM, Dan Yelp wrote: >> Is it safe to open the index to do searches on while another >> process is >> writing to the index? > > Yes. The caveat is that an IndexReader or IndexSearcher only sees > what was indexed at the time the index was opened, and will not see > documents written since. Seeing changes requires re-opening. See > Solr discussions on auto-warming if you need faster searches out of > the gate. > > Erik > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From anotherbritt at gmail.com Fri Nov 3 10:23:06 2006 From: anotherbritt at gmail.com (Britt Selvitelle) Date: Fri, 3 Nov 2006 10:23:06 -0500 Subject: [Ferret-talk] highlighting with find_by_contents In-Reply-To: References: <7c12e24c9b9539b7e21bf84296c52290@ruby-forum.com> Message-ID: <9fd96fa70611030723k1dad5c19re5e801520dff9fd5@mail.gmail.com> Thanks Curtis and Jens. That worked great (after rebuilding index)! Britt On 11/2/06, curtis.hatter at insightbb.com wrote: > Have you defined the field(s) as storable? > > Link to FieldInfo class: > http://ferret.davebalmain.com/api/classes/Ferret/Index/FieldInfo.html > > You need to set up your fields with acts_as_ferret. This is how I have mine > setup (still learning Ferret and AAF so may not be totally correct but > highlighting works): > > acts_as_ferret( :fields => { > :name => {}, > :desc => {}, > :body => {:store => :yes}, > :role => {}, > }) > > This allows me to use the highlighting with the "body" field. The other ones > still can't highlight. > > Curtis > > ----- Original Message ----- > From: Britt Selvitelle > Date: Thursday, November 2, 2006 13:57 > Subject: [Ferret-talk] highlighting with find_by_contents > To: ferret-talk at rubyforge.org > > > I'm trying to highlight keyword snippets using the highlight > > method of > > the results returned from find_by_contents (the actual models), > > but > > always come up with an empty array. Any ideas what could be > > going wrong? > > > > -- > > Posted via http://www.ruby-forum.com/. > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > > From fastjames at gmail.com Fri Nov 3 15:17:51 2006 From: fastjames at gmail.com (Jim Kane) Date: Fri, 3 Nov 2006 21:17:51 +0100 Subject: [Ferret-talk] Safe to read index while it is being written to? In-Reply-To: <28a7193190a403b6853c595fe94a50fa@ruby-forum.com> References: <28a7193190a403b6853c595fe94a50fa@ruby-forum.com> Message-ID: <7fc3ea6e72c1982188168173eceda963@ruby-forum.com> Dan Yelp wrote: > Is it safe to open the index to do searches on while another process is > writing to the index? I don't know the technical details, but I have experienced some problems when trying to do this. I have an index of about 275K short documents, and in an ideal world I'd be updating it continuously. Right now I wait until the off-hours because if I'm updating the index at all, searches tend to cause my mongrel procs to either hang (100% CPU util), or segfault (documented in a ticket on the ferret track). I have processes in place to correct these problems when they occur, but it's still discouraging because I can't keep the data as fresh as I'd like it to be. Jim -- Posted via http://www.ruby-forum.com/. From andreas.korth at gmx.net Fri Nov 3 16:21:53 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Fri, 3 Nov 2006 22:21:53 +0100 Subject: [Ferret-talk] Indexing and searching across multiple locales In-Reply-To: References: Message-ID: <9201B88B-178C-4174-9E4B-DFD41C34A619@gmx.net> These are very good questions indeed. I'm afraid I don't have the answers but I'd like to add some questions and remarks of my own and hope someone will eventually provide some insight. On 02.11.2006, at 23:57, Chris Gansen wrote: > I'm currently investigating support for Ferret and content that > spans multiple locales. I am particularly interested in using > stemming and fuzzy searches (e.g. with slop factor) across multiple > locales. > > So far I've followed the online docs for implementing a Stemming > Analyzer, and it is working for English terms just fine. I've also > written a method to import data from the legacy XML files and save > as ActiveRecord objects (using AAF). However, I'm not certain the > the locale-switching is working properly: > > doc = Document.import_from_xml(filename) > Ferret::locale = doc.locale_id # locale_id is "en.UTF-8" or > "fr.UTF-8" for example > doc.save I don't think setting the locale has any effect on already created StemFilters and StopFilters, so the above code doesn't change anything. According to the docs the locale setting doesn't even affect the default stop words or stemming algorithms used when creating a new StopFilter or StemFilter, respectively. The default language is English in both cases, no matter what the current locale is. This leads me to the ultimate question: What is the locale setting good for anyway? Could it be that only the character encoding portion of the locale string is actually relevant? > What's the best way to handle the import of data, where locale is > changing from document to document? What other considerations > should I keep in mind when using Ferret across multiple locales? From what I have observed, you'll need to create different Analyzers with a StemFilter and StopFilter explicitly created for the respective locale. I don't know about French but the German stemming algorithm is very inaccurate. Stemming algorithms for the English language are probably easier to implement, since German and French have more complex rules and lots of exceptions. But even the English stemming algorithm seems to be entirely rule-based and thus fails on irregular verbs. I think it might be a good idea to provide a facility to extend the stemmer, very much like the inflection rules can be extended in Rails. Cheers, Andy From cgansen at gmail.com Fri Nov 3 18:18:49 2006 From: cgansen at gmail.com (Chris Gansen) Date: Fri, 3 Nov 2006 17:18:49 -0600 Subject: [Ferret-talk] Indexing and searching across multiple locales In-Reply-To: <9201B88B-178C-4174-9E4B-DFD41C34A619@gmx.net> References: <9201B88B-178C-4174-9E4B-DFD41C34A619@gmx.net> Message-ID: On 11/3/06, Andreas Korth wrote: > > These are very good questions indeed. I'm afraid I don't have the > answers but I'd like to add some questions and remarks of my own and > hope someone will eventually provide some insight. > Thanks for the response. I guess my real question is: how have other people handled indexing data across many locales? What works and what doesn't? From my initial work, the basic indexing works across languages; however, it's the "fun" stuff like stemming and fuzzy searches that I am particularly interested in. Any pointers are appreciated. --chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20061103/cf0d95bc/attachment.html From bk at benjaminkrause.com Sat Nov 4 10:52:54 2006 From: bk at benjaminkrause.com (Benjamin Krause) Date: Sat, 04 Nov 2006 16:52:54 +0100 Subject: [Ferret-talk] Indexing and searching across multiple locales In-Reply-To: References: <9201B88B-178C-4174-9E4B-DFD41C34A619@gmx.net> Message-ID: <454CB756.3040401@benjaminkrause.com> Chris Gansen schrieb: > Thanks for the response. I guess my real question is: how have other > people handled indexing data across many locales? What works and what > doesn't? From my initial work, the basic indexing works across > languages; however, it's the "fun" stuff like stemming and fuzzy > searches that I am particularly interested in. Hey Chris, i store content in different languages in different fields.. i have an object, that has content in de/pl/en and i got a field content_de, content_en and content_pl for that object. now i can implement a per_field_analyzer to stem each field in its locale. this might not exactly match your example, as this is really one db-object with different translations attached to it, not different objects in different languages. Ben From reverri at gmail.com Sat Nov 4 18:53:35 2006 From: reverri at gmail.com (Daniel Reverri) Date: Sun, 5 Nov 2006 00:53:35 +0100 Subject: [Ferret-talk] index updates Message-ID: The ferret documentation reports that using the option "key" can affect performance of indexing operations. If the option "key" is used when creating a ferret index, is there a performance hit when creating a new document, or is the performance hit only for updated records? -- Posted via http://www.ruby-forum.com/. From navneetaron at gmail.com Sun Nov 5 22:39:10 2006 From: navneetaron at gmail.com (Navneet Aron) Date: Mon, 6 Nov 2006 04:39:10 +0100 Subject: [Ferret-talk] NameError uninitialized constant Ferret::Index::FieldInfos Message-ID: <4136acc47cbb47a83a5a190e7a823557@ruby-forum.com> Hi Everyone, I've a RoR application. I am trying to build full text search capability into it. I installed Ferret. After that I installed the act_as_ferret plugin. I've also put the acts_as_ferret inside the .rb file . I'm using the find_by_contents to get the search results. I'm getting the following error. I've no clue and I didn't find any previous posts discussing this issue . I'll really appreciate if any of you can point out what I might be doing wrong. NameError in MaintenanceController#search uninitialized constant Ferret::Index::FieldInfos RAILS_ROOT: script/../config/.. Application Trace | Framework Trace | Full Trace vendor/rails/activesupport/lib/active_support/dependencies.rb:260:in `load_missing_constant' vendor/rails/activesupport/lib/active_support/dependencies.rb:431:in `const_missing' vendor/plugins/acts_as_ferret/lib/class_methods.rb:170:in `rebuild_index' vendor/plugins/acts_as_ferret/lib/class_methods.rb:223:in `create_index_instance' vendor/plugins/acts_as_ferret/lib/class_methods.rb:216:in `ferret_index' vendor/plugins/acts_as_ferret/lib/class_methods.rb:381:in `find_id_by_contents' vendor/plugins/acts_as_ferret/lib/class_methods.rb:248:in `find_by_contents' app/controllers/maintenance_controller.rb:99:in `search' -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Nov 6 03:31:49 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 6 Nov 2006 09:31:49 +0100 Subject: [Ferret-talk] NameError uninitialized constant Ferret::Index::FieldInfos In-Reply-To: <4136acc47cbb47a83a5a190e7a823557@ruby-forum.com> References: <4136acc47cbb47a83a5a190e7a823557@ruby-forum.com> Message-ID: <20061106083148.GA14929@cordoba.webit.de> On Mon, Nov 06, 2006 at 04:39:10AM +0100, Navneet Aron wrote: > Hi Everyone, > I've a RoR application. I am trying to build full text search capability > into it. I installed Ferret. After that I installed the act_as_ferret > plugin. I've also put the acts_as_ferret inside the .rb file . > I'm using the find_by_contents to get the search results. > > I'm getting the following error. I've no clue and I didn't find any > previous posts discussing this issue . I'll really appreciate if any of > you can point out what I might be doing wrong. > > NameError in MaintenanceController#search > > uninitialized constant Ferret::Index::FieldInfos looks like aaf can't find Ferret. What happens if you try to require 'ferret' in script/console ? Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Mon Nov 6 13:53:49 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Mon, 6 Nov 2006 19:53:49 +0100 Subject: [Ferret-talk] acts_as_ferret and associations Message-ID: I have the following models: class Book < ActiveRecord::Base acts_as_ferret belongs_to :author end class Author < ActiveRecord::Base has_many :books end and in the controller: def search if params[:query] @query = params[:query] @total, @books = Book.full_text_search(@query, :page => (params[:page]||1)) @pages = pages_for(@total) else @books = [] end end I can use the acts_as_ferret plugin to search for a book by title but it isn't picking up the authors through the association. So I can't do a search for books by author. I've a feeling I've left some config out here. Can anyone help me out? -- Posted via http://www.ruby-forum.com/. From anotherbritt at gmail.com Mon Nov 6 15:07:22 2006 From: anotherbritt at gmail.com (Britt Selvitelle) Date: Mon, 6 Nov 2006 15:07:22 -0500 Subject: [Ferret-talk] Updating the index. Message-ID: <9fd96fa70611061207p70a265a3xb93c1ddd9fedb4d4@mail.gmail.com> I've been reading through the lucene, ferret, and aaf docs, but I'm still a bit new at full text indexing. When I create a new instance of an indexed model, and save it, it doesn't show up in searches. Should I have to update the index of the entire model (which works) before it will return in queries? Britt From anotherbritt at gmail.com Mon Nov 6 15:56:51 2006 From: anotherbritt at gmail.com (Britt Selvitelle) Date: Mon, 6 Nov 2006 15:56:51 -0500 Subject: [Ferret-talk] Updating the index. In-Reply-To: <9fd96fa70611061207p70a265a3xb93c1ddd9fedb4d4@mail.gmail.com> References: <9fd96fa70611061207p70a265a3xb93c1ddd9fedb4d4@mail.gmail.com> Message-ID: <9fd96fa70611061256i55d25b8dm5bc55aea45c632b3@mail.gmail.com> Ah, is this the answer here? http://www.ruby-forum.com/topic/81658#new Which is the same problem I'm seeing. Britt On 11/6/06, Britt Selvitelle wrote: > I've been reading through the lucene, ferret, and aaf docs, but I'm > still a bit new at full text indexing. > > When I create a new instance of an indexed model, and save it, it > doesn't show up in searches. Should I have to update the index of the > entire model (which works) before it will return in queries? > > Britt > From howardmoon at hitcity.com.au Mon Nov 6 17:47:03 2006 From: howardmoon at hitcity.com.au (Pete Royle) Date: Mon, 6 Nov 2006 23:47:03 +0100 Subject: [Ferret-talk] acts_as_ferret and associations In-Reply-To: References: Message-ID: Hi Matthew I'm fairly new to AAF as well, but I think if you read through this: http://projects.jkraemer.net/acts_as_ferret/rdoc/classes/FerretMixin/Acts/ARFerret/ClassMethods.html#M000006 it might help you. I think essentially it boils down to the fact that attributes are indexed by default but associations are not, so you might have to pass some options in your call to acts_as_ferret. Pete. -- Posted via http://www.ruby-forum.com/. From howardmoon at hitcity.com.au Mon Nov 6 18:08:10 2006 From: howardmoon at hitcity.com.au (Pete Royle) Date: Tue, 7 Nov 2006 00:08:10 +0100 Subject: [Ferret-talk] acts_as_ferret and associations In-Reply-To: References: Message-ID: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> To clarify, say your Author class has the fields, 'first_name' and 'surname'. To have these indexed by the Book class you would need to do some things: 1) Create methods in Book.rb to access these fields: def author_first_name return author.first_name end def author_surname return author_surname end 2) Pass these fields as options to the call to acts_as_ferret: class Book < ActiveRecord::Base acts_as_ferret :additional_fields => ['author_first_name', 'author_surname'] belongs_to :author end I'm not sure it there's a more elegant solution, but that should do the trick. Pete. -- Posted via http://www.ruby-forum.com/. From jeffrey at silveregg.co.jp Mon Nov 6 22:36:18 2006 From: jeffrey at silveregg.co.jp (Jeffrey Gelens) Date: Tue, 07 Nov 2006 12:36:18 +0900 Subject: [Ferret-talk] No search results using Searcher Message-ID: <1162870578.21503.5.camel@jeffrey.esaka> > On Tue, Oct 31, 2006 at 03:55:07PM +0900, Jeffrey Gelens wrote: > > I just started using Ferret and I successfully indexed some > documents. I > > can search this index using the following code: > > > > index = Index::Index.new(:path => path) > > index.search_each("something") do |doc, score| > > print "##{doc} #{index[doc]['url']} - #{score}" > > print "\n" > > end > > > > However, when I try to use Search::Searcher and QueryParser I don't > get > > any results. I tried the following code: > > > > queryparser = QueryParser.new() > > searcher = Searcher.new(path) > > queryparser.fields = searcher.reader.fields > > searcher.search(queryparser.parse("something")) > > > > I index all my documents as follows: > > > > index = Index::Index.new(:path => path, :analyzer => > > Analysis::RegExpAnalyzer.new(/./, false)) > > index << { :title => title, :url => link, :body => page } > > > > What am I doing wrong? > > Basically you should use the same analyzer to analyze queries as you > used to analyze your content. So constructing your queryparser like > this: > qp = QueryParser.new(:analyzer => Analysis::RegExpAnalyzer.new(/./, > false)) > your searches should work. > > However, your regexp for the analyzer looks strange - /./ matches > every > single character, including whitespace. So each field's value would be > indexed as 1-character long terms, which probably is not what you > want. > > However I don't know why searching through the Index class worked, > I'd > suspect it not to work, too. > > > Jens Now I constructed the queryparser using the same analyzer and it seems that was indeed the problem. Thanks. The reason why I'm using the regexp /./ is that I'm indexing Japanese sites. I found a solution on this mailinglist to use a RegExpAnalyzer using this regexp for asian characters. It is working fine. Am I using it correctly or should I index Japanese characters ? -- Jeffrey Gelens From kraemer at webit.de Tue Nov 7 02:36:24 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 7 Nov 2006 08:36:24 +0100 Subject: [Ferret-talk] acts_as_ferret and associations In-Reply-To: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> References: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> Message-ID: <20061107073623.GA27583@cordoba.webit.de> On Tue, Nov 07, 2006 at 12:08:10AM +0100, Pete Royle wrote: [..] > 1) Create methods in Book.rb to access these fields: [..] > 2) Pass these fields as options to the call to acts_as_ferret: > > class Book < ActiveRecord::Base > acts_as_ferret :additional_fields => ['author_first_name', > 'author_surname'] > belongs_to :author > end > > I'm not sure it there's a more elegant solution, but that should do the > trick. that's exactly what I would have suggested, but please use symbols for field names in your call to acts_as_ferret. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Tue Nov 7 02:39:32 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 7 Nov 2006 08:39:32 +0100 Subject: [Ferret-talk] Updating the index. In-Reply-To: <9fd96fa70611061256i55d25b8dm5bc55aea45c632b3@mail.gmail.com> References: <9fd96fa70611061207p70a265a3xb93c1ddd9fedb4d4@mail.gmail.com> <9fd96fa70611061256i55d25b8dm5bc55aea45c632b3@mail.gmail.com> Message-ID: <20061107073932.GB27583@cordoba.webit.de> On Mon, Nov 06, 2006 at 03:56:51PM -0500, Britt Selvitelle wrote: > Ah, is this the answer here? http://www.ruby-forum.com/topic/81658#new > Which is the same problem I'm seeing. Well, If you use some other plugin that does magic things after saving your record (lice acts_as_taggable does), then this answer might as well solve your problem ;-) cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Tue Nov 7 03:34:52 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 7 Nov 2006 09:34:52 +0100 Subject: [Ferret-talk] No search results using Searcher In-Reply-To: <1162870578.21503.5.camel@jeffrey.esaka> References: <1162870578.21503.5.camel@jeffrey.esaka> Message-ID: <20061107083452.GC14929@cordoba.webit.de> On Tue, Nov 07, 2006 at 12:36:18PM +0900, Jeffrey Gelens wrote: [..] > Now I constructed the queryparser using the same analyzer and it seems > that was indeed the problem. Thanks. > > The reason why I'm using the regexp /./ is that I'm indexing Japanese > sites. I found a solution on this mailinglist to use a RegExpAnalyzer > using this regexp for asian characters. > It is working fine. Am I using it correctly or should I index Japanese > characters ? I don't know anything about indexing japanese characters, so just ignore me if it is working fine ;-) Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Tue Nov 7 04:22:20 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 7 Nov 2006 10:22:20 +0100 Subject: [Ferret-talk] acts_as_ferret and associations In-Reply-To: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> References: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> Message-ID: Thanks Pete. I'll give this a go. Do you mean: > def author_surname > return author.surname > end and not: > def author_surname > return author_surname > end M. -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Tue Nov 7 05:15:41 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 7 Nov 2006 11:15:41 +0100 Subject: [Ferret-talk] acts_as_ferret and associations In-Reply-To: <20061107073623.GA27583@cordoba.webit.de> References: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> <20061107073623.GA27583@cordoba.webit.de> Message-ID: Jens Kraemer wrote: > that's exactly what I would have suggested, but please use symbols for > field names in your call to acts_as_ferret. So: class Book < ActiveRecord::Base acts_as_ferret :additional_fields => [:author_first_name, :author_surname] belongs_to :author def author_first_name return author.first_name end def author_surname return author.surname end end ? -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Tue Nov 7 05:22:40 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 7 Nov 2006 11:22:40 +0100 Subject: [Ferret-talk] acts_as_ferret and associations In-Reply-To: References: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> <20061107073623.GA27583@cordoba.webit.de> Message-ID: <1012fbe7cb9fb07606c33a80dd630701@ruby-forum.com> Matthew Planchant wrote: > class Book < ActiveRecord::Base > acts_as_ferret :additional_fields => [:author_first_name, > :author_surname] > belongs_to :author > > def author_first_name > return author.first_name > end > > def author_surname > return author.surname > end > end This works. Thanks. -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Tue Nov 7 05:35:38 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 7 Nov 2006 11:35:38 +0100 Subject: [Ferret-talk] acts_as_ferret and associations In-Reply-To: <1012fbe7cb9fb07606c33a80dd630701@ruby-forum.com> References: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> <20061107073623.GA27583@cordoba.webit.de> <1012fbe7cb9fb07606c33a80dd630701@ruby-forum.com> Message-ID: <40216b7035e101d7cbb1b3b7bf82e9d8@ruby-forum.com> Will this work with many-to-many relationships? For example: class Book < ActiveRecord::Base acts_as_ferret :additional_fields => [:topic_title] has_many :book_topics, :dependent => true has_many :topics, :through => :book_topics def topic_title return topic.title end end -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Nov 7 08:15:30 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 7 Nov 2006 14:15:30 +0100 Subject: [Ferret-talk] acts_as_ferret and associations In-Reply-To: <40216b7035e101d7cbb1b3b7bf82e9d8@ruby-forum.com> References: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> <20061107073623.GA27583@cordoba.webit.de> <1012fbe7cb9fb07606c33a80dd630701@ruby-forum.com> <40216b7035e101d7cbb1b3b7bf82e9d8@ruby-forum.com> Message-ID: <20061107131530.GG14929@cordoba.webit.de> On Tue, Nov 07, 2006 at 11:35:38AM +0100, Matthew Planchant wrote: > Will this work with many-to-many relationships? > > For example: > > class Book < ActiveRecord::Base > acts_as_ferret :additional_fields => [:topic_title] > > has_many :book_topics, :dependent => true > has_many :topics, :through => :book_topics > > def topic_title > return topic.title > end this won't work, as there is no method 'topic' in your Book class. But you could index the titles of all topics: acts_as_ferret :additional_fields => [:topic_titles] def topic_titles topics.collect { |topic| topic.title }.join ' ' end Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Tue Nov 7 09:13:14 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 7 Nov 2006 15:13:14 +0100 Subject: [Ferret-talk] acts_as_ferret and associations In-Reply-To: <20061107131530.GG14929@cordoba.webit.de> References: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> <20061107073623.GA27583@cordoba.webit.de> <1012fbe7cb9fb07606c33a80dd630701@ruby-forum.com> <40216b7035e101d7cbb1b3b7bf82e9d8@ruby-forum.com> <20061107131530.GG14929@cordoba.webit.de> Message-ID: <070cd419b72e4e428decbad2de835cb7@ruby-forum.com> Jens Kraemer wrote: > On Tue, Nov 07, 2006 at 11:35:38AM +0100, Matthew Planchant wrote: >> def topic_title >> return topic.title >> end > > this won't work, as there is no method 'topic' in your Book class. > But you could index the titles of all topics: > > acts_as_ferret :additional_fields => [:topic_titles] > def topic_titles > topics.collect { |topic| topic.title }.join ' ' > end Thanks. I'll give this a shot. -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Tue Nov 7 09:40:31 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Tue, 7 Nov 2006 15:40:31 +0100 Subject: [Ferret-talk] aaf and stop words; query parser In-Reply-To: <002801c6fde1$8dcb2ed0$0202a8c0@again> References: <002f01c6fdc5$a009f6b0$0202a8c0@again> <20061101172731.GA16601@cordoba.webit.de> <002801c6fde1$8dcb2ed0$0202a8c0@again> Message-ID: <1e470bad1a8dc04e8a77380e9dba3224@ruby-forum.com> I'm using the same version of AAF and Ferret 0.3.0 and 0.10.9 respectively. I sent David Balmain my index so he could analyze it. I posted a similiar message here: http://www.ruby-forum.com/topic/84909 Any index I built with AAF seemed to demostrate this problem. I checked the code, but I couldn't see where it might have been modifying the query string in anyway. Any help? Charlie Curtis Hatter wrote: > Currently I'm using AAF 0.10 and windows build of Ferret version 0.10.9 > > I'm currently moving my development platform to a FreeBSD machine which > is > why I haven't been able to do much testing. The FreeBSD version will be > 0.10.13 > > I looked into the archives I have but only solution I found was to set > the > stopwords to nil. > > Thanks, > Curtis > > ----- Original Message ----- > From: "Jens Kraemer" > To: > Sent: Wednesday, November 01, 2006 12:27 PM > Subject: Re: [Ferret-talk] aaf and stop words; query parser > > > Hi! > > On Wed, Nov 01, 2006 at 09:54:25AM -0500, Curtis Hatter wrote: >> I've been trying to implement acts_as_ferret in my latest project and ran > into a snag. If I do a search for 'auditor state' then the search works > perfectly. If I include a stop word, as in 'auditor of state', then I > get no > results. I'd prefer not to set stop words to nil and index everything. > > what version of AAF/Ferret do you use ? Afair that issue isn't new, and > should have been fixed some time ago. > > cheers, > Jens > > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Tue Nov 7 09:48:21 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Tue, 7 Nov 2006 15:48:21 +0100 Subject: [Ferret-talk] Memory consumption too high Message-ID: Hi, I'm having trouble with ferret and AAF blowing up with a NoMemoryError. Sometimes when I add documents inside my rails app. Ferret starts consuming huge amounts of memory. I'm on a machine with 2GB of memory and it still runs out of memory. Sometimes I'm able to run MyObject.rebuild_index and the memory doesn't move up at all. However, sometimes it blows up horribly with a NoMemoryError. I'm running it from the script\console. Here is the stack trace from runnning MyObject.rebuild_index D:/dev/ruby/lib/ruby/gems/1.8/gems/ferret-0.10.9-mswin32/lib/ferret/index.rb:277 :in `add_document': failed to allocate memory (NoMemoryError) from D:/dev/ruby/lib/ruby/gems/1.8/gems/ferret-0.10.9-mswin32/lib/ferret /index.rb:277:in `<<' from D:/dev/ruby/lib/ruby/1.8/monitor.rb:229:in `synchronize' from D:/dev/ruby/lib/ruby/gems/1.8/gems/ferret-0.10.9-mswin32/lib/ferret /index.rb:252:in `<<' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/c lass_methods.rb:199:in `rebuild_index' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/c lass_methods.rb:198:in `rebuild_index' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/c lass_methods.rb:197:in `rebuild_index' from D:/dev/ruby/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_r ecord/connection_adapters/abstract/database_statements.rb:51:in `transaction' from D:/dev/ruby/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_r ecord/transactions.rb:91:in `transaction' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/c lass_methods.rb:196:in `rebuild_index' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/c lass_methods.rb:194:in `rebuild_index' from (irb):1:in `irb_binding' from D:/dev/ruby/lib/ruby/1.8/irb/workspace.rb:52:in `irb_binding' from D:/dev/ruby/lib/ruby/1.8/irb/workspace.rb:52 Here is the stack trace I see when I try running my unit tests: D:\dev\src\booksmart>rake test:units:rcov (in D:/dev/src/booksmart) rm -rf ./coverage/units D:/dev/ruby/bin/ruby "D:/dev/src/booksmart/vendor/plugins/rails_rcov/tasks/rails _rcov.rake" --run-rake-task=test:units (in D:/dev/src/booksmart) rcov.cmd -o "D:/dev/src/booksmart/coverage/units" -T -x "rubygems/*,rcov*" --rai ls -Ilib;test "D:/dev/ruby/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake/rake_test _loader.rb" "test/unit/book_test.rb" "test/unit/cart_test.rb" "test/unit/credit_ card_test.rb" "test/unit/line_item_test.rb" "test/unit/note_test.rb" "test/unit/ notifications_test.rb" "test/unit/publisher_test.rb" "test/unit/purchase_test.rb " "test/unit/user_test.rb" "test/unit/pinning_test.rb" "test/unit/page_test.rb" Loaded suite D:/dev/ruby/bin/rcov Started D:/dev/ruby/lib/ruby/gems/1.8/gems/ferret-0.10.9-mswin32/lib/ferret/index.rb:277 :in `add_document': failed to allocate memory (NoMemoryError) from D:/dev/ruby/lib/ruby/gems/1.8/gems/ferret-0.10.9-mswin32/lib/ferret /index.rb:277:in `<<' from D:/dev/ruby/lib/ruby/1.8/monitor.rb:229:in `synchronize' from D:/dev/ruby/lib/ruby/gems/1.8/gems/ferret-0.10.9-mswin32/lib/ferret /index.rb:252:in `<<' from D:/dev/src/booksmart/config/../vendor/plugins/acts_as_ferret/lib/in stance_methods.rb:85:in `ferret_create' from D:/dev/ruby/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_r ecord/callbacks.rb:344:in `callback' from D:/dev/ruby/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_r ecord/callbacks.rb:341:in `callback' from D:/dev/ruby/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_r ecord/callbacks.rb:266:in `create_without_timestamps' from D:/dev/ruby/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_r ecord/timestamp.rb:30:in `create' ... 22 levels... from D:/dev/ruby/lib/ruby/1.8/test/unit/autorunner.rb:200:in `run' from D:/dev/ruby/lib/ruby/1.8/test/unit/autorunner.rb:13:in `run' from D:/dev/ruby/lib/ruby/1.8/test/unit.rb:285 from D:/dev/ruby/bin/rcov:18 -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Tue Nov 7 09:50:30 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Tue, 7 Nov 2006 15:50:30 +0100 Subject: [Ferret-talk] Memory consumption too high In-Reply-To: References: Message-ID: <92759e5576bfad22e32e48bf08e19df7@ruby-forum.com> Sorry here is a better format stack trace: D:/dev/ruby/lib/ruby/gems/1.8/gems/ferret-0.10.9-mswin32/lib/ferret/index.rb:277:in `add_document': failed to allocate memory (NoMemoryError) from D:/dev/ruby/lib/ruby/gems/1.8/gems/ferret-0.10.9-mswin32/lib/ferret/index.rb:277:in `<<' from D:/dev/ruby/lib/ruby/1.8/monitor.rb:229:in `synchronize' from D:/dev/ruby/lib/ruby/gems/1.8/gems/ferret-0.10.9-mswin32/lib/ferret/index.rb:252:in `<<' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:199:in `rebuild_index' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:198:in `rebuild_index' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:197:in `rebuild_index' from D:/dev/ruby/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/connection_adapters/abstract/database_statements.rb:51:in `transaction' from D:/dev/ruby/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/transactions.rb:91:in `transaction' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:196:in `rebuild_index' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:194:in `rebuild_index' from (irb):1:in `irb_binding' from D:/dev/ruby/lib/ruby/1.8/irb/workspace.rb:52:in `irb_binding' from D:/dev/ruby/lib/ruby/1.8/irb/workspace.rb:52 Oh why doesn't this forum have preview! -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Tue Nov 7 09:59:00 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Tue, 7 Nov 2006 15:59:00 +0100 Subject: [Ferret-talk] aaf and stop words; query parser In-Reply-To: <1e470bad1a8dc04e8a77380e9dba3224@ruby-forum.com> References: <002f01c6fdc5$a009f6b0$0202a8c0@again> <20061101172731.GA16601@cordoba.webit.de> <002801c6fde1$8dcb2ed0$0202a8c0@again> <1e470bad1a8dc04e8a77380e9dba3224@ruby-forum.com> Message-ID: <8edde78d5bd915d5d39d93582a0f340f@ruby-forum.com> Charlie Hubbard wrote: > I'm using the same version of AAF and Ferret 0.3.0 and 0.10.9 > respectively. I sent David Balmain my index so he could analyze it. I > posted a similiar message here: > > http://www.ruby-forum.com/topic/84909 > > Any index I built with AAF seemed to demostrate this problem. I checked > the code, but I couldn't see where it might have been modifying the > query string in anyway. > > Any help? I should also say that I was not able to reproduce this when I created an index using just ferret. So doing something similar to what David suggested in the other thread. I got hits when I submitted queries with stop words. Hope that helps. Charlie -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Tue Nov 7 11:04:37 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 7 Nov 2006 17:04:37 +0100 Subject: [Ferret-talk] Which version for Win32? Message-ID: <8adc3f4059228cd93d95abac91c18cab@ruby-forum.com> Which version of the ferret gem should I be installing in Win32? I currently have the latest 'mswin32' version (0.10.9) installed. Is this the version I should be using? Should I install one of the 'ruby' versions? -- Posted via http://www.ruby-forum.com/. From mark.puckett at gmail.com Tue Nov 7 12:25:12 2006 From: mark.puckett at gmail.com (Mark Puckett) Date: Tue, 7 Nov 2006 18:25:12 +0100 Subject: [Ferret-talk] searchable or acts_as_ferret or neither? In-Reply-To: <562a35c10611010026k3807ccc5h8217efccc9abb381@mail.gmail.com> References: <73119141bf0e84da110ae6d0e1a7ef78@ruby-forum.com> <562a35c10611010026k3807ccc5h8217efccc9abb381@mail.gmail.com> Message-ID: <79addec0544666213bae52714517f0c9@ruby-forum.com> I have this in my model: acts_as_ferret :fields => ['name', 'brand', 'primary_category', 'secondary_category'] Those fields are defined as follows: name varchar(255) brand varchar(255) primary_category varchar(40) secondary_category text The index appears to have hit some sort of equillibrium now (?), with the longest requests taking about 8 seconds, but with the average still being around 4 seconds. The perf on my dev machine is better (averaging around 1.5 secs) but not great, of course it's running in development mode. I'll try it in production mode tonight. Thanks -Mark Jan Prill wrote: > Hi Mark, > > would you mind posting some of your search-code? How is the performance > on > your development machine. There's something going wrong big time in your > dreamhost installation.. No chance a query is taking two minutes on this > very moderate amount of data. > > Cheers, > Jan -- Posted via http://www.ruby-forum.com/. From anotherbritt at gmail.com Tue Nov 7 12:50:46 2006 From: anotherbritt at gmail.com (Britt Selvitelle) Date: Tue, 7 Nov 2006 12:50:46 -0500 Subject: [Ferret-talk] Updating the index. In-Reply-To: <20061107073932.GB27583@cordoba.webit.de> References: <9fd96fa70611061207p70a265a3xb93c1ddd9fedb4d4@mail.gmail.com> <9fd96fa70611061256i55d25b8dm5bc55aea45c632b3@mail.gmail.com> <20061107073932.GB27583@cordoba.webit.de> Message-ID: <9fd96fa70611070950i491ef3bbl383685f01e12454f@mail.gmail.com> And if I don't? Must I call rebuild_index after a new record is created? On 11/7/06, Jens Kraemer wrote: > On Mon, Nov 06, 2006 at 03:56:51PM -0500, Britt Selvitelle wrote: > > Ah, is this the answer here? http://www.ruby-forum.com/topic/81658#new > > Which is the same problem I'm seeing. > > Well, If you use some other plugin that does magic things after saving > your record (lice acts_as_taggable does), then this answer might as well > solve your problem ;-) > > cheers, > Jens > > > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From tennisbum2002 at hotmail.com Tue Nov 7 13:01:10 2006 From: tennisbum2002 at hotmail.com (Eric Gross) Date: Tue, 7 Nov 2006 19:01:10 +0100 Subject: [Ferret-talk] conditional boost? friends to come up at top of search.. In-Reply-To: <20061101130542.GP4769@cordoba.webit.de> References: <4d8a5a69850433ec2e2ef883f9c92835@ruby-forum.com> <20061101130542.GP4769@cordoba.webit.de> Message-ID: <5a9fb5c210b35c4e9fbf8e972ad8a519@ruby-forum.com> ok for the user, for now i am only indexing the users full name, i have class User < ActiveRecord::Base acts_as_ferret :store_class_name => true, :fields => { :full_name => { :boost => 3 }} has_and_belongs_to_many :friends,:class_name=>"User", :join_table=> "friends_users",:association_foreign_key => "friend_id", :after_add => :become_friend_to_friend,:after_remove=> :remove_user_as_friend so thats pretty much all there is there. The thing is Im calling the search as a multisearch - User.multisearch(query,[ Book ], (with :limit and :offset)). Jens Kraemer wrote: > On Tue, Oct 31, 2006 at 10:02:50AM +0100, Eric Gross wrote: >> Hey guys, im trying to get my friends to come up at the top of the act >> as ferret search. I would query the whole result set first, then move my >> friends to the top, but the thing is, Im paginating my results and use >> the offset and limit parameters in the multi_search() function. >> >> Anyone know how to do this? > > We'd need some more info on how you store your friend-of-relationship, > and > how your index looks like (i.e. what fields does it contain). > > cheers, > Jens > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 -- Posted via http://www.ruby-forum.com/. From curtis.hatter at insightbb.com Tue Nov 7 18:25:20 2006 From: curtis.hatter at insightbb.com (Curtis Hatter) Date: Tue, 7 Nov 2006 18:25:20 -0500 Subject: [Ferret-talk] aaf and stop words; query parser References: <002f01c6fdc5$a009f6b0$0202a8c0@again> <20061101172731.GA16601@cordoba.webit.de> <002801c6fde1$8dcb2ed0$0202a8c0@again> <1e470bad1a8dc04e8a77380e9dba3224@ruby-forum.com> <8edde78d5bd915d5d39d93582a0f340f@ruby-forum.com> Message-ID: <001901c702c3$fe314ab0$0202a8c0@again> I believe the problem was in how I was creating my index. My acts_as_ferret declaration was as follows: acts_as_ferret( :fields => { :name => {}, :desc => {:index => :untokenized_omit_norms}, :body => {:store => :yes}, :role => {}, }) With the above a search that used stop words, ex. "auditor of state", would return no hits. When I removed the ":index => :untokenized_omit_norms" and rebuilt the index that same search started to work with acts_as_ferret. I haven't played around with just using ferret and seeing what would happen because of time constraints on this current project. If there's any suggestions or anything I'd gladly try them. I would like to keep the "desc" untokenized and omit the norms because I don't do boosting and may wish to sort by the "desc" field. Thanks, Curtis ----- Original Message ----- From: "Charlie Hubbard" To: Sent: Tuesday, November 07, 2006 9:59 AM Subject: Re: [Ferret-talk] aaf and stop words; query parser > Charlie Hubbard wrote: > > I'm using the same version of AAF and Ferret 0.3.0 and 0.10.9 > > respectively. I sent David Balmain my index so he could analyze it. I > > posted a similiar message here: > > > > http://www.ruby-forum.com/topic/84909 > > > > Any index I built with AAF seemed to demostrate this problem. I checked > > the code, but I couldn't see where it might have been modifying the > > query string in anyway. > > > > Any help? > > I should also say that I was not able to reproduce this when I created > an index using just ferret. So doing something similar to what David > suggested in the other thread. I got hits when I submitted queries with > stop words. Hope that helps. > > Charlie > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From kraemer at webit.de Wed Nov 8 03:55:36 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 8 Nov 2006 09:55:36 +0100 Subject: [Ferret-talk] aaf and stop words; query parser In-Reply-To: <001901c702c3$fe314ab0$0202a8c0@again> References: <002f01c6fdc5$a009f6b0$0202a8c0@again> <20061101172731.GA16601@cordoba.webit.de> <002801c6fde1$8dcb2ed0$0202a8c0@again> <1e470bad1a8dc04e8a77380e9dba3224@ruby-forum.com> <8edde78d5bd915d5d39d93582a0f340f@ruby-forum.com> <001901c702c3$fe314ab0$0202a8c0@again> Message-ID: <20061108085536.GI14929@cordoba.webit.de> On Tue, Nov 07, 2006 at 06:25:20PM -0500, Curtis Hatter wrote: > I believe the problem was in how I was creating my index. My acts_as_ferret > declaration was as follows: > > acts_as_ferret( :fields => { > :name => {}, > :desc => {:index => :untokenized_omit_norms}, > :body => {:store => :yes}, > :role => {}, > }) > > With the above a search that used stop words, ex. "auditor of state", would > return no hits. When I removed the ":index => :untokenized_omit_norms" and > rebuilt the index that same search started to work with acts_as_ferret. I > haven't played around with just using ferret and seeing what would happen > because of time constraints on this current project. > > If there's any suggestions or anything I'd gladly try them. I would like to > keep the "desc" untokenized and omit the norms because I don't do boosting > and may wish to sort by the "desc" field. you really should tokenize the desc field if you want to run searches across it. If you have to sort by the desc field and therefore can't tokenize it, you could index it twice, once tokenized for searching and once untokenized (and maybe truncated to save some space in your index) for sorting. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Wed Nov 8 04:13:57 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 8 Nov 2006 10:13:57 +0100 Subject: [Ferret-talk] Updating the index. In-Reply-To: <9fd96fa70611070950i491ef3bbl383685f01e12454f@mail.gmail.com> References: <9fd96fa70611061207p70a265a3xb93c1ddd9fedb4d4@mail.gmail.com> <9fd96fa70611061256i55d25b8dm5bc55aea45c632b3@mail.gmail.com> <20061107073932.GB27583@cordoba.webit.de> <9fd96fa70611070950i491ef3bbl383685f01e12454f@mail.gmail.com> Message-ID: <20061108091357.GJ14929@cordoba.webit.de> On Tue, Nov 07, 2006 at 12:50:46PM -0500, Britt Selvitelle wrote: > And if I don't? Must I call rebuild_index after a new record is created? no, aaf will update the index after save and create operations. if it doesn't, this is a bug. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From waspfactory at ggggmmmmaail.com Wed Nov 8 05:15:20 2006 From: waspfactory at ggggmmmmaail.com (Caspar) Date: Wed, 8 Nov 2006 11:15:20 +0100 Subject: [Ferret-talk] conditional boost? friends to come up at top of search.. In-Reply-To: <5a9fb5c210b35c4e9fbf8e972ad8a519@ruby-forum.com> References: <4d8a5a69850433ec2e2ef883f9c92835@ruby-forum.com> <20061101130542.GP4769@cordoba.webit.de> <5a9fb5c210b35c4e9fbf8e972ad8a519@ruby-forum.com> Message-ID: Hi i have a similar thing in my search results. I have sponsored listings appearing above other results. This is done with a simple sponsored boolean flag and a sort field that orders by sponsored and then score. regards caspar -- Posted via http://www.ruby-forum.com/. From tennisbum2002 at hotmail.com Wed Nov 8 05:34:16 2006 From: tennisbum2002 at hotmail.com (Eric Gross) Date: Wed, 8 Nov 2006 11:34:16 +0100 Subject: [Ferret-talk] conditional boost? friends to come up at top of search.. In-Reply-To: References: <4d8a5a69850433ec2e2ef883f9c92835@ruby-forum.com> <20061101130542.GP4769@cordoba.webit.de> <5a9fb5c210b35c4e9fbf8e972ad8a519@ruby-forum.com> Message-ID: <99ca84b6414e958a1e60f488d921abc2@ruby-forum.com> hey caspar, im having trouble understanding what you mean. Could you show me some of your code? I dont know what you mean by sponsored boolean flag and a sort field. Any examples would be highly appreciatied. -- Posted via http://www.ruby-forum.com/. From waspfactory at ggggmmmmaail.com Wed Nov 8 05:54:41 2006 From: waspfactory at ggggmmmmaail.com (Caspar) Date: Wed, 8 Nov 2006 11:54:41 +0100 Subject: [Ferret-talk] conditional boost? friends to come up at top of search.. In-Reply-To: <99ca84b6414e958a1e60f488d921abc2@ruby-forum.com> References: <4d8a5a69850433ec2e2ef883f9c92835@ruby-forum.com> <20061101130542.GP4769@cordoba.webit.de> <5a9fb5c210b35c4e9fbf8e972ad8a519@ruby-forum.com> <99ca84b6414e958a1e60f488d921abc2@ruby-forum.com> Message-ID: <4311a69ca45318bcd57b290b0be2b921@ruby-forum.com> Okay firstly the sponsored listings are what we call the paid for premier style of listing in our site. In the listings db table there are a few fields that describe the sponsored feature. Ferret is used to index one of them which is a boolean called sponsored which simple defines a listing as being sponsored. Ferret doesn't know (as far as I'm aware..) how to compare booleans so you need this declaration somewhere in your model. Which tells ferret how to handle boolean comparrisons. def false.<=>(o) o ? -1 : 0 end def true.<=>(o) !o ? 1 : 0 end Then it's just a case of defining your sort fields and then using them in your ferret search like so. sort_fields = [] sort_fields << Ferret::Search::SortField.new(:sponsored, :reverse => :true) sort_fields << Ferret::Search::SortField::SCORE results = VoObject.find_by_contents(query,:sort =>sort_fields,:offset=>page,:limit => RESULTS_PER_PAGE) I'm a rank amateur when it comes to ruby/rails/ferret so please don't take this as the right/best way to do it, but it works for me. Hope this helps. Regards Caspar -- Posted via http://www.ruby-forum.com/. From reverri at gmail.com Thu Nov 9 07:15:05 2006 From: reverri at gmail.com (Daniel Reverri) Date: Thu, 9 Nov 2006 13:15:05 +0100 Subject: [Ferret-talk] index updates In-Reply-To: References: Message-ID: <9dd2100d5f0241a8070e8c5973c47069@ruby-forum.com> Daniel Reverri wrote: > The ferret documentation reports that using the option "key" can affect > performance of indexing operations. If the option "key" is used when > creating a ferret index, is there a performance hit when creating a new > document, or is the performance hit only for updated records? In case anyone else cared; when the :key option is used Ferret will search the index for the document being added before creating a new document. This check occurs on all documents (new and updated). -- Posted via http://www.ruby-forum.com/. From mattias at oncotype.dk Thu Nov 9 08:49:11 2006 From: mattias at oncotype.dk (Mattias Bud) Date: Thu, 9 Nov 2006 14:49:11 +0100 Subject: [Ferret-talk] Problem searching with special characters In-Reply-To: References: <8f10bea7d388a58c7a7222950cb41fa4@ruby-forum.com> Message-ID: Hi Did you find a solution? - I have the same problem but with danish. /mattias -- Posted via http://www.ruby-forum.com/. From waspfactory at ggggmmaaiil.com Fri Nov 10 14:13:40 2006 From: waspfactory at ggggmmaaiil.com (Caspar) Date: Fri, 10 Nov 2006 20:13:40 +0100 Subject: [Ferret-talk] backgroundrb and ferret Message-ID: Hi I'm using ferret happily on a site with 260ish documents in the index. Problem is that the site is still in development and I have to rebuild the index fairly regularly as new fields are added to it, like today when i added price. I have been using a rebuild index action in the admin section but this is clearly the wrong way to do it as it completly kills the site during the rebuild. So question is how do people handle index rebuilds on sites with large indexes and keep them live? I have come accross backgroundrb and think this is the way to go but am unsure about some things. E.g. if you have backgroundrb rebuilding the index, what happens in the meantime with the old index? Is it still accessable while the index is rebuilt? What happens if someone adds a new record during the index getting rebuilt? Id anyone has experience using backgroundrb and ferret then i would really appreciate your comments or code. Thanks very much caspar -- Posted via http://www.ruby-forum.com/. From waspfactory at ggggmmaaiil.com Fri Nov 10 14:14:27 2006 From: waspfactory at ggggmmaaiil.com (Caspar) Date: Fri, 10 Nov 2006 20:14:27 +0100 Subject: [Ferret-talk] backgroundrb and ferret In-Reply-To: References: Message-ID: <12277571e7d70032312fb429b0c42211@ruby-forum.com> Caspar wrote: > Hi I'm using ferret happily on a site with 260ish documents in the should have been 260 000ish... -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Nov 13 03:47:31 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 13 Nov 2006 09:47:31 +0100 Subject: [Ferret-talk] backgroundrb and ferret In-Reply-To: References: Message-ID: <20061113084731.GM5753@cordoba.webit.de> On Fri, Nov 10, 2006 at 08:13:40PM +0100, Caspar wrote: > Hi I'm using ferret happily on a site with 260ish documents in the > index. Problem is that the site is still in development and I have to > rebuild the index fairly regularly as new fields are added to it, like > today when i added price. I have been using a rebuild index action in > the admin section but this is clearly the wrong way to do it as it > completly kills the site during the rebuild. So question is how do > people handle index rebuilds on sites with large indexes and keep them > live? I have come accross backgroundrb and think this is the way to go > but am unsure about some things. E.g. if you have backgroundrb > rebuilding the index, what happens in the meantime with the old index? > Is it still accessable while the index is rebuilt? If you open up a searcher on the old index before you start rebuilding, that searcher should be able to search the old index while the rebuild runs. Another way is to create a whole new index, and just swap indexes once you're finished rebuilding. > What happens if someone adds a new record during the index getting > rebuilt? It's up to you to keep track of these changes, i.e. with a db flag telling you if this record needs indexing or not. Once you have such a flag, you could do all your indexing in a backgroundrb job regularly, i.e. check every 10 minutes for records to be indexed, and index them. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From banshee at banshee.com Mon Nov 13 14:06:22 2006 From: banshee at banshee.com (James Moore) Date: Mon, 13 Nov 2006 11:06:22 -0800 Subject: [Ferret-talk] Stemming, stop words, acts_as_ferret Message-ID: <007601c70756$ceda66b0$6401a8c0@BansheeSoftware.local> I'd like to get the following behavior: 1. Stemming. The search is on a database of summaries of California legal cases. Things like a search for "thermal image" needs to hit "thermal imaging." 2. Stop words. Searches for "failing to instruct the jury" should come up with hits on a search for "fail to instruct." 3. Case-insensitive. What I tried was: class StemmedAnalyzer < Ferret::Analysis::Analyzer def token_stream(field, reader) return Ferret::Analysis::PorterStemFilter.new(Ferret::Analysis::LowerCaseTokenizer. new(reader)) end end class Summary < ActiveRecord::Base acts_as_ferret(:analyzer => StemmedAnalyzer.new) But this doesn't appear to give me either stemming or stopwords. It does give me basic searching (searches for exact keywords without stopwords work, searches with stopwords return no results). I've looked through the archives, and I'm still confused. Suggestions? - James Moore From fxn at hashref.com Mon Nov 13 16:11:11 2006 From: fxn at hashref.com (Xavier Noria) Date: Mon, 13 Nov 2006 22:11:11 +0100 Subject: [Ferret-talk] Makefile for gcc on Solaris? Message-ID: <8904DDA3-FB21-416B-9413-8A19A0EFBD6B@hashref.com> I just grabbed 0.10.13 and tried to build it on Solaris, but the generated Makefile uses cc, which is not installed. gcc is installed but some options like -KPIC are not valid. I did a search with no luck, except that I see people compiling ferret with gcc on Solaris somehow. What should I do? -- fxn PS: Sun Microsystems Inc. SunOS 5.11 snv_43 October 2007 From kraemer at webit.de Tue Nov 14 04:38:31 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 14 Nov 2006 10:38:31 +0100 Subject: [Ferret-talk] Stemming, stop words, acts_as_ferret In-Reply-To: <007601c70756$ceda66b0$6401a8c0@BansheeSoftware.local> References: <007601c70756$ceda66b0$6401a8c0@BansheeSoftware.local> Message-ID: <20061114093831.GE639@cordoba.webit.de> On Mon, Nov 13, 2006 at 11:06:22AM -0800, James Moore wrote: > I'd like to get the following behavior: > > 1. Stemming. The search is on a database of summaries of California legal > cases. Things like a search for "thermal image" needs to hit "thermal > imaging." > > 2. Stop words. Searches for "failing to instruct the jury" should come up > with hits on a search for "fail to instruct." > > 3. Case-insensitive. > > What I tried was: > > class StemmedAnalyzer < Ferret::Analysis::Analyzer > def token_stream(field, reader) > return > Ferret::Analysis::PorterStemFilter.new(Ferret::Analysis::LowerCaseTokenizer. > new(reader)) > end > end > > class Summary < ActiveRecord::Base > acts_as_ferret(:analyzer => StemmedAnalyzer.new) > > But this doesn't appear to give me either stemming or stopwords. It does > give me basic searching (searches for exact keywords without stopwords work, > searches with stopwords return no results). what version of Ferret/AAF are you using ? In the most recent Ferret (0.10.13) there is no class PorterStemFilter. With said Feret version, the following seems to suit your needs: http://pastie.caboo.se/22629 Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From fxn at hashref.com Tue Nov 14 06:30:56 2006 From: fxn at hashref.com (Xavier Noria) Date: Tue, 14 Nov 2006 12:30:56 +0100 Subject: [Ferret-talk] Makefile for gcc on Solaris? In-Reply-To: <8904DDA3-FB21-416B-9413-8A19A0EFBD6B@hashref.com> References: <8904DDA3-FB21-416B-9413-8A19A0EFBD6B@hashref.com> Message-ID: <737E9C9C-C844-45D7-B640-F8EE89114979@hashref.com> On Nov 13, 2006, at 10:11 PM, Xavier Noria wrote: > I just grabbed 0.10.13 and tried to build it on Solaris, but the > generated Makefile uses cc, which is not installed. gcc is installed > but some options like -KPIC are not valid. > > I did a search with no luck, except that I see people compiling > ferret with gcc on Solaris somehow. What should I do? For the archives: there are some instructions about how to modify extconf.rb in the TextDrive forums: http://forum.textdrive.com/viewtopic.php?id=12630 But if the real source of this mismatch is that the interpreter was compiled with cc, I would try to compile a tarball with gcc and work with that one instead. Otherwise extensions would be compiled with a different suite than the one used for the interpreter, and AFAIK that is not advisable. -- fxn From aditya_nalla at yahoo.co.in Tue Nov 14 06:44:57 2006 From: aditya_nalla at yahoo.co.in (Roger) Date: Tue, 14 Nov 2006 12:44:57 +0100 Subject: [Ferret-talk] transaction support in ferret Message-ID: I have a model called "Business". Each object say @business can be tagged by any user. When i do ferret update for this object(on adding a new tag or editing it), there always a possibility that another user is also tagging it. I am using the std rails pluggin for tagging 'acts_as_taggable'. So is the indexing done properly. Or would there be any problem? Thanks Roger -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Nov 14 10:25:28 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 14 Nov 2006 16:25:28 +0100 Subject: [Ferret-talk] transaction support in ferret In-Reply-To: References: Message-ID: <20061114152527.GH639@cordoba.webit.de> On Tue, Nov 14, 2006 at 12:44:57PM +0100, Roger wrote: > I have a model called "Business". Each object say @business can be > tagged by any user. When i do ferret update for this object(on adding a > new tag or editing it), there always a possibility that another user is > also tagging it. I am using the std rails pluggin for tagging > 'acts_as_taggable'. > > So is the indexing done properly. Or would there be any problem? There is no transaction support of any kind in Ferret itself. The worst thing that might happen in your example is the typical 'lost update' thing - users A and B both tag the same object, and B overwrites A's tags with his own. Without optimistic locking this will probably even happen to your database records, depending on how acts_as_taggable does it's tagging. With optimistic locking the DB call will fail, and any after_create hooks will be skipped - if you use acts_as_ferret or do your indexing manually in an after_create hook, it shouldn't get called in this case. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From brendon at spikeinsights.co.nz Thu Nov 16 01:04:55 2006 From: brendon at spikeinsights.co.nz (Brendon Muir) Date: Thu, 16 Nov 2006 07:04:55 +0100 Subject: [Ferret-talk] acts_as_ferret with polymorphic associations In-Reply-To: <070cd419b72e4e428decbad2de835cb7@ruby-forum.com> References: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> <20061107073623.GA27583@cordoba.webit.de> <1012fbe7cb9fb07606c33a80dd630701@ruby-forum.com> <40216b7035e101d7cbb1b3b7bf82e9d8@ruby-forum.com> <20061107131530.GG14929@cordoba.webit.de> <070cd419b72e4e428decbad2de835cb7@ruby-forum.com> Message-ID: <3ed8a36595173a05d493e072a6d416a1@ruby-forum.com> Hi there, I have a similar problem. I'm building a CMS and we have a tree model called component_instances. This model has polymorphic associations with many other models (e.g. Links, Folders, Pages etc...) I want the user to be able to search for an instance using data stored in the associated models. To start with in the backend interface, I only want them to be able to search for items by name, so I came up with this: class ComponentInstance < ActiveRecord::Base has_many :permissions has_many :groups, :through => :permissions belongs_to :component #will be in different database ?? acts_as_tree :order => 'position' acts_as_list :scope => 'parent_id' belongs_to :instance, :polymorphic => true acts_as_ferret( :fields => :instance_name ) def instance_name instance.name end end Now that looks like it should work. But when I search for an item by name (knowing that it exists), nothing shows. I know the search itself works because becure I added the :fields condition, it would pick up and return results for the word "folder" as it was used to define the polymorphic associations on the instances table. So firstly your help with that problem would be most appreciated. Then we make things trickier by adding the fact that I'd like the end users on the front end to be able to search the site not just using the name field, but basically anything in any of the instance tables. As a laughing point, when I wrote the original application in PHP using Mysql fulltext search, the query for the frontend search was 2 pages long! :) Looking forward to your great ideas! :) -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Nov 16 10:42:38 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 16 Nov 2006 16:42:38 +0100 Subject: [Ferret-talk] acts_as_ferret with polymorphic associations In-Reply-To: <3ed8a36595173a05d493e072a6d416a1@ruby-forum.com> References: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> <20061107073623.GA27583@cordoba.webit.de> <1012fbe7cb9fb07606c33a80dd630701@ruby-forum.com> <40216b7035e101d7cbb1b3b7bf82e9d8@ruby-forum.com> <20061107131530.GG14929@cordoba.webit.de> <070cd419b72e4e428decbad2de835cb7@ruby-forum.com> <3ed8a36595173a05d493e072a6d416a1@ruby-forum.com> Message-ID: <20061116154238.GP639@cordoba.webit.de> Hi! On Thu, Nov 16, 2006 at 07:04:55AM +0100, Brendon Muir wrote: [..] > acts_as_ferret( > :fields => :instance_name > ) > > def instance_name > instance.name > end > > end > > Now that looks like it should work. But when I search for an item by > name (knowing that it exists), nothing shows. I know the search itself > works because becure I added the :fields condition, it would pick up and > return results for the word "folder" as it was used to define the > polymorphic associations on the instances table. have a look in your development log when you save an instance of class ComponentInstance. There should be a line like 'adding field instance_name with value ...' showing what value aaf indexed for your instance name. Maybe the instance isn't there yet when the save takes place ? > So firstly your help with that problem would be most appreciated. Then > we make things trickier by adding the fact that I'd like the end users > on the front end to be able to search the site not just using the name > field, but basically anything in any of the instance tables. that's easy, just index all the fields you want for frontend and backend search (including the instance_name field) and use a special QueryParser restricted to only the instance_name field for your backend search: QueryParser qp = QueryParser.new(:fields => [:instance_name]) ComponentInstance.find_by_contents(qp.parse user_query) with this solution, backend users could still manually construct complex queries to search other fields, but queries not using any field names will default to only search the instance_name field. > As a laughing point, when I wrote the original application in PHP using > Mysql fulltext search, the query for the frontend search was 2 pages > long! :) dont tell me any details ;-) cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From curtis.hatter at insightbb.com Thu Nov 16 11:45:51 2006 From: curtis.hatter at insightbb.com (curtis.hatter at insightbb.com) Date: Thu, 16 Nov 2006 11:45:51 -0500 Subject: [Ferret-talk] Strange indexing issues with CachedModel, STI, and AAF Message-ID: I started using robotcoop's CachedModel class in my project but have encountered problems when using it with the acts_as_ferret plugin. It seems it doesn't index everything in my STI model, also if I do a search from my base STI class I get a result count but no results. If I run the same search from one of the children STI models I get the appropriate results (if the information was indexed). Here's my setup: class Record < CachedModel acts_as_nested_set acts_as_ferret( :fields => { :lft { :index => :untokenized_omit_norms }, :name => {}, :desc => {}, :body => {:strore => :yes}, :role => {}, }) def self.inheritance_column 'role' end # methods below ..... end class FirstRecord < Record end class SecondRecord < Record end class ApplicationController < ActionController::Base after_filter { CachedModel.cache_reset } end Here's my CachedModel setup: - config/environment.rb: # Include your application configuration below require 'cached_model' CachedModel.use_local_cache = true - config/environments/development.rb (last line) CACHE = MemCache.new 'localhost:11211', :namespace => 'ohio_development' - config/environments/production.rb (last line) CACHE = MemCache.new 'localhost:11211', :namespace => 'ohio_production' As far as I can tell I've set both up properly. Also I get the same problems when running in production mode. This is on a FreeBSD 6.1 server, with memcached-1.1.12_3, mysql 5.0.26, and rails 1.1.6. Any help would be appreciated as I've been at this one for 2 days. Here's example output I get from irb: >> Record.find_by_contents("search code") => # This makes me think it has something to do with the 'find' method being overridden by CachedModel but not sure how to verify that at this point. Thanks, Curtis From marvin at rectangular.com Thu Nov 16 15:18:50 2006 From: marvin at rectangular.com (Marvin Humphrey) Date: Thu, 16 Nov 2006 12:18:50 -0800 Subject: [Ferret-talk] Away for a week In-Reply-To: References: Message-ID: On Oct 26, 2006, at 9:24 AM, David Balmain wrote: > I'm off to Vietnam for a week on my way home to Australia so I'll be > off the list for a while. Anybody know what's up with Dave? He indicated he'd be gone for a week, but that was three weeks ago. Marvin Humphrey Rectangular Research http://www.rectangular.com/ From brendon at spikeinsights.co.nz Thu Nov 16 21:20:43 2006 From: brendon at spikeinsights.co.nz (Brendon Muir) Date: Fri, 17 Nov 2006 03:20:43 +0100 Subject: [Ferret-talk] acts_as_ferret with polymorphic associations In-Reply-To: <20061116154238.GP639@cordoba.webit.de> References: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> <20061107073623.GA27583@cordoba.webit.de> <1012fbe7cb9fb07606c33a80dd630701@ruby-forum.com> <40216b7035e101d7cbb1b3b7bf82e9d8@ruby-forum.com> <20061107131530.GG14929@cordoba.webit.de> <070cd419b72e4e428decbad2de835cb7@ruby-forum.com> <3ed8a36595173a05d493e072a6d416a1@ruby-forum.com> <20061116154238.GP639@cordoba.webit.de> Message-ID: <0916c27162c0b712d588246eed5fcd9d@ruby-forum.com> Jens Kraemer wrote: > Hi! > > On Thu, Nov 16, 2006 at 07:04:55AM +0100, Brendon Muir wrote: > [..] >> Now that looks like it should work. But when I search for an item by >> name (knowing that it exists), nothing shows. I know the search itself >> works because becure I added the :fields condition, it would pick up and >> return results for the word "folder" as it was used to define the >> polymorphic associations on the instances table. > > have a look in your development log when you save an instance of class > ComponentInstance. There should be a line like > 'adding field instance_name with value ...' > showing what value aaf indexed for your instance name. > > Maybe the instance isn't there yet when the save takes place ? > I already had a few records in the database so I assumed when i added the ferrit aa that it would index those. I will try adding a new record and see what it says when i get back to work on Monday. :) >> So firstly your help with that problem would be most appreciated. Then >> we make things trickier by adding the fact that I'd like the end users >> on the front end to be able to search the site not just using the name >> field, but basically anything in any of the instance tables. > > that's easy, just index all the fields you want for frontend and backend > search (including the instance_name field) and use a special QueryParser > restricted to only the instance_name field for your backend search: > > QueryParser qp = QueryParser.new(:fields => [:instance_name]) > ComponentInstance.find_by_contents(qp.parse user_query) > > with this solution, backend users could still manually construct > complex queries to search other fields, but queries not using any field > names will default to only search the instance_name field. > Excellent! Sounds like a good solution. Will this mean I will have to do a while bunch of these type of accessor's in my model. One for each field in my seperate polymorphic instances? def instance_linkurl instance.linkurl end etc... Will that break when ferret trys to trawl a polymorphic model that doesn't have a linkurl attribute? >> As a laughing point, when I wrote the original application in PHP using >> Mysql fulltext search, the query for the frontend search was 2 pages >> long! :) > > dont tell me any details ;-) > > cheers, > Jens > > > -- > webit! Gesellschaft f?e Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?r kraemer at webit.de > Schnorrstra? 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 -- Posted via http://www.ruby-forum.com/. From reverri at gmail.com Thu Nov 16 23:02:59 2006 From: reverri at gmail.com (Daniel Reverri) Date: Fri, 17 Nov 2006 05:02:59 +0100 Subject: [Ferret-talk] undefined method `exists?' Message-ID: <14ae0b100f9bb1ed70ef27c4c3519afd@ruby-forum.com> Anyone ever run into this error message when creating a new FieldInfos? Ferret::Index::FieldInfos.new(:store=>:no) NoMethodError: undefined method `exists?' for {:store=>:no}:Hash -- Posted via http://www.ruby-forum.com/. From bk at benjaminkrause.com Fri Nov 17 04:55:17 2006 From: bk at benjaminkrause.com (Benjamin Krause) Date: Fri, 17 Nov 2006 10:55:17 +0100 Subject: [Ferret-talk] Away for a week In-Reply-To: References: Message-ID: On 16.11.2006, at 21:18, Marvin Humphrey wrote: > > On Oct 26, 2006, at 9:24 AM, David Balmain wrote: > >> I'm off to Vietnam for a week on my way home to Australia so I'll be >> off the list for a while. > > Anybody know what's up with Dave? He indicated he'd be gone for a > week, but that was three weeks ago. I don't know.. he wanted to travel to South Korea and Vietnam for a vacation.. he was then about to move from japan back to australia .. so he might be busy with his move.. so Dave, if you read this, just give us a short sign that everything is allright .. Ben From maccman at gmail.com Fri Nov 17 05:41:16 2006 From: maccman at gmail.com (Alex MacCaw) Date: Fri, 17 Nov 2006 11:41:16 +0100 Subject: [Ferret-talk] acts_as_ferret and searching word docs Message-ID: I was wondering if it is possible to search word documents using ferret. The actual text in a word document isn't in a binary format - only the formatting. Surely it would be possible to parse that? -- Posted via http://www.ruby-forum.com/. From toby-wan-kenobi at web.de Fri Nov 17 06:57:48 2006 From: toby-wan-kenobi at web.de (Tobias Rademacher) Date: Fri, 17 Nov 2006 12:57:48 +0100 Subject: [Ferret-talk] [Tweaking-Typo-4.0.3] acts_as_ferret `method_missing' Message-ID: <9f2807844ea6a08bd102dddd1df3c4e2@ruby-forum.com> Hey Folks, after following the instructions for tweaking Typo to use rather ferret than DB queries to search article I got a strange NoMethod error when starting the console or the server scripts. /usr/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/base.rb:1129:in `method_missing':NoMethodError: undefined method `acts_as_ferret' for Content:Class As you can see I'm using activerecord version 1.14.4 together with rails 1.1, ferret 0.10.13 and typo 4.0.3. This is the directory structure listing of my acs_as_ferret plugin: -rw-r--r-- 1 init.rb drwxr-xr-x 2 lib -rw-r--r-- 1 LICENSE -rw-r--r-- 1 rakefile -rw-r--r-- 1 README I installed the pluging with this command line operation script/plugin install svn://projects.jkraemer.net/acts_as_ferret/tags/stable/acts_as_ferret Are there any hints? Something I mised? Traps or pitfalls? Highly appreciating your assistance. Thx a lot Toby -- Posted via http://www.ruby-forum.com/. From none at none.com Fri Nov 17 11:43:05 2006 From: none at none.com (mixplate) Date: Fri, 17 Nov 2006 17:43:05 +0100 Subject: [Ferret-talk] acts_as_ferret + :tag_list, how to use it? Message-ID: hello, i am currently using acts_as_ferret in one of my applications and now would like to also incorproate tags. i came across somesones blog that had this: ////// Having Ferret index method results (like :tag_list) worked out-of-the-box for me with acts_as_ferret. I specified attributes (table columns) as strings and methods as symbols, like so: acts_as_ferret :fields => [?title?, :tag_list] I?ve had issues with corrupted indexes and various platform-related problems, but indexing and searching has been a breeze. ////// now since tag_list symbol is part of my index, how do i access it or add to it? where is this symbol stored? is it associated to my model that inheriets the acts_as_ferret plugin? thanks. -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Sat Nov 18 10:29:43 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Sat, 18 Nov 2006 16:29:43 +0100 Subject: [Ferret-talk] aaf and stop words; query parser In-Reply-To: <20061108085536.GI14929@cordoba.webit.de> References: <002f01c6fdc5$a009f6b0$0202a8c0@again> <20061101172731.GA16601@cordoba.webit.de> <002801c6fde1$8dcb2ed0$0202a8c0@again> <1e470bad1a8dc04e8a77380e9dba3224@ruby-forum.com> <8edde78d5bd915d5d39d93582a0f340f@ruby-forum.com> <001901c702c3$fe314ab0$0202a8c0@again> <20061108085536.GI14929@cordoba.webit.de> Message-ID: <6ae3f669fcd35a2104ceb4f2d0cca6a5@ruby-forum.com> Jens Kraemer wrote: > you really should tokenize the desc field if you want to run searches > across it. If you have to sort by the desc field and therefore > can't tokenize it, you could index it twice, once tokenized for > searching > and once untokenized (and maybe truncated to save some space in your > index) for sorting. > Jens, I'm seeing this same behavior as Curtis, but here is how I"m building my index: acts_as_ferret( { :additional_fields => [:content] } ) See my other thread for some observations from what I initially tested. http://www.ruby-forum.com/topic/84909 However, when I tried to reproduce this using just ferret I couldn't. Any ideas? Charlie -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Sat Nov 18 10:33:26 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Sat, 18 Nov 2006 16:33:26 +0100 Subject: [Ferret-talk] acts_as_ferret and searching word docs In-Reply-To: References: Message-ID: Alex MacCaw wrote: > I was wondering if it is possible to search word documents using ferret. > The actual text in a word document isn't in a binary format - only the > formatting. Surely it would be possible to parse that? You might be able to use some of the extensions for M$ platform and ruby to use COM to get the data. Or if you don't want to run on M$ platform you could possibly use Java's POI from Jakarta to parse out the text and put it into something that Ruby could then put into ferret. Charlie -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Sat Nov 18 10:43:58 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Sat, 18 Nov 2006 16:43:58 +0100 Subject: [Ferret-talk] acts_as_ferret + :tag_list, how to use it? In-Reply-To: References: Message-ID: <0c501867738463cf3532fbe04373faa3@ruby-forum.com> > now since tag_list symbol is part of my index, how do i access it or add > to it? > where is this symbol stored? is it associated to my model that inheriets > the acts_as_ferret plugin? I would use the :additional_fields option instead of fields so that you get all the default behavior plus the tag list. For example: acts_as_ferret( { :additional_fields => [:tag_list] } ) Then implement he tag_list methods to return tags concat'ed together. If you want limit your searches to just the tags you would do the following: MyModel.find_by_contents( "tag_list:(#{my_tags})" ) Otherwise if you wanted to find any thing regardless of the fields just do a normal search without the field prepended. Charlie -- Posted via http://www.ruby-forum.com/. From koloa at none.com Sat Nov 18 11:27:09 2006 From: koloa at none.com (koloa) Date: Sat, 18 Nov 2006 17:27:09 +0100 Subject: [Ferret-talk] acts_as_ferret + :tag_list, how to use it? In-Reply-To: <0c501867738463cf3532fbe04373faa3@ruby-forum.com> References: <0c501867738463cf3532fbe04373faa3@ruby-forum.com> Message-ID: hello charlie, thank you for responding. sorry for my noob questions, but :tag_list will be a column in my model's database correct? so if i have a form with a text box for tags, i basically just save that text field in my column tag_list? or does the additional field option create a column in wherever ferret stores the indexes? thanks. i also would like to be able to fetch the most popular tags. would this be a method i define? -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Sat Nov 18 13:39:55 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Sat, 18 Nov 2006 19:39:55 +0100 Subject: [Ferret-talk] acts_as_ferret + :tag_list, how to use it? In-Reply-To: References: <0c501867738463cf3532fbe04373faa3@ruby-forum.com> Message-ID: <9f23b00d652b94216fd35016ce5b6e68@ruby-forum.com> koloa wrote: > hello charlie, thank you for responding. sorry for my noob questions, > but :tag_list will be a column in my model's database correct? so if i > have a form with a text box for tags, i basically just save that text > field in my column tag_list? or does the additional field option create > a column in wherever ferret stores the indexes? No, I assumed you'd be storing the tags in a seperate table like maybe your were using the acts_as_taggle plugin. In this case AAF won't index model objects that are associations. So in order to put them into your index you'd use the :additional_field option to include associations in your index. When you use the :additional_field's option the value of that option is an array of symbols. Those symbols will be turned into method calls on your object. Then define a new method to return your data in a string so AAF can index it. You could use an existing method just as well. > i also would like to be able to fetch the most popular tags. would this > be a method i define? I would not suggest using ferret for finding the most popular tags. If you stored these in a seperate table it's an easy query to the database to figure that out. Something like: class MyTaggableObject < ActiveRecord::Base acts_as_ferret { :additional_fields => :tags } has_many :tags #( or acts_as_taggable, point is this is a many to many) def topTags( limit ) topTags = MyTaggableObject.count( :all, :group => "tags.name", :include => :tags, :sort => "count", :limit => limit ) # then build up your map or whatever. end end You'll have to check my math I'm just kinda hacking it out by memory, but its close. I can remember how you specify the sorting when you're doing a count on a column. (i.e. I can't remember what rails calls the column when doing a sort. You'll need to figure that out in order to do the above). While you could store the tags unnormalized in a column, I think you're gonna find it very hard to operate and build features on top of that. While you might be to use ferret to search. Doing your next feature might not come so easy. I would suggest you pull those out of the columns and put them in a proper table so you can use SQL to do your dirty work. Charlie -- Posted via http://www.ruby-forum.com/. From sreechand at hotmail.com Sun Nov 19 16:50:44 2006 From: sreechand at hotmail.com (Sreechand Boppudi) Date: Sun, 19 Nov 2006 22:50:44 +0100 Subject: [Ferret-talk] score for wildcard searches Message-ID: <5be43a5f2e025a20e4f3ea007f8fc415@ruby-forum.com> Hello All, I have a rails app that maintains movie data index and uses "acts_as_ferret" for search. I ran into an issue with the scoring of wildcard searches. When I search for word "super*", the record containing the word "superman" is ranked above the one having just "super". Is this normal or am I missing something? Any ideas on how scoring can be controlled so that the shorter word is ranked higher? Thanks. -- Posted via http://www.ruby-forum.com/. From holden at pigscanfly.ca Sun Nov 19 21:52:21 2006 From: holden at pigscanfly.ca (Holden Karau) Date: Mon, 20 Nov 2006 03:52:21 +0100 Subject: [Ferret-talk] Parallal Building? Message-ID: I'm trying to index ~130,000 documents [soon to grow to about 500,000 documents] and I'm wondering if its possible to combine ferret databases or in some other way split up the building process. Normally, indexing 130k documents wouldn't be that painful except that there are different types of links between these documents and they are not absolute (so for example doc a refers to a document b but there are multiple different documents laballed document a and document b and to prevent false links I have to use some fairly computationally intensive heuristics]. If its not possible to split up the building of a ferret index I'll probably resolve the links into absolute links as a separate part of the process [which I can split up] and then build the ferret index one one machine after that. -- Posted via http://www.ruby-forum.com/. From alex at blackkettle.org Mon Nov 20 02:01:18 2006 From: alex at blackkettle.org (Alex Young) Date: Mon, 20 Nov 2006 07:01:18 +0000 Subject: [Ferret-talk] acts_as_ferret and searching word docs In-Reply-To: References: Message-ID: <456152BE.4080709@blackkettle.org> Charlie Hubbard wrote: > Alex MacCaw wrote: >> I was wondering if it is possible to search word documents using ferret. >> The actual text in a word document isn't in a binary format - only the >> formatting. Surely it would be possible to parse that? > > You might be able to use some of the extensions for M$ platform and ruby > to use COM to get the data. Or if you don't want to run on M$ platform > you could possibly use Java's POI from Jakarta to parse out the text and > put it into something that Ruby could then put into ferret. > > Charlie > Or there's Abiword - runs on all platforms, and ouputs nice text. If you don't want graphical dependencies, there's wvWare, too. I'm using it at the moment. -- Alex From kraemer at webit.de Mon Nov 20 03:47:55 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 20 Nov 2006 09:47:55 +0100 Subject: [Ferret-talk] acts_as_ferret and searching word docs In-Reply-To: References: Message-ID: <20061120084754.GB22508@cordoba.webit.de> On Sat, Nov 18, 2006 at 04:33:26PM +0100, Charlie Hubbard wrote: > Alex MacCaw wrote: > > I was wondering if it is possible to search word documents using ferret. > > The actual text in a word document isn't in a binary format - only the > > formatting. Surely it would be possible to parse that? > > You might be able to use some of the extensions for M$ platform and ruby > to use COM to get the data. Or if you don't want to run on M$ platform > you could possibly use Java's POI from Jakarta to parse out the text and > put it into something that Ruby could then put into ferret. I successfully used the wv-utilities (wvText or wvHtml, on debian do 'apt-get install wv') to index word documents with Ferret. you can have a look at RDig (http://rubyforge.org/projects/rdig) to see an example of how this could be done. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Mon Nov 20 05:46:31 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 20 Nov 2006 11:46:31 +0100 Subject: [Ferret-talk] Parallal Building? In-Reply-To: References: Message-ID: <20061120104631.GD22508@cordoba.webit.de> On Mon, Nov 20, 2006 at 03:52:21AM +0100, Holden Karau wrote: > I'm trying to index ~130,000 documents [soon to grow to about 500,000 > documents] and I'm wondering if its possible to combine ferret databases > or in some other way split up the building process. > > Normally, indexing 130k documents wouldn't be that painful except that > there are different types of links between these documents and they are > not absolute (so for example doc a refers to a document b but there are > multiple different documents laballed document a and document b and to > prevent false links I have to use some fairly computationally intensive > heuristics]. > > If its not possible to split up the building of a ferret index I'll > probably resolve the links into absolute links as a separate part of the > process [which I can split up] and then build the ferret index one one > machine after that. Only one process or thread may write to the index at once, so you'll have to serialize your writing to the index somehow, i.e. gathering the data on two machines (or threads) and hand it over to the indexer. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Mon Nov 20 05:50:47 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 20 Nov 2006 11:50:47 +0100 Subject: [Ferret-talk] undefined method `exists?' In-Reply-To: <14ae0b100f9bb1ed70ef27c4c3519afd@ruby-forum.com> References: <14ae0b100f9bb1ed70ef27c4c3519afd@ruby-forum.com> Message-ID: <20061120105047.GE22508@cordoba.webit.de> On Fri, Nov 17, 2006 at 05:02:59AM +0100, Daniel Reverri wrote: > Anyone ever run into this error message when creating a new FieldInfos? > Ferret::Index::FieldInfos.new(:store=>:no) > NoMethodError: undefined method `exists?' for {:store=>:no}:Hash no, what version of Ferret are you using ? here's how this looks on my system: irb(main):001:0> require 'ferret' => false irb(main):002:0> Ferret::Index::FieldInfos.new(:store=>:no) => default: store: :no index: :untokenized term_vector: :with_positions_offsets fields: irb(main):003:0> Ferret::VERSION => "0.10.13" Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Mon Nov 20 06:01:23 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 20 Nov 2006 12:01:23 +0100 Subject: [Ferret-talk] acts_as_ferret with polymorphic associations In-Reply-To: <0916c27162c0b712d588246eed5fcd9d@ruby-forum.com> References: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> <20061107073623.GA27583@cordoba.webit.de> <1012fbe7cb9fb07606c33a80dd630701@ruby-forum.com> <40216b7035e101d7cbb1b3b7bf82e9d8@ruby-forum.com> <20061107131530.GG14929@cordoba.webit.de> <070cd419b72e4e428decbad2de835cb7@ruby-forum.com> <3ed8a36595173a05d493e072a6d416a1@ruby-forum.com> <20061116154238.GP639@cordoba.webit.de> <0916c27162c0b712d588246eed5fcd9d@ruby-forum.com> Message-ID: <20061120110123.GF22508@cordoba.webit.de> On Fri, Nov 17, 2006 at 03:20:43AM +0100, Brendon Muir wrote: > Jens Kraemer wrote: [..] > >> So firstly your help with that problem would be most appreciated. Then > >> we make things trickier by adding the fact that I'd like the end users > >> on the front end to be able to search the site not just using the name > >> field, but basically anything in any of the instance tables. > > > > that's easy, just index all the fields you want for frontend and backend > > search (including the instance_name field) and use a special QueryParser > > restricted to only the instance_name field for your backend search: > > > > QueryParser qp = QueryParser.new(:fields => [:instance_name]) > > ComponentInstance.find_by_contents(qp.parse user_query) > > > > with this solution, backend users could still manually construct > > complex queries to search other fields, but queries not using any field > > names will default to only search the instance_name field. > > > > Excellent! Sounds like a good solution. Will this mean I will have to do > a while bunch of these type of accessor's in my model. One for each > field in my seperate polymorphic instances? > > def instance_linkurl > instance.linkurl > end > > etc... that depends on what you want to do. If you want fine grained searches on single fields like linkurl or name then yes, you'll have to have these accessors. but with ruby you can easily declare them in a programmatic way, i.e loop over the attributes of instance and ue define_method inside the loop... if you don't need that many fields for querying you also can aggregate the values of your instance attributes to a single string and index that instead. > Will that break when ferret trys to trawl a polymorphic model that > doesn't have a linkurl attribute? it probably will, you'll have to check if the object has this method and return '' if it doesn't. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Mon Nov 20 06:58:12 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Mon, 20 Nov 2006 12:58:12 +0100 Subject: [Ferret-talk] Index only partially built Message-ID: I have an application which I'm running using Mongrel and Apache as described here http://www.napcs.com/howto/rails/deploy/. I have a model Person which I am attempting to use acts_as_ferret with. When I first try to do a search the index begins to get built but it its fails halfway through with the following error in the browser: === Proxy Error The proxy server received an invalid response from an upstream server. The proxy server could not handle the request POST /myapp/people/search. Reason: Error reading from remote server === I'm guessing this is Apache giving up on receiving anything from Mongrel as the index is taking so long to build. If I attempt to do the search again then only half of the data seems to have be indexed. How can I index all of the database entries? -- Posted via http://www.ruby-forum.com/. From neeraj.jsr at gmail.com Mon Nov 20 07:32:22 2006 From: neeraj.jsr at gmail.com (Raj Singh) Date: Mon, 20 Nov 2006 13:32:22 +0100 Subject: [Ferret-talk] End-of-File Error occured at :79 in xraise Message-ID: On an average I get this error twice or thrice a week. After I rebuild the index Event.rebuild_index it works fine. I'm a bit puzzled by this behavior. Why does this happen? I am using AAF. End-of-File Error occured at :79 in xraise Error occured in store.c:216 - is_refill current pos = 301, file length = 301 /usr/local/lib/ruby/site_ruby/1.8/ferret/index.rb:517:in `close' /usr/local/lib/ruby/site_ruby/1.8/ferret/index.rb:517:in `flush' /usr/local/lib/ruby/1.8/monitor.rb:229:in `synchronize' /usr/local/lib/ruby/site_ruby/1.8/ferret/index.rb:514:in `flush' /usr/local/lib/ruby/site_ruby/1.8/ferret/index.rb:280:in `<<' /usr/local/lib/ruby/1.8/monitor.rb:229:in `synchronize' /usr/local/lib/ruby/site_ruby/1.8/ferret/index.rb:252:in `<<' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/instance_methods.rb:85:in `ferret_update' /usr/local/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/callbacks.rb:344:in `callback' /usr/local/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/callbacks.rb:341:in `callback' /usr/local/lib/ruby/gems/1.8/gems/activerecord- 1.14.4/lib/active_record/callbacks.rb:279:in `update_without_timestamps' /usr/local/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/timestamp.rb:39:in `update' /usr/local/lib/ruby/gems/1.8/gems/activerecord- 1.14.4/lib/active_record/base.rb:1718:in `create_or_update_without_callbacks' /usr/local/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/callbacks.rb:253:in `create_or_update' /usr/local/lib/ruby/gems/1.8/gems/activerecord- 1.14.4/lib/active_record/base.rb:1392:in `save_without_validation' /usr/local/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/validations.rb:736:in `save_without_transactions' /usr/local/lib/ruby/gems/1.8/gems/activerecord- 1.14.4/lib/active_record/transactions.rb:126:in `save' /usr/local/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/connection_adapters/abstract/database_statements.rb:51:in `transaction' /usr/local/lib/ruby/gems/1.8/gems/activerecord- 1.14.4/lib/active_record/transactions.rb:91:in `transaction' /usr/local/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/transactions.rb:118:in `transaction' /usr/local/lib/ruby/gems/1.8/gems/activerecord- 1.14.4/lib/active_record/transactions.rb:126:in `save' /usr/local/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/base.rb:1398:in `save_without_validation!' /usr/local/lib/ruby/gems/1.8/gems/activerecord- 1.14.4/lib/active_record/validations.rb:746:in `save!' #{RAILS_ROOT}/app/controllers/home_controller.rb:33:in `event_info' -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Mon Nov 20 07:41:55 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Mon, 20 Nov 2006 13:41:55 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: References: Message-ID: Is there a way I can manually build the index before using the application? From the console for example? -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Nov 20 07:50:20 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 20 Nov 2006 13:50:20 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: References: Message-ID: <20061120125020.GG22508@cordoba.webit.de> On Mon, Nov 20, 2006 at 01:41:55PM +0100, Matthew Planchant wrote: > Is there a way I can manually build the index before using the > application? From the console for example? yeah, just do Person.rebuild_index in your console. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From neeraj.jsr at gmail.com Mon Nov 20 07:55:21 2006 From: neeraj.jsr at gmail.com (Raj Singh) Date: Mon, 20 Nov 2006 13:55:21 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: References: Message-ID: Matthew Planchant wrote: > Is there a way I can manually build the index before using the > application? From the console for example? ruby script/console production Person.rebuild_index -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Mon Nov 20 07:56:42 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Mon, 20 Nov 2006 13:56:42 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: References: Message-ID: Raj Singh wrote: > Matthew Planchant wrote: >> Is there a way I can manually build the index before using the >> application? From the console for example? > > ruby script/console production > Person.rebuild_index Thanks. Thought it might have been something like that. -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Mon Nov 20 08:12:32 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Mon, 20 Nov 2006 14:12:32 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: References: Message-ID: <266383b43a0b47a02e20bf4ffbac43d8@ruby-forum.com> > ruby script/console production > Person.rebuild_index When I try this false is returned and some of my data still isn't being index. How can I find out what is going wrong? -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Nov 20 09:14:44 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 20 Nov 2006 15:14:44 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <266383b43a0b47a02e20bf4ffbac43d8@ruby-forum.com> References: <266383b43a0b47a02e20bf4ffbac43d8@ruby-forum.com> Message-ID: <20061120141444.GB22902@cordoba.webit.de> On Mon, Nov 20, 2006 at 02:12:32PM +0100, Matthew Planchant wrote: > > ruby script/console production > > Person.rebuild_index > > When I try this false is returned and some of my data still isn't being > index. How can I find out what is going wrong? the return value of rebuild_index has no special meaning, so this is ok. how do you know some of your data isn't indexed ? However, AAF logs the fields and values it indexes when a record is saved or created, so you might find some helpful info in the log file (you might have to set the log level to debug when doing this in production mode) Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Mon Nov 20 09:17:15 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Mon, 20 Nov 2006 15:17:15 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <20061120125020.GG22508@cordoba.webit.de> References: <20061120125020.GG22508@cordoba.webit.de> Message-ID: <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> > yeah, just do > > Person.rebuild_index in your console. This is returning false and some of the data is still no indexed? I'm getting some output like this in the log: Error retrieving value for field primary_organisation_name: undefined method `name' for nil:NilClass Error retrieving value for field preferred_address_address: undefined method `address' for nil:NilClass Error retrieving value for field primary_organisation_name: undefined method `name' for nil:NilClass Could this be the cause of rebulid missing out some of the data? -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Mon Nov 20 09:33:19 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Mon, 20 Nov 2006 15:33:19 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <20061120141444.GB22902@cordoba.webit.de> References: <266383b43a0b47a02e20bf4ffbac43d8@ruby-forum.com> <20061120141444.GB22902@cordoba.webit.de> Message-ID: <31b0090d6233826f75f03a7ae7f21273@ruby-forum.com> > the return value of rebuild_index has no special meaning, so this is ok. > how do you know some of your data isn't indexed ? When I do a search the data which I know is in the db isn't found. Here's how I know it isn't all being indexed. If I search for person X then they are not found. If I do directly to X's edit page and make and an amendment X can now be found with a search. > However, AAF logs the fields and values it indexes when a record is > saved or created, so you might find some helpful info in the log file > (you might have to set the log level to debug when doing this in > production mode) OK. I'll take a look. -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Mon Nov 20 09:35:44 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Mon, 20 Nov 2006 15:35:44 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <20061120141444.GB22902@cordoba.webit.de> References: <266383b43a0b47a02e20bf4ffbac43d8@ruby-forum.com> <20061120141444.GB22902@cordoba.webit.de> Message-ID: <999349a7d4474aee22f1e5063a2d076e@ruby-forum.com> Jens Kraemer wrote: > (you might have to set the log level to debug when doing this in > production mode) How do I do this? -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Mon Nov 20 09:54:10 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Mon, 20 Nov 2006 15:54:10 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <999349a7d4474aee22f1e5063a2d076e@ruby-forum.com> References: <266383b43a0b47a02e20bf4ffbac43d8@ruby-forum.com> <20061120141444.GB22902@cordoba.webit.de> <999349a7d4474aee22f1e5063a2d076e@ruby-forum.com> Message-ID: Matthew Planchant wrote: > Jens Kraemer wrote: > >> (you might have to set the log level to debug when doing this in >> production mode) > > How do I do this? OK. I added 'config.log_level = :debug' to production.rb. -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Nov 20 10:03:24 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 20 Nov 2006 16:03:24 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> References: <20061120125020.GG22508@cordoba.webit.de> <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> Message-ID: <20061120150324.GA14323@cordoba.webit.de> On Mon, Nov 20, 2006 at 03:17:15PM +0100, Matthew Planchant wrote: > > > yeah, just do > > > > Person.rebuild_index in your console. > > This is returning false and some of the data is still no indexed? > > I'm getting some output like this in the log: > > Error retrieving value for field primary_organisation_name: undefined > method `name' for nil:NilClass > Error retrieving value for field preferred_address_address: undefined > method `address' for nil:NilClass > Error retrieving value for field primary_organisation_name: undefined > method `name' for nil:NilClass > > Could this be the cause of rebulid missing out some of the data? seems your primary_organisation and preferred_address are nil, indeed. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From pritchie at videotron.ca Mon Nov 20 09:04:21 2006 From: pritchie at videotron.ca (Patrick Ritchie) Date: Mon, 20 Nov 2006 09:04:21 -0500 Subject: [Ferret-talk] Parallal Building? In-Reply-To: <20061120104631.GD22508@cordoba.webit.de> References: <20061120104631.GD22508@cordoba.webit.de> Message-ID: <4561B5E5.7070907@videotron.ca> Jens Kraemer wrote: > On Mon, Nov 20, 2006 at 03:52:21AM +0100, Holden Karau wrote: > >> I'm trying to index ~130,000 documents [soon to grow to about 500,000 >> documents] and I'm wondering if its possible to combine ferret databases >> or in some other way split up the building process. >> >> Normally, indexing 130k documents wouldn't be that painful except that >> there are different types of links between these documents and they are >> not absolute (so for example doc a refers to a document b but there are >> multiple different documents laballed document a and document b and to >> prevent false links I have to use some fairly computationally intensive >> heuristics]. >> >> If its not possible to split up the building of a ferret index I'll >> probably resolve the links into absolute links as a separate part of the >> process [which I can split up] and then build the ferret index one one >> machine after that. >> > > Only one process or thread may write to the index at once, so you'll > have to serialize your writing to the index somehow, i.e. gathering the > data on two machines (or threads) and hand it over to the indexer. *Ferret newbie warning* Shouldn't it be possible to use the add_indexes method to merge one or more indexes? http://ferret.davebalmain.com/api/classes/Ferret/Index/Index.html#M000035 Cheers! Patrick -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20061120/1babd1fb/attachment.html From matt at planchant.co.uk Mon Nov 20 10:06:31 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Mon, 20 Nov 2006 16:06:31 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <20061120150324.GA14323@cordoba.webit.de> References: <20061120125020.GG22508@cordoba.webit.de> <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> <20061120150324.GA14323@cordoba.webit.de> Message-ID: Jens Kraemer wrote: > On Mon, Nov 20, 2006 at 03:17:15PM +0100, Matthew Planchant wrote: >> method `name' for nil:NilClass >> Error retrieving value for field preferred_address_address: undefined >> method `address' for nil:NilClass >> Error retrieving value for field primary_organisation_name: undefined >> method `name' for nil:NilClass >> >> Could this be the cause of rebulid missing out some of the data? > > seems your primary_organisation and preferred_address are nil, indeed. Does this mean that contacts with NULL values in the database for primary_organisation and preferred_address will not be included in the index? If so how can I get around this? -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Nov 20 11:07:24 2006 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 20 Nov 2006 17:07:24 +0100 Subject: [Ferret-talk] Parallal Building? In-Reply-To: <4561B5E5.7070907@videotron.ca> References: <20061120104631.GD22508@cordoba.webit.de> <4561B5E5.7070907@videotron.ca> Message-ID: <20061120160723.GB14323@cordoba.webit.de> On Mon, Nov 20, 2006 at 09:04:21AM -0500, Patrick Ritchie wrote: > Jens Kraemer wrote: [..] > >Only one process or thread may write to the index at once, so you'll > >have to serialize your writing to the index somehow, i.e. gathering the > >data on two machines (or threads) and hand it over to the indexer. > *Ferret newbie warning* > > Shouldn't it be possible to use the add_indexes method to merge one or > more indexes? > > http://ferret.davebalmain.com/api/classes/Ferret/Index/Index.html#M000035 interesting :-) I didn't ever try this, so if you do please let me know how it worked. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From andreas.korth at gmx.net Mon Nov 20 11:44:00 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Mon, 20 Nov 2006 17:44:00 +0100 Subject: [Ferret-talk] Request for separate AAF list In-Reply-To: <20061120105047.GE22508@cordoba.webit.de> References: <14ae0b100f9bb1ed70ef27c4c3519afd@ruby-forum.com> <20061120105047.GE22508@cordoba.webit.de> Message-ID: <83174290-8017-44B6-9827-CEC15EFA1FE6@gmx.net> Hi there! I'd like to suggest setting up a separate mailing list for the ActsAsFerret plugin since the majority of posts are AAF-related. This results in a pretty low signal-to-noise ratio for folks not interested in ActsAsFerret. I really prefer low-traffic lists with a clear focus. One can still subscribe to both lists and cross-post if appropriate. Cheers, Andy From pritchie at videotron.ca Mon Nov 20 11:55:37 2006 From: pritchie at videotron.ca (Patrick Ritchie) Date: Mon, 20 Nov 2006 11:55:37 -0500 Subject: [Ferret-talk] Parallal Building? In-Reply-To: <20061120160723.GB14323@cordoba.webit.de> References: <20061120104631.GD22508@cordoba.webit.de> <4561B5E5.7070907@videotron.ca> <20061120160723.GB14323@cordoba.webit.de> Message-ID: <4561DE09.1070101@videotron.ca> Jens Kraemer wrote: > On Mon, Nov 20, 2006 at 09:04:21AM -0500, Patrick Ritchie wrote: > >> Jens Kraemer wrote: >> > [..] > >>> Only one process or thread may write to the index at once, so you'll >>> have to serialize your writing to the index somehow, i.e. gathering the >>> data on two machines (or threads) and hand it over to the indexer. >>> >> *Ferret newbie warning* >> >> Shouldn't it be possible to use the add_indexes method to merge one or >> more indexes? >> >> http://ferret.davebalmain.com/api/classes/Ferret/Index/Index.html#M000035 >> > > interesting :-) > I didn't ever try this, so if you do please let me know how it worked. > > Jens > > I just did the following in IRB: i1 = Index.new i2 = Index.new i1 << {:text => 'one'} i2 << {:text => 'two'} i1.search_each("text:one") {|id, score| puts "#{i1[id][:text]"} => "one" i1.search_each("text:two") {|id, score| puts "#{i1[id][:text]"} => nil i1.add_indexes i2 i1.search_each("text:two") {|id, score| puts "#{i1[id][:text]"} => "two" Seems to work as advertised... Cheers! Patrick -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20061120/aca5a89f/attachment.html From matt at planchant.co.uk Mon Nov 20 11:58:57 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Mon, 20 Nov 2006 17:58:57 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: References: <20061120125020.GG22508@cordoba.webit.de> <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> <20061120150324.GA14323@cordoba.webit.de> Message-ID: The rebuild_index sees to be working OK but then terminates prematurely. Why might this happen? -- Posted via http://www.ruby-forum.com/. From holden at pigscanfly.ca Mon Nov 20 14:47:36 2006 From: holden at pigscanfly.ca (Holden Karau) Date: Mon, 20 Nov 2006 20:47:36 +0100 Subject: [Ferret-talk] Parallal Building? In-Reply-To: <4561B5E5.7070907@videotron.ca> References: <20061120104631.GD22508@cordoba.webit.de> <4561B5E5.7070907@videotron.ca> Message-ID: <3957ce2fbbb4af82b4788ac47567bf72@ruby-forum.com> Patrick Ritchie wrote: > Jens Kraemer wrote: >>> prevent false links I have to use some fairly computationally intensive >> data on two machines (or threads) and hand it over to the indexer. > *Ferret newbie warning* > > Shouldn't it be possible to use the add_indexes method to merge one or > more indexes? > > http://ferret.davebalmain.com/api/classes/Ferret/Index/Index.html#M000035 > > Cheers! > Patrick I can't believe I missed that. I'll give it a shot sometime over the weekend, thanks :-) -- Posted via http://www.ruby-forum.com/. From brendon at spikeinsights.co.nz Mon Nov 20 16:44:35 2006 From: brendon at spikeinsights.co.nz (Brendon Muir) Date: Mon, 20 Nov 2006 22:44:35 +0100 Subject: [Ferret-talk] acts_as_ferret with polymorphic associations In-Reply-To: <20061120110123.GF22508@cordoba.webit.de> References: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> <20061107073623.GA27583@cordoba.webit.de> <1012fbe7cb9fb07606c33a80dd630701@ruby-forum.com> <40216b7035e101d7cbb1b3b7bf82e9d8@ruby-forum.com> <20061107131530.GG14929@cordoba.webit.de> <070cd419b72e4e428decbad2de835cb7@ruby-forum.com> <3ed8a36595173a05d493e072a6d416a1@ruby-forum.com> <20061116154238.GP639@cordoba.webit.de> <0916c27162c0b712d588246eed5fcd9d@ruby-forum.com> <20061120110123.GF22508@cordoba.webit.de> Message-ID: <1577daf4cc669d362a9725626f395568@ruby-forum.com> Thanks for all your help above :) Coming back to the original query (now that I'm back at work to try it), when I create a "component instance" i get the following in the logs: Processing FolderController#new (for 127.0.0.1 at 2006-11-21 10:23:29) [POST] Session ID: e3154e167b8382c5050468c261bd0ead Parameters: {"commit"=>"Add", "folder"=>{"name"=>"cool folder"}, "action"=>"new", "controller"=>"admin/construction/folder", "parent_id"=>"12"} Folder Columns (0.016000) SHOW FIELDS FROM folders SQL (0.000000) BEGIN SQL (0.046000) INSERT INTO folders (`name`) VALUES('cool folder') SQL (0.094000) COMMIT ComponentInstance Columns (0.016000) SHOW FIELDS FROM component_instances Component Load (0.015000) SELECT * FROM components WHERE (technical_name = 'folder') LIMIT 1 Component Columns (0.000000) SHOW FIELDS FROM components SQL (0.000000) BEGIN ComponentInstance Load (0.000000) SELECT * FROM component_instances WHERE (parent_id) ORDER BY position DESC LIMIT 1 SQL (0.032000) INSERT INTO component_instances (`deleted_root_item`, `instance_type`, `deleted_on`, `enabled`, `instance_id`, `component_id`, `parent_id`, `position`) VALUES(NULL, 'Folder', NULL, 1, 4, 2, 12, 5) ferret_create/update: ComponentInstance : 20 creating doc for class: ComponentInstance, id: 20 SQL (0.031000) COMMIT Redirected to http://localhost:3001/admin/construction/construction_zone/12 Completed in 1.01600 (0 reqs/sec) | DB: 0.25000 (24%) | 302 Found [http://localhost/admin/construction/folder/12/new] The current model code is: class ComponentInstance < ActiveRecord::Base has_many :permissions has_many :groups, :through => :permissions belongs_to :component #will be in different database ?? acts_as_tree :order => 'position' acts_as_list :scope => 'parent_id' belongs_to :instance, :polymorphic => true acts_as_ferret( :fields => :instance_name ) def instance_name instance.name end -- And here is the controller method to make a new folder (component_instance) def new if request.post? @folder = Folder.new(params[:folder]) success = @folder.save component_instance = ComponentInstance.new component_instance.enabled = 1; #ie True component_instance.instance = @folder component_instance.parent_id = params[:parent_id] component_instance.component = Component.find(:first, :conditions =>"technical_name = 'folder'") if success && component_instance.save flash[:notice] = 'A new folder was successfully added.' redirect_to :controller => 'construction_zone', :parent_id => params[:parent_id] else @folder.destroy if success #Make sure the link and component_instances tables are consistent flash[:fail] = 'There was an error when saving the folder' end else @folder = Folder.new end end Any ideas would be greatly appreciated. :) Brendon -- Posted via http://www.ruby-forum.com/. From brendon at spikeinsights.co.nz Mon Nov 20 17:39:30 2006 From: brendon at spikeinsights.co.nz (Brendon Muir) Date: Mon, 20 Nov 2006 23:39:30 +0100 Subject: [Ferret-talk] acts_as_ferret with polymorphic associations In-Reply-To: <1577daf4cc669d362a9725626f395568@ruby-forum.com> References: <8e76359318638fb9f0d8ac5f5994aa6f@ruby-forum.com> <20061107073623.GA27583@cordoba.webit.de> <1012fbe7cb9fb07606c33a80dd630701@ruby-forum.com> <40216b7035e101d7cbb1b3b7bf82e9d8@ruby-forum.com> <20061107131530.GG14929@cordoba.webit.de> <070cd419b72e4e428decbad2de835cb7@ruby-forum.com> <3ed8a36595173a05d493e072a6d416a1@ruby-forum.com> <20061116154238.GP639@cordoba.webit.de> <0916c27162c0b712d588246eed5fcd9d@ruby-forum.com> <20061120110123.GF22508@cordoba.webit.de> <1577daf4cc669d362a9725626f395568@ruby-forum.com> Message-ID: <919b866d6f9fef08c079bbb7722eeb0e@ruby-forum.com> Oh so, here's how to do it :) acts_as_ferret :fields => ['instance_name'] def instance_name instance.name end Seems to be the correct syntax :) Not sure why the symbol way didn't do it? Cheers, Brendon -- Posted via http://www.ruby-forum.com/. From miguel.wong at gmail.com Mon Nov 20 20:15:54 2006 From: miguel.wong at gmail.com (Miguel) Date: Tue, 21 Nov 2006 02:15:54 +0100 Subject: [Ferret-talk] acts_as_ferret with STI models Message-ID: <6ffad8c06c3dde138199a012e595f457@ruby-forum.com> Can acts_as_ferret search only one of the inherit models in the hierarchy of STI models? Say you have Contents, with types articles and comments. I know that you do Contents.find_by_contents, but can you also create indexed for Comment and Articles? Thanks for you help Miguel -- Posted via http://www.ruby-forum.com/. From curtis.hatter at insightbb.com Mon Nov 20 21:41:44 2006 From: curtis.hatter at insightbb.com (Curtis Hatter) Date: Mon, 20 Nov 2006 21:41:44 -0500 Subject: [Ferret-talk] [Libraries] Strange indexing issues with CachedModel, STI, and AAF References: Message-ID: <002b01c70d16$9597bea0$0202a8c0@again> I solved this problem finally. After investigating Ferret, Acts_As_Ferret and CachedModel I finally turned to checking out Rails. It turns out Rails 1.1.6 does not properly scope queries for STI models if they have an abstract_class. This seems to be fixed now (http://dev.rubyonrails.org/ticket/5704). The problem was that Rails current method did not check to see if the class' parent was abstract. Hope this helps someone else who may try to use STI and CachedModel together. Now that it works I'm very pleased with the solution. Almost split my data into separate tables. Sorry for the noise, Curtis ----- Original Message ----- From: To: ; Sent: Thursday, November 16, 2006 11:45 AM Subject: [Libraries] Strange indexing issues with CachedModel, STI, and AAF > I started using robotcoop's CachedModel class in my project but have encountered problems when using it with the acts_as_ferret plugin. It seems it doesn't index everything in my STI model, also if I do a search from my base STI class I get a result count but no results. If I run the same search from one of the children STI models I get the appropriate results (if the information was indexed). > > Here's my setup: > > class Record < CachedModel > acts_as_nested_set > acts_as_ferret( :fields => { > :lft { :index => :untokenized_omit_norms }, > :name => {}, > :desc => {}, > :body => {:strore => :yes}, > :role => {}, > }) > > def self.inheritance_column > 'role' > end > > # methods below ..... > end > > class FirstRecord < Record > end > > class SecondRecord < Record > end > > class ApplicationController < ActionController::Base > after_filter { CachedModel.cache_reset } > end > > Here's my CachedModel setup: > - config/environment.rb: > # Include your application configuration below > require 'cached_model' > CachedModel.use_local_cache = true > > - config/environments/development.rb (last line) > CACHE = MemCache.new 'localhost:11211', :namespace => 'ohio_development' > > - config/environments/production.rb (last line) > CACHE = MemCache.new 'localhost:11211', :namespace => 'ohio_production' > > As far as I can tell I've set both up properly. Also I get the same problems when running in production mode. > > This is on a FreeBSD 6.1 server, with memcached-1.1.12_3, mysql 5.0.26, and rails 1.1.6. > > Any help would be appreciated as I've been at this one for 2 days. > > Here's example output I get from irb: > >> Record.find_by_contents("search code") > => # > > This makes me think it has something to do with the 'find' method being overridden by CachedModel but not sure how to verify that at this point. > > Thanks, > Curtis > _______________________________________________ > Libraries mailing list > Libraries at lists.robotcoop.com > http://lists.robotcoop.com/mailman/listinfo/libraries From curtis.hatter at insightbb.com Mon Nov 20 21:34:54 2006 From: curtis.hatter at insightbb.com (Curtis Hatter) Date: Mon, 20 Nov 2006 21:34:54 -0500 Subject: [Ferret-talk] acts_as_ferret with STI models References: <6ffad8c06c3dde138199a012e595f457@ruby-forum.com> Message-ID: <001701c70d15$a128fa00$0202a8c0@again> You can most definately do this. Just index your base class in the STI model. Then you can run "find_by_contents" from any of the children. I created a method in my base STI class so I can scope my query. For scoping I used something like the following line: query << " role:#{self.class.eql?(Contents) '*' : self.class}" Though you could make it more generic by simply asking "self.descends_from_active_record?" which is how rails decides if it should scope your "find" query for STI models. You can check out "base.rb" in activerecord to see that. I do believe that eventually AAF will scope queries for STI as I see a TODO in the source about it but Jens would be a better person to comment on that. One last note, if you are using Rails 1.1.6 or earlier and plan on using something liked CachedModel (an abstract class that sits between ActiveRecord::Base and the base of your STI model) please see my email "Strange indexing issues with CachedModel, STI, and AAF" as Rails 1.1.6 does not properly scope queries for STI that have an abstract_class parent. Hope that helps, Curtis ----- Original Message ----- From: "Miguel" To: Sent: Monday, November 20, 2006 8:15 PM Subject: [Ferret-talk] acts_as_ferret with STI models > Can acts_as_ferret search only one of the inherit models in the > hierarchy of STI models? Say you have Contents, with types articles and > comments. I know that you do Contents.find_by_contents, but can you > also create indexed for Comment and Articles? > > Thanks for you help > > Miguel > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From miguel.wong at gmail.com Mon Nov 20 22:38:13 2006 From: miguel.wong at gmail.com (Miguel) Date: Tue, 21 Nov 2006 04:38:13 +0100 Subject: [Ferret-talk] acts_as_ferret with STI models In-Reply-To: <001701c70d15$a128fa00$0202a8c0@again> References: <6ffad8c06c3dde138199a012e595f457@ruby-forum.com> <001701c70d15$a128fa00$0202a8c0@again> Message-ID: Thanks Curtis for you quick response! I do have another question. > > query << " role:#{self.class.eql?(Contents) '*' : self.class}" > Does AAF look for a field called "role" by default? Or is :role a field that is added to any STI model? Shouldn't that be :type? Thanks! Miguel -- Posted via http://www.ruby-forum.com/. From curtis.hatter at insightbb.com Mon Nov 20 23:12:19 2006 From: curtis.hatter at insightbb.com (Curtis Hatter) Date: Mon, 20 Nov 2006 23:12:19 -0500 Subject: [Ferret-talk] acts_as_ferret with STI models References: <6ffad8c06c3dde138199a012e595f457@ruby-forum.com> <001701c70d15$a128fa00$0202a8c0@again> Message-ID: <001301c70d23$3cd0ecd0$0202a8c0@again> Ahhh.. sorry about that. I redefined the field name for my STI because "type" causes way to many problems. class YourModel < ActiveRecord::Base ... code ... def inheritance_column 'role' end end You may come up with a better column name than that but I recommend redefining it because you can't do record.type without getting warnings/errors (it caused problems elsewhere for me as well). Currently AAF does not scope queries for you (to my knowledge). Hope that clears up the confusion I may have caused you, Curtis a link to rails doc explaining what I just said: http://rubyonrails.com/rails/classes/ActiveRecord/Base.html#M000879 ----- Original Message ----- From: "Miguel" To: Sent: Monday, November 20, 2006 10:38 PM Subject: Re: [Ferret-talk] acts_as_ferret with STI models > Thanks Curtis for you quick response! I do have another question. > > > > > query << " role:#{self.class.eql?(Contents) '*' : self.class}" > > > > Does AAF look for a field called "role" by default? Or is :role a field > that is added to any STI model? Shouldn't that be :type? > > Thanks! > > Miguel > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From miguel.wong at gmail.com Tue Nov 21 00:08:16 2006 From: miguel.wong at gmail.com (Miguel) Date: Tue, 21 Nov 2006 06:08:16 +0100 Subject: [Ferret-talk] acts_as_ferret with STI models In-Reply-To: <001301c70d23$3cd0ecd0$0202a8c0@again> References: <6ffad8c06c3dde138199a012e595f457@ruby-forum.com> <001701c70d15$a128fa00$0202a8c0@again> <001301c70d23$3cd0ecd0$0202a8c0@again> Message-ID: Thank you so much Curtis. That was exactly what i needed to know!!! your awesome -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Nov 21 03:51:27 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 21 Nov 2006 09:51:27 +0100 Subject: [Ferret-talk] acts_as_ferret with polymorphic associations In-Reply-To: <919b866d6f9fef08c079bbb7722eeb0e@ruby-forum.com> References: <1012fbe7cb9fb07606c33a80dd630701@ruby-forum.com> <40216b7035e101d7cbb1b3b7bf82e9d8@ruby-forum.com> <20061107131530.GG14929@cordoba.webit.de> <070cd419b72e4e428decbad2de835cb7@ruby-forum.com> <3ed8a36595173a05d493e072a6d416a1@ruby-forum.com> <20061116154238.GP639@cordoba.webit.de> <0916c27162c0b712d588246eed5fcd9d@ruby-forum.com> <20061120110123.GF22508@cordoba.webit.de> <1577daf4cc669d362a9725626f395568@ruby-forum.com> <919b866d6f9fef08c079bbb7722eeb0e@ruby-forum.com> Message-ID: <20061121085127.GE14323@cordoba.webit.de> On Mon, Nov 20, 2006 at 11:39:30PM +0100, Brendon Muir wrote: > Oh so, here's how to do it :) > > acts_as_ferret :fields => ['instance_name'] > > def instance_name > instance.name > end > > > Seems to be the correct syntax :) Not sure why the symbol way didn't do > it? imho it's not the symbol that was the problem, but the missing brackets around it - in your original post you wrote acts_as_ferret :fields => :instance_name So acts_as_ferret :fields => [ :instance_name ] should work, too. Jens > > Cheers, > > Brendon > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Tue Nov 21 04:07:12 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 21 Nov 2006 10:07:12 +0100 Subject: [Ferret-talk] acts_as_ferret with STI models In-Reply-To: <001701c70d15$a128fa00$0202a8c0@again> References: <6ffad8c06c3dde138199a012e595f457@ruby-forum.com> <001701c70d15$a128fa00$0202a8c0@again> Message-ID: <20061121090712.GF14323@cordoba.webit.de> On Mon, Nov 20, 2006 at 09:34:54PM -0500, Curtis Hatter wrote: > You can most definately do this. Just index your base class in the STI > model. Then you can run "find_by_contents" from any of the children. > > I created a method in my base STI class so I can scope my query. For scoping > I used something like the following line: > > query << " role:#{self.class.eql?(Contents) '*' : self.class}" > > Though you could make it more generic by simply asking > "self.descends_from_active_record?" which is how rails decides if it should > scope your "find" query for STI models. You can check out "base.rb" in > activerecord to see that. > > I do believe that eventually AAF will scope queries for STI as I see a TODO > in the source about it but Jens would be a better person to comment on that. AAF does no scoping for searches on an STI model, but as AAF uses activerecord to retrieve the results delivered by find_by_contents, the usual Rails scoping will take place, so you'll get only instances of the class you called find_by_contents on. However the :limit and :offset options of find_by_contents become quite useless since they will be applied to the (unscoped) result set from ferret, and the scoping will only take place later when retrieving the actual records from the db. So scoping the ferret query is a good idea, and it should be possible to integrate that into aaf once I find the time do make a new release... cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Tue Nov 21 04:23:12 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 21 Nov 2006 10:23:12 +0100 Subject: [Ferret-talk] score for wildcard searches In-Reply-To: <5be43a5f2e025a20e4f3ea007f8fc415@ruby-forum.com> References: <5be43a5f2e025a20e4f3ea007f8fc415@ruby-forum.com> Message-ID: <20061121092312.GG14323@cordoba.webit.de> On Sun, Nov 19, 2006 at 10:50:44PM +0100, Sreechand Boppudi wrote: > Hello All, > I have a rails app that maintains movie data index and uses > "acts_as_ferret" for search. I ran into an issue with the scoring of > wildcard searches. When I search for word "super*", the record > containing the word "superman" is ranked above the one having just > "super". > > Is this normal or am I missing something? Any ideas on how scoring can > be controlled so that the shorter word is ranked higher? Thanks. there's a function named 'explain' in Ferret::Index::Index which prints out the calculation how the score of the results of a query is calculated. This might help to find out why your scores are the way they are, but it requires a deep understanding of how the index works (I for myself only understand parts of it ;-)) I think Dave once explained the output of this method in a post some time ago. Reading Lucene in Action will definitely help in understanding what happens, too ;-) As a quick uneducated guess - the document with 'superman' might score better because it's overall amount of text is smaller then the text of the document containing the word 'super'. In general, a hit in a smaller amount of words is considered more relevant. But that's only one part of the equation, so I might well be wrong here... Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Tue Nov 21 04:43:24 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 21 Nov 2006 10:43:24 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: References: <20061120125020.GG22508@cordoba.webit.de> <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> <20061120150324.GA14323@cordoba.webit.de> Message-ID: Here's a summary of what I have: === class Person < ActiveRecord::Base acts_as_ferret :additional_fields => [:organisation_names, :preferred_address_address, :primary_organisation_name] has_many :person_organisations, :dependent => true has_many :organisations, :through => :person_organisations belongs_to :preferred_address, :foreign_key => 'preferred_address_id', :class_name => 'Address' belongs_to :primary_organisation, :foreign_key => 'primary_organisation_id', :class_name => 'Organisation' def primary_organisation_name return primary_organisation.name end def preferred_address_address return preferred_address.address end def organisation_names organisations.collect { |organisation| organisation.name }.join ' ' end def self.full_text_search(q, options = {}) return nil if q.nil? or q == "" default_options = {:limit => 10, :page => 1} options = default_options.merge options options[:offset] = options[:limit] * (options.delete(:page).to_i-1) results = Person.find_by_contents(q, options) return [results.total_hits, results] end end class Organisation < ActiveRecord::Base acts_as_ferret :additional_fields => [:address_address] belongs_to :address has_many :documents has_many :person_organisations, :dependent => true has_many :persons, :through => :person_organisations def address_address return address.address end def self.full_text_search(q, options = {}) return nil if q.nil? or q=="" default_options = {:limit => 10, :page => 1} options = default_options.merge options options[:offset] = options[:limit] * (options.delete(:page).to_i-1) results = Organisation.find_by_contents(q, options) return [results.total_hits, results] end end class Document < ActiveRecord::Base acts_as_ferret :additional_fields => [:organisation_name, :topic_titles] has_many :document_topics, :dependent => true has_many :topics, :through => :document_topics belongs_to :organisation def topic_titles topics.collect { |topic| topic.title }.join ' ' end def organisation_name return organisation.name end def self.full_text_search(q, options = {}) return nil if q.nil? or q == "" default_options = {:limit => 10, :page => 1} options = default_options.merge options options[:offset] = options[:limit] * (options.delete(:page).to_i-1) results = Document.find_by_contents(q, options) return [results.total_hits, results] end end === -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Nov 21 04:49:52 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 21 Nov 2006 10:49:52 +0100 Subject: [Ferret-talk] aaf and stop words; query parser In-Reply-To: <6ae3f669fcd35a2104ceb4f2d0cca6a5@ruby-forum.com> References: <002f01c6fdc5$a009f6b0$0202a8c0@again> <20061101172731.GA16601@cordoba.webit.de> <002801c6fde1$8dcb2ed0$0202a8c0@again> <1e470bad1a8dc04e8a77380e9dba3224@ruby-forum.com> <8edde78d5bd915d5d39d93582a0f340f@ruby-forum.com> <001901c702c3$fe314ab0$0202a8c0@again> <20061108085536.GI14929@cordoba.webit.de> <6ae3f669fcd35a2104ceb4f2d0cca6a5@ruby-forum.com> Message-ID: <20061121094952.GH14323@cordoba.webit.de> Hi! On Sat, Nov 18, 2006 at 04:29:43PM +0100, Charlie Hubbard wrote: > Jens, > > I'm seeing this same behavior as Curtis, but here is how I"m building my > index: > > acts_as_ferret( { :additional_fields => [:content] } ) > > See my other thread for some observations from what I initially tested. > > http://www.ruby-forum.com/topic/84909 > > However, when I tried to reproduce this using just ferret I couldn't. > Any ideas? yes, I think it's a Ferret bug that was introduced some time after 0.10.1. Have a look at this script: http://pastie.caboo.se/22886 This reproduces the problem by adding an untokenized field to the index. If there is no untokenized field, everything is fine. As AAF uses untokenized fields to store IDs and class names, the problem is always present. I checked the following versions of Ferret: working: 0.10.1 not working: 0.10.9, 0.10.11, 0.10.13 I already tried to conact Dave about this, but he still seems to be offline. Hope he's fine and back soon to help us out here ;-) cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Tue Nov 21 05:00:25 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 21 Nov 2006 11:00:25 +0100 Subject: [Ferret-talk] [Tweaking-Typo-4.0.3] acts_as_ferret `method_missing' In-Reply-To: <9f2807844ea6a08bd102dddd1df3c4e2@ruby-forum.com> References: <9f2807844ea6a08bd102dddd1df3c4e2@ruby-forum.com> Message-ID: <20061121100025.GI14323@cordoba.webit.de> On Fri, Nov 17, 2006 at 12:57:48PM +0100, Tobias Rademacher wrote: > Hey Folks, > > after following the instructions for tweaking Typo to use rather ferret > than DB queries to search article I got a strange NoMethod error when > starting the console or the server scripts. > > /usr/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/base.rb:1129:in > `method_missing':NoMethodError: undefined method `acts_as_ferret' for > Content:Class > > As you can see I'm using activerecord version 1.14.4 together with rails > 1.1, ferret 0.10.13 and typo 4.0.3. > > This is the directory structure listing of my acs_as_ferret plugin: > > -rw-r--r-- 1 init.rb > drwxr-xr-x 2 lib > -rw-r--r-- 1 LICENSE > -rw-r--r-- 1 rakefile > -rw-r--r-- 1 README > > I installed the pluging with this command line operation > > script/plugin install > svn://projects.jkraemer.net/acts_as_ferret/tags/stable/acts_as_ferret > > Are there any hints? Something I mised? Traps or pitfalls? This is really strange. Looks like the plugin isn't loaded at all. Maybe Typo is doing something strange there ? The instructions for integrating aaf wih Typo are rather old, and I don't use Typo anymore. So I have no idea what might have changed in this area... I'd try to find out if init.rb is loaded at all at first, and then go on step by step to see where things go wrong. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Tue Nov 21 08:23:12 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 21 Nov 2006 14:23:12 +0100 Subject: [Ferret-talk] Starting from scratch Message-ID: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> I have the following models: === class Person < ActiveRecord::Base has_many :person_organisations, :dependent => true has_many :organisations, :through => :person_organisations has_many :person_categories, :dependent => true has_many :categories, :through => :person_categories end class Category < ActiveRecord::Base has_many :person_categories, :dependent => true has_many :persons, :through => :person_categories end class Organisation < ActiveRecord::Base has_many :documents has_many :person_organisations, :dependent => true has_many :persons, :through => :person_organisations end class Document < ActiveRecord::Base belongs_to :organisation has_many :document_topics, :dependent => true has_many :topics, :through => :document_topics end class Topic < ActiveRecord::Base has_many :document_topics, :dependent => true has_many :documents, :through => :document_topics end === I'd like to be able to search for: * A person by using an organisation name or by using a category name (as well as the person attributes - surname etc.). * A document using topic names and organisation names (as well as the document attributes - title etc.). My first attempt at this is here (http://www.ruby-forum.com/topic/88678) -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Tue Nov 21 11:44:31 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 21 Nov 2006 17:44:31 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <20061120150324.GA14323@cordoba.webit.de> References: <20061120125020.GG22508@cordoba.webit.de> <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> <20061120150324.GA14323@cordoba.webit.de> Message-ID: <4a06d98c59215db1e1e91009aaa70198@ruby-forum.com> Jens Kraemer wrote: > On Mon, Nov 20, 2006 at 03:17:15PM +0100, Matthew Planchant wrote: >> method `name' for nil:NilClass >> Error retrieving value for field preferred_address_address: undefined >> method `address' for nil:NilClass >> Error retrieving value for field primary_organisation_name: undefined >> method `name' for nil:NilClass >> >> Could this be the cause of rebulid missing out some of the data? > > seems your primary_organisation and preferred_address are nil, indeed. So what does this mean? Is this likely to bring the rebuild process to an end prematurely? -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Tue Nov 21 12:08:17 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 21 Nov 2006 18:08:17 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> Message-ID: <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> How does this look: === class Person < ActiveRecord::Base acts_as_ferret :additional_fields => [:organisation_names, :category_titles] has_many :person_categories, :dependent => true has_many :categories, :through => :person_categories has_many :person_organisations, :dependent => true has_many :organisations, :through => :person_organisations def organisation_names organisations.collect { |organisation| organisation.name }.join ' ' end def category_titles categories.collect { |category| categories.name }.join ' ' end end class Category < ActiveRecord::Base has_many :person_categories, :dependent => true has_many :persons, :through => :person_categories end class Organisation < ActiveRecord::Base acts_as_ferret has_many :documents has_many :person_organisations, :dependent => true has_many :contacts, :through => :person_organisations end class Document < ActiveRecord::Base acts_as_ferret :additional_fields => [:organisation_name] belongs_to :organisation def organisation_name return organisation.name end end === My rebuild is still ending prematurely with only half of the data indexed. -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Nov 21 12:48:05 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 21 Nov 2006 18:48:05 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <4a06d98c59215db1e1e91009aaa70198@ruby-forum.com> References: <20061120125020.GG22508@cordoba.webit.de> <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> <20061120150324.GA14323@cordoba.webit.de> <4a06d98c59215db1e1e91009aaa70198@ruby-forum.com> Message-ID: <20061121174805.GA9110@cordoba.webit.de> On Tue, Nov 21, 2006 at 05:44:31PM +0100, Matthew Planchant wrote: > Jens Kraemer wrote: > > On Mon, Nov 20, 2006 at 03:17:15PM +0100, Matthew Planchant wrote: > >> method `name' for nil:NilClass > >> Error retrieving value for field preferred_address_address: undefined > >> method `address' for nil:NilClass > >> Error retrieving value for field primary_organisation_name: undefined > >> method `name' for nil:NilClass > >> > >> Could this be the cause of rebulid missing out some of the data? > > > > seems your primary_organisation and preferred_address are nil, indeed. > > So what does this mean? Is this likely to bring the rebuild process to > an end prematurely? should not, but you should handle the case your relationship is nil, i.e.: def primary_organisation_name primary_organisation.name rescue nil end Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Tue Nov 21 12:49:26 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 21 Nov 2006 18:49:26 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> Message-ID: <20061121174926.GB9110@cordoba.webit.de> On Tue, Nov 21, 2006 at 06:08:17PM +0100, Matthew Planchant wrote: > How does this look: > > === > class Person < ActiveRecord::Base > acts_as_ferret :additional_fields => [:organisation_names, > :category_titles] > > has_many :person_categories, :dependent => true > has_many :categories, :through => :person_categories > > has_many :person_organisations, :dependent => true > has_many :organisations, :through => :person_organisations > > def organisation_names > organisations.collect { |organisation| organisation.name }.join ' ' > end > > def category_titles > categories.collect { |category| categories.name }.join ' ' > end > > end > > class Category < ActiveRecord::Base > has_many :person_categories, :dependent => true > has_many :persons, :through => :person_categories > end > > class Organisation < ActiveRecord::Base > acts_as_ferret > > has_many :documents > > has_many :person_organisations, :dependent => true > has_many :contacts, :through => :person_organisations > end > > class Document < ActiveRecord::Base > acts_as_ferret :additional_fields => [:organisation_name] > > belongs_to :organisation > > def organisation_name > return organisation.name > end > end > === > > My rebuild is still ending prematurely with only half of the data > indexed. ok, so what exactly do you do to rebuild your index, and what searches do you run to check for completeness of your indexes ? Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Tue Nov 21 12:57:08 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 21 Nov 2006 18:57:08 +0100 Subject: [Ferret-talk] Request for separate AAF list In-Reply-To: <83174290-8017-44B6-9827-CEC15EFA1FE6@gmx.net> References: <14ae0b100f9bb1ed70ef27c4c3519afd@ruby-forum.com> <20061120105047.GE22508@cordoba.webit.de> <83174290-8017-44B6-9827-CEC15EFA1FE6@gmx.net> Message-ID: <20061121175708.GC9110@cordoba.webit.de> On Mon, Nov 20, 2006 at 05:44:00PM +0100, Andreas Korth wrote: > Hi there! > > I'd like to suggest setting up a separate mailing list for the > ActsAsFerret plugin since the majority of posts are AAF-related. This > results in a pretty low signal-to-noise ratio for folks not > interested in ActsAsFerret. I really prefer low-traffic lists with a > clear focus. One can still subscribe to both lists and cross-post if > appropriate. I'm no big fan of forking the list, since often it's hard to decide for people if something is an aaf issue or a ferret one. I rather don't want to send people from one list to another all the time, just to keep messages on topic. In the end everybody would end up cross-posting and I'd get every mail twice... What about tagging messages in the subject, like [AAF] for aaf-related messages? So everybody not interested in aaf could skip those messages or filter them away. However, if there's a strong demand for a separate aaf mailing list, I could set up one, of course. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From andreas.korth at gmx.net Tue Nov 21 13:40:50 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Tue, 21 Nov 2006 19:40:50 +0100 Subject: [Ferret-talk] Request for separate AAF list In-Reply-To: <20061121175708.GC9110@cordoba.webit.de> References: <14ae0b100f9bb1ed70ef27c4c3519afd@ruby-forum.com> <20061120105047.GE22508@cordoba.webit.de> <83174290-8017-44B6-9827-CEC15EFA1FE6@gmx.net> <20061121175708.GC9110@cordoba.webit.de> Message-ID: On 21.11.2006, at 18:57, Jens Kraemer wrote: >> I'd like to suggest setting up a separate mailing list for the >> ActsAsFerret plugin since the majority of posts are AAF-related. This >> results in a pretty low signal-to-noise ratio for folks not >> interested in ActsAsFerret. I really prefer low-traffic lists with a >> clear focus. One can still subscribe to both lists and cross-post if >> appropriate. > > I'm no big fan of forking the list, since often it's hard to decide > for > people if something is an aaf issue or a ferret one. I rather don't > want > to send people from one list to another all the time, just to keep > messages on topic. > > In the end everybody would end up cross-posting and I'd get every mail > twice... As a general rule, people using AAF should post to the AAF list. If it turns out to be a Ferret issue, this can be posted to Ferret-talk. I think these cases are quite rare. > What about tagging messages in the subject, like [AAF] for aaf-related > messages? So everybody not interested in aaf could skip those messages > or filter them away. That would be fine for me. But since this relies on posting discipline, I doubt it will work. > However, if there's a strong demand for a separate aaf mailing list, I > could set up one, of course. I think a single request is not considered a "strong demand" ;) I don't want to part a community, it's just that the AAF-related traffic on this list seems overwhelming. If anyone feels the same way, feel free to support my request by posting a short note. Thanks, Andy From matt at planchant.co.uk Tue Nov 21 18:10:45 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Wed, 22 Nov 2006 00:10:45 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: <20061121174926.GB9110@cordoba.webit.de> References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> <20061121174926.GB9110@cordoba.webit.de> Message-ID: >> My rebuild is still ending prematurely with only half of the data >> indexed. > > ok, so what exactly do you do to rebuild your index, and what searches > do you run to check for completeness of your indexes ? ruby script/console production Person.rebuild_index Organisation.rebulid_index Document.rebuild_index Then I try a find_by_contents on some of the people (using any of the fields i.e. surname). I can find people up to about two thirds of the way through the data (in id order) but the final third aren't found. If I edit, say, the last record (which previously could not be found using a search) then I can see in the log that the edited data is added to the index then when I search for it it is found. So for some reason it looks as though a large part of the data isn't being added to the index. -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Tue Nov 21 18:13:49 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Wed, 22 Nov 2006 00:13:49 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <20061121174805.GA9110@cordoba.webit.de> References: <20061120125020.GG22508@cordoba.webit.de> <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> <20061120150324.GA14323@cordoba.webit.de> <4a06d98c59215db1e1e91009aaa70198@ruby-forum.com> <20061121174805.GA9110@cordoba.webit.de> Message-ID: Jens Kraemer wrote: > On Tue, Nov 21, 2006 at 05:44:31PM +0100, Matthew Planchant wrote: >> > seems your primary_organisation and preferred_address are nil, indeed. >> >> So what does this mean? Is this likely to bring the rebuild process to >> an end prematurely? > > should not, but you should handle the case your relationship is nil, > i.e.: > > def primary_organisation_name > primary_organisation.name rescue nil > end Ok. Thanks. What does this do? -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Tue Nov 21 18:16:13 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Wed, 22 Nov 2006 00:16:13 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <20061121174805.GA9110@cordoba.webit.de> References: <20061120125020.GG22508@cordoba.webit.de> <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> <20061120150324.GA14323@cordoba.webit.de> <4a06d98c59215db1e1e91009aaa70198@ruby-forum.com> <20061121174805.GA9110@cordoba.webit.de> Message-ID: <0f4ce669d9ea18c578716ea864ab8d48@ruby-forum.com> Jens Kraemer wrote: > On Tue, Nov 21, 2006 at 05:44:31PM +0100, Matthew Planchant wrote: >> > seems your primary_organisation and preferred_address are nil, indeed. >> >> So what does this mean? Is this likely to bring the rebuild process to >> an end prematurely? > > should not, but you should handle the case your relationship is nil, > i.e.: > > def primary_organisation_name > primary_organisation.name rescue nil > end How does this work for relationships such as: def organisation_names organisations.collect { |organisation| organisation.name }.join ' ' end ? -- Posted via http://www.ruby-forum.com/. From charlie.hubbard at gmail.com Tue Nov 21 19:02:26 2006 From: charlie.hubbard at gmail.com (Charlie Hubbard) Date: Wed, 22 Nov 2006 01:02:26 +0100 Subject: [Ferret-talk] Away for a week In-Reply-To: References: Message-ID: <29a93adba32e7e71005200ff7f539b53@ruby-forum.com> Benjamin Krause wrote: > On 16.11.2006, at 21:18, Marvin Humphrey wrote: > >> >> On Oct 26, 2006, at 9:24 AM, David Balmain wrote: >> >>> I'm off to Vietnam for a week on my way home to Australia so I'll be >>> off the list for a while. >> >> Anybody know what's up with Dave? He indicated he'd be gone for a >> week, but that was three weeks ago. I followed up on him with an email a while back. He sent me a reply on Nov 2nd saying he was back from Vietnam. I haven't heard from him since. Maybe he just got tired of us. ;-) Charlie -- Posted via http://www.ruby-forum.com/. From Neville.Burnell at bmsoft.com.au Tue Nov 21 23:42:46 2006 From: Neville.Burnell at bmsoft.com.au (Neville Burnell) Date: Wed, 22 Nov 2006 15:42:46 +1100 Subject: [Ferret-talk] Help with Multiple Readers, 1 Writer scenario Message-ID: <126EC586577FD611A28E00A0C9A0375886EF6F@maui.bmsoft.com.au> Some time back in September, [sorry to be so slow], Dave wrote: > When you open an IndexReader on the index it is opened up on > that particular version (or state) of the index. So any > operations on the IndexReader (like searches) will only show > what was in the index at the time you opened it. Any modifications > to the index (usually through and IndexWriter) that occur after > you open the IndexReader will not appear in your searches. > So to keep searches up to date you need to close and reopen your > IndexReader every time you commit changes to the index. Would it be possible to enhance IndexReader to report the "version" of the index it is using, and to report if a newer version of the index exists, eg perhaps #version and #current_version? This would allow a long running IndexReader to detect that a new index version is available, which is a problem if the IndexWriter updates the index from another process. For example, I am planning to divide my application into two apps, one which services reader requests, and the other which periodically updates the index if needed. Kind Regards Neville From Neville.Burnell at bmsoft.com.au Wed Nov 22 00:28:33 2006 From: Neville.Burnell at bmsoft.com.au (Neville Burnell) Date: Wed, 22 Nov 2006 16:28:33 +1100 Subject: [Ferret-talk] Help with Multiple Readers, 1 Writer scenario Message-ID: <126EC586577FD611A28E00A0C9A0375886EF71@maui.bmsoft.com.au> Oops, Dave already answered this! > Each version of the index has an internal version number > and there is an IndexReader#latest? method to determine > if the version of the index that you are reading is > the current version. Silly me! > -----Original Message----- > From: Neville Burnell > Sent: Wednesday, 22 November 2006 3:43 PM > To: 'ferret-talk at rubyforge.org' > Subject: RE: [Ferret-talk] Help with Multiple Readers, 1 > Writer scenario > > Some time back in September, [sorry to be so slow], Dave wrote: > > > When you open an IndexReader on the index it is opened up on that > > particular version (or state) of the index. So any > operations on the > > IndexReader (like searches) will only show what was in the index at > > the time you opened it. Any modifications to the index (usually > > through and IndexWriter) that occur after you open the IndexReader > > will not appear in your searches. > > > So to keep searches up to date you need to close and reopen your > > IndexReader every time you commit changes to the index. > > Would it be possible to enhance IndexReader to report the > "version" of the index it is using, and to report if a newer > version of the index exists, eg perhaps #version and #current_version? > > This would allow a long running IndexReader to detect that a > new index version is available, which is a problem if the > IndexWriter updates the index from another process. For > example, I am planning to divide my application into two > apps, one which services reader requests, and the other which > periodically updates the index if needed. > > Kind Regards > > Neville From matt at planchant.co.uk Wed Nov 22 07:27:33 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Wed, 22 Nov 2006 13:27:33 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <0f4ce669d9ea18c578716ea864ab8d48@ruby-forum.com> References: <20061120125020.GG22508@cordoba.webit.de> <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> <20061120150324.GA14323@cordoba.webit.de> <4a06d98c59215db1e1e91009aaa70198@ruby-forum.com> <20061121174805.GA9110@cordoba.webit.de> <0f4ce669d9ea18c578716ea864ab8d48@ruby-forum.com> Message-ID: <2942deed59f7ef128b366e0e9b79fdfa@ruby-forum.com> Has anyone else had the rebuild ending prematurely? -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Wed Nov 22 09:45:20 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Wed, 22 Nov 2006 15:45:20 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <2942deed59f7ef128b366e0e9b79fdfa@ruby-forum.com> References: <20061120125020.GG22508@cordoba.webit.de> <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> <20061120150324.GA14323@cordoba.webit.de> <4a06d98c59215db1e1e91009aaa70198@ruby-forum.com> <20061121174805.GA9110@cordoba.webit.de> <0f4ce669d9ea18c578716ea864ab8d48@ruby-forum.com> <2942deed59f7ef128b366e0e9b79fdfa@ruby-forum.com> Message-ID: Forgot to mention I'm using Win32. Are there any know issues? -- Posted via http://www.ruby-forum.com/. From john at fivesquare.net Wed Nov 22 13:45:03 2006 From: john at fivesquare.net (John Clayton) Date: Wed, 22 Nov 2006 19:45:03 +0100 Subject: [Ferret-talk] acts_as_ferret, Displaying score on a search results page Message-ID: <0cef0cdac97b5c1cf88c0f7951e6934e@ruby-forum.com> Am I missing something or would I need to use find_id_by_contents and populate a results list with my own objects (or add 'score' to my AR instances and populate that) in order to be able to get a score out when iterating the results (like on a search results page)? Said another way, it seems find_by_contents does not provide any access to score in the resulting list. Is this true? Thanks! John -- Posted via http://www.ruby-forum.com/. From caleb at inforadical.net Wed Nov 22 14:47:42 2006 From: caleb at inforadical.net (Caleb Clausen) Date: Wed, 22 Nov 2006 11:47:42 -0800 Subject: [Ferret-talk] crash while retrieving term vectors Message-ID: <4564A95E.8070205@inforadical.net> This program reliably crashes for me (usually a segfault): require 'rubygems' require 'ferret' reader=Ferret::Index::IndexReader.new ARGV fields=reader.field_infos.fields reader.max_doc.times{|n| fields.each{|field| reader.term_vector(n,field) } unless reader.deleted?(n) print "."; STDOUT.flush } As you can see, it just goes through the index, retrieving all the term vectors. I imagine term vectors must be enabled in at least one field to trigger this... I've seen this problem on two different systems, running debian and ubuntu. It may well be the result of something I've done wrong, but if so, I don't know what. If anyone can provide some assistance with or information about this problem, I'd appreciate it. From howardmoon at hitcity.com.au Thu Nov 23 00:29:19 2006 From: howardmoon at hitcity.com.au (Pete Royle) Date: Thu, 23 Nov 2006 06:29:19 +0100 Subject: [Ferret-talk] acts_as_ferret, Displaying score on a search results pag In-Reply-To: <0cef0cdac97b5c1cf88c0f7951e6934e@ruby-forum.com> References: <0cef0cdac97b5c1cf88c0f7951e6934e@ruby-forum.com> Message-ID: <4ea4149d81561c7a12b8e0747ee4623b@ruby-forum.com> Hi John, Looks like it's possible in SVN: http://projects.jkraemer.net/acts_as_ferret/ticket/52 Pete. -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Nov 23 04:12:25 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 23 Nov 2006 10:12:25 +0100 Subject: [Ferret-talk] acts_as_ferret, Displaying score on a search results pag In-Reply-To: <4ea4149d81561c7a12b8e0747ee4623b@ruby-forum.com> References: <0cef0cdac97b5c1cf88c0f7951e6934e@ruby-forum.com> <4ea4149d81561c7a12b8e0747ee4623b@ruby-forum.com> Message-ID: <20061123091225.GD9110@cordoba.webit.de> On Thu, Nov 23, 2006 at 06:29:19AM +0100, Pete Royle wrote: > Hi John, > > Looks like it's possible in SVN: > http://projects.jkraemer.net/acts_as_ferret/ticket/52 that's right :) Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Thu Nov 23 04:14:24 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 23 Nov 2006 10:14:24 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: References: <20061120125020.GG22508@cordoba.webit.de> <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> <20061120150324.GA14323@cordoba.webit.de> <4a06d98c59215db1e1e91009aaa70198@ruby-forum.com> <20061121174805.GA9110@cordoba.webit.de> Message-ID: <20061123091424.GE9110@cordoba.webit.de> On Wed, Nov 22, 2006 at 12:13:49AM +0100, Matthew Planchant wrote: > Jens Kraemer wrote: > > On Tue, Nov 21, 2006 at 05:44:31PM +0100, Matthew Planchant wrote: > >> > seems your primary_organisation and preferred_address are nil, indeed. > >> > >> So what does this mean? Is this likely to bring the rebuild process to > >> an end prematurely? > > > > should not, but you should handle the case your relationship is nil, > > i.e.: > > > > def primary_organisation_name > > primary_organisation.name rescue nil > > end > > Ok. Thanks. What does this do? it returns nil in case the expression before the 'rescue' raises an exception. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Thu Nov 23 04:19:34 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 23 Nov 2006 10:19:34 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> <20061121174926.GB9110@cordoba.webit.de> Message-ID: <20061123091934.GF9110@cordoba.webit.de> On Wed, Nov 22, 2006 at 12:10:45AM +0100, Matthew Planchant wrote: > >> My rebuild is still ending prematurely with only half of the data > >> indexed. > > > > ok, so what exactly do you do to rebuild your index, and what searches > > do you run to check for completeness of your indexes ? > > ruby script/console production > Person.rebuild_index > Organisation.rebulid_index > Document.rebuild_index > > Then I try a find_by_contents on some of the people (using any of the > fields i.e. surname). I can find people up to about two thirds of the > way through the data (in id order) but the final third aren't found. ok, to keep things simple please keep trying with only the Person class for now. what does the log look like if you do Person.rebuild_index ? What happens if you do the same (with the same data) in development mode (maybe on another machine) ? cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Thu Nov 23 04:25:16 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 23 Nov 2006 10:25:16 +0100 Subject: [Ferret-talk] crash while retrieving term vectors In-Reply-To: <4564A95E.8070205@inforadical.net> References: <4564A95E.8070205@inforadical.net> Message-ID: <20061123092516.GG9110@cordoba.webit.de> On Wed, Nov 22, 2006 at 11:47:42AM -0800, Caleb Clausen wrote: > This program reliably crashes for me (usually a segfault): > > require 'rubygems' > require 'ferret' > > reader=Ferret::Index::IndexReader.new ARGV > fields=reader.field_infos.fields > reader.max_doc.times{|n| > fields.each{|field| > reader.term_vector(n,field) > } unless reader.deleted?(n) > print "."; STDOUT.flush > } > > As you can see, it just goes through the index, retrieving all the term > vectors. I imagine term vectors must be enabled in at least one field to > trigger this... > > I've seen this problem on two different systems, running debian and > ubuntu. It may well be the result of something I've done wrong, but if > so, I don't know what. If anyone can provide some assistance with or > information about this problem, I'd appreciate it. hm, I have this snippet running here without problems (Ferret 0.10.13 on Debian): i = Ferret::I.new i << 'only a short test' i << 'another document' reader = i.reader fields = reader.field_infos.fields reader.max_doc.times{|n| fields.each{|field| puts reader.term_vector(n,field) } unless reader.deleted?(n) print "."; STDOUT.flush } Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Thu Nov 23 04:41:51 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Thu, 23 Nov 2006 10:41:51 +0100 Subject: [Ferret-talk] Index only partially built In-Reply-To: <20061123091424.GE9110@cordoba.webit.de> References: <20061120125020.GG22508@cordoba.webit.de> <2d58413ff45dffdd143309a510fa9c60@ruby-forum.com> <20061120150324.GA14323@cordoba.webit.de> <4a06d98c59215db1e1e91009aaa70198@ruby-forum.com> <20061121174805.GA9110@cordoba.webit.de> <20061123091424.GE9110@cordoba.webit.de> Message-ID: <0ca774b7a050b17e0cb095f8f2197cbe@ruby-forum.com> Jens Kraemer wrote: > On Wed, Nov 22, 2006 at 12:13:49AM +0100, Matthew Planchant wrote: >> > def primary_organisation_name >> > primary_organisation.name rescue nil >> > end >> >> Ok. Thanks. What does this do? > > it returns nil in case the expression before the 'rescue' raises an > exception. Ah I see. Thanks for the explanation. Is there a way of catching the exception here: def organisation_names organisations.collect { |organisation| organisation.name }.join ' ' end I take it you have no experience of the rebuild ending prematurely? -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Thu Nov 23 04:55:02 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Thu, 23 Nov 2006 10:55:02 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: <20061123091934.GF9110@cordoba.webit.de> References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> <20061121174926.GB9110@cordoba.webit.de> <20061123091934.GF9110@cordoba.webit.de> Message-ID: Jens Kraemer wrote: > On Wed, Nov 22, 2006 at 12:10:45AM +0100, Matthew Planchant wrote: >> >> Then I try a find_by_contents on some of the people (using any of the >> fields i.e. surname). I can find people up to about two thirds of the >> way through the data (in id order) but the final third aren't found. > > ok, to keep things simple please keep trying with only the Person class > for now. what does the log look like if you do Person.rebuild_index ? Good idea. I'll give this a go. I'll try: class Person < ActiveRecord::Base acts_as_ferret end > What happens if you do the same (with the same data) in development > mode (maybe on another machine) ? Same thing happens in development mode. -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Thu Nov 23 05:28:45 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Thu, 23 Nov 2006 11:28:45 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> <20061121174926.GB9110@cordoba.webit.de> <20061123091934.GF9110@cordoba.webit.de> Message-ID: <1c219b564d7630c8c6dd88dd4edd5a92@ruby-forum.com> OK. Here is what is happening. When the indexing starts it selects the first 1000 records to add to the index. These seem to be added to the index. When it has added the 1000th record another select appears in the log file to get the rest of records (There are 1561 records in the table): SELECT * FROM (SELECT TOP 561 * FROM (SELECT TOP 1561 * FROM persons) AS tmp1 ) AS tmp2 However this select doesn't get the records from 1001 to 1561. It get 1 to 1000. So these first 500 or so records are added to the index twice but the final 500 are never added. -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Thu Nov 23 05:39:46 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Thu, 23 Nov 2006 11:39:46 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: <1c219b564d7630c8c6dd88dd4edd5a92@ruby-forum.com> References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> <20061121174926.GB9110@cordoba.webit.de> <20061123091934.GF9110@cordoba.webit.de> <1c219b564d7630c8c6dd88dd4edd5a92@ruby-forum.com> Message-ID: <9ab658cb09532ebae2a10c5aeda90abb@ruby-forum.com> Matthew Planchant wrote: > However this select doesn't get the records from 1001 to 1561. It get 1 > to 1000. So these first 500 or so records are added to the index twice > but the final 500 are never added. Small mistake here I should read: However this select doesn't get the records from 1001 to 1561. It gets 1 to 561. So these first 500 or so records are added to the index twice but the final 500 are never added. -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Thu Nov 23 05:55:12 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Thu, 23 Nov 2006 11:55:12 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: <9ab658cb09532ebae2a10c5aeda90abb@ruby-forum.com> References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> <20061121174926.GB9110@cordoba.webit.de> <20061123091934.GF9110@cordoba.webit.de> <1c219b564d7630c8c6dd88dd4edd5a92@ruby-forum.com> <9ab658cb09532ebae2a10c5aeda90abb@ruby-forum.com> Message-ID: I have a model with only ~350 records this seems to have been added as it should have been. I can find all the records from searching. The problem seems to occur when there are more then 1000 records (the batch_size in rebuild_index). -- Posted via http://www.ruby-forum.com/. From cameron at cameronyule.com Thu Nov 23 06:07:07 2006 From: cameron at cameronyule.com (Cameron Yule) Date: Thu, 23 Nov 2006 12:07:07 +0100 Subject: [Ferret-talk] Conditional queries Message-ID: Hi. I'm attempting to return a result set which filters Pages by their active state and their start/end date. Whilst the code I have at the moment is doing this without problem, some of the Page items may not have an end date set and I can't see a way of conditionally doing the query so these are not excluded. Perhaps showing some code will help explain better; search_controller.rb @results = Page.multi_search( "active:(true) *:(#{params[:s]}) search_start_date:( <= #{Time.now.strftime("%Y%m%d")} ) search_end_date:( >= #{Time.now.strftime("%Y%m%d")} )", [], {:offset => @offset, :limit => @limit} ) Page.rb def search_start_date self.start_at.strftime("%Y%m%d") if self.start_at end def search_end_date self.end_at.strftime("%Y%m%d") if self.end_at end The problem being that I can't figure out how to do the equivalent of: search_end_date:( IS NULL OR >= #{Time.now.strftime("%Y%m%d")} )", Thanks for any help/advice, Cam -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Nov 23 06:11:31 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 23 Nov 2006 12:11:31 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> <20061121174926.GB9110@cordoba.webit.de> <20061123091934.GF9110@cordoba.webit.de> <1c219b564d7630c8c6dd88dd4edd5a92@ruby-forum.com> <9ab658cb09532ebae2a10c5aeda90abb@ruby-forum.com> Message-ID: <20061123111131.GH9110@cordoba.webit.de> On Thu, Nov 23, 2006 at 11:55:12AM +0100, Matthew Planchant wrote: > I have a model with only ~350 records this seems to have been added as > it should have been. I can find all the records from searching. The > problem seems to occur when there are more then 1000 records (the > batch_size in rebuild_index). so does setting the batch size to a higher value, say 10000, work for you ? Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Thu Nov 23 06:14:10 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Thu, 23 Nov 2006 12:14:10 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> <20061121174926.GB9110@cordoba.webit.de> <20061123091934.GF9110@cordoba.webit.de> <1c219b564d7630c8c6dd88dd4edd5a92@ruby-forum.com> <9ab658cb09532ebae2a10c5aeda90abb@ruby-forum.com> Message-ID: <7c16cfd0336383abdf1f3ea6cb367c47@ruby-forum.com> By changing batch_size in rebuild_index to 2000 I've been able to index all my records (I only have ~1500 records). There may be a problem with this patch http://projects.jkraemer.net/acts_as_ferret/ticket/24 I'm using MS SQL Server. Should this make a difference? -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Nov 23 07:49:18 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 23 Nov 2006 13:49:18 +0100 Subject: [Ferret-talk] Conditional queries In-Reply-To: References: Message-ID: <20061123124918.GI9110@cordoba.webit.de> On Thu, Nov 23, 2006 at 12:07:07PM +0100, Cameron Yule wrote: > Hi. I'm attempting to return a result set which filters Pages by their > active state and their start/end date. Whilst the code I have at the > moment is doing this without problem, some of the Page items may not > have an end date set and I can't see a way of conditionally doing the > query so these are not excluded. > [..] > The problem being that I can't figure out how to do the equivalent of: > > search_end_date:( IS NULL OR >= #{Time.now.strftime("%Y%m%d")} > )", indexing some special value like 99999999 for the end_date of records not having an end data could to the trick. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Thu Nov 23 07:54:00 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 23 Nov 2006 13:54:00 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: <1c219b564d7630c8c6dd88dd4edd5a92@ruby-forum.com> References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> <20061121174926.GB9110@cordoba.webit.de> <20061123091934.GF9110@cordoba.webit.de> <1c219b564d7630c8c6dd88dd4edd5a92@ruby-forum.com> Message-ID: <20061123125400.GJ9110@cordoba.webit.de> On Thu, Nov 23, 2006 at 11:28:45AM +0100, Matthew Planchant wrote: > OK. Here is what is happening. When the indexing starts it selects the > first 1000 records to add to the index. These seem to be added to the > index. When it has added the 1000th record another select appears in the > log file to get the rest of records (There are 1561 records in the > table): > > > SELECT * FROM (SELECT TOP 561 * FROM (SELECT TOP 1561 * FROM persons) AS > tmp1 ) AS tmp2 ehm, what kind of database is this ? looks really strange ;-) > However this select doesn't get the records from 1001 to 1561. It get 1 > to 1000. So these first 500 or so records are added to the index twice > but the final 500 are never added. is it possible the :limit and :offset options of ActiveRecord are not supported or buggy for your kind of database ? what do you get when calling Person.find(:all, :limit => 1000, :offset => 1000) on the console ? Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Thu Nov 23 08:20:14 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Thu, 23 Nov 2006 14:20:14 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: <20061123125400.GJ9110@cordoba.webit.de> References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> <20061121174926.GB9110@cordoba.webit.de> <20061123091934.GF9110@cordoba.webit.de> <1c219b564d7630c8c6dd88dd4edd5a92@ruby-forum.com> <20061123125400.GJ9110@cordoba.webit.de> Message-ID: I think we have the problem. http://dev.rubyonrails.org/ticket/6254 -- Posted via http://www.ruby-forum.com/. From cameron at cameronyule.com Thu Nov 23 09:30:44 2006 From: cameron at cameronyule.com (Cameron Yule) Date: Thu, 23 Nov 2006 15:30:44 +0100 Subject: [Ferret-talk] Conditional queries In-Reply-To: <20061123124918.GI9110@cordoba.webit.de> References: <20061123124918.GI9110@cordoba.webit.de> Message-ID: <420e44d69021ea6723d72a567a572e9d@ruby-forum.com> Jens Kraemer wrote: > > indexing some special value like 99999999 for the end_date of records > not having an end data could to the trick. > > Jens > Hi Jens, thanks for the reply. I'd considered something along those lines, but was curious if there was a neater method actually using the AAF querying syntax itself. Still, I think I'll use your solution in the meantime. Many thanks -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Nov 23 12:50:36 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 23 Nov 2006 18:50:36 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> <20061121174926.GB9110@cordoba.webit.de> <20061123091934.GF9110@cordoba.webit.de> <1c219b564d7630c8c6dd88dd4edd5a92@ruby-forum.com> <20061123125400.GJ9110@cordoba.webit.de> Message-ID: <20061123175036.GK9110@cordoba.webit.de> On Thu, Nov 23, 2006 at 02:20:14PM +0100, Matthew Planchant wrote: > I think we have the problem. > > http://dev.rubyonrails.org/ticket/6254 does it work for you with that patch applied ? Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Thu Nov 23 14:00:13 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Thu, 23 Nov 2006 20:00:13 +0100 Subject: [Ferret-talk] Starting from scratch In-Reply-To: <20061123175036.GK9110@cordoba.webit.de> References: <12084c8470d853301ed7b2d3b590b156@ruby-forum.com> <0df8126d09c420dc1532f2cabb254350@ruby-forum.com> <20061121174926.GB9110@cordoba.webit.de> <20061123091934.GF9110@cordoba.webit.de> <1c219b564d7630c8c6dd88dd4edd5a92@ruby-forum.com> <20061123125400.GJ9110@cordoba.webit.de> <20061123175036.GK9110@cordoba.webit.de> Message-ID: <933cf982ade89a6cef7203fbd199a0e3@ruby-forum.com> Jens Kraemer wrote: > On Thu, Nov 23, 2006 at 02:20:14PM +0100, Matthew Planchant wrote: >> I think we have the problem. >> >> http://dev.rubyonrails.org/ticket/6254 > > does it work for you with that patch applied ? You mean if I reverse the patch and go back to not using batches? I don't know I haven't tried that yet but I assume that it will as it's the SQl which is generated to create the batches which doesn't work with MS SQL Server. For the moment I've set the batch size to 5000 (i.e. greater than then number of records and have in any of my models) and it works. -- Posted via http://www.ruby-forum.com/. From caleb at inforadical.net Thu Nov 23 15:08:03 2006 From: caleb at inforadical.net (Caleb Clausen) Date: Thu, 23 Nov 2006 12:08:03 -0800 Subject: [Ferret-talk] crash while retrieving term vectors Message-ID: <4565FFA3.9090003@inforadical.net> Jens Kraemer wrote: > hm, I have this snippet running here without problems (Ferret 0.10.13 on > Debian): [snippet snipped] Jens, thanks for trying it. Your snippet works perfectly for me as well, so I modified it til I could get it to fail again. I should have mentioned that I'm working with sizable indexes (10000-1000000 entries). Anyway, here's another version that crashes for me. Here I build an index from my system's man files: require 'rubygems' require 'ferret' require 'zlib' i = Ferret::I.new #:path=>'temp_index' manfiles=Dir["/usr/share/man/man*/*.gz"] manfiles.each{|mf| fd=Zlib::GzipReader.open(mf) i<<{:text=>fd.read} fd.close } reader=i.reader reader.max_doc.times{|n| reader.term_vector(n,:text) unless reader.deleted?(n) print "."; STDOUT.flush } This problem seems to be fairly sensitive to initial conditions. Printing out each term vector as it is found, like you did in your snippet, makes the problem go away. I also have 0.10.13, but the problem seems to be common to all the 0.10 series. From matt at mattschnitz.com Thu Nov 23 18:25:12 2006 From: matt at mattschnitz.com (Matt Schnitz) Date: Thu, 23 Nov 2006 15:25:12 -0800 Subject: [Ferret-talk] Two repeatable crash bugs in Ferret proper Message-ID: <497cc4a0611231525t8f4e171ua65756c62a6f5428@mail.gmail.com> Hi guys! Been reading this list for a while. I have two repeatable Ferret crash bugs, both seg faults. 1. The first bug appears to seg fault Ferret when you use quotes in a search argument (eg 'file_name:"file name"') 2. The second bug appears to seg fault Ferret when you attempt to index text with very long tokens (above 256 chars). It may have something to do with URL characters and the default analyzer, since other very long tokens parse successfully. The code and my system specs are below. I've sent the first one to David, but the second I haven't. He recommended I talk to you guys. They're both relatively easy to work around. So don't worry about me. I'd fix them in the C++ myself, but I'm not really geared up for that environment. I figure someone here is better equipped to handle this. Schnitz --- First bug: quotes in search terms #!/usr/bin/ruby require 'rubygems' require 'ferret' # Strangely, the omit_norms is required to exercise the bug. field_infos = Ferret::Index::FieldInfos.new(:index => :omit_norms) field_infos.add_field( :phile_id ) field_infos.add_field( :file_name ) index = Ferret::Index::Index.new( :field_infos => field_infos, :path =>'./exercisequotebugindex', :create => true ) index << { :file_name => "[new] Yo La Tengo - Beanbag Chair.mp3", :phile_id => "428570" } # Works docs = index.search( "file_name:'yo la'" ) puts index[docs.hits[0].doc][:phile_id] docs = index.search( "file_name:\"yo\"" ) puts index[docs.hits[0].doc][:phile_id] # Does not work; will seg fault docs = index.search( "file_name:\"yo la\"" ) # This doesn't either docs = index.search( %Q!file_name:"yo la"! ) --- Second bug: long tokens (?) #!/usr/bin/ruby require 'rubygems' require 'ferret' # Strangely, the omit_norms is required to exercise the bug. field_infos = Ferret::Index::FieldInfos.new() field_infos.add_field( :comment_id ) field_infos.add_field( :comment_body ) index = Ferret::Index::Index.new( :field_infos => field_infos, :path =>'./exercisequotebugindex', :create => true ) index << { :comment_id => 1, :comment_body => "weird URL, huh? [a href=\" http://www.hotelbogotaberlin.com/bogota_e/index_e.html/index_e.html/index_e.html/index_e.html/index_e.html/index_e.html/index_e.html/index_e.html/index_e.html/index_e.html/index_e.html/index_e.html/index_e.html/index_e.html/index_e.html/index_e.html/index_e.html\"]" } --- My system specs Ubuntu 6.06.1 LTS (under VMWare) ruby 1.8.4 (2005-12-24) [i486-linux] ferret-0.10.12 gem_plugin- 0.2.1 rubygems-update-0.9.0 rails-1.1.6 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20061123/0fd67b95/attachment-0001.html From atomgiant at gmail.com Thu Nov 23 18:57:01 2006 From: atomgiant at gmail.com (Tom Davies) Date: Thu, 23 Nov 2006 18:57:01 -0500 Subject: [Ferret-talk] Segmentation Faults Message-ID: Hi, I am using Ferret 0.9.4 and my index appears to be corrupt as any attempt to read or write it causes a segmentation fault. I had been using it with minimal problems for the past few months. Is there a way to fix this without rebuilding the entire index? Since I am on a shared host ferret takes too much memory to rebuild these roughly 8500 records in one go and in the past I have had to rebuild them in batches. Also, is there a way to trap when ferret segfaults using rescue? I have had to disable searching for the time being on my site until I get this worked out since every search will segfault and kill my fcgi. I am actually considering switching to Mysql full text search just for stability reasons, but if there was a good technique for combating these segfaults I would rather stick with Ferret. NOTE: I did try running with ferret 0.10.13 but I was getting a bunch of segfaults just rebuilding my indexes. Thanks, Tom Davies http://atomgiant.com http://gifthat.com From toby-wan-kenobi at web.de Fri Nov 24 01:57:40 2006 From: toby-wan-kenobi at web.de (Tobias Rademacher) Date: Fri, 24 Nov 2006 07:57:40 +0100 Subject: [Ferret-talk] [Tweaking-Typo-4.0.3] acts_as_ferret `method_missing' In-Reply-To: <20061121100025.GI14323@cordoba.webit.de> References: <9f2807844ea6a08bd102dddd1df3c4e2@ruby-forum.com> <20061121100025.GI14323@cordoba.webit.de> Message-ID: <8ee65b787ada92d76d3d64269c6eaf0a@ruby-forum.com> Jens Kraemer wrote: > On Fri, Nov 17, 2006 at 12:57:48PM +0100, Tobias Rademacher wrote: > > This is really strange. Looks like the plugin isn't loaded at all. Maybe > Typo is doing something strange there ? The instructions for integrating > aaf wih Typo are rather old, and I don't use Typo anymore. So I have no > idea > what might have changed in this area... This there any alternative for Typo at the moment? > > I'd try to find out if init.rb is loaded at all at first, and then go on > step by step to see where things go wrong. After setting some STDERR and STDOUT in all vendor-plugins I see no reaction wen starting the console script. Maybe the typo-tweaks of config/boot.rb causing some side-effects and so Typo don't load any plugin anymore. Unfortuanalty the typo trac side is down since a couple of month now. So I can't post an issues about this problem at the moment. Toby -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Fri Nov 24 04:04:49 2006 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 24 Nov 2006 10:04:49 +0100 Subject: [Ferret-talk] [Tweaking-Typo-4.0.3] acts_as_ferret `method_missing' In-Reply-To: <8ee65b787ada92d76d3d64269c6eaf0a@ruby-forum.com> References: <9f2807844ea6a08bd102dddd1df3c4e2@ruby-forum.com> <20061121100025.GI14323@cordoba.webit.de> <8ee65b787ada92d76d3d64269c6eaf0a@ruby-forum.com> Message-ID: <20061124090449.GL9110@cordoba.webit.de> On Fri, Nov 24, 2006 at 07:57:40AM +0100, Tobias Rademacher wrote: [..] > This there any alternative for Typo at the moment? I switched to mephisto a while ago (http://mephistoblog.com/). > > I'd try to find out if init.rb is loaded at all at first, and then go on > > step by step to see where things go wrong. > > After setting some STDERR and STDOUT in all vendor-plugins I see no > reaction wen starting the console script. > > Maybe the typo-tweaks of config/boot.rb causing some side-effects and so > Typo don't load any plugin anymore. seems so... > Unfortuanalty the typo trac side is down since a couple of month now. So > I can't post an issues about this problem at the moment. one of the reasons for my switch to mephisto ;-) Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Fri Nov 24 04:24:49 2006 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 24 Nov 2006 10:24:49 +0100 Subject: [Ferret-talk] Segmentation Faults In-Reply-To: References: Message-ID: <20061124092449.GM9110@cordoba.webit.de> Hi! On Thu, Nov 23, 2006 at 06:57:01PM -0500, Tom Davies wrote: > Hi, > > I am using Ferret 0.9.4 and my index appears to be corrupt as any > attempt to read or write it causes a segmentation fault. I had been > using it with minimal problems for the past few months. > > Is there a way to fix this without rebuilding the entire index? Since > I am on a shared host ferret takes too much memory to rebuild these > roughly 8500 records in one go and in the past I have had to rebuild > them in batches. I don't know of a way to fix the index without rebuilding it. > Also, is there a way to trap when ferret segfaults using rescue? I > have had to disable searching for the time being on my site until I > get this worked out since every search will segfault and kill my fcgi. Imho a seg fault is far too low level to be caught on the ruby side. > I am actually considering switching to Mysql full text search just for > stability reasons, but if there was a good technique for combating > these segfaults I would rather stick with Ferret. > > NOTE: I did try running with ferret 0.10.13 but I was getting a bunch > of segfaults just rebuilding my indexes. that's strange, maybe we should try to find the problem in this area. Ferret's API changed in sometimes subtle ways from 0.9 to 0.10, and segfaults often happen when using the API in a wrong way (i.e. unexpected argument types and things like that). cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From bk at benjaminkrause.com Fri Nov 24 04:10:25 2006 From: bk at benjaminkrause.com (Benjamin Krause) Date: Fri, 24 Nov 2006 10:10:25 +0100 Subject: [Ferret-talk] Away for a week In-Reply-To: <29a93adba32e7e71005200ff7f539b53@ruby-forum.com> References: <29a93adba32e7e71005200ff7f539b53@ruby-forum.com> Message-ID: <4566B701.1010403@benjaminkrause.com> > I followed up on him with an email a while back. He sent me a reply on > Nov 2nd saying he was back from Vietnam. I haven't heard from him > since. Maybe he just got tired of us. ;-) > Hey .. Just got some news from dave.. he's back in australia but has some difficulties getting online. He hopes to be back within the next days, but getting a connection while living in the a remote part of australia seems to be a real problem over there :) Ben From toby-wan-kenobi at web.de Fri Nov 24 05:12:33 2006 From: toby-wan-kenobi at web.de (Tobias Rademacher) Date: Fri, 24 Nov 2006 11:12:33 +0100 Subject: [Ferret-talk] [Tweaking-Typo-4.0.3] acts_as_ferret `method_missing' In-Reply-To: <20061124090449.GL9110@cordoba.webit.de> References: <9f2807844ea6a08bd102dddd1df3c4e2@ruby-forum.com> <20061121100025.GI14323@cordoba.webit.de> <8ee65b787ada92d76d3d64269c6eaf0a@ruby-forum.com> <20061124090449.GL9110@cordoba.webit.de> Message-ID: Jens Kraemer wrote: > On Fri, Nov 24, 2006 at 07:57:40AM +0100, Tobias Rademacher wrote: > [..] >> This there any alternative for Typo at the moment? > > I switched to mephisto a while ago (http://mephistoblog.com/). Ah! I recently heard about that. >> > I'd try to find out if init.rb is loaded at all at first, and then go on >> > step by step to see where things go wrong. >> >> After setting some STDERR and STDOUT in all vendor-plugins I see no >> reaction wen starting the console script. >> >> Maybe the typo-tweaks of config/boot.rb causing some side-effects and so >> Typo don't load any plugin anymore. > > seems so... Oh dear! I realized similar problems when extending the sidebar with simple plugin parsing the blogrolling rss feed.... > >> Unfortuanalty the typo trac side is down since a couple of month now. So >> I can't post an issues about this problem at the moment. > > one of the reasons for my switch to mephisto ;-) Okay. Maybe I spending some efforts in migrating to mephisto together and don't investigate into a individual typo theme any more. ... This mephisto able to do search on top of ferrent? Toby http://tradem.name/blog -- Posted via http://www.ruby-forum.com/. From agnieszka.figiel at gmail.com Fri Nov 24 06:07:47 2006 From: agnieszka.figiel at gmail.com (Agnieszka Figiel) Date: Fri, 24 Nov 2006 12:07:47 +0100 Subject: [Ferret-talk] advanced search with ferret? Message-ID: <4e973a8731ab163f542b6dbf2ccde2fc@ruby-forum.com> Hello, I'm a novice to ferret, so far only used it via acts_as_ferret. My question is about a recommended pattern for an 'advanced search', which would be searching by all fields of a model and some fields from related models, with range search, expression search and wildcards. The kind of search in which a user is presented with a huge form that allows them to set the variuos criteria. is this something ferret (acts_as_ferret?) is suited for and is there a clean way to do it? Thanks, Agnieszka Figiel -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Fri Nov 24 07:20:34 2006 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 24 Nov 2006 13:20:34 +0100 Subject: [Ferret-talk] [Tweaking-Typo-4.0.3] acts_as_ferret `method_missing' In-Reply-To: References: <9f2807844ea6a08bd102dddd1df3c4e2@ruby-forum.com> <20061121100025.GI14323@cordoba.webit.de> <8ee65b787ada92d76d3d64269c6eaf0a@ruby-forum.com> <20061124090449.GL9110@cordoba.webit.de> Message-ID: <20061124122034.GN9110@cordoba.webit.de> On Fri, Nov 24, 2006 at 11:12:33AM +0100, Tobias Rademacher wrote: [..] > Okay. Maybe I spending some efforts in migrating to mephisto together > and don't investigate into a individual typo theme any more. ... > > This mephisto able to do search on top of ferrent? on my blog at www.jkraemer.net it does :-) It was not hard to do. I could offer you diffs, but I guess they would be of little help with current versions of mephisto... Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From david.wennergren at gmail.com Fri Nov 24 08:02:01 2006 From: david.wennergren at gmail.com (David Wennergren) Date: Fri, 24 Nov 2006 14:02:01 +0100 Subject: [Ferret-talk] Strange error. Index corrupt on production server Message-ID: <81434d2cf70f48951fa5c0d3ca2398a1@ruby-forum.com> We've been running Ferret for a few months on our site with great result. But, just a monent ago the index suddenly became corrupt. It all started with this error message: :108250 is out of range [0..108183] for IndexWriter#[] /usr/local/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:382:in `[]' And after that every search resulted in this error: A IOError occurred in search#rss: IO Error occured at :79 in xraise Error occured in fs_store.c:323 - fs_open_input couldn't create InStream /home/newsdesk_prod/current/config/../index/production/pressrelease/_2tap.fdt: /usr/local/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:679:in `initialize' I couldn't find any other solution than rebuilding the index, which takes a few hours... Has anyone experienced anything similar? Is there any way to repair the index without rebuilding it? Thanks a lot for any help or advice! /David Wennergren -- Posted via http://www.ruby-forum.com/. From atomgiant at gmail.com Fri Nov 24 08:21:59 2006 From: atomgiant at gmail.com (Tom Davies) Date: Fri, 24 Nov 2006 08:21:59 -0500 Subject: [Ferret-talk] Segmentation Faults In-Reply-To: <20061124092449.GM9110@cordoba.webit.de> References: <20061124092449.GM9110@cordoba.webit.de> Message-ID: Hi Jens, When I tried switching to 0.10.13 I did have to rewrite portions to work with the new API and I had it working using a small number of documents. It was only when I tried to do the entire import of my existing records that it would routinely seg fault. Also, one other odd thing I noticed is on both 0.9.4 and 0.10.13 I will always get a segfault if I try to rebuild the index using the script/runner. Due to the lack of being able to handle seg faults from the Rails code I may have no choice but to switch to mysql full text for the time being just for stability. Otherwise, all it takes is a seg fault to bring my whole application down. And in this case, since it corrupted the index my app will permanently seg fault until I rebuild it. Thanks for your help. Tom Davies http://atomgiant.com http://gifthat.com From atomgiant at gmail.com Fri Nov 24 09:02:41 2006 From: atomgiant at gmail.com (Tom Davies) Date: Fri, 24 Nov 2006 09:02:41 -0500 Subject: [Ferret-talk] Strange error. Index corrupt on production server In-Reply-To: <81434d2cf70f48951fa5c0d3ca2398a1@ruby-forum.com> References: <81434d2cf70f48951fa5c0d3ca2398a1@ruby-forum.com> Message-ID: Hi David, The same thing just happened to me yesterday. The response I received on this list is that there is no way to fix it other than rebuilding the index. As a result, I had to disable searching on my site as I look for an alternative. I personally am looking into another solution for searching since I can't afford to have my index become corrupt even once. Ferret is great though when it works so I may revisit it in the future. Good luck, Tom On 11/24/06, David Wennergren wrote: > We've been running Ferret for a few months on our site with great > result. But, just a monent ago the index suddenly became corrupt. > > It all started with this error message: > > :108250 is out of range [0..108183] for IndexWriter#[] > /usr/local/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:382:in > `[]' > > And after that every search resulted in this error: > > A IOError occurred in search#rss: > > IO Error occured at :79 in xraise > Error occured in fs_store.c:323 - fs_open_input > couldn't create InStream > /home/newsdesk_prod/current/config/../index/production/pressrelease/_2tap.fdt: > > > /usr/local/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:679:in > `initialize' > > I couldn't find any other solution than rebuilding the index, which > takes a few hours... > > Has anyone experienced anything similar? Is there any way to repair the > index without rebuilding it? > > Thanks a lot for any help or advice! > > /David Wennergren > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Tom Davies http://atomgiant.com http://gifthat.com From andreas.korth at gmx.net Fri Nov 24 14:46:01 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Fri, 24 Nov 2006 20:46:01 +0100 Subject: [Ferret-talk] advanced search with ferret? In-Reply-To: <4e973a8731ab163f542b6dbf2ccde2fc@ruby-forum.com> References: <4e973a8731ab163f542b6dbf2ccde2fc@ruby-forum.com> Message-ID: On 24.11.2006, at 12:07, Agnieszka Figiel wrote: > I'm a novice to ferret, so far only used it via acts_as_ferret. My > question is about a recommended pattern for an 'advanced search', > which > would be searching by all fields of a model and some fields from > related > models, with range search, expression search and wildcards. The > kind of > search in which a user is presented with a huge form that allows > them to > set the variuos criteria. > > is this something ferret (acts_as_ferret?) is suited for and is > there a > clean way to do it? When I last checked, you could use Ferret's advanced query syntax in acts_as_ferret. So you can use wildcards, ranges, phrases, boolean expressions and field qualifiers just as if you used Ferret directly. If you haven't used Ferret's query language before, check out the RDoc documentation for Ferret::QueryParser. It's pretty well explained there. The 'pattern' for implementing an advanced search form would be to gather the information from the form an build a query string from that. As for indexing/searching fields from related models: This has been extensively discussed on this list recently, so you might want to consult the archives. In short, you provide an accessor method for the field in the related model and index that. Cheers, Andy From agnieszka.figiel at gmail.com Fri Nov 24 16:05:15 2006 From: agnieszka.figiel at gmail.com (Agnieszka Figiel) Date: Fri, 24 Nov 2006 22:05:15 +0100 Subject: [Ferret-talk] advanced search with ferret? In-Reply-To: References: <4e973a8731ab163f542b6dbf2ccde2fc@ruby-forum.com> Message-ID: Andreas Korth wrote: > If you haven't used Ferret's query language before, check out the > RDoc documentation for Ferret::QueryParser. It's pretty well > explained there. Hello, Thank you, I'm definitely missing this part so far. As for the advanced search pattern, I'm only worried that my code for glueing the query from all sorts of fields will be very complex. That's why I'm trying to think of something easier to maintain. So far I was thinking of a solution going along these lines: - data from the search form would be used to create a pattern object - the query would be constructed by iterating through all the field_infos of the index - for each field_info add to the query a pair attribute : value of that attribute from the pattern object I'm considering whether it's a good course to take, seems tempting because of its dryness -- no need to alter the search method after extending the index. And it would make the search method a few lines only, simple loop and not a terrifying sequence of ifs and concats :) I'm trying to spot the weaknesses of this idea before I start coding. I would be happy to hear your thoughts on this idea! Agnieszka -- Posted via http://www.ruby-forum.com/. From curtis.hatter at insightbb.com Sat Nov 25 16:38:04 2006 From: curtis.hatter at insightbb.com (Curtis Hatter) Date: Sat, 25 Nov 2006 16:38:04 -0500 Subject: [Ferret-talk] Metaphone analysis Message-ID: <200611251638.05102.curtis.hatter@insightbb.com> Not sure how much this will interest people but I don't have a blog so I'm posting something I threw together today cause I think it might be useful. In what little free time I have I've been wanting to put together a Rails/Ferret based restful dictionary. So I finally got a chance to get started today so the first thing I wanted to do was implement a metaphone analyzer and filter. Some links for more info on the metaphone algorithms: http://en.wikipedia.org/wiki/Metaphone http://en.wikipedia.org/wiki/Double_Metaphone The jist of it is that it breaks words down into its phonetic parts. For example, the word 'cool' and 'kewl' both become 'KL' in the double-metaphone algorithm. Indexing dictionary words in this manner is almost essential so that users can find the proper spelling of a word by spelling it how it sounds. The first thing I did was create a MetaphoneFilter class that would run the metaphone algorithm over a token stream. It's a fairly simple class, but does require the 'Text' gem be installed. require 'ferret' require 'text' module Curtis module Analysis # TODO write tests! class MetaphoneFilter < Ferret::Analysis::TokenStream def initialize(token_stream, version = :double) @input = token_stream @version = version end def next t = @input.next return nil if t.nil? t.text = @version.eql?(:double) ? Text::Metaphone.double_metaphone(t.text) : Text::Metaphone.metaphone(t.text) end end end end Second I created a MetaphoneAnalyzer class that would use the MetaphoneFilter created above. The MetaphoneAnalyzer also makes use of the StemFilter so that words like "eat" and "eating" both equal to "eat". require 'ferret' # TODO write tests module Curtis module Analysis class MetaphoneAnalyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(version = :double, stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words @version = version end def token_stream(field, str) MetaphoneFilter.new(StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)), @stop_words)), @version) end end end end I saved both of these files, 'metaphone_filter.rb' and 'metaphone_analyzer.rb' to RAILS_ROOT/extras. Next I added the following line to my 'config/environments.rb' file: config.load_paths += %W{ #{RAILS_ROOT}/extras } after that i fired up script/console to test it all out: >> require 'metaphone_analyzer' => true >> include Curtis::Analysis => Object >> ts = MetaphoneAnalyzer.new.token_stream(nil, "the quick brown fox jumped over the lazy dog") => $ >> while token = ts.next >> p token >> end ["KK", nil] ["PRN", nil] ["FKS", nil] ["JMP", "AMP"] ["AFR", nil] ["LS", nil] ["TK", nil] => nil As you can see it has been metaphoned. Now if someone were to search but inadvertently type 'qwick' instead of 'quick' it would still match because 'qwick' metaphoned also becomes 'KK'. Still a lot to do, such as test it with AAF, and see how it interacts with using slop (which measures the Levenshtein distance, http://en.wikipedia.org/wiki/Levenshtein, between two terms) so that I can put in a "Did you mean xxx" feature (where xxx is a list of terms within a certain distance of the original query). Plus many other ideas also, such as thesaurus searching. Hopefully this has been informative. Wanted to show how to create new Analyzers and Filters for anyone who was curious (I know I was until today), as well as give a general idea for how I'm going to put them to use. I'd be happy to hear any questions or comments on the above. Oh, one last thing.. the MetaphoneAnalyzer and MetaphoneFilter default to the double-metaphone algorithm. Just pass pass nil (or anything other than :double when constructing the analyzer to use just the metaphone algorithm. Thanks, Curtis From andreas.korth at gmx.net Sat Nov 25 17:21:35 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Sat, 25 Nov 2006 23:21:35 +0100 Subject: [Ferret-talk] Metaphone analysis In-Reply-To: <200611251638.05102.curtis.hatter@insightbb.com> References: <200611251638.05102.curtis.hatter@insightbb.com> Message-ID: <6FC68244-13EB-4C6D-B9E2-3DB2565BC288@gmx.net> On 25.11.2006, at 22:38, Curtis Hatter wrote: > As you can see it has been metaphoned. Now if someone were to > search but > inadvertently type 'qwick' instead of 'quick' it would still match > because > 'qwick' metaphoned also becomes 'KK'. You can achieve almost the same result using Ferret's built-in FuzzyQuery. It works even better for misspellings than phonetic algorithms, and it's language-neutral. Consider: i = Ferret::I.new i << "the quick brown fox" i.search("quikc~").total_hits => 1 i.search("qwick~").total_hits => 1 Whereas metaphone yields: Text::Metaphone.double_metaphone("quick") => ["KK", nil] Text::Metaphone.double_metaphone("quikc") => ["KKK", nil] Cheers, Andy From maccman at gmail.com Sun Nov 26 05:35:11 2006 From: maccman at gmail.com (Alex MacCaw) Date: Sun, 26 Nov 2006 11:35:11 +0100 Subject: [Ferret-talk] acts_as_ferret and searching word docs In-Reply-To: <20061120084754.GB22508@cordoba.webit.de> References: <20061120084754.GB22508@cordoba.webit.de> Message-ID: > I successfully used the wv-utilities (wvText or wvHtml, on debian do > 'apt-get install wv') to index word documents with Ferret. Thanks Jens, Is there any way to do this on windows - or I'll just have to wait till I deploy on linux. -- Posted via http://www.ruby-forum.com/. From curtis.hatter at insightbb.com Sun Nov 26 16:35:03 2006 From: curtis.hatter at insightbb.com (Curtis Hatter) Date: Sun, 26 Nov 2006 16:35:03 -0500 Subject: [Ferret-talk] Metaphone analysis In-Reply-To: <6FC68244-13EB-4C6D-B9E2-3DB2565BC288@gmx.net> References: <200611251638.05102.curtis.hatter@insightbb.com> <6FC68244-13EB-4C6D-B9E2-3DB2565BC288@gmx.net> Message-ID: <200611261635.03718.curtis.hatter@insightbb.com> On Saturday 25 November 2006 17:21, Andreas Korth wrote: > On 25.11.2006, at 22:38, Curtis Hatter wrote: > > As you can see it has been metaphoned. Now if someone were to > > search but > > inadvertently type 'qwick' instead of 'quick' it would still match > > because > > 'qwick' metaphoned also becomes 'KK'. > > You can achieve almost the same result using Ferret's built-in > FuzzyQuery. It works even better for misspellings than phonetic > algorithms, and it's language-neutral. > > Consider: > > i = Ferret::I.new > i << "the quick brown fox" > > i.search("quikc~").total_hits > => 1 > i.search("qwick~").total_hits > => 1 > > Whereas metaphone yields: > > Text::Metaphone.double_metaphone("quick") > => ["KK", nil] > Text::Metaphone.double_metaphone("quikc") > => ["KKK", nil] > I'm looking at trying to use both. My reason: i = Ferret::I.new i << "The quick brown fox" i.search("qwik~").total_hits => 0 Where as double metaphoning "quick" or "qwik" both become "KK". What I'm thinking might be a good solution is to index the word and it's double-metaphone equivalent. Then search for exact hits against the metaphone and fuzzy hits against the word field. Then sort based on score, with hopefully exact matches being 100. Still investigating the best solutions. Thanks for the ideas, Curtis Curtis From andreas.korth at gmx.net Sun Nov 26 18:34:27 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Mon, 27 Nov 2006 00:34:27 +0100 Subject: [Ferret-talk] Metaphone analysis In-Reply-To: <200611261635.03718.curtis.hatter@insightbb.com> References: <200611251638.05102.curtis.hatter@insightbb.com> <6FC68244-13EB-4C6D-B9E2-3DB2565BC288@gmx.net> <200611261635.03718.curtis.hatter@insightbb.com> Message-ID: On 26.11.2006, at 22:35, Curtis Hatter wrote: >> i = Ferret::I.new >> i << "the quick brown fox" >> >> i.search("quikc~").total_hits >> => 1 >> i.search("qwick~").total_hits >> => 1 >> >> Whereas metaphone yields: >> >> Text::Metaphone.double_metaphone("quick") >> => ["KK", nil] >> Text::Metaphone.double_metaphone("quikc") >> => ["KKK", nil] >> > > I'm looking at trying to use both. My reason: > > i = Ferret::I.new > i << "The quick brown fox" > > i.search("qwik~").total_hits > => 0 Which is OK I guess, since 'qwik' and 'quick' are quite different. Still, you can adjust the tolerance of FuzzyQuery if desired: i.search("qwik~0.4").total_hits => 1 > Where as double metaphoning "quick" or "qwik" both become "KK". Yep. In the same way as 'bag', 'pack', 'back', 'poke' and 'pike' all become 'PK'. I think the accurracy of this particular phonetic algorithm is disputable. > What I'm thinking might be a good solution is to index the word and > it's > double-metaphone equivalent. Then search for exact hits against the > metaphone > and fuzzy hits against the word field. Then sort based on score, with > hopefully exact matches being 100. You should in any case index the actual terms, because the metaphones alone would make exact matches impossible. If you use FuzzySearch, you don't need an extra field and you autmatically get a score based on how close the match is. Example: i = Ferret::I.new i << "quick" i << "quikc" i << "quack" i << "quake" i << "quark" i << "quid" i << "quiche" i.search_each("quikc~0.3") do |doc, score| printf "%6s %1.2f\n", i[doc][:id], score end quikc 0.88 quick 0.53 quake 0.53 quid 0.44 quack 0.35 quark 0.35 quiche 0.35 As you can see, the exact match ranks highest. -- Andy From curtis.hatter at insightbb.com Sun Nov 26 22:26:39 2006 From: curtis.hatter at insightbb.com (Curtis Hatter) Date: Sun, 26 Nov 2006 22:26:39 -0500 Subject: [Ferret-talk] Metaphone analysis In-Reply-To: References: <200611251638.05102.curtis.hatter@insightbb.com> <200611261635.03718.curtis.hatter@insightbb.com> Message-ID: <200611262226.39388.curtis.hatter@insightbb.com> > Yep. In the same way as 'bag', 'pack', 'back', 'poke' and 'pike' all > become 'PK'. I think the accurracy of this particular phonetic > algorithm is disputable. true.. and had I not been introduced to guitar hero 2 this weekend I think I might have realized that myself. I haven't good success with the Soundex algorithm either. Metaphone seemed good at first but I think you've convinced me otherwise. 'Pike' and 'bag' should not be the same. > You should in any case index the actual terms, because the metaphones > alone would make exact matches impossible. > > If you use FuzzySearch, you don't need an extra field and you > autmatically get a score based on how close the match is. > > Example: > > i = Ferret::I.new > > i << "quick" > i << "quikc" > i << "quack" > i << "quake" > i << "quark" > i << "quid" > i << "quiche" > > i.search_each("quikc~0.3") do |doc, score| > printf "%6s %1.2f\n", i[doc][:id], score > end > > quikc 0.88 > quick 0.53 > quake 0.53 > quid 0.44 > quack 0.35 > quark 0.35 > quiche 0.35 > > As you can see, the exact match ranks highest. I think I'll try this approach first and add in a phonetic algorithm if necessary. At least I discovered how to write filters for Ferret, which was much easier that I would have imagined. Thanks for the information, is nice to learn a bit more about the things Ferret can already do so well. Curtis From andreas.korth at gmx.net Sun Nov 26 23:30:53 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Mon, 27 Nov 2006 05:30:53 +0100 Subject: [Ferret-talk] Metaphone analysis In-Reply-To: <200611262226.39388.curtis.hatter@insightbb.com> References: <200611251638.05102.curtis.hatter@insightbb.com> <200611261635.03718.curtis.hatter@insightbb.com> <200611262226.39388.curtis.hatter@insightbb.com> Message-ID: <6153FA9A-5F07-4EAD-A158-CAC8D24309A8@gmx.net> On 27.11.2006, at 04:26, Curtis Hatter wrote: >> Yep. In the same way as 'bag', 'pack', 'back', 'poke' and 'pike' all >> become 'PK'. I think the accurracy of this particular phonetic >> algorithm is disputable. > > true.. and had I not been introduced to guitar hero 2 this weekend > I think I > might have realized that myself. Sounds like a lot of FN. > I think I'll try this approach first and add in a phonetic > algorithm if > necessary. It depends on what you want to achieve. If you want to compensate for typos, FuzzyQuery is probably better. A simple letter-swap can easily trick a phonetic algorithm: metaphone => MTFN metahpone => MTPN FuzzyQuery will catch this even at a relatively low sensitivity of 0.75 ('metahpone~0.75') I used FuzzyQuery to build a doublet detection into my Rails app. When a user creates a new Person record, a fuzzy search is run in the background as the form fields are filled out. Possible doublets are displayed next to the form in a "Did you mean..." fashion. For example, if the user enters "Rachel Welsh", the doublet detection would find "Raquel Welch". Before Ferret I tried to achieve this with MySQL's SOUNDEX function, which didn't work quite as well. (Although SOUNDEX, which is based on the algorithm of the same name, still works way better than metaphone.) > At least I discovered how to write filters for Ferret, which was > much easier > that I would have imagined. Yep. It's great that Ferret can be extended and customized in so many ways. > Thanks for the information, is nice to learn a bit more about the > things > Ferret can already do so well. Ferret is the single best Ruby library I've come across in the past two years. It just rocks. Period. Thanks David for giving it to us! Cheers, Andreas From daniel at flyingmachinestudios.com Mon Nov 27 01:14:09 2006 From: daniel at flyingmachinestudios.com (Daniel) Date: Mon, 27 Nov 2006 07:14:09 +0100 Subject: [Ferret-talk] find_conditions in acts_as_ferret find_by_contents Message-ID: <047ad205468d193120688c35aece2c1d@ruby-forum.com> Hi all, Every time I try to add options for the find_conditions argument of find_by_contents I get the following: a = AnnotatedLink.find_by_contents('test', {}, {:conditions => 'category_id IS NOT NULL'}) >> NoMethodError: You have a nil object when you didn't expect it! You might have expected an instance of Array. The error occured while evaluating nil.sort! from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:288:in `find_by_contents' from (irb):1 I'm using rev 112 of acts_as ferret and the latest version of ferret. Has anyone else had this problem? Thanks! Daniel -- Posted via http://www.ruby-forum.com/. From daniel at flyingmachinestudios.com Mon Nov 27 01:29:09 2006 From: daniel at flyingmachinestudios.com (Daniel) Date: Mon, 27 Nov 2006 07:29:09 +0100 Subject: [Ferret-talk] find_conditions in acts_as_ferret find_by_contents In-Reply-To: <047ad205468d193120688c35aece2c1d@ruby-forum.com> References: <047ad205468d193120688c35aece2c1d@ruby-forum.com> Message-ID: <933658dad3371a2c8bbb2922069eceaa@ruby-forum.com> I think I found the problem. On line 470, this: def combine_conditions(conditions, additional_conditions) should be this: def combine_conditions(conditions, *additional_conditions) Furthermore, the rescue clause at 282 might be more usefull if it reraised the error Thanks! Daniel wrote: > Hi all, > > Every time I try to add options for the find_conditions argument of > find_by_contents I get the following: > > a = AnnotatedLink.find_by_contents('test', {}, {:conditions => > 'category_id IS NOT NULL'}) >>> NoMethodError: You have a nil object when you didn't expect it! > You might have expected an instance of Array. > The error occured while evaluating nil.sort! > from > ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/class_methods.rb:288:in > `find_by_contents' > from (irb):1 > > I'm using rev 112 of acts_as ferret and the latest version of ferret. > Has anyone else had this problem? > > Thanks! > Daniel -- Posted via http://www.ruby-forum.com/. From jperkins at equationresearch.com Mon Nov 27 19:36:43 2006 From: jperkins at equationresearch.com (John Perkins) Date: Tue, 28 Nov 2006 01:36:43 +0100 Subject: [Ferret-talk] Search on data accross many tables, linked by belongs_to In-Reply-To: References: <20060707222231.GA31705@cordoba.webit.de> Message-ID: <55a68e22f7b312927b27aea1f77c64a8@ruby-forum.com> Should this be able to work from the Console? For example, let's assume there is a Post record with a Category name of "Help"... Post.find_by_contents("Help") For some reason this does not generate any results... Maxime Curioni wrote: > Thanks Jens, I didn't know it was that simple ! > > >> you could use >> >> acts_as_ferret :fields => [ 'title', 'content', :category_name] >> def category_name >> category.name >> end >> >> to achieve this. >> >> Jens -- Posted via http://www.ruby-forum.com/. From cswilliams at gmail.com Tue Nov 28 03:03:20 2006 From: cswilliams at gmail.com (Chris Williams) Date: Tue, 28 Nov 2006 09:03:20 +0100 Subject: [Ferret-talk] find_by_contents never finds anything on my model Message-ID: Hi, Let me preface by saying I am very new to ferret and aaf. Anyhow, I'm using the aaf plugin on a model named Book. This model isnt a typical rails model in the fact that it doesnt have an "id" column as its primary key but instead has a string column named ISBN that is used as the primary key. When I try to search for anything in the model using find_by_contents it never finds anything. Trying to troubleshoot, I added an id column in my model as an integer and made it auto increment. Once I added this, added my sample data back in, and rebuilt my index, I noticed that I could search all of a sudden using find_by_contents. I changed it back again, and then my searches had no results. I was wondering if there is anything to get it to work without having to add an id field to my model? Many thanks for your help in advance! -Chris -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Tue Nov 28 04:57:44 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 28 Nov 2006 10:57:44 +0100 Subject: [Ferret-talk] Index not being updated Message-ID: My index is not being updated when I add new records or amend existing ones. Can anyone point me in the direction of where I should be looking for what is going wrong? I'm running this in the production environment. -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Nov 28 05:21:30 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 28 Nov 2006 11:21:30 +0100 Subject: [Ferret-talk] [AaF] find_by_contents never finds anything on my model In-Reply-To: References: Message-ID: <20061128102130.GA7968@cordoba.webit.de> On Tue, Nov 28, 2006 at 09:03:20AM +0100, Chris Williams wrote: > Hi, > Let me preface by saying I am very new to ferret and aaf. > > Anyhow, I'm using the aaf plugin on a model named Book. This model isnt > a typical rails model in the fact that it doesnt have an "id" column as > its primary key but instead has a string column named ISBN that is used > as the primary key. When I try to search for anything in the model > using find_by_contents it never finds anything. Trying to troubleshoot, > I added an id column in my model as an integer and made it auto > increment. Once I added this, added my sample data back in, and rebuilt > my index, I noticed that I could search all of a sudden using > find_by_contents. I changed it back again, and then my searches had no > results. I was wondering if there is anything to get it to work without > having to add an id field to my model? This should be possible, but not without modifying aaf. Aaf relies on your model having an 'id' attribute in several places, if you changed these places to use 'isbn' instead this should work. Maybe one even could solve this in a more generic way by getting the information what the actual primary key of the table is from active record. Anyway, the easiest way seems to rename your isbn column to 'id' would be the easiest way ;-) cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Tue Nov 28 05:22:57 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 28 Nov 2006 11:22:57 +0100 Subject: [Ferret-talk] [AaF] Search on data accross many tables, linked by belongs_to In-Reply-To: <55a68e22f7b312927b27aea1f77c64a8@ruby-forum.com> References: <20060707222231.GA31705@cordoba.webit.de> <55a68e22f7b312927b27aea1f77c64a8@ruby-forum.com> Message-ID: <20061128102257.GB7968@cordoba.webit.de> On Tue, Nov 28, 2006 at 01:36:43AM +0100, John Perkins wrote: > Should this be able to work from the Console? For example, let's assume > there is a Post record with a Category name of "Help"... > > Post.find_by_contents("Help") > > For some reason this does not generate any results... it should work from the console. what if you do Post.rebuild_index before the find_by_contents call ? Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Tue Nov 28 09:46:16 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 28 Nov 2006 15:46:16 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: References: Message-ID: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> Matthew Planchant wrote: > My index is not being updated when I add new records or amend existing > ones. > > Can anyone point me in the direction of where I should be looking for > what is going wrong? > > I'm running this in the production environment. Is an amended record added to the index when .save is called? -- Posted via http://www.ruby-forum.com/. From anotherbritt at gmail.com Tue Nov 28 10:08:50 2006 From: anotherbritt at gmail.com (Britt Selvitelle) Date: Tue, 28 Nov 2006 10:08:50 -0500 Subject: [Ferret-talk] Index not being updated In-Reply-To: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> Message-ID: <9fd96fa70611280708h2982efbdu69c75db71d5e1dd9@mail.gmail.com> This is the same problem I am experiencing, but have not had time to investigate it yet. I believe it is a bug. Britt On 11/28/06, Matthew Planchant wrote: > Matthew Planchant wrote: > > My index is not being updated when I add new records or amend existing > > ones. > > > > Can anyone point me in the direction of where I should be looking for > > what is going wrong? > > > > I'm running this in the production environment. > > Is an amended record added to the index when .save is called? > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From matt at planchant.co.uk Tue Nov 28 10:10:38 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 28 Nov 2006 16:10:38 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> Message-ID: <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> >> My index is not being updated when I add new records or amend existing >> ones. >> >> Can anyone point me in the direction of where I should be looking for >> what is going wrong? >> >> I'm running this in the production environment. > > Is an amended record added to the index when .save is called? Looks as though attributes from many to many relationships are not being added to the index when a record is amended. -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Tue Nov 28 10:18:44 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 28 Nov 2006 16:18:44 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> Message-ID: <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> > Looks as though attributes from many to many relationships are not being > added to the index when a record is amended. I have the following relationship between documents and topics: class Document < ActiveRecord::Base acts_as_ferret :additional_fields => [:topic_titles] has_many :document_topics, :dependent => true has_many :topics, :through => :document_topics def topic_titles topics.collect { |topic| topic.title }.join ' ' end end class Topic < ActiveRecord::Base has_many :document_topics, :dependent => true has_many :documents, :through => :document_topics end The update action in documents_controller.rb looks like this: def update params[:document][:topic_ids] ||= [] @document = Document.find(params[:id]) @topics = (params[:topics] or []).collect { |item| item.to_i } @document.attributes = params[:document] @document.topic_ids = @topics @document.save if @document.update_attributes(params[:document]) flash[:notice] = 'Document was successfully updated.' redirect_to :action => 'show', :id => @document else render :action => 'edit' end end I suspect there is something here which means the document has no topics when acts_as_ferret attempts to add the topic_titles to the index. -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Nov 28 11:21:11 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 28 Nov 2006 17:21:11 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> Message-ID: <20061128162111.GC7968@cordoba.webit.de> Hi! please see comments below. On Tue, Nov 28, 2006 at 04:18:44PM +0100, Matthew Planchant wrote: > > > Looks as though attributes from many to many relationships are not being > > added to the index when a record is amended. > > I have the following relationship between documents and topics: > > > > class Document < ActiveRecord::Base > acts_as_ferret :additional_fields => [:topic_titles] > > has_many :document_topics, :dependent => true > has_many :topics, :through => :document_topics > > def topic_titles > topics.collect { |topic| topic.title }.join ' ' > end > end > > class Topic < ActiveRecord::Base > has_many :document_topics, :dependent => true > has_many :documents, :through => :document_topics > end > > > The update action in documents_controller.rb looks like this: > > def update > params[:document][:topic_ids] ||= [] > @document = Document.find(params[:id]) > @topics = (params[:topics] or []).collect { |item| item.to_i } > @document.attributes = params[:document] > @document.topic_ids = @topics actually I wonder if this works at all - I'd be surprised if you can assign ids to a has_many :through relationship like that. However you could try this: @document.disable_ferret if @document.update_attributes(params[:document]) @document.ferret_update flash[:notice] = 'Document was successfully updated.' redirect_to :action => 'show', :id => @document else render :action => 'edit' end end > I suspect there is something here which means the document has no topics > when acts_as_ferret attempts to add the topic_titles to the index. the problem with indexing data from related objects seems to be that the after_update hook of aaf is called too early, that is, before all relationships are saved. Most often it helps to first save the record (and skip the indexing, which saves some time) and index the record after that. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Tue Nov 28 11:29:23 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 28 Nov 2006 17:29:23 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: <20061128162111.GC7968@cordoba.webit.de> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> <20061128162111.GC7968@cordoba.webit.de> Message-ID: Jens Kraemer wrote: > > the problem with indexing data from related objects seems to be that the > after_update hook of aaf is called too early, that is, before all > relationships are saved. Yes. This is what I thought might be going on. > Most often it helps to first save the record and skip the indexing, > (which saves some time) and index the record after that. How do I do this? Can I explicitly stop the indexing beginning then start it when I have completed saving the record? -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Nov 28 11:36:02 2006 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 28 Nov 2006 17:36:02 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> <20061128162111.GC7968@cordoba.webit.de> Message-ID: <20061128163602.GD7968@cordoba.webit.de> On Tue, Nov 28, 2006 at 05:29:23PM +0100, Matthew Planchant wrote: > Jens Kraemer wrote: > > > > the problem with indexing data from related objects seems to be that the > > after_update hook of aaf is called too early, that is, before all > > relationships are saved. > > Yes. This is what I thought might be going on. > > > Most often it helps to first save the record and skip the indexing, > > (which saves some time) and index the record after that. > > How do I do this? Can I explicitly stop the indexing beginning then > start it when I have completed saving the record? I already answered these in my last mail, but seems you snipped a bit too much away ;-) cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Tue Nov 28 11:39:40 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 28 Nov 2006 17:39:40 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: <20061128163602.GD7968@cordoba.webit.de> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> <20061128162111.GC7968@cordoba.webit.de> <20061128163602.GD7968@cordoba.webit.de> Message-ID: > I already answered these in my last mail, but seems you snipped a bit > too much away ;-) Thanks :D -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Tue Nov 28 11:47:47 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 28 Nov 2006 17:47:47 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: <20061128162111.GC7968@cordoba.webit.de> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> <20061128162111.GC7968@cordoba.webit.de> Message-ID: <6e24c047cc52e8238320a7bc25b22e5d@ruby-forum.com> Jens Kraemer wrote: >> has_many :documents, :through => :document_topics >> @document.topic_ids = @topics > actually I wonder if this works at all - I'd be surprised if you can > assign ids to a has_many :through relationship like that. Yep. It works. > However you could try this: > > @document.disable_ferret > if @document.update_attributes(params[:document]) > @document.ferret_update > flash[:notice] = 'Document was successfully updated.' > redirect_to :action => 'show', :id => @document > else > render :action => 'edit' > end > end > I get the error below. Do I need to include anything at the top of my controller for this to work? == undefined method `disable_ferret' for nil:NilClass == -- Posted via http://www.ruby-forum.com/. From cswilliams at gmail.com Tue Nov 28 13:16:15 2006 From: cswilliams at gmail.com (Chris Williams) Date: Tue, 28 Nov 2006 19:16:15 +0100 Subject: [Ferret-talk] how to update index from a script Message-ID: <5542b84741b5b84f17e2a06392b0e458@ruby-forum.com> Hello all, I'm using AAF right now to index my ~3million db records. However, any additions to these records are added to the database through an external script so the aaf activerecord hooks will not catch any updates. Since new records are only added rarely, I figured I could just add the new records manually in ferret from some type of script. I've been looking at the ferret documentation, but I'm sort of lost about how to update the aaf ferret index from a ruby script. I was wondering if anyone had any examples on how to do this. Thanks! -Chris -- Posted via http://www.ruby-forum.com/. From nappin713 at yahoo.com Tue Nov 28 14:14:25 2006 From: nappin713 at yahoo.com (Raymond O'connor) Date: Tue, 28 Nov 2006 20:14:25 +0100 Subject: [Ferret-talk] Update/Create record only if field is true Message-ID: I have a sellable flag in my database. I'm trying to have ferret only add/update records where sellable == true. What is the best way to do this? I've tried editing instance_methods.rb in the AAF, but I still can't get it to work. Thanks for the help -- Posted via http://www.ruby-forum.com/. From matt at planchant.co.uk Tue Nov 28 14:33:26 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Tue, 28 Nov 2006 20:33:26 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: <20061128162111.GC7968@cordoba.webit.de> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> <20061128162111.GC7968@cordoba.webit.de> Message-ID: <6645ae8c21a2d14dc701fa282fb5e628@ruby-forum.com> >> has_many :documents, :through => :document_topics >> @document.topic_ids = @topics > actually I wonder if this works at all Forgot to mention that topic_ids is defined elsewhere. -- Posted via http://www.ruby-forum.com/. From nappin713 at yahoo.com Tue Nov 28 14:44:26 2006 From: nappin713 at yahoo.com (Raymond O'connor) Date: Tue, 28 Nov 2006 20:44:26 +0100 Subject: [Ferret-talk] Update/Create record only if field is true In-Reply-To: References: Message-ID: I think I figured it out. I edited the following: instance_methods.rb#ferret_create line 85 I changed to: self.class.ferret_index << self.to_doc if !configure[:ignore_flag] or self.send(configure[:ignore_flag]) == true class_methods.rb#rebuild_index I made the same change to line 199: index << rec.to_doc if !configure[:ignore_flag] or self.send(configure[:ignore_flag]) == true Jens do you think you could add this functionality to AAF in a future update? It would be much appreciated! Thanks again so much for the great plugin. Cheers! -Ray -- Posted via http://www.ruby-forum.com/. From nappin713 at yahoo.com Tue Nov 28 14:50:55 2006 From: nappin713 at yahoo.com (Raymond O'connor) Date: Tue, 28 Nov 2006 20:50:55 +0100 Subject: [Ferret-talk] Update/Create record only if field is true In-Reply-To: References: Message-ID: <2cb3e4dc880ddec3e34d3fc452ee6997@ruby-forum.com> Raymond O'connor wrote: > I think I figured it out. I edited the following: > > instance_methods.rb#ferret_create > line 85 I changed to: > self.class.ferret_index << self.to_doc if !configure[:ignore_flag] or > self.send(configure[:ignore_flag]) == true > > class_methods.rb#rebuild_index > I made the same change to line 199: > index << rec.to_doc if !configure[:ignore_flag] or > self.send(configure[:ignore_flag]) == true > > Jens do you think you could add this functionality to AAF in a future > update? It would be much appreciated! Thanks again so much for the > great plugin. > > Cheers! > -Ray Whoops, I made a typo above. Configure should be configuration so the changes should actually read instance_methods.rb#ferret_create line 85 I changed to: self.class.ferret_index << self.to_doc if !configuration[:ignore_flag] or self.send(configuration[:ignore_flag]) == true class_methods.rb#rebuild_index I made the same change to line 199: index << rec.to_doc if !configuration[:ignore_flag] or self.send(configuration[:ignore_flag]) == true -- Posted via http://www.ruby-forum.com/. From ckozus at gmail.com Tue Nov 28 19:55:12 2006 From: ckozus at gmail.com (Carlos Kozuszko) Date: Wed, 29 Nov 2006 01:55:12 +0100 Subject: [Ferret-talk] Update/Create record only if field is true In-Reply-To: <2cb3e4dc880ddec3e34d3fc452ee6997@ruby-forum.com> References: <2cb3e4dc880ddec3e34d3fc452ee6997@ruby-forum.com> Message-ID: I want to do the same in my app. Is there a way to do this without hacking the plugin code? -- Posted via http://www.ruby-forum.com/. From wmorgan-ferret at masanjin.net Tue Nov 28 20:53:57 2006 From: wmorgan-ferret at masanjin.net (William Morgan) Date: Tue, 28 Nov 2006 17:53:57 -0800 Subject: [Ferret-talk] [ANN] sup 0.0.1 Released -- rubies for emails Message-ID: <1164765184-redwood-1705@south> Do you use rubies to read your emails? Well, sup version 0.0.1 has been released! http://sup.rubyforge.org Sup is an attempt to take the UI innovations of web-based email readers (ok, really just GMail) and to combine them with the traditional wholesome goodness of a console-based email client. Sup is designed to work with massive amounts of email, potentially spread out across different mbox files, IMAP folders, and GMail accounts, and to pull them all together into a single interface. The goal of Sup is to become the email client of choice for nerds everywhere. Sup is made possible thanks to Dave Balmain's ferret and Matt Armstrong's rmail. Changes: == 0.0.1 / 2006-11-28 * Initial release. Unix-centrism, support for mbox only, no i18n. Untested on anything other than 1.8.5. Other than that, works great! http://sup.rubyforge.org -- William From matt at planchant.co.uk Wed Nov 29 05:23:01 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Wed, 29 Nov 2006 11:23:01 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: <6e24c047cc52e8238320a7bc25b22e5d@ruby-forum.com> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> <20061128162111.GC7968@cordoba.webit.de> <6e24c047cc52e8238320a7bc25b22e5d@ruby-forum.com> Message-ID: <464e2a3000e4f1654697287e61dffbc6@ruby-forum.com> > undefined method `disable_ferret' for nil:NilClass Anyone know what I should do about this? -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Wed Nov 29 07:29:35 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 29 Nov 2006 13:29:35 +0100 Subject: [Ferret-talk] [ANN] sup 0.0.1 Released -- rubies for emails In-Reply-To: <1164765184-redwood-1705@south> References: <1164765184-redwood-1705@south> Message-ID: <20061129122935.GE7968@cordoba.webit.de> On Tue, Nov 28, 2006 at 05:53:57PM -0800, William Morgan wrote: > Do you use rubies to read your emails? Well, sup version 0.0.1 has > been released! > > http://sup.rubyforge.org > > Sup is an attempt to take the UI innovations of web-based email > readers (ok, really just GMail) and to combine them with the > traditional wholesome goodness of a console-based email client. sounds really great, I'll definitely give it a go! Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Wed Nov 29 07:47:17 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 29 Nov 2006 13:47:17 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: <6e24c047cc52e8238320a7bc25b22e5d@ruby-forum.com> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> <20061128162111.GC7968@cordoba.webit.de> <6e24c047cc52e8238320a7bc25b22e5d@ruby-forum.com> Message-ID: <20061129124717.GF7968@cordoba.webit.de> On Tue, Nov 28, 2006 at 05:47:47PM +0100, Matthew Planchant wrote: > Jens Kraemer wrote: > > @document.disable_ferret > > if @document.update_attributes(params[:document]) > > @document.ferret_update > > flash[:notice] = 'Document was successfully updated.' > > redirect_to :action => 'show', :id => @document > > else > > render :action => 'edit' > > end > > end > > > > I get the error below. Do I need to include anything at the top of my > controller for this to work? > > == > undefined method `disable_ferret' for nil:NilClass > == that error means that @document (the object you call disable_ferret on) is nil. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Wed Nov 29 08:03:56 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 29 Nov 2006 14:03:56 +0100 Subject: [Ferret-talk] Update/Create record only if field is true In-Reply-To: References: <2cb3e4dc880ddec3e34d3fc452ee6997@ruby-forum.com> Message-ID: <20061129130356.GG7968@cordoba.webit.de> On Wed, Nov 29, 2006 at 01:55:12AM +0100, Carlos Kozuszko wrote: > I want to do the same in my app. Is there a way to do this without > hacking the plugin code? not a nice one, but using the disable_ferret method you could build a custom save method that disables ferret for the next save if some condition is met, and then saves the record: def conditional_ferret_save disable_ferret if do_not_index? save end note that the disable_ferret method disables indexing for this record for the next call to save - this is definitely not thread safe, but in a Rails context this usually is not a problem. However I like the idea of having an :if option to acts_as_ferret to specify a condition (symbol pointing to a method, or a proc) telling whether indexing should take place or not. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Wed Nov 29 08:10:39 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 29 Nov 2006 14:10:39 +0100 Subject: [Ferret-talk] how to update index from a script In-Reply-To: <5542b84741b5b84f17e2a06392b0e458@ruby-forum.com> References: <5542b84741b5b84f17e2a06392b0e458@ruby-forum.com> Message-ID: <20061129131039.GH7968@cordoba.webit.de> On Tue, Nov 28, 2006 at 07:16:15PM +0100, Chris Williams wrote: > Hello all, > I'm using AAF right now to index my ~3million db records. However, any > additions to these records are added to the database through an external > script so the aaf activerecord hooks will not catch any updates. Since > new records are only added rarely, I figured I could just add the new > records manually in ferret from some type of script. I've been looking > at the ferret documentation, but I'm sort of lost about how to update > the aaf ferret index from a ruby script. I was wondering if anyone had > any examples on how to do this. You'd have to mimic the way how aaf uses the index in your script. the to_doc method in instance_methods.rb should be a goot starting point. It would be way easier to use ActiveRecord in the script and run it through script/runner - that way aaf will catch the updates. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From john at digitalpulp.com Wed Nov 29 19:33:00 2006 From: john at digitalpulp.com (John Bachir) Date: Wed, 29 Nov 2006 19:33:00 -0500 Subject: [Ferret-talk] non-searchable columns, normalization In-Reply-To: References: Message-ID: <52723956-8FE7-463D-A448-3E378CEFDD63@digitalpulp.com> > On Nov 29, 2006, at 7:16 PM, John Bachir wrote: > > [1] A typical request will be "select id from articles where > KEYWORDS % body". Will id be indexed for fulltext searching? > clearly the fulltext index on id will never be used... id is only > in the index so that it can be returned. > > [2] I anticipate that a response might be "because id is numeric > and has a cardinality of 1/1, ferret is intelligent enough to store > this efficiently in a b-tree and not waste space or time on a > fulltext index". what about if there is a year column, against > which we will never search? Okay, I just discovered the :index option http://ferret.davebalmain.com/api/classes/Ferret/Index/FieldInfo.html So I answered the first 2 questions :) John From john at digitalpulp.com Wed Nov 29 19:16:22 2006 From: john at digitalpulp.com (John Bachir) Date: Wed, 29 Nov 2006 19:16:22 -0500 Subject: [Ferret-talk] non-searchable columns, normalization Message-ID: Hello. I am new to Ferret. I am using it through Acts as Ferret. Let's say I have such a table, and all columns are indexed using the default behavior provided by acts_as_ferret: ARTICLES -id -year -body [1] A typical request will be "select id from articles where KEYWORDS % body". Will id be indexed for fulltext searching? clearly the fulltext index on id will never be used... id is only in the index so that it can be returned. [2] I anticipate that a response might be "because id is numeric and has a cardinality of 1/1, ferret is intelligent enough to store this efficiently in a b-tree and not waste space or time on a fulltext index". what about if there is a year column, against which we will never search? [3] This is an acts_as_ferret-specific question: are there any convenient methods to do queries against the regular (mysql) database and the ferret database at the same time? for example, still using the example above, if year (and maybe several other columns) were NOT stored in the ferret DB, is there any level of abstraction at which i can query: "select id, year, author, title from articles where KEYWORDS % body" ? thanks for any pointers. john From kraemer at webit.de Thu Nov 30 04:36:30 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 30 Nov 2006 10:36:30 +0100 Subject: [Ferret-talk] [AaF] non-searchable columns, normalization In-Reply-To: References: Message-ID: <20061130093630.GL7968@cordoba.webit.de> Hi! On Wed, Nov 29, 2006 at 07:16:22PM -0500, John Bachir wrote: > Hello. I am new to Ferret. I am using it through Acts as Ferret. > > Let's say I have such a table, and all columns are indexed using the > default behavior provided by acts_as_ferret: > > ARTICLES > -id > -year > -body > [..] > > [3] This is an acts_as_ferret-specific question: are there any > convenient methods to do queries against the regular (mysql) database > and the ferret database at the same time? for example, still using > the example above, if year (and maybe several other columns) were NOT > stored in the ferret DB, is there any level of abstraction at which i > can query: "select id, year, author, title from articles where > KEYWORDS % body" ? You can give ActiveRecord conditions (and any other options such as :include or :order) in the second parameter hash to find_by_contents: Model.find_by_contents( query, {}, { :conditions => ['year > ?',year] } ) but be aware that these conditions will be applied *after* searching the Ferret index, and will further reduce the result set returned by ferret. This makes Ferret's :limit and :offset options quite useless. You can however use these options inside the second hash so they will be applied to the ActiveRecord find call. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From henke at mac.se Thu Nov 30 05:28:42 2006 From: henke at mac.se (Henrik Zagerholm) Date: Thu, 30 Nov 2006 11:28:42 +0100 Subject: [Ferret-talk] Fatal error when require ferret Message-ID: <63047075-8A28-4B59-A126-F4856B9AB77E@mac.se> Hello list, I just started using ferret and it really doesn't go my way. Doing gem install ferret outputs -> make install /usr/bin/install -c -m 0755 ferret_ext.so /var/lib/gems/1.8/gems/ ferret-0.10.13/lib make clean Successfully installed ferret-0.10.13 Installing ri documentation for ferret-0.10.13... Installing RDoc documentation for ferret-0.10.13... In ferret.rb -> require 'ferret' include Ferret ruby ferret.rb outputs -> ./ferret.rb:3: uninitialized constant Ferret (NameError) from ferret.rb:2:in `require' from ferret.rb:2 Hmm something doesn't smell right! Regards, Henrik From andreas.korth at gmx.net Thu Nov 30 06:33:00 2006 From: andreas.korth at gmx.net (Andreas Korth) Date: Thu, 30 Nov 2006 12:33:00 +0100 Subject: [Ferret-talk] Fatal error when require ferret In-Reply-To: <63047075-8A28-4B59-A126-F4856B9AB77E@mac.se> References: <63047075-8A28-4B59-A126-F4856B9AB77E@mac.se> Message-ID: require "rubygems" > require 'ferret' > include Ferret :) From henke at mac.se Thu Nov 30 06:53:09 2006 From: henke at mac.se (Henrik Zagerholm) Date: Thu, 30 Nov 2006 12:53:09 +0100 Subject: [Ferret-talk] SOLVED Re: Fatal error when require ferret In-Reply-To: <63047075-8A28-4B59-A126-F4856B9AB77E@mac.se> References: <63047075-8A28-4B59-A126-F4856B9AB77E@mac.se> Message-ID: <4238A062-FAD5-457C-A975-FCEC6FCF27B0@mac.se> Solved it.... :) 30 nov 2006 kl. 11:28 skrev Henrik Zagerholm: > Hello list, > > I just started using ferret and it really doesn't go my way. > > Doing > gem install ferret outputs -> > > make install > /usr/bin/install -c -m 0755 ferret_ext.so /var/lib/gems/1.8/gems/ > ferret-0.10.13/lib > > make clean > Successfully installed ferret-0.10.13 > Installing ri documentation for ferret-0.10.13... > Installing RDoc documentation for ferret-0.10.13... > > > In ferret.rb -> > require 'ferret' > include Ferret > > ruby ferret.rb outputs -> > ./ferret.rb:3: uninitialized constant Ferret (NameError) > from ferret.rb:2:in `require' > from ferret.rb:2 > > > Hmm something doesn't smell right! > > Regards, > Henrik > > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From kraemer at webit.de Thu Nov 30 09:13:17 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 30 Nov 2006 15:13:17 +0100 Subject: [Ferret-talk] [AaF] find_conditions in acts_as_ferret find_by_contents In-Reply-To: <933658dad3371a2c8bbb2922069eceaa@ruby-forum.com> References: <047ad205468d193120688c35aece2c1d@ruby-forum.com> <933658dad3371a2c8bbb2922069eceaa@ruby-forum.com> Message-ID: <20061130141317.GO7968@cordoba.webit.de> On Mon, Nov 27, 2006 at 07:29:09AM +0100, Daniel wrote: > I think I found the problem. On line 470, this: > > def combine_conditions(conditions, additional_conditions) > > should be this: > > def combine_conditions(conditions, *additional_conditions) correct, that way it works with a single string as conditions argument, too. > Furthermore, the rescue clause at 282 might be more usefull if it > reraised the error I changed the rescue to only catch RecordNotFound errors, and raised the log level to warn. just committed. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Thu Nov 30 12:00:58 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Thu, 30 Nov 2006 18:00:58 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: <20061129124717.GF7968@cordoba.webit.de> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> <20061128162111.GC7968@cordoba.webit.de> <6e24c047cc52e8238320a7bc25b22e5d@ruby-forum.com> <20061129124717.GF7968@cordoba.webit.de> Message-ID: <90d0a6b7fc99e2b32e6eb7d9322b23fa@ruby-forum.com> I got this working by disabling ferret for a block like this: def update params[:document][:topic_ids] ||= [] @document = Document.find(params[:id]) @document.disable_ferret do @topics = (params[:topics] or []).collect { |item| item.to_i } @document.attributes = params[:document] @document.topic_ids = @topics @document.save end if @document.update_attributes(params[:document]) flash[:notice] = 'Document was successfully updated.' redirect_to :action => 'show', :id => @document else render :action => 'edit' end end -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Nov 30 12:21:12 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 30 Nov 2006 18:21:12 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: <90d0a6b7fc99e2b32e6eb7d9322b23fa@ruby-forum.com> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> <20061128162111.GC7968@cordoba.webit.de> <6e24c047cc52e8238320a7bc25b22e5d@ruby-forum.com> <20061129124717.GF7968@cordoba.webit.de> <90d0a6b7fc99e2b32e6eb7d9322b23fa@ruby-forum.com> Message-ID: <20061130172112.GP7968@cordoba.webit.de> On Thu, Nov 30, 2006 at 06:00:58PM +0100, Matthew Planchant wrote: > I got this working by disabling ferret for a block like this: glad to hear :-) > def update > params[:document][:topic_ids] ||= [] > @document = Document.find(params[:id]) > > @document.disable_ferret do > @topics = (params[:topics] or []).collect { |item| item.to_i } > @document.attributes = params[:document] > @document.topic_ids = @topics > @document.save > end > imho it would be better to not call update_attributes in this place, as you already saved the document inside the block above. It's one line more for the explicit call to ferret_update but should save you one update call to your DB. so instead of this: > if @document.update_attributes(params[:document]) this should work, too: if @document.valid? @document.ferret_update > flash[:notice] = 'Document was successfully updated.' > redirect_to :action => 'show', :id => @document > else > render :action => 'edit' > end > end you could even store the return value from the save call inside the block and use that in the if statement... Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From matt at planchant.co.uk Thu Nov 30 12:35:54 2006 From: matt at planchant.co.uk (Matthew Planchant) Date: Thu, 30 Nov 2006 18:35:54 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: <20061130172112.GP7968@cordoba.webit.de> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> <20061128162111.GC7968@cordoba.webit.de> <6e24c047cc52e8238320a7bc25b22e5d@ruby-forum.com> <20061129124717.GF7968@cordoba.webit.de> <90d0a6b7fc99e2b32e6eb7d9322b23fa@ruby-forum.com> <20061130172112.GP7968@cordoba.webit.de> Message-ID: > imho it would be better to not call update_attributes in this place, as > you already saved the document inside the block above. It's one line > more for the explicit call to ferret_update but should save you one > update call to your DB. > > so instead of this: >> if @document.update_attributes(params[:document]) > > this should work, too: > > if @document.valid? > @document.ferret_update >> flash[:notice] = 'Document was successfully updated.' >> redirect_to :action => 'show', :id => @document >> else >> render :action => 'edit' >> end >> end > > you could even store the return value from the save call inside the > block and use that in the if statement... Thanks for the reply Jens. Is this because: @document.attributes = params[:document] does the same as: @document.update_attributes(params[:document]) ? -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Nov 30 12:40:16 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 30 Nov 2006 18:40:16 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: <20061130172112.GP7968@cordoba.webit.de> References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> <20061128162111.GC7968@cordoba.webit.de> <6e24c047cc52e8238320a7bc25b22e5d@ruby-forum.com> <20061129124717.GF7968@cordoba.webit.de> <90d0a6b7fc99e2b32e6eb7d9322b23fa@ruby-forum.com> <20061130172112.GP7968@cordoba.webit.de> Message-ID: <20061130174016.GQ7968@cordoba.webit.de> Update: As it turned out I have a very similar situation in some controller I'm just writing, so I just committed a small tweak to make it less noisy: def update params[:document][:topic_ids] ||= [] @document = Document.find(params[:id]) # this will call ferret_update after the block, if the block evals # to true (which is the case when save succeeds). as a bonus, it # also returns the value returned by the block so we can use it in # the if statement :-) if @document.disable_ferret(:index_when_true) { @topics = (params[:topics] or []).collect { |item| item.to_i } @document.attributes = params[:document] @document.topic_ids = @topics @document.save } flash[:notice] = 'Document was successfully updated.' redirect_to :action => 'show', :id => @document else render :action => 'edit' end end the method name 'disable_ferret' doesn't fit this usage pattern that nice, I'm open to suggstions ;-) cheers, Jens On Thu, Nov 30, 2006 at 06:21:12PM +0100, Jens Kraemer wrote: > On Thu, Nov 30, 2006 at 06:00:58PM +0100, Matthew Planchant wrote: > > I got this working by disabling ferret for a block like this: > > glad to hear :-) > > > def update > > params[:document][:topic_ids] ||= [] > > @document = Document.find(params[:id]) > > > > @document.disable_ferret do > > @topics = (params[:topics] or []).collect { |item| item.to_i } > > @document.attributes = params[:document] > > @document.topic_ids = @topics > > @document.save > > end > > > > imho it would be better to not call update_attributes in this place, as > you already saved the document inside the block above. It's one line > more for the explicit call to ferret_update but should save you one > update call to your DB. > > so instead of this: > > if @document.update_attributes(params[:document]) > > this should work, too: > > if @document.valid? > @document.ferret_update > > flash[:notice] = 'Document was successfully updated.' > > redirect_to :action => 'show', :id => @document > > else > > render :action => 'edit' > > end > > end > > you could even store the return value from the save call inside the > block and use that in the if statement... > > Jens > > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Thu Nov 30 12:41:22 2006 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 30 Nov 2006 18:41:22 +0100 Subject: [Ferret-talk] Index not being updated In-Reply-To: References: <80feb33cd993bed52e00ced96788c15f@ruby-forum.com> <19dbdc03be629ab1e3638fcd61494e65@ruby-forum.com> <21fb8bde88c861b0c0dc3097163fdfc4@ruby-forum.com> <20061128162111.GC7968@cordoba.webit.de> <6e24c047cc52e8238320a7bc25b22e5d@ruby-forum.com> <20061129124717.GF7968@cordoba.webit.de> <90d0a6b7fc99e2b32e6eb7d9322b23fa@ruby-forum.com> <20061130172112.GP7968@cordoba.webit.de> Message-ID: <20061130174122.GR7968@cordoba.webit.de> On Thu, Nov 30, 2006 at 06:35:54PM +0100, Matthew Planchant wrote: > > > imho it would be better to not call update_attributes in this place, as > > you already saved the document inside the block above. It's one line > > more for the explicit call to ferret_update but should save you one > > update call to your DB. > > > > so instead of this: > >> if @document.update_attributes(params[:document]) > > > > this should work, too: > > > > if @document.valid? > > @document.ferret_update > >> flash[:notice] = 'Document was successfully updated.' > >> redirect_to :action => 'show', :id => @document > >> else > >> render :action => 'edit' > >> end > >> end > > > > you could even store the return value from the save call inside the > > block and use that in the if statement... > > Thanks for the reply Jens. > > Is this because: > > @document.attributes = params[:document] > > does the same as: > > @document.update_attributes(params[:document]) nearly ;-) @document.attributes = params[:document] @document.save does the same as @document.update_attributes(params[:document]) Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From john at digitalpulp.com Thu Nov 30 12:50:32 2006 From: john at digitalpulp.com (John Bachir) Date: Thu, 30 Nov 2006 12:50:32 -0500 Subject: [Ferret-talk] usage and benefits of single-index with AAF Message-ID: <03293525-F6B0-4BA5-B268-A06AD2965C75@digitalpulp.com> The documentation states: "single_index: set this to true to let this class use a Ferret index that is shared by all classes having :single_index set to true. :store_class_name is set to true implicitly, as well as index_dir, so don?t bother setting these when using this option. the shared index will be located in index//shared ." [1] If I'm reading the code correctly, it seems that single-model searches will behave the same as before, and AAF/Ferret will add the extra column to the query for me. Is this correct? [2] How can I take advantage of the single-index when doing multi- model searches? Through which model will I perform the query? Or do I need to do this with raw Ferret queries and not through AAF? [3] Are there other advantages or gotchas that I'm missing? Thanks, John