From jicuss at gmail.com Tue Apr 10 16:37:41 2007 From: jicuss at gmail.com (Josh Icuss) Date: Tue, 10 Apr 2007 13:37:41 -0700 Subject: mousehole 2 syntax Message-ID: The syntax seems to have changed since when i last used mousehole. What is the current way to select a div with an id of 'header' and delete it? Any references I could look at? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070410/e2d9d9e7/attachment.html From jowensbysandifer at gmail.com Tue Apr 10 18:29:54 2007 From: jowensbysandifer at gmail.com (Jessica Owensby-Sandifer) Date: Tue, 10 Apr 2007 18:29:54 -0400 Subject: mousehole 2 syntax In-Reply-To: References: Message-ID: <777ac5ee0704101529v7889f15bve35cea64a705d1bf@mail.gmail.com> The Hpricot showcaseis a good reference. Did you try something like... > doc.search('#header').remove > doc.to_html > ? Jessica On 4/10/07, Josh Icuss wrote: > > The syntax seems to have changed since when i last used mousehole. What is > the current way to select a div with an id of 'header' and delete it? Any > references I could look at? > > _______________________________________________ > Mousehole-scripters mailing list > Mousehole-scripters at rubyforge.org > http://rubyforge.org/mailman/listinfo/mousehole-scripters > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070410/e7ed8b7a/attachment.html From why at hobix.com Tue Apr 10 21:05:41 2007 From: why at hobix.com (why the lucky stiff) Date: Tue, 10 Apr 2007 20:05:41 -0500 Subject: mousehole 2 syntax In-Reply-To: References: Message-ID: <20070411010541.GO36970@beekeeper.hobix.com> On Tue, Apr 10, 2007 at 01:37:41PM -0700, Josh Icuss wrote: > The syntax seems to have changed since when i last used mousehole. What is > the current way to select a div with an id of 'header' and delete it? Any > references I could look at? Yeah, MH2 uses Hpricot: http://code.whytheluckystiff.net/doc/hpricot/ <- docs http://code.whytheluckystiff.net/hpricot/ <- showcase So, your example would go like: (document/"#header).remove _why From jicuss at gmail.com Thu Apr 12 02:21:02 2007 From: jicuss at gmail.com (Josh Icuss) Date: Wed, 11 Apr 2007 23:21:02 -0700 Subject: requiring other libs Message-ID: Im realy digging the new syntax. Hpricot is a well developed choice. Having trouble loading the Scrapi gem. any known issues? require 'tidy' or require 'scrapi' causes the script not to load. Also could anyone provide a quick insert_before example? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070411/19faa48d/attachment.html From lwu.two at gmail.com Sat Apr 14 18:24:28 2007 From: lwu.two at gmail.com (Leslie Wu) Date: Sat, 14 Apr 2007 15:24:28 -0700 Subject: requiring other libs In-Reply-To: References: Message-ID: <78ec96990704141524s25160fe0i79fb651ba0ece751@mail.gmail.com> % gem list --local | grep "scrapi\|tidy\|hpricot" hpricot (0.5.134, ...) scrapi (1.2.0) tidy (1.1.2) % ruby -v ruby 1.8.6 (2007-02-19 patchlevel 5000) [i686-darwin8.8.1] % irb require 'scrapi' require 'hpricot' # Works for me? Google found this use of insert_before: http://shanesbrain.net/articles/2006/10/02/screen-scraping-wikipedia ~L On 4/11/07, Josh Icuss wrote: > > Im realy digging the new syntax. Hpricot is a well developed choice. > Having trouble loading the Scrapi gem. any known issues? > require 'tidy' or require 'scrapi' causes the script not to load. Also > could anyone provide a quick insert_before example? > > _______________________________________________ > Mousehole-scripters mailing list > Mousehole-scripters at rubyforge.org > http://rubyforge.org/mailman/listinfo/mousehole-scripters > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070414/defb2bec/attachment.html From jicuss at gmail.com Tue Apr 17 14:09:35 2007 From: jicuss at gmail.com (Josh Icuss) Date: Tue, 17 Apr 2007 11:09:35 -0700 Subject: requiring other libs. Message-ID: Sure. This is what I was working on for IFilm. Im looking to define classes for each object to make it easier to work with and later store in a DB. Ive noticed that Mousehole now supports SQLite databases. I like the direction your taking the project. One thing I like about Scrapi is the built in class definitions. ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ require 'tidy' require 'scrapi' list=(page/"div.similarlist"/'div.related_meta_data') related_media = Scraper.define do process "a.title_link", :description=>:text, :url=>"@href" process "p.stats", :stats=>:text result :description, :url, :stats end class IFilm < MouseHole::App title "IFilm Ad remover" namespace '' description 'removes ifilm ads' version "0.1" + url("http://*.ifilm.com/*") # + url("http://www.ifilm.com/") def rewrite(page) (document/"#HEADERAD").remove (document/"div.ad-rectangle").inner_html="funtimes" similar_videos=(document/"div.similar_videos_div") (document/"div[@id='SUPPLEMENT']").inner_html=similar_videos (document/"div[@id='comment_box']").remove (document/"div[@id='MYIFILM_BUMP']").remove (document/"div[@id='FOOTER']").remove (document/"h3[@id='UPLOAD']").remove (document/"div[@id='HEADER']").remove (document/"div[@id='TASKBAR']").remove list=(document/"div.similarlist"/'div.related_meta_data') media_urls=list.collect{|x| related_media.scrape(x.inner_html).url} string="
" media_urls.each do |x| string << "

" << x << "

" end string << "
" (document/"div[@id='SUPPLEMENT']").inner_html=string end end On 4/14/07, Leslie Wu wrote: > > Ah, you posted this to mousehole-scripters. > > Do you have a sample where this breaks? > > ~L > > On 4/11/07, Josh Icuss < jicuss at gmail.com> wrote: > > > > Im realy digging the new syntax. Hpricot is a well developed choice. > > Having trouble loading the Scrapi gem. any known issues? > > require 'tidy' or require 'scrapi' causes the script not to load. Also > > could anyone provide a quick insert_before example? > > > > _______________________________________________ > > Mousehole-scripters mailing list > > Mousehole-scripters at rubyforge.org > > http://rubyforge.org/mailman/listinfo/mousehole-scripters > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070417/25ed315c/attachment.html From lwu.two at gmail.com Wed Apr 18 01:12:33 2007 From: lwu.two at gmail.com (Leslie Wu) Date: Tue, 17 Apr 2007 22:12:33 -0700 Subject: mouseHole... a balloon! Message-ID: <78ec96990704172212y5a1ebfd7ode31dbeae2c2cd7c@mail.gmail.com> Anyone want to give this a try?! http://balloon.hobix.com/mouseHole ~L -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070417/2b291b82/attachment.html From lwu.two at gmail.com Wed Apr 18 02:51:07 2007 From: lwu.two at gmail.com (Leslie Wu) Date: Tue, 17 Apr 2007 23:51:07 -0700 Subject: requiring other libs. In-Reply-To: References: Message-ID: <78ec96990704172351w4c440b19g4d2cb5aec67316d1@mail.gmail.com> If I comment out the first instance of 'list=(page/"div..." I'm able to get the script to load properly. As is, note that mouseHole will complain on the command-line that the script is broken (and also on the mouseHole apps page). ~L On 4/17/07, Josh Icuss wrote: > > Sure. This is what I was working on for IFilm. Im looking to define > classes for each object to make it easier to work with and later store in a > DB. Ive noticed that Mousehole now supports SQLite databases. I like the > direction your taking the project. One thing I like about Scrapi is the > built in class definitions. > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > require 'tidy' > require 'scrapi' > > > list=(page/"div.similarlist"/'div.related_meta_data') > > related_media = Scraper.define do > process "a.title_link", :description=>:text, :url=>"@href" > process "p.stats", :stats=>:text > result :description, :url, :stats > end > > class IFilm < MouseHole::App > title "IFilm Ad remover" > namespace '' > description 'removes ifilm ads' > version "0.1" > + url("http://*.ifilm.com/*") > # + url("http://www.ifilm.com/") > > def rewrite(page) > (document/"#HEADERAD").remove > (document/"div.ad-rectangle").inner_html="funtimes" > similar_videos=(document/"div.similar_videos_div") > (document/"div[@id='SUPPLEMENT']").inner_html=similar_videos > (document/"div[@id='comment_box']").remove > (document/"div[@id='MYIFILM_BUMP']").remove > (document/"div[@id='FOOTER']").remove > (document/"h3[@id='UPLOAD']").remove > (document/"div[@id='HEADER']").remove > (document/"div[@id='TASKBAR']").remove > list=(document/"div.similarlist"/'div.related_meta_data') > media_urls=list.collect{|x| related_media.scrape(x.inner_html).url} > > string="
" > media_urls.each do |x| > string << "

" << x << "

" > end > string << "
" > (document/"div[@id='SUPPLEMENT']").inner_html=string > > > end > end > > > > > > > On 4/14/07, Leslie Wu wrote: > > > > Ah, you posted this to mousehole-scripters. > > > > Do you have a sample where this breaks? > > > > ~L > > > > On 4/11/07, Josh Icuss < jicuss at gmail.com> wrote: > > > > > > Im realy digging the new syntax. Hpricot is a well developed choice. > > > Having trouble loading the Scrapi gem. any known issues? > > > require 'tidy' or require 'scrapi' causes the script not to load. Also > > > could anyone provide a quick insert_before example? > > > > > > _______________________________________________ > > > Mousehole-scripters mailing list > > > Mousehole-scripters at rubyforge.org > > > http://rubyforge.org/mailman/listinfo/mousehole-scripters > > > > > > > > > > _______________________________________________ > Mousehole-scripters mailing list > Mousehole-scripters at rubyforge.org > http://rubyforge.org/mailman/listinfo/mousehole-scripters > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070417/32a5fe53/attachment-0001.html From joe.mauriello at gmail.com Wed Apr 18 09:52:35 2007 From: joe.mauriello at gmail.com (Joe Mauriello) Date: Wed, 18 Apr 2007 09:52:35 -0400 Subject: mouseHole... a balloon! In-Reply-To: <78ec96990704172212y5a1ebfd7ode31dbeae2c2cd7c@mail.gmail.com> References: <78ec96990704172212y5a1ebfd7ode31dbeae2c2cd7c@mail.gmail.com> Message-ID: <1f7424050704180652m19a0b06fi5d1a1381dadbf763@mail.gmail.com> Gave it a try and it worked perfectly. this is great! Thanks. On 4/18/07, Leslie Wu wrote: > > Anyone want to give this a try?! > > http://balloon.hobix.com/mouseHole > > ~L > > _______________________________________________ > Mousehole-scripters mailing list > Mousehole-scripters at rubyforge.org > http://rubyforge.org/mailman/listinfo/mousehole-scripters > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070418/ecaec0d3/attachment.html From jicuss at gmail.com Wed Apr 18 14:04:40 2007 From: jicuss at gmail.com (Josh Icuss) Date: Wed, 18 Apr 2007 11:04:40 -0700 Subject: mouseHole... a balloon! In-Reply-To: <1f7424050704180652m19a0b06fi5d1a1381dadbf763@mail.gmail.com> References: <78ec96990704172212y5a1ebfd7ode31dbeae2c2cd7c@mail.gmail.com> <1f7424050704180652m19a0b06fi5d1a1381dadbf763@mail.gmail.com> Message-ID: awesome! i love those balloons. I had never heard of them, very convenient, like an inflated gem . On 4/18/07, Joe Mauriello wrote: > > Gave it a try and it worked perfectly. this is great! Thanks. > > On 4/18/07, Leslie Wu wrote: > > > Anyone want to give this a try?! > > > > http://balloon.hobix.com/mouseHole > > > > ~L > > > > _______________________________________________ > > Mousehole-scripters mailing list > > Mousehole-scripters at rubyforge.org > > http://rubyforge.org/mailman/listinfo/mousehole-scripters > > > > > _______________________________________________ > Mousehole-scripters mailing list > Mousehole-scripters at rubyforge.org > http://rubyforge.org/mailman/listinfo/mousehole-scripters > -- Joshua Icuss - Biomedical Engineering Undergraduate, UCI -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070418/0f429996/attachment.html From edheil at edheil.com Wed Apr 18 15:05:10 2007 From: edheil at edheil.com (Edward Heil) Date: Wed, 18 Apr 2007 15:05:10 -0400 Subject: mouseHole... a balloon! In-Reply-To: <1f7424050704180652m19a0b06fi5d1a1381dadbf763@mail.gmail.com> References: <78ec96990704172212y5a1ebfd7ode31dbeae2c2cd7c@mail.gmail.com> <1f7424050704180652m19a0b06fi5d1a1381dadbf763@mail.gmail.com> Message-ID: On Apr 18, 2007, at 9:52 AM, Joe Mauriello wrote: > Gave it a try and it worked perfectly. this is great! Thanks. > Gave it a try with a recently compiled latest-and-greatest-stable Ruby and got... Select which gem to install for your platform (i686-darwin8.9.1) 1. mongrel 0.3.14 (ruby) 2. mongrel 0.3.13.4 (mswin32) 3. Cancel installation > 1 RuntimeError: cgi_multipart_eof_fix requires Ruby version <= 1.8.5 :( Contrariwise, if I use the musty old Ruby 1.8.2 shipped with OS X Tiger, I get... Select which gem to install for your platform (universal-darwin8.0) 1. mongrel 0.3.14 (ruby) 2. mongrel 0.3.13.4 (mswin32) 3. Cancel installation > 1 RuntimeError: mongrel requires Ruby version >= 1.8.4 I've got to go back and get ruby 1.8.5 to hit the sweet spot? From joe.mauriello at gmail.com Thu Apr 19 15:32:40 2007 From: joe.mauriello at gmail.com (Joe Mauriello) Date: Thu, 19 Apr 2007 14:32:40 -0500 Subject: mouseHole... a balloon! In-Reply-To: References: <78ec96990704172212y5a1ebfd7ode31dbeae2c2cd7c@mail.gmail.com> <1f7424050704180652m19a0b06fi5d1a1381dadbf763@mail.gmail.com> Message-ID: <1f7424050704191232n74f843e8yc851307535a57f0f@mail.gmail.com> Hey, Actually... I got the same... I sent that email a little early, and than never corrected it. Sorry for the confusion. I am also running 1.8.4 On 4/18/07, Edward Heil wrote: > > > On Apr 18, 2007, at 9:52 AM, Joe Mauriello wrote: > > > Gave it a try and it worked perfectly. this is great! Thanks. > > > > Gave it a try with a recently compiled latest-and-greatest-stable > Ruby and got... > > Select which gem to install for your platform (i686-darwin8.9.1) > 1. mongrel 0.3.14 (ruby) > 2. mongrel 0.3.13.4 (mswin32) > 3. Cancel installation > > 1 > RuntimeError: cgi_multipart_eof_fix requires Ruby version <= 1.8.5 > > :( > > Contrariwise, if I use the musty old Ruby 1.8.2 shipped with OS X > Tiger, I get... > > Select which gem to install for your platform (universal-darwin8.0) > 1. mongrel 0.3.14 (ruby) > 2. mongrel 0.3.13.4 (mswin32) > 3. Cancel installation > > 1 > RuntimeError: mongrel requires Ruby version >= 1.8.4 > > I've got to go back and get ruby 1.8.5 to hit the sweet spot? > > > _______________________________________________ > Mousehole-scripters mailing list > Mousehole-scripters at rubyforge.org > http://rubyforge.org/mailman/listinfo/mousehole-scripters > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070419/ea4da321/attachment.html