requiring other libs.

Leslie Wu lwu.two at gmail.com
Wed Apr 18 02:51:07 EDT 2007


If I comment out the first instance of 'list=(page/"div..." I'm able to get
the script to load properly. As is, note that mouseHole will complain on the
command-line that the script is broken (and also on the mouseHole apps
page).

~L

On 4/17/07, Josh Icuss <jicuss at gmail.com> wrote:
>
> Sure. This is what I was working on for IFilm. Im looking to define
> classes for each object to make it easier to work with and later store in a
> DB. Ive noticed that Mousehole now supports SQLite databases. I like the
> direction your taking the project. One thing I like about Scrapi is the
> built in class definitions.
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> require 'tidy'
> require 'scrapi'
>
>
> list=(page/"div.similarlist"/'div.related_meta_data')
>
> related_media = Scraper.define do
>   process "a.title_link", :description=>:text, :url=>"@href"
>   process "p.stats", :stats=>:text
>   result :description, :url, :stats
> end
>
> class IFilm < MouseHole::App
>   title "IFilm Ad remover"
>   namespace ''
>   description 'removes ifilm ads'
>   version "0.1"
>   + url("http://*.ifilm.com/*")
>  # + url("http://www.ifilm.com/")
>
>   def rewrite(page)
>    (document/"#HEADERAD").remove
>    (document/"div.ad-rectangle").inner_html="funtimes"
>    similar_videos=(document/"div.similar_videos_div")
>    (document/"div[@id='SUPPLEMENT']").inner_html=similar_videos
>    (document/"div[@id='comment_box']").remove
>    (document/"div[@id='MYIFILM_BUMP']").remove
>    (document/"div[@id='FOOTER']").remove
>    (document/"h3[@id='UPLOAD']").remove
>     (document/"div[@id='HEADER']").remove
>      (document/"div[@id='TASKBAR']").remove
>  list=(document/"div.similarlist"/'div.related_meta_data')
>  media_urls=list.collect{|x| related_media.scrape(x.inner_html).url}
>
>  string="<div id='nu'>"
>  media_urls.each do |x|
>  string << "<p>" << x << "</p>"
>  end
>  string << "</div>"
>  (document/"div[@id='SUPPLEMENT']").inner_html=string
>
>
>   end
> end
>
>
>
>
>
>
> On 4/14/07, Leslie Wu <lwu.two at gmail.com> wrote:
> >
> > Ah, you posted this to mousehole-scripters.
> >
> > Do you have a sample where this breaks?
> >
> > ~L
> >
> > On 4/11/07, Josh Icuss < jicuss at gmail.com> wrote:
> > >
> > > Im realy digging the new syntax. Hpricot is a well developed choice.
> > > Having trouble loading the Scrapi gem. any known issues?
> > > require 'tidy' or require 'scrapi' causes the script not to load. Also
> > > could anyone provide a quick insert_before example?
> > >
> > > _______________________________________________
> > > Mousehole-scripters mailing list
> > > Mousehole-scripters at rubyforge.org
> > > http://rubyforge.org/mailman/listinfo/mousehole-scripters
> > >
> >
> >
>
>
> _______________________________________________
> Mousehole-scripters mailing list
> Mousehole-scripters at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mousehole-scripters
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070417/32a5fe53/attachment-0001.html 


More information about the Mousehole-scripters mailing list