requiring other libs.
Leslie Wu
lwu.two at gmail.com
Wed Apr 18 02:51:07 EDT 2007
If I comment out the first instance of 'list=(page/"div..." I'm able to get
the script to load properly. As is, note that mouseHole will complain on the
command-line that the script is broken (and also on the mouseHole apps
page).
~L
On 4/17/07, Josh Icuss <jicuss at gmail.com> wrote:
>
> Sure. This is what I was working on for IFilm. Im looking to define
> classes for each object to make it easier to work with and later store in a
> DB. Ive noticed that Mousehole now supports SQLite databases. I like the
> direction your taking the project. One thing I like about Scrapi is the
> built in class definitions.
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> require 'tidy'
> require 'scrapi'
>
>
> list=(page/"div.similarlist"/'div.related_meta_data')
>
> related_media = Scraper.define do
> process "a.title_link", :description=>:text, :url=>"@href"
> process "p.stats", :stats=>:text
> result :description, :url, :stats
> end
>
> class IFilm < MouseHole::App
> title "IFilm Ad remover"
> namespace ''
> description 'removes ifilm ads'
> version "0.1"
> + url("http://*.ifilm.com/*")
> # + url("http://www.ifilm.com/")
>
> def rewrite(page)
> (document/"#HEADERAD").remove
> (document/"div.ad-rectangle").inner_html="funtimes"
> similar_videos=(document/"div.similar_videos_div")
> (document/"div[@id='SUPPLEMENT']").inner_html=similar_videos
> (document/"div[@id='comment_box']").remove
> (document/"div[@id='MYIFILM_BUMP']").remove
> (document/"div[@id='FOOTER']").remove
> (document/"h3[@id='UPLOAD']").remove
> (document/"div[@id='HEADER']").remove
> (document/"div[@id='TASKBAR']").remove
> list=(document/"div.similarlist"/'div.related_meta_data')
> media_urls=list.collect{|x| related_media.scrape(x.inner_html).url}
>
> string="<div id='nu'>"
> media_urls.each do |x|
> string << "<p>" << x << "</p>"
> end
> string << "</div>"
> (document/"div[@id='SUPPLEMENT']").inner_html=string
>
>
> end
> end
>
>
>
>
>
>
> On 4/14/07, Leslie Wu <lwu.two at gmail.com> wrote:
> >
> > Ah, you posted this to mousehole-scripters.
> >
> > Do you have a sample where this breaks?
> >
> > ~L
> >
> > On 4/11/07, Josh Icuss < jicuss at gmail.com> wrote:
> > >
> > > Im realy digging the new syntax. Hpricot is a well developed choice.
> > > Having trouble loading the Scrapi gem. any known issues?
> > > require 'tidy' or require 'scrapi' causes the script not to load. Also
> > > could anyone provide a quick insert_before example?
> > >
> > > _______________________________________________
> > > Mousehole-scripters mailing list
> > > Mousehole-scripters at rubyforge.org
> > > http://rubyforge.org/mailman/listinfo/mousehole-scripters
> > >
> >
> >
>
>
> _______________________________________________
> Mousehole-scripters mailing list
> Mousehole-scripters at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mousehole-scripters
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070417/32a5fe53/attachment-0001.html
More information about the Mousehole-scripters
mailing list