requiring other libs.

Josh Icuss jicuss at gmail.com
Tue Apr 17 14:09:35 EDT 2007


Sure. This is what I was working on for IFilm. Im looking to define classes
for each object to make it easier to work with and later store in a DB. Ive
noticed that Mousehole now supports SQLite databases. I like the direction
your taking the project. One thing I like about Scrapi is the built in class
definitions.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

require 'tidy'
require 'scrapi'


list=(page/"div.similarlist"/'div.related_meta_data')

related_media = Scraper.define do
  process "a.title_link", :description=>:text, :url=>"@href"
  process "p.stats", :stats=>:text
  result :description, :url, :stats
end

class IFilm < MouseHole::App
  title "IFilm Ad remover"
  namespace ''
  description 'removes ifilm ads'
  version "0.1"
  + url("http://*.ifilm.com/*")
 # + url("http://www.ifilm.com/")

  def rewrite(page)
   (document/"#HEADERAD").remove
   (document/"div.ad-rectangle").inner_html="funtimes"
   similar_videos=(document/"div.similar_videos_div")
   (document/"div[@id='SUPPLEMENT']").inner_html=similar_videos
   (document/"div[@id='comment_box']").remove
   (document/"div[@id='MYIFILM_BUMP']").remove
   (document/"div[@id='FOOTER']").remove
   (document/"h3[@id='UPLOAD']").remove
    (document/"div[@id='HEADER']").remove
     (document/"div[@id='TASKBAR']").remove
 list=(document/"div.similarlist"/'div.related_meta_data')
 media_urls=list.collect{|x| related_media.scrape(x.inner_html).url}

 string="<div id='nu'>"
 media_urls.each do |x|
 string << "<p>" << x << "</p>"
 end
 string << "</div>"
 (document/"div[@id='SUPPLEMENT']").inner_html=string


  end
end






On 4/14/07, Leslie Wu <lwu.two at gmail.com> wrote:
>
> Ah, you posted this to mousehole-scripters.
>
> Do you have a sample where this breaks?
>
> ~L
>
> On 4/11/07, Josh Icuss < jicuss at gmail.com> wrote:
> >
> > Im realy digging the new syntax. Hpricot is a well developed choice.
> > Having trouble loading the Scrapi gem. any known issues?
> > require 'tidy' or require 'scrapi' causes the script not to load. Also
> > could anyone provide a quick insert_before example?
> >
> > _______________________________________________
> > Mousehole-scripters mailing list
> > Mousehole-scripters at rubyforge.org
> > http://rubyforge.org/mailman/listinfo/mousehole-scripters
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070417/25ed315c/attachment.html 


More information about the Mousehole-scripters mailing list