requiring other libs.
Josh Icuss
jicuss at gmail.com
Tue Apr 17 14:09:35 EDT 2007
Sure. This is what I was working on for IFilm. Im looking to define classes
for each object to make it easier to work with and later store in a DB. Ive
noticed that Mousehole now supports SQLite databases. I like the direction
your taking the project. One thing I like about Scrapi is the built in class
definitions.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
require 'tidy'
require 'scrapi'
list=(page/"div.similarlist"/'div.related_meta_data')
related_media = Scraper.define do
process "a.title_link", :description=>:text, :url=>"@href"
process "p.stats", :stats=>:text
result :description, :url, :stats
end
class IFilm < MouseHole::App
title "IFilm Ad remover"
namespace ''
description 'removes ifilm ads'
version "0.1"
+ url("http://*.ifilm.com/*")
# + url("http://www.ifilm.com/")
def rewrite(page)
(document/"#HEADERAD").remove
(document/"div.ad-rectangle").inner_html="funtimes"
similar_videos=(document/"div.similar_videos_div")
(document/"div[@id='SUPPLEMENT']").inner_html=similar_videos
(document/"div[@id='comment_box']").remove
(document/"div[@id='MYIFILM_BUMP']").remove
(document/"div[@id='FOOTER']").remove
(document/"h3[@id='UPLOAD']").remove
(document/"div[@id='HEADER']").remove
(document/"div[@id='TASKBAR']").remove
list=(document/"div.similarlist"/'div.related_meta_data')
media_urls=list.collect{|x| related_media.scrape(x.inner_html).url}
string="<div id='nu'>"
media_urls.each do |x|
string << "<p>" << x << "</p>"
end
string << "</div>"
(document/"div[@id='SUPPLEMENT']").inner_html=string
end
end
On 4/14/07, Leslie Wu <lwu.two at gmail.com> wrote:
>
> Ah, you posted this to mousehole-scripters.
>
> Do you have a sample where this breaks?
>
> ~L
>
> On 4/11/07, Josh Icuss < jicuss at gmail.com> wrote:
> >
> > Im realy digging the new syntax. Hpricot is a well developed choice.
> > Having trouble loading the Scrapi gem. any known issues?
> > require 'tidy' or require 'scrapi' causes the script not to load. Also
> > could anyone provide a quick insert_before example?
> >
> > _______________________________________________
> > Mousehole-scripters mailing list
> > Mousehole-scripters at rubyforge.org
> > http://rubyforge.org/mailman/listinfo/mousehole-scripters
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/mousehole-scripters/attachments/20070417/25ed315c/attachment.html
More information about the Mousehole-scripters
mailing list