From boghra at gmail.com Mon Aug 1 05:09:41 2011 From: boghra at gmail.com (boghra) Date: Mon, 1 Aug 2011 02:09:41 -0700 Subject: [Mechanize-users] weird encoding issue Message-ID: Hi, Just first day with Mechanize and I am trying to access facebook fan page wall posts i.e. facebook.com/walmart?sk=wall, example: require 'rubygems' require 'mechanize' agent = Mechanize.new agent.user_agent_alias = 'Windows Mozilla' page = agent.get('http://www.facebook.com') form = page.form_with(:id => 'login_form') # just removed actual username password form.email = 'fb-user-name' form.pass = 'fb-password' page = agent.submit(form) page = agent.get("/walmart?sk=wall") span = page.search "//span[@class='messageBody']" puts span puts span doesn't print anything (and actual page has spans with given class name) and if try to print page.body, for all html opening tag '<' are printed as \u and html gets escaped, not sure how to tell it to use utf encoding ? also not sure why things get escaped ? please advice ! Thanks, Boghra -------------- next part -------------- An HTML attachment was scrubbed... URL: From boghra at gmail.com Tue Aug 2 00:05:26 2011 From: boghra at gmail.com (boghra) Date: Mon, 1 Aug 2011 21:05:26 -0700 Subject: [Mechanize-users] weird encoding issue In-Reply-To: References: Message-ID: Still stuck with issue, still not very sure what is going on, any tips to narrow down the problem ? Many Thanks ! On Mon, Aug 1, 2011 at 2:09 AM, boghra wrote: > Hi, > > Just first day with Mechanize and I am trying to access facebook fan page > wall posts i.e. facebook.com/walmart?sk=wall, example: > > require 'rubygems' > require 'mechanize' > > agent = Mechanize.new > agent.user_agent_alias = 'Windows Mozilla' > page = agent.get('http://www.facebook.com') > > form = page.form_with(:id => 'login_form') > # just removed actual username password > form.email = 'fb-user-name' > form.pass = 'fb-password' > page = agent.submit(form) > > page = agent.get("/walmart?sk=wall") > span = page.search "//span[@class='messageBody']" > puts span > > puts span doesn't print anything (and actual page has spans with given > class name) and if try to print page.body, for all html opening tag '<' are > printed as \u and html gets escaped, not sure how to tell it to use utf > encoding ? also not sure why things get escaped ? please advice ! > > Thanks, > Boghra > -------------- next part -------------- An HTML attachment was scrubbed... URL: