From chad.one at gmail.com Thu Jul 1 16:25:28 2010 From: chad.one at gmail.com (Chad Seeger) Date: Thu, 1 Jul 2010 13:25:28 -0700 Subject: [Celerity-users] Celerity::ClickableElement Download Cache Message-ID: I can't seem to prevent ClickableElement from caching the response returned in #download. I'm not sure if this is a Celerity or HtmlUnit specific question. Anyhow, I'd love some feedback on this, if anyone has figured out a way to clear it. Our current workaround uses browser.goto() with the current page URL, effectively resetting the cache. While this works, it sure does slow things down for us. -Chad From jari.bakken at gmail.com Sun Jul 4 07:51:06 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Sun, 4 Jul 2010 13:51:06 +0200 Subject: [Celerity-users] Celerity::ClickableElement Download Cache In-Reply-To: References: Message-ID: On Thu, Jul 1, 2010 at 10:25 PM, Chad Seeger wrote: > I can't seem to prevent ClickableElement from caching the response > returned in #download. I'm not sure if this is a Celerity or HtmlUnit > specific question. Anyhow, I'd love some feedback on this, if anyone > has figured out a way to clear it. Could you provide some more detail, or a reproducible example? How do you know it is "cached" - what kind of change do you expect? As mentioned in the docs for #download, the element will be clicked but the "current page stays unchanged" - #download is really meant for the simplest cases, where the content type of the response would pop up a download dialog in a real browser. Without having tested it, I can imagine a possible problem if clicking the link updates the DOM (i.e. onclick handler that changes the link to point to some other content). From tomas.pospisek at zhdk.ch Mon Jul 5 12:29:46 2010 From: tomas.pospisek at zhdk.ch (tomas.pospisek at zhdk.ch) Date: Mon, 5 Jul 2010 18:29:46 +0200 Subject: [Celerity-users] using celerity with htmlunit svn Message-ID: <63639F25-E8DC-468A-986B-55C58C46355D@zhdk.ch> Hello, I use celerity through culerity. Our app uses ExtJS, which HtmlUnit seems to have toubles with [1]. So I wanted to try if htmlunit/trunk fixes the problem. I succeeded building htmlunit [2]. But celerity has troubles picking up the new jars or something. I tried with two approaches: * remove old htmlunit*2.7*.jars from jruby-1.3.1/lib/ruby/gems/1.8/gems/celerity-0.7.9/lib/celerity/htmlunit/ and place the new htmlunit*2.8*.jars there. * or replace the old htmlunit*2.7*.jars with the new 2.8 jars Both times the result was the same when running celerity: NameError: uninitialized constant HtmlUnit::WebClient (Culerity::CulerityException) /Users/itz/Library/jruby-1.3.1/lib/ruby/site_ruby/1.8/builtin/javasupport/core_ext/module.rb:23:in `const_missing' /Users/itz/Library/jruby-1.3.1/lib/ruby/site_ruby/1.8/builtin/javasupport/core_ext/module.rb:23:in `const_missing' /Users/itz/Library/jruby-1.3.1/lib/ruby/gems/1.8/gems/celerity-0.7.9/lib/celerity/browser.rb:836:in `setup_webclient' /Users/itz/Library/jruby-1.3.1/lib/ruby/gems/1.8/gems/celerity-0.7.9/lib/celerity/browser.rb:74:in `initialize' /Library/Ruby/Gems/1.8/gems/culerity-0.2.10/lib/../bin/../lib/culerity/celerity_server.rb:43:in `new_browser' /Library/Ruby/Gems/1.8/gems/culerity-0.2.10/lib/../bin/../lib/culerity/celerity_server.rb:20:in `initialize' /Library/Ruby/Gems/1.8/gems/culerity-0.2.10/lib/../bin/run_celerity_server.rb:3 Any clues or advice for me? Has anybody tried/succeeded in using current celerity 0.7.9 with trunk htmlunit? Thanks, *t [1] https://sourceforge.net/tracker/index.php?func=detail&aid=2969230&group_id=47038&atid=448266 [2] 1 - checkout trunk from https://sourceforge.net/scm/?type=svn&group_id=47038 2 - compile: - cd trunk/htmlunit && MAVEN_OPTS=-Xmx512m mvn -Dmaven.test.skip=true -up clean site package - cd ../core-js && ant jar-all 3 - copy jars to jruby/celerity directory - cp htmlunit/artifacts/htmlunit-2.8-SNAPSHOT.jar \ ~/Library/jruby-1.3.1/lib/ruby/gems/1.8/gems/celerity-0.7.9/lib/celerity/htmlunit/ - cp core-js/target/htmlunit-core-js-2.8-SNAPSHOT.jar \ ~/Library/jruby-1.3.1/lib/ruby/gems/1.8/gems/celerity-0.7.9/lib/celerity/htmlunit/ From tomas.pospisek at zhdk.ch Mon Jul 5 12:48:28 2010 From: tomas.pospisek at zhdk.ch (tomas.pospisek at zhdk.ch) Date: Mon, 5 Jul 2010 18:48:28 +0200 Subject: [Celerity-users] I/O exception caught when processing request: The server localhost failed to respond Message-ID: I'm seeing this when running my tests on the first call: 05.07.2010 18:16:40 org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server localhost failed to respond 05.07.2010 18:16:40 org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Then things seem to proceed. Is this due to a http timeout? *t From jari.bakken at gmail.com Mon Jul 5 12:58:31 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Mon, 5 Jul 2010 18:58:31 +0200 Subject: [Celerity-users] I/O exception caught when processing request: The server localhost failed to respond In-Reply-To: References: Message-ID: On Mon, Jul 5, 2010 at 6:48 PM, wrote: > I'm seeing this when running my tests on the first call: > > 05.07.2010 18:16:40 org.apache.commons.httpclient.HttpMethodDirector executeWithRetry > INFO: I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server localhost failed to respond > 05.07.2010 18:16:40 org.apache.commons.httpclient.HttpMethodDirector executeWithRetry > INFO: Retrying request > > Then things seem to proceed. Is this due to a http timeout? > *t > You'll probably get a better answer on the HtmlUnit list, but a timeout seems like a plausible explanation. From tomas.pospisek at zhdk.ch Mon Jul 5 13:06:11 2010 From: tomas.pospisek at zhdk.ch (tomas.pospisek at zhdk.ch) Date: Mon, 5 Jul 2010 19:06:11 +0200 Subject: [Celerity-users] I/O exception caught when processing request: The server localhost failed to respond In-Reply-To: References: Message-ID: <3796E7B4-9984-4D9E-AF8A-F7A55C70C339@zhdk.ch> Am 05.07.2010 um 18:58 schrieb Jari Bakken: > On Mon, Jul 5, 2010 at 6:48 PM, wrote: >> I'm seeing this when running my tests on the first call: >> >> 05.07.2010 18:16:40 org.apache.commons.httpclient.HttpMethodDirector executeWithRetry >> INFO: I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server localhost failed to respond >> 05.07.2010 18:16:40 org.apache.commons.httpclient.HttpMethodDirector executeWithRetry >> INFO: Retrying request >> >> Then things seem to proceed. Is this due to a http timeout? >> *t >> > > You'll probably get a better answer on the HtmlUnit list, but a > timeout seems like a plausible explanation. Thanks. Is there a way to set http timeouts from celerity? I have found this [1] however I can't see how to get access to that setting from celerity? ? *t [1] http://osdir.com/ml/java.htmlunit.general/2008-07/msg00106.html From jari.bakken at gmail.com Mon Jul 5 13:09:28 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Mon, 5 Jul 2010 19:09:28 +0200 Subject: [Celerity-users] using celerity with htmlunit svn In-Reply-To: <63639F25-E8DC-468A-986B-55C58C46355D@zhdk.ch> References: <63639F25-E8DC-468A-986B-55C58C46355D@zhdk.ch> Message-ID: On Mon, Jul 5, 2010 at 6:29 PM, wrote: > > Any clues or advice for me? > Try checking out Celerity from GitHub [1], which includes a 2.8 snapshot from today. I've heard rumors that 2.8 will be released within a week - I'll do a new Celerity release when it is out. [1] http://github.com/jarib/celerity From jari.bakken at gmail.com Mon Jul 5 13:15:37 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Mon, 5 Jul 2010 19:15:37 +0200 Subject: [Celerity-users] I/O exception caught when processing request: The server localhost failed to respond In-Reply-To: <3796E7B4-9984-4D9E-AF8A-F7A55C70C339@zhdk.ch> References: <3796E7B4-9984-4D9E-AF8A-F7A55C70C339@zhdk.ch> Message-ID: > Is there a way to set http timeouts from celerity? I have found this [1] > however I can't see how to get access to that setting from celerity? > ? > *t > Celerity has no API for this, but you can dig into the underlying WebClient instance: browser.webclient.setTimeout(5000) # milliseconds From tomas.pospisek at zhdk.ch Mon Jul 5 13:28:46 2010 From: tomas.pospisek at zhdk.ch (tomas.pospisek at zhdk.ch) Date: Mon, 5 Jul 2010 19:28:46 +0200 Subject: [Celerity-users] using celerity with htmlunit svn In-Reply-To: References: <63639F25-E8DC-468A-986B-55C58C46355D@zhdk.ch> Message-ID: <218554AA-3CC2-4F1B-BA88-B39217B3FCF7@zhdk.ch> Am 05.07.2010 um 19:09 schrieb Jari Bakken: > On Mon, Jul 5, 2010 at 6:29 PM, wrote: >> >> Any clues or advice for me? >> > > Try checking out Celerity from GitHub [1], which includes a 2.8 > snapshot from today. > I've heard rumors that 2.8 will be released within a week - I'll do a > new Celerity release when it is out. > > [1] http://github.com/jarib/celerity Oh wow, thanks a lot. Actually I figured it out myself: $ cd ~/Library/jruby-1.3.1/lib/ruby/gems/1.8/gems/celerity-0.7.9 $ sed -i bak 's/2.7/2.8/' tasks/snapshot.rake $ rake snapshot Works now. I tried to clone the git Celerity repo, but github doesn't reply to me ATM. I'll try tomorrow again. Thanks! *t From win at wincent.com Mon Jul 5 13:19:25 2010 From: win at wincent.com (Wincent Colaiuta) Date: Mon, 5 Jul 2010 19:19:25 +0200 Subject: [Celerity-users] using celerity with htmlunit svn In-Reply-To: References: <63639F25-E8DC-468A-986B-55C58C46355D@zhdk.ch> Message-ID: El 05/07/2010, a las 19:09, Jari Bakken escribi?: > On Mon, Jul 5, 2010 at 6:29 PM, wrote: >> >> Any clues or advice for me? > > Try checking out Celerity from GitHub [1], which includes a 2.8 > snapshot from today. > I've heard rumors that 2.8 will be released within a week - I'll do a > new Celerity release when it is out. Great! Looking forward to that. Wincent From tomas.pospisek at zhdk.ch Tue Jul 6 07:49:46 2010 From: tomas.pospisek at zhdk.ch (tomas.pospisek at zhdk.ch) Date: Tue, 6 Jul 2010 13:49:46 +0200 Subject: [Celerity-users] using celerity with htmlunit svn In-Reply-To: References: <63639F25-E8DC-468A-986B-55C58C46355D@zhdk.ch> Message-ID: <09A100B5-93A6-4FC9-A8A1-0C1626A56041@zhdk.ch> Am 05.07.2010 um 19:09 schrieb Jari Bakken: > On Mon, Jul 5, 2010 at 6:29 PM, wrote: >> >> Any clues or advice for me? >> > > Try checking out Celerity from GitHub [1], which includes a 2.8 > snapshot from today. > I've heard rumors that 2.8 will be released within a week - I'll do a > new Celerity release when it is out. > > [1] http://github.com/jarib/celerity Did that, some problems of htmlunit went away (notably problems it had with evaluating json) but new problems appeared - my logs are now full of 06.07.2010 13:33:36 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error WARNUNG: CSS error: null [17:108] Fehler in Ausdruck. Ung?ltiger Token "=". Erwartet wurde einer von: , , "/", , "-", , , ")", , "inherit", , , , , , , , , , , , , , , , , , , , . Would you mind bumping the version number in celerity.gemspec, so your old release and the current github checkout can play along well with each other? Thanks! *t From tomas.pospisek at zhdk.ch Tue Jul 6 07:51:55 2010 From: tomas.pospisek at zhdk.ch (tomas.pospisek at zhdk.ch) Date: Tue, 6 Jul 2010 13:51:55 +0200 Subject: [Celerity-users] I/O exception caught when processing request: The server localhost failed to respond In-Reply-To: References: <3796E7B4-9984-4D9E-AF8A-F7A55C70C339@zhdk.ch> Message-ID: Am 05.07.2010 um 19:15 schrieb Jari Bakken: >> Is there a way to set http timeouts from celerity? I have found this [1] >> however I can't see how to get access to that setting from celerity? >> ? >> *t >> > > Celerity has no API for this, but you can dig into the underlying > WebClient instance: > > browser.webclient.setTimeout(5000) # milliseconds Thanks, that works, however it did not rid me of the original problem: 06.07.2010 13:33:29 org.apache.http.impl.client.DefaultRequestDirector execute INFO: I/O exception (org.apache.http.NoHttpResponseException) caught when processing request: The target server failed to respond 06.07.2010 13:33:29 org.apache.http.impl.client.DefaultRequestDirector execute INFO: Retrying request I'll ask on the htmlunit list. Thanks! *t From jari.bakken at gmail.com Tue Jul 6 08:11:09 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Tue, 6 Jul 2010 14:11:09 +0200 Subject: [Celerity-users] using celerity with htmlunit svn In-Reply-To: <09A100B5-93A6-4FC9-A8A1-0C1626A56041@zhdk.ch> References: <63639F25-E8DC-468A-986B-55C58C46355D@zhdk.ch> <09A100B5-93A6-4FC9-A8A1-0C1626A56041@zhdk.ch> Message-ID: On Tue, Jul 6, 2010 at 1:49 PM, wrote: > Am 05.07.2010 um 19:09 schrieb Jari Bakken: > >> On Mon, Jul 5, 2010 at 6:29 PM, ? wrote: > Did that, some problems of htmlunit went away (notably problems it had > with evaluating json) but new problems appeared - my logs are now full of > > ?06.07.2010 13:33:36 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error > ?WARNUNG: CSS error: null [17:108] Fehler in Ausdruck. Ung?ltiger Token "=". Erwartet wurde einer von: , , "/", , "-", , , ")", , "inherit", , , , , , , , , , , , , , , , , , , , . > CSS is now enabled by default (needed for e.g. Element#visible?). You should be able to get rid of the warnings by decreasing the log level (http://wiki.github.com/jarib/celerity/faq) > Would you mind bumping the version number in celerity.gemspec, so your > old release and the current github checkout can play along well with each other? > I'll do that. From jari.bakken at gmail.com Tue Jul 6 08:24:19 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Tue, 6 Jul 2010 14:24:19 +0200 Subject: [Celerity-users] [ANN] 0.8.0.beta.1 Message-ID: I've pushed a pre-release gem of Celerity with recent HtmlUnit snapshots. To install: gem install celerity --prerelease I'll do a proper release when HtmlUnit 2.8 is released (rumors say in ~1 week). Please try it out and report any problems to either the HtmlUnit [1] or Celerity [2] tracker. Jari [1] http://htmlunit.sourceforge.net/submittingBugs.html [2] http://github.com/jarib/celerity/issues From tomas.pospisek at zhdk.ch Tue Jul 6 10:09:54 2010 From: tomas.pospisek at zhdk.ch (tomas.pospisek at zhdk.ch) Date: Tue, 6 Jul 2010 16:09:54 +0200 Subject: [Celerity-users] 0.8.0.beta.1 OK Message-ID: jruby -S gem install celerity --prerelease In order for "--prerelease" to work, one needs rubygems >= 1.3.4, which isn't included before jruby >= 1.4. That said, 0.8.0.beta.1 is working well here. I had to switch off CSS parsing, otherwise I'd get swamped with CSS parsing errors as described in [1]. *t [1] http://groups.google.com/group/culerity-dev/browse_thread/thread/cb9f49b90dc43990 From peter at hexagile.com Mon Jul 12 08:41:38 2010 From: peter at hexagile.com (Peter Szinek) Date: Mon, 12 Jul 2010 14:41:38 +0200 Subject: [Celerity-users] Celerity vs Nokogiri strange bug?? Message-ID: Hi guys, It took me a few hours to find this out and I am totally baffled. My stuff: OS X leopard java version 1.6.0_17 jruby 1.5.0 (ruby 1.8.7 patchlevel 249) celerity 0.7.9 nokogiri-1.5.0.beta.1-java So, this script: ======================== require 'rubygems' require "celerity" @agent = Celerity::Browser.new @agent.goto 'http://royalarsenalresidential.co.uk/search/sales' link = @agent.element_by_xpath("//span[@class='property_search_page' and contains(.,'2')]") link.click puts @agent.text ======================== works as it should (the output contains 'Showing 11 to 20 of 23 properties to buy', ie it crawled to the next page. However, this one (the only difference is on line 2 - require 'nokogiri'): ======================== require 'rubygems' require 'nokogiri' require "celerity" @agent = Celerity::Browser.new @agent.goto 'http://royalarsenalresidential.co.uk/search/sales' link = @agent.element_by_xpath("//span[@class='property_search_page' and contains(.,'2')]") link.click puts @agent.text ======================== does *not* crawl (even though it does not raise any Exception or print any error) - it looks like it's working, but the agent.text clearly shows we are still on the first page. Does anyone have an idea what's going on here? I thought nokogiri and celerity are not related? Cheers, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From carstep at gmail.com Mon Jul 12 14:45:55 2010 From: carstep at gmail.com (Bolla Sandor) Date: Mon, 12 Jul 2010 20:45:55 +0200 Subject: [Celerity-users] how does target=_blank pages behave Message-ID: <4C3B62E3.2020006@gmail.com> Hi there, does anybody know how the browser instance behaves when I use b.link(:text, 'targetblanklinktest').click? r. Sandor From jari.bakken at gmail.com Tue Jul 13 11:15:28 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Tue, 13 Jul 2010 17:15:28 +0200 Subject: [Celerity-users] Celerity vs Nokogiri strange bug?? In-Reply-To: References: Message-ID: On Mon, Jul 12, 2010 at 2:41 PM, Peter Szinek wrote: > > Does anyone have an idea what's going on here? I thought nokogiri and > celerity are not related? > That's indeed very strange. I have no idea why this would happen. I am only able to reproduce with the 1.5.0.beta.1 version of Nokogiri though, 1.4.2 works fine. If I were you I'd take a look at what changed between those two versions, especially on the Java side. From jari.bakken at gmail.com Tue Jul 13 12:05:30 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Tue, 13 Jul 2010 18:05:30 +0200 Subject: [Celerity-users] Celerity vs Nokogiri strange bug?? In-Reply-To: References: Message-ID: On Tue, Jul 13, 2010 at 5:15 PM, Jari Bakken wrote: > On Mon, Jul 12, 2010 at 2:41 PM, Peter Szinek wrote: >> > That's indeed very strange. I have no idea why this would happen. I am > only able to reproduce with the 1.5.0.beta.1 version of Nokogiri > though, 1.4.2 works fine. If I were you I'd take a look at what > changed between those two versions, especially on the Java side. > The jar files here http://github.com/tenderlove/nokogiri/tree/master/lib/ are likely the culprit - e.g. HtmlUnit is using NekoHTML internally, and the versions are probably different. If you switch the requires around (require celerity first, then nokogiri), the code seems to work as expected, but of course then there's no guarantee that nokogiri will. From peter at hexagile.com Tue Jul 13 12:37:54 2010 From: peter at hexagile.com (Peter Szinek) Date: Tue, 13 Jul 2010 18:37:54 +0200 Subject: [Celerity-users] Celerity vs Nokogiri strange bug?? In-Reply-To: References: Message-ID: Thanks Jari! Unfortunately I don't know the nokogiri codebase too much - especially not the java branch, which is a pure Java (no FFI) experimental rewrite, so thanks a million for digging this out :) Cheers, Peter On Tue, Jul 13, 2010 at 6:05 PM, Jari Bakken wrote: > On Tue, Jul 13, 2010 at 5:15 PM, Jari Bakken > wrote: > > On Mon, Jul 12, 2010 at 2:41 PM, Peter Szinek > wrote: > >> > > That's indeed very strange. I have no idea why this would happen. I am > > only able to reproduce with the 1.5.0.beta.1 version of Nokogiri > > though, 1.4.2 works fine. If I were you I'd take a look at what > > changed between those two versions, especially on the Java side. > > > > The jar files here > > http://github.com/tenderlove/nokogiri/tree/master/lib/ > > are likely the culprit - e.g. HtmlUnit is using NekoHTML internally, > and the versions are probably different. If you switch the requires > around (require celerity first, then nokogiri), the code seems to work > as expected, but of course then there's no guarantee that nokogiri > will. > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From grant at pikimal.com Wed Jul 14 15:29:15 2010 From: grant at pikimal.com (Grant Olson) Date: Wed, 14 Jul 2010 15:29:15 -0400 Subject: [Celerity-users] Ajax posts but page results don't change. Message-ID: <4C3E100B.3020003@pikimal.com> I'm brand new to celerity and HtmlUnit, so apologies if this is something stupid. I'm loading a page with ajax tabs. I click on the tab. I can see that celerity issues a request for the page with the tab content. But my browser.html object doesn't change. I've inspected the output of the whole page, but I'm going to keep the reproducible here a little smaller, and just check for a number that should be on the page. The last line of the output seems to indicate that the page with the tab text was requested by celerity. I'm also not sure if the content-type warnings will prevent some of the javascript from executing. Am I doing something wrong? Does anyone have any pointers? Thanks in advance. johnmudhead:~ grant$ cat celerity_test.rb require "rubygems" require "celerity" browser = Celerity::Browser.new(:resynchronize => true, :javascript_exceptions => true, :log_level => :all) browser.goto('http://www.radioshack.com/product/index.jsp?productId=4285605') browser.link(:text, "Tech Specs").click puts "Got text" if browser.html.include? "22.2" johnmudhead:~ grant$ jruby celerity_test.rb Jul 14, 2010 3:23:05 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify WARNING: Expected content type of 'application/javascript' or 'application/ecmascript' for remotely loaded JavaScript element at 'http://RSK.imageg.net/js/gomez-gtagb4_noobj.js', but got 'application/x-javascript'. Jul 14, 2010 3:23:06 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify ... ... (similar warnings omitted for brevity) ... WARNING: Expected content type of 'application/javascript' or 'application/ecmascript' for remotely loaded JavaScript element at 'http://www.radioshack.com/include/omniture-h.js', but got 'application/x-javascript'. Jul 14, 2010 3:23:16 PM com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController processSynchron INFO: Re-synchronized call to http://www.radioshack.com/product/TecSpec.jsp?productId=4285605 -- Grant "I am very hungry," Kobayashi said. "I wish there were hot dogs in jail." -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 559 bytes Desc: OpenPGP digital signature URL: From caius at brightbox.co.uk Wed Jul 14 15:35:47 2010 From: caius at brightbox.co.uk (Caius Durling) Date: Wed, 14 Jul 2010 20:35:47 +0100 Subject: [Celerity-users] Ajax posts but page results don't change. In-Reply-To: <4C3E100B.3020003@pikimal.com> References: <4C3E100B.3020003@pikimal.com> Message-ID: On 14 Jul 2010, at 20:29, Grant Olson wrote: > I'm loading a page with ajax tabs. I click on the tab. I can see that > celerity issues a request for the page with the tab content. But my > browser.html object doesn't change. #html doesn't update as js changes the dom, but #xml does. HTH, C --- Caius Durling Brightbox caius at brightbox.co.uk http://brightbox.co.uk/ From grant at pikimal.com Wed Jul 14 15:48:10 2010 From: grant at pikimal.com (Grant Olson) Date: Wed, 14 Jul 2010 15:48:10 -0400 Subject: [Celerity-users] Ajax posts but page results don't change. In-Reply-To: References: <4C3E100B.3020003@pikimal.com> Message-ID: <4C3E147A.5050006@pikimal.com> On 7/14/10 3:35 PM, Caius Durling wrote: > On 14 Jul 2010, at 20:29, Grant Olson wrote: > >> I'm loading a page with ajax tabs. I click on the tab. I can see that >> celerity issues a request for the page with the tab content. But my >> browser.html object doesn't change. > > > #html doesn't update as js changes the dom, but #xml does. > That worked. Thanks a million! -- Grant "I am very hungry," Kobayashi said. "I wish there were hot dogs in jail." -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 559 bytes Desc: OpenPGP digital signature URL: From peter at hexagile.com Thu Jul 15 18:23:01 2010 From: peter at hexagile.com (Peter Szinek) Date: Fri, 16 Jul 2010 00:23:01 +0200 Subject: [Celerity-users] page 'caching' for crawling? Message-ID: Hi guys, I am trying to find a generic solution for page crawling, but do not seem to be able to come up with one... The task is to crawl a structure like - list page 1 - detail 1 - content 11 - detail 2 - content 12 ... detail n - content 1n - list page 2 - detail 1 - content 21 - detail 2 - content 22 ... detail n - content 2n So I want to get content 11 through content 2n - after getting content11, to get content12 I need to go back to list page 1 etc, then crawl to detail 2 etc etc For non-JS sites a pretty simple one seems to work - I always store original_page = agent.page when crawling to a detail page, then restoring it before crawling to the next detail page etc. However this doesn't seem to work with JS pages - once I navigate away from the original page, I am not able to crawl to the next page and/or to other pages - it seems that it's not possible to store a page for future crawling... Is this correct (and if so, what't the solution/workaround) or did I just ran into a few funky pages ? Cheers, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From jari.bakken at gmail.com Thu Jul 15 18:50:11 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Fri, 16 Jul 2010 00:50:11 +0200 Subject: [Celerity-users] page 'caching' for crawling? In-Reply-To: References: Message-ID: On Fri, Jul 16, 2010 at 12:23 AM, Peter Szinek wrote: > Hi guys, > > For non-JS sites a pretty simple one seems to work - I always store > > original_page = agent.page > > when crawling to a detail page, then restoring it before crawling to the > next detail page etc. > Why not use Browser#back? As long as you don't try to reuse old element references (which can be easily avoided), that should work just fine. In general, trying to think about how you would navigate through the site as a user is helpful in finding the right approach. If you really need to reuse old pages, you'll probably have more luck on the HtmlUnit list as to why that's not working with JS sites. From peter at hexagile.com Fri Jul 16 06:02:05 2010 From: peter at hexagile.com (Peter Szinek) Date: Fri, 16 Jul 2010 12:02:05 +0200 Subject: [Celerity-users] page 'caching' for crawling? In-Reply-To: References: Message-ID: Hi Jari Why not use Browser#back? As long as you don't try to reuse old > element references (which can be easily avoided), that should work > just fine. Yeah, that was my first idea too - but on the 2 pages I tried, browser.back seemed to be kind of inconsistent - maybe the pages were too AJAXy or something? I need to test it on a larger sample. So you are saying agent.back should work (what I mean is: even in modern browsers you are getting 'are you sure you want to resubmit the form' and stuff like that. cookies etc - despite of all this, in celerity it's expected to work?) Cheers, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From calbers at neomantic.com Tue Jul 20 10:36:11 2010 From: calbers at neomantic.com (Chad Albers) Date: Tue, 20 Jul 2010 10:36:11 -0400 Subject: [Celerity-users] Celerity's contains_text doesn't work Message-ID: Hi, I'm using cucumber and culerity's connection to celerity to test if a document contains a particular text content. Culerity's step for "I should see" looks for a div with the text. However, the document I'm checking has the text in a p tag. Changing the step to the p tag makes the step pass. Celerity seems to be very specific then. If I try to make the "I should see" step more generic using celerity's brower's object contains_text method, the step, though, still fails. Any reason why celerity can't find the text? Chad -- Chad Albers http://www.neomantic.com (pgp signature available on request) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jari.bakken at gmail.com Tue Jul 20 10:49:07 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Tue, 20 Jul 2010 16:49:07 +0200 Subject: [Celerity-users] Celerity's contains_text doesn't work In-Reply-To: References: Message-ID: On Tue, Jul 20, 2010 at 4:36 PM, Chad Albers wrote: > Hi, > I'm using cucumber and culerity's connection to celerity to test if a > document contains a particular text content. ?Culerity's step for "I should > see" looks for a div with the text. ?However, the document I'm checking has > the text in a p tag. ? Changing the step to the p tag makes the step pass. > ?Celerity seems to be very specific then. > If I try to make the "I should see" step more generic using celerity's > brower's object contains_text method, the step, though, still fails. Any > reason why celerity can't find the text? Not sure I understand what's going on. Could you provide some example HTML/Ruby code that will reproduce the issue? From calbers at neomantic.com Tue Jul 20 11:52:48 2010 From: calbers at neomantic.com (Chad Albers) Date: Tue, 20 Jul 2010 11:52:48 -0400 Subject: [Celerity-users] Celerity's contains_text doesn't work In-Reply-To: References: Message-ID: Thanks for you help, Jari. I don't think showing the code is necessary. I think it's a culerity bug. Here's the questionable step: http://gist.github.com/483131 As you can see, culerity looks for a div containing the text. It seems to me it would be better that it use celerity's contain_text. Celerity's contains_text returns either the index of the text - if it exists - or nil - if the text doesn't exist. So div.should be_exist doesn't work. Instead the correct step should read: http://gist.github.com/483143 Does that look correct? I'm not sure, though, this would handle ajax request, though. What do you think? Thanks for your help again, Chad -- Chad Albers http://www.neomantic.com (pgp signature available on request) On Tue, Jul 20, 2010 at 10:49 AM, Jari Bakken wrote: > On Tue, Jul 20, 2010 at 4:36 PM, Chad Albers > wrote: > > Hi, > > I'm using cucumber and culerity's connection to celerity to test if a > > document contains a particular text content. Culerity's step for "I > should > > see" looks for a div with the text. However, the document I'm checking > has > > the text in a p tag. Changing the step to the p tag makes the step > pass. > > Celerity seems to be very specific then. > > If I try to make the "I should see" step more generic using celerity's > > brower's object contains_text method, the step, though, still fails. Any > > reason why celerity can't find the text? > > Not sure I understand what's going on. Could you provide some example > HTML/Ruby code that will reproduce the issue? > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jari.bakken at gmail.com Tue Jul 20 12:29:30 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Tue, 20 Jul 2010 18:29:30 +0200 Subject: [Celerity-users] Celerity's contains_text doesn't work In-Reply-To: References: Message-ID: On Tue, Jul 20, 2010 at 5:52 PM, Chad Albers wrote: > Does that look correct? ?I'm not sure, though, this would handle ajax > request, though. ? What do you think? That looks reasonable, however contains_text is being deprecated in various Watir impl., so you should probably rather do: browser.text =~ /#{Regexp::escape(text)}/ Regarding the comment about Browser#html, there's also Browser#xml which has the updated DOM after JS manipulations. From calbers at neomantic.com Tue Jul 20 14:52:10 2010 From: calbers at neomantic.com (Chad Albers) Date: Tue, 20 Jul 2010 14:52:10 -0400 Subject: [Celerity-users] Celerity's contains_text doesn't work In-Reply-To: References: Message-ID: Awesome! Thanks for your help. I created a personal fork of culerity get it to happily work with celerity. -- Chad Albers http://www.neomantic.com (pgp signature available on request) On Tue, Jul 20, 2010 at 12:29 PM, Jari Bakken wrote: > On Tue, Jul 20, 2010 at 5:52 PM, Chad Albers > wrote: > > Does that look correct? I'm not sure, though, this would handle ajax > > request, though. What do you think? > > That looks reasonable, however contains_text is being deprecated in > various Watir impl., so you should probably rather do: > > browser.text =~ /#{Regexp::escape(text)}/ > > Regarding the comment about Browser#html, there's also Browser#xml > which has the updated DOM after JS manipulations. > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter at hexagile.com Sat Jul 24 09:05:42 2010 From: peter at hexagile.com (Peter Szinek) Date: Sat, 24 Jul 2010 15:05:42 +0200 Subject: [Celerity-users] browser.back weirdness Message-ID: Hi all, Some time ago I asked about page caching - ie storing a page for later reuse. I got the advice (from Jari) to simply use the browser's capability of storing pages, and use browser.back - however, I am finding it buggy... for example: b = Celerity::Browser.new # start here b.goto 'http://www.brownandbrooke.co.uk/search.asp?pricetype=1' #click the submit button b.element_by_xpath("//input[@id='searchsubmit']").click #on the result page, print the number of records found b.element_by_xpath("//td[@id='results_resultsbar_id_xofy1']").text => "Displaying 1 to 10 of 59 properties found" # click a detail page link b.element_by_xpath("//a[child::img[@src='images/moreinfo.gif']]").click # go back b.back #print the number of records again - it's different!! b.element_by_xpath("//td[@id='results_resultsbar_id_xofy1']").text => "Displaying 1 to 10 of 83 properties found" So, we should be on the same page after hitting back where we were after submit - but that's simply not true (the above test is just a primitive proof, but I tested it more extensively and it's b0rk3d). Moreover, this happened to me on more pages already - and we are not even talking AJAX / cookies / whatnot here, just simple web1.0... Am I doing something wrong / assuming something I should not / is this a bug? Cheers, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter at hexagile.com Sat Jul 24 09:43:45 2010 From: peter at hexagile.com (Peter Szinek) Date: Sat, 24 Jul 2010 15:43:45 +0200 Subject: [Celerity-users] DOM bug? Message-ID: Hey all, Consider this snippet: b = Celerity::Browser.new b.goto 'http://www.brownandbrooke.co.uk/search.asp?pricetype=1' irb(main):037:0> b.element_by_xpath("//td[@class='results1_priceask']").parent.parent.parent => ##} @object=nil> irb(main):038:0> b.element_by_xpath("//td[@class='results1_priceask']").parent.parent.parent.parent => ##} @object=nil> According to this, the parent of a HtmlTable is a HtmlTableRow - which sounds wrong? (because a HtmlTableRow should have HtmlTableDataCell children and nothing else (e.g. HtmlTable in this case))? The HTML also reflect this - i.e. that has children and no children at all - so I am wondering why does celerity think that the parent of that
is a ? Cheers, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From jari.bakken at gmail.com Sat Jul 24 09:53:10 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Sat, 24 Jul 2010 15:53:10 +0200 Subject: [Celerity-users] browser.back weirdness In-Reply-To: References: Message-ID: On Sat, Jul 24, 2010 at 3:05 PM, Peter Szinek wrote: > > Am I doing something wrong / assuming something I should not / is this a > bug? > Your assumptions look correct to me. Not sure what's happening there. It's certainly not a bug in Celerity, as we just delegate down to HtmlUnit's back functionality. So you should raise this on the HtmlUnit mailing list / bug tracker - perhaps describe your high-level use case to them as well. You might want to also try the latest prerelase gem (0.8.0.beta.2), just to confirm that it's not already fixed: gem install celerity --prerelase From jari.bakken at gmail.com Sat Jul 24 12:39:38 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Sat, 24 Jul 2010 18:39:38 +0200 Subject: [Celerity-users] DOM bug? In-Reply-To: References: Message-ID: Hi Peter On Sat, Jul 24, 2010 at 3:43 PM, Peter Szinek wrote: > Hey all, > > Consider this snippet: > > b = Celerity::Browser.new > b.goto 'http://www.brownandbrooke.co.uk/search.asp?pricetype=1' > > irb(main):037:0> > b.element_by_xpath("//td[@class='results1_priceask']").parent.parent.parent > => # @conditions={:object=>#} @object=nil> I'm not able to get this far, as that element doesn't seem to exist: http://gist.github.com/488799 From peter at hexagile.com Sat Jul 24 12:55:41 2010 From: peter at hexagile.com (Peter Szinek) Date: Sat, 24 Jul 2010 18:55:41 +0200 Subject: [Celerity-users] DOM bug? In-Reply-To: References: Message-ID: Oops... you need to b.element_by_xpath("//input[@id='searchsubmit']").click after the goto... Cheers, Peter On Sat, Jul 24, 2010 at 6:39 PM, Jari Bakken wrote: > Hi Peter > > On Sat, Jul 24, 2010 at 3:43 PM, Peter Szinek wrote: > > Hey all, > > > > Consider this snippet: > > > > b = Celerity::Browser.new > > b.goto 'http://www.brownandbrooke.co.uk/search.asp?pricetype=1' > > > > irb(main):037:0> > > > b.element_by_xpath("//td[@class='results1_priceask']").parent.parent.parent > > => # > @conditions={:object=>#} @object=nil> > > I'm not able to get this far, as that element doesn't seem to exist: > > http://gist.github.com/488799 > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jari.bakken at gmail.com Sat Jul 24 13:37:17 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Sat, 24 Jul 2010 19:37:17 +0200 Subject: [Celerity-users] DOM bug? In-Reply-To: References: Message-ID: On Sat, Jul 24, 2010 at 6:55 PM, Peter Szinek wrote: >> > Hey all, >> > >> > Consider this snippet: >> > OK, so if you print out the internal HtmlUnit DOM, you can see that it does have nested tables: puts b.xml Looking at "View Source" for the result page, it appears to be the same there as well. Ditto for Firefox' DOM in Firebug. From peter at hexagile.com Sat Jul 24 14:11:38 2010 From: peter at hexagile.com (Peter Szinek) Date: Sat, 24 Jul 2010 20:11:38 +0200 Subject: [Celerity-users] DOM bug? In-Reply-To: References: Message-ID: Yes, that's true. The problem is that both b.xml and 'view source' show that the nested table has a parent. However, in the DOM, the , not the parent. However, in > the DOM, the , not the >
parent, and that has a
is 'missing' - the table's parent is the
... On Sat, Jul 24, 2010 at 7:37 PM, Jari Bakken wrote: > On Sat, Jul 24, 2010 at 6:55 PM, Peter Szinek wrote: > >> > Hey all, > >> > > >> > Consider this snippet: > >> > > > OK, so if you print out the internal HtmlUnit DOM, you can see that it > does have nested tables: > > puts b.xml > > Looking at "View Source" for the result page, it appears to be the > same there as well. Ditto for Firefox' DOM in Firebug. > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jari.bakken at gmail.com Sat Jul 24 15:07:43 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Sat, 24 Jul 2010 21:07:43 +0200 Subject: [Celerity-users] DOM bug? In-Reply-To: References: Message-ID: On Sat, Jul 24, 2010 at 8:11 PM, Peter Szinek wrote: > Yes, that's true. The problem is that both b.xml and 'view source' show that > the nested table has a parent, and that has a
is 'missing' - the table's parent is the
... > Ah, right. Found the bug. Look for a fix in the next release. From peter at hexagile.com Tue Jul 27 07:13:16 2010 From: peter at hexagile.com (Peter Szinek) Date: Tue, 27 Jul 2010 13:13:16 +0200 Subject: [Celerity-users] child elements of an element? Message-ID: Hi guys, Is it possible to get all child elements (or child nodes, at least) of an element? Like element.children in nokogiri... I can get h5s and ols and ps etc. but not *any* elements I can iterate on... Cheers, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From jari.bakken at gmail.com Tue Jul 27 07:47:30 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Tue, 27 Jul 2010 13:47:30 +0200 Subject: [Celerity-users] child elements of an element? In-Reply-To: References: Message-ID: On Tue, Jul 27, 2010 at 1:13 PM, Peter Szinek wrote: > Hi guys, > > Is it possible to get all child elements (or child nodes, at least) of an > element? Like? element.children in nokogiri... I can get h5s and ols and ps > etc. but not *any* elements I can iterate on... > Perhaps something like this: browser.elements_by_xpath(element.xpath + "//*") You should ask yourself why you need this though - Celerity doesn't purport to give you generic DOM traversal capabilities, and relying too much on the DOM structure will make your tests brittle and increase the maintenance cost. If you need Nokogiri's capabilities, consider just using it instead: doc = Nokogiri.XML(browser.xml) From lui.cicino at gmail.com Thu Jul 29 01:54:46 2010 From: lui.cicino at gmail.com (Lui Cicino) Date: Thu, 29 Jul 2010 01:54:46 -0400 Subject: [Celerity-users] capture download speed Message-ID: Is it possible to capture the download speed (Kbps) for the execution of a script? -Lui -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter at hexagile.com Thu Jul 29 02:53:22 2010 From: peter at hexagile.com (Peter Szinek) Date: Thu, 29 Jul 2010 08:53:22 +0200 Subject: [Celerity-users] child elements of an element? In-Reply-To: References: Message-ID: Hi Jari thanks for the suggestions! Well, this time I am using celerity for scraping - therefore all the low-level DOM massaging ;) I actually used the celerity+nokogiri combo, but I couldn't reliably install nokogiri under jruby (and for XPath evaluations where I have to click the resulting element, I have to use celerity anyway) so I just threw nokogiri away. So far this is the only problem I came across (and I am doing quite a bit of DOM/XPath hacking). Unfortunately I don't have a browser reference at that point, and it would be reather cumbersome to pass it there... are you sure there is no API (even if I have to talk to htmlunit) which can do this? It sounds crazy that you can get a specific set of elements, but not all of them...writing a method along the lines of %w{divs spans tables ols uls lis ps .....}.inject([]) {|a,v| a << element.send v; a} sounds crazy :) Cheers, Peter On Tue, Jul 27, 2010 at 1:47 PM, Jari Bakken wrote: > On Tue, Jul 27, 2010 at 1:13 PM, Peter Szinek wrote: > > Hi guys, > > > > Is it possible to get all child elements (or child nodes, at least) of an > > element? Like element.children in nokogiri... I can get h5s and ols and > ps > > etc. but not *any* elements I can iterate on... > > > > Perhaps something like this: > > browser.elements_by_xpath(element.xpath + "//*") > > You should ask yourself why you need this though - Celerity doesn't > purport to give you generic DOM traversal capabilities, and relying > too much on the DOM structure will make your tests brittle and > increase the maintenance cost. If you need Nokogiri's capabilities, > consider just using it instead: > > doc = Nokogiri.XML(browser.xml) > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jari.bakken at gmail.com Thu Jul 29 06:22:37 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Thu, 29 Jul 2010 12:22:37 +0200 Subject: [Celerity-users] capture download speed In-Reply-To: References: Message-ID: On Thu, Jul 29, 2010 at 7:54 AM, Lui Cicino wrote: > Is it possible to capture the download speed (Kbps) for the execution of a > script? I have no idea. The HtmlUnit list might now. Let me know what you find. From jari.bakken at gmail.com Thu Jul 29 06:28:01 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Thu, 29 Jul 2010 12:28:01 +0200 Subject: [Celerity-users] child elements of an element? In-Reply-To: References: Message-ID: On Thu, Jul 29, 2010 at 8:53 AM, Peter Szinek wrote: > > Unfortunately I don't have a browser reference at that point, and it would > be reather cumbersome to pass it there... are you sure there is no API (even > if I have to talk to htmlunit) which can do this? HtmlUnit probably has a way to get child nodes - you'd have to look at the docs. > It sounds crazy that you > can get a specific set of elements, but not all of them...writing a method > along the lines of > > %w{divs spans tables ols uls lis ps .....}.inject([]) {|a,v| a << > element.send v; a} > Yes that's crazy, which is why I suggested using Browser#elements_by_xpath and Element#xpath to find children, and get them back as Celerity elements. You could even monkey patch Element to do this: class Celerity::Element def children browser.elements_by_xpath "#{xpath}//*" end end