From simon.a.chiang at gmail.com Wed Apr 1 11:56:10 2009
From: simon.a.chiang at gmail.com (Simon Chiang)
Date: Wed, 1 Apr 2009 09:56:10 -0600
Subject: [Nokogiri-talk] Help for a namespace newbie
Message-ID: <85fafb0c0904010856y4abb1c3fyedcb511cc6cc7c5e@mail.gmail.com>
Hi, just started messing around with Nokogiri and now I'm confused. I know
example b returns nil because of the namespace but I don't really get why or
how you work around this.
a = Nokogiri::XML %q{content}
puts a.root.at("/root") # => root node
b = Nokogiri::XML %q{content}
puts b.root.at("/root") # => nil
Obviously 'blah' may be invalid simply because it's made up, but the same
thing happens with a valid http namespace (valid in that I can go to the url
and get something that looks like an XML namespace... uh... declaration?).
Any help is appreciated, thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bobf at jhu.edu Wed Apr 1 15:51:44 2009
From: bobf at jhu.edu (Bob Fitterman)
Date: Wed, 1 Apr 2009 15:51:44 -0400
Subject: [Nokogiri-talk] Advice on New Install to Local Directories
Message-ID: <88b3725b0904011251ia5a62f3lc86b39ccab938055@mail.gmail.com>
I'm hoping this is the right place for getting some help. I recently
switched from Hpricot to Nokogiri and love the huge performance
improvement.
I do my development on Leopard, and it was a piece of cake to install.
Unfortunately, I'm hosted on a shared environment, so it gets a little
messier. I installed libxml2-2.7.3 and libxslt-1.1.24 into my own
environment. (It's sitting in local/lib & local/include off my home
directory.) When I go to do the gem install, it complains:
/usr/bin/ruby1.8 extconf.rb install nokogiri -V
checking for iconv.h in
/usr/include,/opt/local/include,/usr/local/include,/usr/include... yes
checking for libxml/parser.h in
/opt/local/include/,/opt/local/include/libxml2,/usr/include/libxml2,/usr/include,/usr/local/include/libxml2,/usr/include/libxml2...
yes
checking for libxslt/xslt.h in
/opt/local/include/,/opt/local/include/libxml2,/usr/include/libxml2,/usr/include,/usr/local/include/libxml2,/usr/include/libxml2...
no
libxslt is missing. try 'port install libxslt' or 'yum install
libxslt-devel'
*** extconf.rb failed ***
Could not create Makefile due to some reason, probably lack of
necessary libraries and/or headers. Check the mkmf.log file for more
details. You may need configuration options.
I downloaded the tarball and tried running "ruby extconf.rb
--with-xslt-dir=/home/myname/local" and an assortment of combinations like
that, but I can't seem to make it work. What's the secret?
Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From aaron.patterson at gmail.com Wed Apr 1 16:52:16 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Wed, 1 Apr 2009 13:52:16 -0700
Subject: [Nokogiri-talk] Advice on New Install to Local Directories
In-Reply-To: <88b3725b0904011251ia5a62f3lc86b39ccab938055@mail.gmail.com>
References: <88b3725b0904011251ia5a62f3lc86b39ccab938055@mail.gmail.com>
Message-ID: <6959e1680904011352wa747b3cv12f10243bebfb3d9@mail.gmail.com>
Hi Bob,
2009/4/1 Bob Fitterman :
> I'm hoping this is the right place for getting some help. I recently
> switched from Hpricot to Nokogiri and love the huge performance
> improvement.
> I do my development on Leopard, and it was a piece of cake to install.
> Unfortunately, I'm hosted on a shared environment, so it gets a little
> messier. I installed libxml2-2.7.3 and libxslt-1.1.24 into my own
> environment. (It's sitting in local/lib & local/include off my home
> directory.) When I go to do the gem install, it complains:
>
> /usr/bin/ruby1.8 extconf.rb install nokogiri -V
> checking for iconv.h in
> /usr/include,/opt/local/include,/usr/local/include,/usr/include... yes
> checking for libxml/parser.h in
> /opt/local/include/,/opt/local/include/libxml2,/usr/include/libxml2,/usr/include,/usr/local/include/libxml2,/usr/include/libxml2...
> yes
> checking for libxslt/xslt.h in
> /opt/local/include/,/opt/local/include/libxml2,/usr/include/libxml2,/usr/include,/usr/local/include/libxml2,/usr/include/libxml2...
> no
> libxslt is missing. ?try 'port install libxslt' or 'yum install
> libxslt-devel'
> *** extconf.rb failed ***
> Could not create Makefile due to some reason, probably lack of
> necessary libraries and/or headers. ?Check the mkmf.log file for more
> details. ?You may need configuration options.
>
> I downloaded the tarball and tried running "ruby extconf.rb
> --with-xslt-dir=/home/myname/local" and an assortment of combinations like
> that, but I can't seem to make it work. What's the secret?
> Thanks.
I assume that you've already installed libxml2 and libxslt? They're
just in your local directory?
Your extconf parameters look fine. Can you send me the mkmf.log
*after* running with your custom params?
--
Aaron Patterson
http://tenderlovemaking.com/
From bobf at jhu.edu Thu Apr 2 15:57:09 2009
From: bobf at jhu.edu (Bob Fitterman)
Date: Thu, 2 Apr 2009 13:57:09 -0600
Subject: [Nokogiri-talk] Advice on New Install to Local Directories
In-Reply-To: <6959e1680904011352wa747b3cv12f10243bebfb3d9@mail.gmail.com>
References: <88b3725b0904011251ia5a62f3lc86b39ccab938055@mail.gmail.com>
<6959e1680904011352wa747b3cv12f10243bebfb3d9@mail.gmail.com>
Message-ID: <88b3725b0904021257n3359974v496d42ef01c26120@mail.gmail.com>
>From my this directory: ~/.gems/gems/nokogiri-1.2.3/ext/nokogiri
If I do this: ruby extconf.rb --opt-dir=/home/myname/local/
I see this:
checking for iconv.h in
/usr/include,/opt/local/include,/usr/local/include,/usr/include... yes
checking for libxml/parser.h in
/opt/local/include/,/opt/local/include/libxml2,/usr/include/libxml2,/usr/include,/usr/local/include/libxml2,/usr/include/libxml2...
yes
checking for libxslt/xslt.h in
/opt/local/include/,/opt/local/include/libxml2,/usr/include/libxml2,/usr/include,/usr/local/include/libxml2,/usr/include/libxml2...
no
libxslt is missing. try 'port install libxslt' or 'yum install
libxslt-devel'
*** extconf.rb failed ***
I figured this would tell it to look in the right directories but you'll
notice it's not searching through MY copies, it's looking out in /usr/local.
Everything is installed in /home/myname/local. Specifically, libxslt/xslt.h
can be found in
/home/myname/local/include/libxslt . I'm sure I'm missing something really
basic about how get it to build with the local install.
I think once I get that point, the contents of mkmf.log will be relevant.
Thanks.
Bob
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From aaron.patterson at gmail.com Thu Apr 2 16:57:39 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Thu, 2 Apr 2009 13:57:39 -0700
Subject: [Nokogiri-talk] Advice on New Install to Local Directories
In-Reply-To: <88b3725b0904021257n3359974v496d42ef01c26120@mail.gmail.com>
References: <88b3725b0904011251ia5a62f3lc86b39ccab938055@mail.gmail.com>
<6959e1680904011352wa747b3cv12f10243bebfb3d9@mail.gmail.com>
<88b3725b0904021257n3359974v496d42ef01c26120@mail.gmail.com>
Message-ID: <6959e1680904021357q2f55f748pc3d891a8d7d096d9@mail.gmail.com>
2009/4/2 Bob Fitterman :
> >From my this directory: ~/.gems/gems/nokogiri-1.2.3/ext/nokogiri
> If I do this: ruby extconf.rb --opt-dir=/home/myname/local/
Try something like this:
ruby exconf.rb --with-xslt-dir=/home/myname/local
You should see '/home/myname/local' show up in the searched directories.
--
Aaron Patterson
http://tenderlovemaking.com/
From bobf at jhu.edu Fri Apr 3 21:26:56 2009
From: bobf at jhu.edu (Bob Fitterman)
Date: Fri, 3 Apr 2009 19:26:56 -0600
Subject: [Nokogiri-talk] Advice on New Install to Local Directories
In-Reply-To: <6959e1680904021357q2f55f748pc3d891a8d7d096d9@mail.gmail.com>
References: <88b3725b0904011251ia5a62f3lc86b39ccab938055@mail.gmail.com>
<6959e1680904011352wa747b3cv12f10243bebfb3d9@mail.gmail.com>
<88b3725b0904021257n3359974v496d42ef01c26120@mail.gmail.com>
<6959e1680904021357q2f55f748pc3d891a8d7d096d9@mail.gmail.com>
Message-ID: <88b3725b0904031826r448b5f21tf2505a9bcd51e7c7@mail.gmail.com>
Aaron, thanks it's getting closer, but there are 2 problems.
Getting there. The extconf and make work now, but the "make install" tries
to install to /usr/lib where I don't have permission. I think all I need is
some guidance on this last step and I'll be all set. Thanks for your
patience with this.
Bob
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From aaron.patterson at gmail.com Fri Apr 3 21:49:03 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Fri, 3 Apr 2009 18:49:03 -0700
Subject: [Nokogiri-talk] Advice on New Install to Local Directories
In-Reply-To: <88b3725b0904031826r448b5f21tf2505a9bcd51e7c7@mail.gmail.com>
References: <88b3725b0904011251ia5a62f3lc86b39ccab938055@mail.gmail.com>
<6959e1680904011352wa747b3cv12f10243bebfb3d9@mail.gmail.com>
<88b3725b0904021257n3359974v496d42ef01c26120@mail.gmail.com>
<6959e1680904021357q2f55f748pc3d891a8d7d096d9@mail.gmail.com>
<88b3725b0904031826r448b5f21tf2505a9bcd51e7c7@mail.gmail.com>
Message-ID: <6959e1680904031849q44650ca9o2e623691e720cae8@mail.gmail.com>
On Fri, Apr 3, 2009 at 6:26 PM, Bob Fitterman wrote:
> Aaron, thanks it's getting closer, but there are 2 problems.
>
> Getting there. The extconf and make work now, but the "make install" tries
> to install to /usr/lib where I don't have permission. I think all I need is
> some guidance on this last step and I'll be all set. Thanks for your
> patience with this.
No problem. Don't use 'make install' to install the gem. The gem
must be installed via the 'gem' command.
You can give the extconf.rb parameters to the gem command, and it will
use those while compiling, but install to your normal gem directory.
The command goes like this:
$ gem install nokogiri -- --with-libxslt-dir=/whatever/
All of the flags after the double dash will be supplied to extconf.rb.
Hope that helps.
--
Aaron Patterson
http://tenderlovemaking.com/
From bobf at jhu.edu Fri Apr 3 23:05:23 2009
From: bobf at jhu.edu (Bob Fitterman)
Date: Fri, 3 Apr 2009 21:05:23 -0600
Subject: [Nokogiri-talk] Advice on New Install to Local Directories
In-Reply-To: <6959e1680904031849q44650ca9o2e623691e720cae8@mail.gmail.com>
References: <88b3725b0904011251ia5a62f3lc86b39ccab938055@mail.gmail.com>
<6959e1680904011352wa747b3cv12f10243bebfb3d9@mail.gmail.com>
<88b3725b0904021257n3359974v496d42ef01c26120@mail.gmail.com>
<6959e1680904021357q2f55f748pc3d891a8d7d096d9@mail.gmail.com>
<88b3725b0904031826r448b5f21tf2505a9bcd51e7c7@mail.gmail.com>
<6959e1680904031849q44650ca9o2e623691e720cae8@mail.gmail.com>
Message-ID: <88b3725b0904032005t25f119dewa5e766d5c4913f8e@mail.gmail.com>
BINGO! Thanks a million. Your package is a lifesaver for my sluggish server.
Keep at it.
Bob
On Fri, Apr 3, 2009 at 7:49 PM, Aaron Patterson
wrote:
> On Fri, Apr 3, 2009 at 6:26 PM, Bob Fitterman wrote:
> > Aaron, thanks it's getting closer, but there are 2 problems.
> >
> > Getting there. The extconf and make work now, but the "make install"
> tries
> > to install to /usr/lib where I don't have permission. I think all I need
> is
> > some guidance on this last step and I'll be all set. Thanks for your
> > patience with this.
>
> No problem. Don't use 'make install' to install the gem. The gem
> must be installed via the 'gem' command.
>
> You can give the extconf.rb parameters to the gem command, and it will
> use those while compiling, but install to your normal gem directory.
>
> The command goes like this:
>
> $ gem install nokogiri -- --with-libxslt-dir=/whatever/
>
> All of the flags after the double dash will be supplied to extconf.rb.
>
> Hope that helps.
>
> --
> Aaron Patterson
> http://tenderlovemaking.com/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From phlip2005 at gmail.com Sat Apr 4 00:38:09 2009
From: phlip2005 at gmail.com (Phlip)
Date: Fri, 03 Apr 2009 21:38:09 -0700
Subject: [Nokogiri-talk] [Ann] Verify, a very basic testing tool.
In-Reply-To: <49D69885.6020105@comcast.net>
References: <335e48a90904031302o52992ad1m611789186e964e13@mail.gmail.com> <335e48a90904031515q5f63b694p11e9ecfaa2484bf1@mail.gmail.com> <335e48a90904031545o62efa7c1kd22c0403b263fc58@mail.gmail.com> <335e48a90904031548h46ebd6e2sb9fc2b5999385b9c@mail.gmail.com>
<49D69885.6020105@comcast.net>
Message-ID: <49D6E431.3060802@gmail.com>
Tom Cloyd wrote:
> Damn. I love this sudden burst of creativity around the topic of
> testing. A Good Thing.
Oookay. Here's a sneak preview of assert{ 2.0 } 0.4.8.
You know how Ajax works by generating JavaScript, and slinging it at your web
browser? And you know how Rails purportedly tests it with "assert_rjs"? Here's a
sample:
assert_rjs :replace_html, "advanced_filter", ""
Too cute, right?
Wrong! That expands to nothing but a big Regexp, like
/Element.update.*advanced_filter/. So a payload of "advanced_filter", or even a
subsequent Element.update('advanced_filter'), could fool it.
Further, at work we do a lot of in-house Ajax, so we are at liberty to render
entire partials at whim. We require our teeming minions to use only Firefox. But
all assert_rjs does with its third argument is drop it into assert_match. That
is not powerful enough to constrain our apps!
Just now while writing this post, I got Aaron Paterson's rKelly working in an
assert_rjs clone. rKelly uses racc to parse and evaluate JavaScript. This
matches our goal of _unit_ testing soft targets. Watir, Selenium, etc. are all
great, and they introduced a generation to testing in general. Buuuuuut they
work thru the browser. We are not inventing Ajax itself; we just need to
accurately spot-check that our own data go into the correct slots in our
JavaScript payloads.
So here's a test that simulates a Rails functional test with xhr :get :
@response = OpenStruct.new(:body => "Element.update(\"label_7\", \"I want a pet < than a chihuahua\");")
assert_rjs :replace_html, :label_7
K, so far that looks like the original assert_rjs. But under the hood, it
actually lexed the Element.update() call:
ast.pointcut('Element.update()').matches.each do |updater|
updater.grep(RKelly::Nodes::ArgumentsNode).each do |thang|
div_id, html = thang.value
if target and html
div_id = eval(div_id.value)
html = eval(html.value)
if div_id == target.to_s
assert_match matcher, html
(Open question to Aaron - is that the best way to run the query?)
The test actually determines we really got hold of the Element.update('label_7',
...). No other JavaScript line will match.
Here's the assertions to match the text payload:
assert_rjs :replace_html, :label_7, /Top_Ranking/
assert_rjs :replace_html, :label_7, /pet < than a chihuahua/
Ho hum; so far assert_rjs Classic could have done all that. But...
Because I have an exact string, not a rough match, I can now treat it as pure
HTML, and I can drop it into the mighty assert_xhtml()! Now the assertion looks
like this:
assert_rjs :replace_html, :label_7 do
input.Top_Ranking! :type => :checked, :value => :Y
input.cross_sale_1, :type => :hidden, :value => 7
end
From here, no matter how complex that rendered partial, the assertion can keep
up with it, and help make it safe to refactor and upgrade.
BTW Verify and Testy can get on board if they A> import Test::Unit::Assertions
(like certain other test rigs we could mention should), and B> implement
flunk(). Both of those are all a custom assertion should ever need...
--
Phlip
http://www.zeroplayer.com/
From phlip2005 at gmail.com Sat Apr 4 08:59:29 2009
From: phlip2005 at gmail.com (Phlip)
Date: Sat, 04 Apr 2009 05:59:29 -0700
Subject: [Nokogiri-talk] to_xhtml should not emit
Message-ID: <49D759B1.80703@gmail.com>
Nokogiri:
My assert_xhtml failure message formerly used to_html:
"\n\n...in this sample...\n\n" +
sample.to_html
That printed:
...in this sample...
But 'checked' is too Last Millenium. So I upgrade to to_xhtml...
...and get the same thing. Ouch! use to_xml??
From aaron.patterson at gmail.com Sat Apr 4 17:26:32 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Sat, 4 Apr 2009 14:26:32 -0700
Subject: [Nokogiri-talk] to_xhtml should not emit
In-Reply-To: <49D759B1.80703@gmail.com>
References: <49D759B1.80703@gmail.com>
Message-ID: <6959e1680904041426m70700c28v36eb91589d346c15@mail.gmail.com>
On Sat, Apr 4, 2009 at 5:59 AM, Phlip wrote:
> Nokogiri:
>
> My assert_xhtml failure message formerly used to_html:
>
> ? ? ? ? ? ? ? ? ? ? "\n\n...in this sample...\n\n" +
> ? ? ? ? ? ? ? ? ? ? ? ? ?sample.to_html
>
> That printed:
>
> ...in this sample...
>
> value="Y">
>
> But 'checked' is too Last Millenium. So I upgrade to to_xhtml...
>
> ...and get the same thing. Ouch! use to_xml??
What version of libxml2 are you using?
--
Aaron Patterson
http://tenderlovemaking.com/
From phlip2005 at gmail.com Sat Apr 4 23:09:49 2009
From: phlip2005 at gmail.com (Phlip)
Date: Sat, 04 Apr 2009 20:09:49 -0700
Subject: [Nokogiri-talk] to_xhtml should not emit
In-Reply-To: <6959e1680904041426m70700c28v36eb91589d346c15@mail.gmail.com>
References: <49D759B1.80703@gmail.com>
<6959e1680904041426m70700c28v36eb91589d346c15@mail.gmail.com>
Message-ID: <49D820FD.1070905@gmail.com>
> What version of libxml2 are you using?
$ aptitude show libxml2
Package: libxml2
State: installed
Automatically installed: no
Version: 2.6.32.dfsg-5ubuntu3
And I suspect something inside it just screwed up my libxml-ruby.
Who needs monkey patching when you can just sudo kate the source??! (-:
From aaron.patterson at gmail.com Mon Apr 6 12:39:33 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Mon, 6 Apr 2009 09:39:33 -0700
Subject: [Nokogiri-talk] to_xhtml should not emit
In-Reply-To: <49D820FD.1070905@gmail.com>
References: <49D759B1.80703@gmail.com>
<6959e1680904041426m70700c28v36eb91589d346c15@mail.gmail.com>
<49D820FD.1070905@gmail.com>
Message-ID: <6959e1680904060939l3ef6f6e2jf7cbfaa6b11616bd@mail.gmail.com>
On Sat, Apr 4, 2009 at 8:09 PM, Phlip wrote:
>> What version of libxml2 are you using?
>
> $ aptitude show libxml2
> Package: libxml2
> State: installed
> Automatically installed: no
> Version: 2.6.32.dfsg-5ubuntu3
>
> And I suspect something inside it just screwed up my libxml-ruby.
Would you mind giving it a whirl with libxml 2.7.3? IIRC, the XHTML
functionality is busted in the 2.6.* series. 2.7.3 provides valid
XHTML for me.
--
Aaron Patterson
http://tenderlovemaking.com/
From julien.genestoux at gmail.com Fri Apr 10 02:22:38 2009
From: julien.genestoux at gmail.com (Julien Genestoux)
Date: Thu, 9 Apr 2009 23:22:38 -0700
Subject: [Nokogiri-talk] Comparing documents
Message-ID: <26c0cf900904092322g352faed6x57baa7fab45fc7f4@mail.gmail.com>
Hey,
Is there an easy way to compare Nokogiri Documents?
The idea is here that I am trying to build some XML with the builder, and,
to make sure I am building correctly, I am parsing a xml chunk that should
be the result and comparing it to what the builder did.
I could compare the string versions of the doc, but then, I have errors with
the slightest space difference, which is not relevant.
Initially, I thought I could easily do it recursively by comparing nodes...
but node comparison fails even if 2 nodes have the same name and
attributes... :
doc = Nokogiri::XML::Document.new
=>
n = Nokogiri::XML::Element.new("salut", doc)
=>
n["toto"] = "tata"
=> "tata"
n
=>
m = Nokogiri::XML::Element.new("salut", doc)
=>
m["toto"] = "tata"
=> "tata"
m == n
=> false
Any help here?
Thanks a bunch!
Julien
--
Julien Genestoux, Notifixio.us
http://twitter.com/julien51
http://www.ouvre-boite.com
http://blog.notifixio.us
+1 (415) 254 7340
+33 (0)9 70 44 76 29
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From mike.dalessio at gmail.com Fri Apr 10 07:56:56 2009
From: mike.dalessio at gmail.com (Mike Dalessio)
Date: Fri, 10 Apr 2009 07:56:56 -0400
Subject: [Nokogiri-talk] Comparing documents
In-Reply-To: <26c0cf900904092322g352faed6x57baa7fab45fc7f4@mail.gmail.com>
References: <26c0cf900904092322g352faed6x57baa7fab45fc7f4@mail.gmail.com>
Message-ID: <618c07250904100456s2492540bvd664ee64c10a31d8@mail.gmail.com>
Julien -
Check out Aaron's tree_diff.
http://github.com/tenderlove/tree_diff/tree
-m
2009/4/10 Julien Genestoux
> Hey,
>
> Is there an easy way to compare Nokogiri Documents?
>
> The idea is here that I am trying to build some XML with the builder, and,
> to make sure I am building correctly, I am parsing a xml chunk that should
> be the result and comparing it to what the builder did.
> I could compare the string versions of the doc, but then, I have errors
> with the slightest space difference, which is not relevant.
>
> Initially, I thought I could easily do it recursively by comparing nodes...
> but node comparison fails even if 2 nodes have the same name and
> attributes... :
>
> doc = Nokogiri::XML::Document.new
> =>
> n = Nokogiri::XML::Element.new("salut", doc)
> =>
> n["toto"] = "tata"
> => "tata"
> n
> =>
> m = Nokogiri::XML::Element.new("salut", doc)
> =>
> m["toto"] = "tata"
> => "tata"
> m == n
> => false
>
> Any help here?
>
> Thanks a bunch!
>
> Julien
>
>
>
>
> --
> Julien Genestoux, Notifixio.us
> http://twitter.com/julien51
> http://www.ouvre-boite.com
> http://blog.notifixio.us
>
> +1 (415) 254 7340
> +33 (0)9 70 44 76 29
>
> _______________________________________________
> Nokogiri-talk mailing list
> Nokogiri-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/nokogiri-talk
>
>
--
mike dalessio
mike at csa.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From aaron.patterson at gmail.com Fri Apr 10 11:46:29 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Fri, 10 Apr 2009 08:46:29 -0700
Subject: [Nokogiri-talk] Comparing documents
In-Reply-To: <26c0cf900904092322g352faed6x57baa7fab45fc7f4@mail.gmail.com>
References: <26c0cf900904092322g352faed6x57baa7fab45fc7f4@mail.gmail.com>
Message-ID: <6959e1680904100846m32494887rdb275aa00647ce8c@mail.gmail.com>
On Thu, Apr 9, 2009 at 11:22 PM, Julien Genestoux
wrote:
> Hey,
>
> Is there an easy way to compare Nokogiri Documents?
>
> The idea is here that I am trying to build some XML with the builder, and,
> to make sure I am building correctly, I am parsing a xml chunk that should
> be the result and comparing it to what the builder did.
> I could compare the string versions of the doc, but then, I have errors with
> the slightest space difference, which is not relevant.
>
> Initially, I thought I could easily do it recursively by comparing nodes...
> but node comparison fails even if 2 nodes have the same name and
> attributes... :
>
> doc = Nokogiri::XML::Document.new
> =>
> n = Nokogiri::XML::Element.new("salut", doc)
> =>
> n["toto"] = "tata"
> => "tata"
> n
> =>
> m =? Nokogiri::XML::Element.new("salut", doc)
> =>
> m["toto"] = "tata"
> => "tata"
> m == n
> => false
>
> Any help here?
As far as the XML document is concerned, no two nodes are ever equal.
Every node in a document is different. Every node has many attributes
to compare:
1. Is the name the same?
2. How about attributes?
3. How about the namespace?
4. What about number of children?
5. Are all the children the same?
6. Is it's parent node the same?
7. What about it's position relative to sibling nodes?
Think about adding two nodes to the same document. They can *never*
have the same position relative to sibling nodes, therefore two nodes
in a document cannot be "equal".
You *can* however compare two different documents. But you need to
answer those 7 questions yourself as you're walking the two trees.
Your requirements for sameness may differ from others.
I wouldn't be opposed to implementing a =~ on Node that did this
comparison, but was very strict about those questions.
You could do stuff like:
doc1 =~ doc2 # => true
doc2 =~ doc3 # => false
As long as it only returned true or false. How does that sound?
--
Aaron Patterson
http://tenderlovemaking.com/
From phlip2005 at gmail.com Fri Apr 10 14:43:51 2009
From: phlip2005 at gmail.com (Phlip)
Date: Fri, 10 Apr 2009 11:43:51 -0700
Subject: [Nokogiri-talk] Comparing documents
In-Reply-To: <6959e1680904100846m32494887rdb275aa00647ce8c@mail.gmail.com>
References: <26c0cf900904092322g352faed6x57baa7fab45fc7f4@mail.gmail.com>
<6959e1680904100846m32494887rdb275aa00647ce8c@mail.gmail.com>
Message-ID: <49DF9367.5010600@gmail.com>
Aaron Patterson wrote:
> As far as the XML document is concerned, no two nodes are ever equal.
> Every node in a document is different. Every node has many attributes
> to compare:
>
> 1. Is the name the same?
> 2. How about attributes?
> 3. How about the namespace?
> 4. What about number of children?
> 5. Are all the children the same?
> 6. Is it's parent node the same?
> 7. What about it's position relative to sibling nodes?
The algorithm inside assert_xhtml essentially detects when one XML is a subset
of the other. Reducing its fuzziness would produce a more exact match detector...
From phlip2005 at gmail.com Fri Apr 10 17:33:17 2009
From: phlip2005 at gmail.com (Phlip)
Date: Fri, 10 Apr 2009 14:33:17 -0700
Subject: [Nokogiri-talk] Comparing documents
In-Reply-To: <26c0cf900904101418h7237b8a2r4e42afd08e739abb@mail.gmail.com>
References: <26c0cf900904092322g352faed6x57baa7fab45fc7f4@mail.gmail.com> <6959e1680904100846m32494887rdb275aa00647ce8c@mail.gmail.com>
<26c0cf900904101418h7237b8a2r4e42afd08e739abb@mail.gmail.com>
Message-ID: <49DFBB1D.6090808@gmail.com>
Julien Genestoux wrote:
> Thank you guys for the responses...
>
> It is actually for rspec purposes that I need to compare an expected doc
> and the actual generated.
I suspect assert_xhtml can do this.
Install the gems nokogiri and assert2. Then require 'assert2/xhtml'
>
> Aaron, here is how I see the 7 questions :
> 1. Is the name the same? yes, it should
> 2. How about attributes? same
> 3. How about the namespace? same
> 4. What about number of children? same
> 5. Are all the children the same? same
> 6. Is it's parent node the same? same
> 7. What about it's position relative to sibling nodes? nope!
Then...
@response.body.should be_html_with{
xml_tag :attribute1 => 'value1' do
nested_tag :attribute2 => 'value2',
:xpath! => '42 = count(*) and parent::xml_tag and position() = 0'
end
}
Ask if that doesn't work - it's exactly what I invented assert_xhtml, with the
slight matter I will drop your XML into Nokogiri::HTML. That may or may not be a
problem!
Naturally, this library needs a .be_xml_with{}, for completeness.
From aaron.patterson at gmail.com Thu Apr 16 13:13:05 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Thu, 16 Apr 2009 10:13:05 -0700
Subject: [Nokogiri-talk] Help with XML Builder
In-Reply-To:
References:
Message-ID: <6959e1680904161013j7b8fde72oc06f02e6c69e9658@mail.gmail.com>
On Thu, Apr 16, 2009 at 4:11 AM, Antel wrote:
> I've this hash:
> @@node = {:nodes=>["main_question", "best_answer", "other_answer"],
> :rootnodes=>["questions_main", "main_content", "main_q1", "main_q2",
> "close_node", "best_a1", "close_node", "close_node", "main_answers",
> "close_node", "close_node", "close_node"]}
Is there any way you can merge those two arrays before building up
your XML? Having the two separate arrays makes things difficult. If
it was one array, you could do this:
http://rafb.net/p/Fbzada27.html
--
Aaron Patterson
http://tenderlovemaking.com/
From byrnejb at harte-lyne.ca Thu Apr 16 16:49:02 2009
From: byrnejb at harte-lyne.ca (James B. Byrne)
Date: Thu, 16 Apr 2009 16:49:02 -0400 (EDT)
Subject: [Nokogiri-talk] Need basic xml help
Message-ID: <57029.216.185.71.24.1239914942.squirrel@webmail.harte-lyne.ca>
I am trying to parse an xml:RDF document. This is completely new
for me and I am struggling. Being a cucumber user I was already
aware of nokogiri and so I decided to push ahead using nokogiri
first.
My xml document is a central bank rss feed of exchange rates. It
looks like this.
Bank of Canada: Noon Foreign Exchange Rates
...
...
en2009-04-16CA: 0.8290 USD = 1 CAD 2009-04-16 ...
http://www.bankofcanada.ca/fx/daily_noon.html
1 Canadian Dollar = 0.8290 USD ...en2009-04-16text/htmlCACADUSD0.8290noonstatistics
... # ~ 50 of these entries
All I have managed to do is load the document and then I am stuck.
>> fx = Nokogiri::XML(open(
'http://www.bankofcanada.ca/rss/fx/noon/fx-noon-all.xml'))
I can see the document if I call fx.to_xml but I cannot figure out
how to navigate through it. The final result that I am trying to
achieve is extract the exchange rate for a given currency on a given
day and put it into an ActiveRecord model for loading into a
database.
So the bits I am interested in are: , ,
, , and . Presumably,
once I discover how to reach any of the interior tags then I will be
able to use the same technique to reach nay others.
I am working in the console to get some sense of how this is
supposed to work. However, fx.child or fx.children return nil.
fx.content shows the entire file. So does fx.root. fx.next yields
nil.
What I need is a brief set of instructions on how to navigate this
document using nokogiri.
--
*** E-Mail is NOT a SECURE channel ***
James B. Byrne mailto:ByrneJB at Harte-Lyne.ca
Harte & Lyne Limited http://www.harte-lyne.ca
9 Brockley Drive vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada L8E 3C3
From aaron.patterson at gmail.com Thu Apr 16 17:10:24 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Thu, 16 Apr 2009 14:10:24 -0700
Subject: [Nokogiri-talk] Need basic xml help
In-Reply-To: <57029.216.185.71.24.1239914942.squirrel@webmail.harte-lyne.ca>
References: <57029.216.185.71.24.1239914942.squirrel@webmail.harte-lyne.ca>
Message-ID: <6959e1680904161410u28917c66uc741688a9b896c3a@mail.gmail.com>
On Thu, Apr 16, 2009 at 1:49 PM, James B. Byrne wrote:
> I am trying to parse an xml:RDF document. ?This is completely new
> for me and I am struggling. ?Being a cucumber user I was already
> aware of nokogiri and so I decided to push ahead using nokogiri
> first.
>
> My xml document is a central bank rss feed of exchange rates. It
> looks like this.
>
>
> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> ... # many more namespaces declared
>
> Bank of Canada: Noon Foreign Exchange Rates
> ...
> ?
> ? ?
> ? ? ? ?
> ? ? ? ?...
> ? ? ? ?
> ?
> ?
> ? ? en
> ? ? 2009-04-16
>
>
> ? ? ? ?CA: 0.8290 USD = 1 CAD 2009-04-16 ...
> ? ? ? ?http://www.bankofcanada.ca/fx/daily_noon.html
> ? ? ? ?1 Canadian Dollar = 0.8290 USD ...
> ? ? ? ?en
> ? ? ? ?2009-04-16
> ? ? ? ?text/html
> ? ? ? ?CA
>
> ? ? ? ?CAD
> ? ? ? ?USD
> ? ? ? ?0.8290
> ? ? ? ?noon
> ? ? ? ?statistics
>
>
>
> ?... # ~ 50 of these entries
>
>
>
>
> All I have managed to do is load the document and then I am stuck.
>
>>> fx = Nokogiri::XML(open(
> ? ? ? 'http://www.bankofcanada.ca/rss/fx/noon/fx-noon-all.xml'))
>
> I can see the document if I call fx.to_xml but I cannot figure out
> how to navigate through it. ?The final result that I am trying to
> achieve is extract the exchange rate for a given currency on a given
> day and put it into an ActiveRecord model for loading into a
> database.
>
> So the bits I am interested in are: , ,
> , , and . ?Presumably,
> once I discover how to reach any of the interior tags then I will be
> able to use the same technique to reach nay others.
>
> I am working in the console to get some sense of how this is
> supposed to work. ?However, fx.child or fx.children return nil.
> fx.content shows the entire file. So does fx.root. ?fx.next yields
> nil.
fx is pointing at the document. Try something like
'fx.root.children.length' and see what that returns.
> What I need is a brief set of instructions on how to navigate this
> document using nokogiri.
If you're familiar with CSS or XPath, it should be pretty easy to
navigate. Take a look at the synopsis in the README:
http://nokogiri.rubyforge.org/nokogiri/
Or also the xpath or css method documentation:
http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/Node.html
Also, try something like this:
fx.css('item').each { |item| p item.children }
--
Aaron Patterson
http://tenderlovemaking.com/
From aaron.patterson at gmail.com Thu Apr 16 17:18:12 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Thu, 16 Apr 2009 14:18:12 -0700
Subject: [Nokogiri-talk] How to convert this XML parsing code to use a
stream?
In-Reply-To:
References:
Message-ID: <6959e1680904161418h77fa4ebbu74e4e0c082b45900@mail.gmail.com>
On Thu, Apr 16, 2009 at 2:03 PM, Jed Hurt wrote:
> I am trying to parse the Netflix catalog index XML file to cherry pick the
> movies that are available to watch instantly[1][2]. The XML file provided by
> Netflix contains every movie in Netflix's catalog and is huge (289MB). I
> wrote some Nokogiri code to find all movies available instantly and ran it
> against a small sample XML file to test it.?It works well.
> The problem is that when I run the code against the actual file, my computer
> grinds to a crashing halt and ruby starts throwing malloc errors. Here's is
> the code and the sample XML file?
> Code:?http://pastie.textmate.org/445794
> Sample XML:?http://pastie.textmate.org/445796
> Would someone would be willing to convert the code to use streaming
> (one? at a time) to mitigate the memory issues?
Converting your code is non-trivial work. Try starting with
Nokogiri::XML::SAX::Parser.
Basically, you'll need to create a class that inherits from
Nokogiri::XML::SAX::Document and pass that off to the SAX parser.
Then give the SAX parser your IO stream and it will call callbacks on
your document class.
Take a look at the tests for an example:
http://github.com/tenderlove/nokogiri/blob/99abb9fb042238004b779317757c8480ed2f143d/test/xml/sax/test_parser.rb
Paul Dix makes heavy use of the SAX parser too, so you might want to
check out his project:
http://github.com/pauldix/feedzirra/tree/master
--
Aaron Patterson
http://tenderlovemaking.com/
From byrnejb at harte-lyne.ca Thu Apr 16 20:35:14 2009
From: byrnejb at harte-lyne.ca (James B. Byrne)
Date: Thu, 16 Apr 2009 20:35:14 -0400 (EDT)
Subject: [Nokogiri-talk] Need basic xml help
In-Reply-To: <6959e1680904161410u28917c66uc741688a9b896c3a@mail.gmail.com>
References: <57029.216.185.71.24.1239914942.squirrel@webmail.harte-lyne.ca>
<6959e1680904161410u28917c66uc741688a9b896c3a@mail.gmail.com>
Message-ID: <61918.69.157.29.96.1239928514.squirrel@webmail.harte-lyne.ca>
On Thu, April 16, 2009 17:10, Aaron Patterson wrote:
> Also, try something like this:
>
> fx.css('item').each { |item| p item.children }
>
Thank you. I have already found and read the references that you give.
What I am trying to get an example of is the type of construction
that would do this:
fx = = Nokogiri::XML(open(
'http://www.bankofcanada.ca/rss/fx/noon/fx-noon-all.xml'))
fx.xpath(???).each do |xchg|
cc = CurrencyExChg.new
cc.currency_base = xchg.??
cc.currency_target = xchg.??
cc.currency_xchg_rate = xchg.??
cc.currency_xchg_date = xchg.??
cc.save!
end
To begin with, I can get nowhere using xpath on the document I am
working with.
>> fx.xpath('//item').each do
?> cnt = cnt -1
>> puts cnt
>> end
=> 0
>>
?> fx.css('item').each do
?> cnt = cnt - 1
>> puts cnt
>> end
-1
-2
-3
...
Now, my problem is that even if I use css and an iterator construct
like this:
fx.css('item').each do |rate|
I do not know how I get the data elements I desire out of the rate
variable. I cannot even grasp what the rate object is or what it
contains. In my head I imagined that one of the xml parsers
available for Ruby would take an xml doc and turn it into a nested
hash or array, so that for the example I gave in my previous message
I would obtain something akin to:
fx = item[ {:date => '2009-04-16', :base_currency => 'CAD',
:target_currency => 'USD', :value => '0.8290', ...},
{:date => '2009-04-16', :base_currency => 'CAD',
:target_currency => 'GBP', :value => '1.6354', ...},
...
]
which I could at least examine in console and could map to the
setter attributes of an ActiveRecord class in a fairly straight
forward manner. As it is I cannot seem to find any methods that
display the data structure I am working with in a manner that I can
extract the relevant parts.
I have gone through three or four tutorials now, including a good
one at http://www.zvon.org/xxl/XPathTutorial/ which explains the
xpath hierarchy very well. Unfortuanately, with nokogiri all I can
accomplish with fx.xpath is to dump the entire document. It does
not seem to matter what I provide as arguments.
?> fx.xpath('//*item').each do
?> cnt = cnt + 1
>> puts cnt
>> end
Nokogiri::XML::XPath::SyntaxError: Invalid expression
?> fx.xpath('//item').each do
?> cnt = cnt + 1
>> puts cnt
>> end
=> 0
?> fx.xpath('item').each do
?> cnt = cnt + 1
>> puts cnt
>> end
=> 0
Now, I realize that this is due to ignorance on my part. But I
really cannot figure out how to obtain what I desire from the
abbreviated examples that I can find. I need an example of how to
pull successive sets of information out of that xml document. I
know that it is possible. I believe that Ruby and nokogiri can do
it. I just need instruction on how it is done.
--
*** E-Mail is NOT a SECURE channel ***
James B. Byrne mailto:ByrneJB at Harte-Lyne.ca
Harte & Lyne Limited http://www.harte-lyne.ca
9 Brockley Drive vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada L8E 3C3
From phlip2005 at gmail.com Fri Apr 17 00:47:55 2009
From: phlip2005 at gmail.com (Phlip)
Date: Thu, 16 Apr 2009 21:47:55 -0700
Subject: [Nokogiri-talk] Need basic xml help
In-Reply-To: <61918.69.157.29.96.1239928514.squirrel@webmail.harte-lyne.ca>
References: <57029.216.185.71.24.1239914942.squirrel@webmail.harte-lyne.ca> <6959e1680904161410u28917c66uc741688a9b896c3a@mail.gmail.com>
<61918.69.157.29.96.1239928514.squirrel@webmail.harte-lyne.ca>
Message-ID: <49E809FB.4050908@gmail.com>
James B. Byrne wrote:
> What I am trying to get an example of is the type of construction
> that would do this:
Here's how far I got at first crack:
fx = Nokogiri::XML(File.read('fx-noon-all.xml'))
fx.xpath('/rdf:RDF/*').each do |xchg|
p xchg.name
p xchg.path
end
I don't want to use /* - I want to use /item. But it does not work, and the
above returns this:
"channel"
"/rdf:RDF/*[1]"
"item"
"/rdf:RDF/*[2]"
"item"
"/rdf:RDF/*[3]"
"item"
...
So it seems there's an issue with the namespace. (Firefox XPath Checker and
XPather both broke on that XML - dunno why, but different XML implementation.)
So this hack tends to work:
fx.xpath('/rdf:RDF/*').each do |xchg|
if xchg.name == 'item'
p xchg.xpath('cb:baseCurrency').text
p xchg.xpath('cb:targetCurrency').text
p xchg.xpath('cb:value').text
p xchg.xpath('dc:date').text
end
end
> Now, I realize that this is due to ignorance on my part.
Though I have never used XPath with namespaces (I mostly attack XHTML with it),
I have written enough XPaths to suspect a bug in libxml2 here...
--
Phlip
http://flea.sourceforge.net/resume.html
From byrnejb at harte-lyne.ca Fri Apr 17 10:21:44 2009
From: byrnejb at harte-lyne.ca (James B. Byrne)
Date: Fri, 17 Apr 2009 10:21:44 -0400 (EDT)
Subject: [Nokogiri-talk] Need basic xml help
In-Reply-To: <49E809FB.4050908@gmail.com>
References: <57029.216.185.71.24.1239914942.squirrel@webmail.harte-lyne.ca>
<6959e1680904161410u28917c66uc741688a9b896c3a@mail.gmail.com>
<61918.69.157.29.96.1239928514.squirrel@webmail.harte-lyne.ca>
<49E809FB.4050908@gmail.com>
Message-ID: <38993.216.185.71.24.1239978104.squirrel@webmail.harte-lyne.ca>
On Fri, April 17, 2009 00:47, Phlip wrote:
>
> So this hack tends to work:
>
> fx.xpath('/rdf:RDF/*').each do |xchg|
> if xchg.name == 'item'
> p xchg.xpath('cb:baseCurrency').text
> p xchg.xpath('cb:targetCurrency').text
> p xchg.xpath('cb:value').text
> p xchg.xpath('dc:date').text
> end
> end
>
Thank you ever so much. This is exactly the type of example that I
was looking for and either did not recognize or could not find.
re: libxml2 - this is what I have:
libxml2.x86_64 2.6.26-2.1.2.7 el5 (CentOS-5.3)
Again , thanks.
--
*** E-Mail is NOT a SECURE channel ***
James B. Byrne mailto:ByrneJB at Harte-Lyne.ca
Harte & Lyne Limited http://www.harte-lyne.ca
9 Brockley Drive vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada L8E 3C3
From aaron.patterson at gmail.com Fri Apr 17 13:11:44 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Fri, 17 Apr 2009 10:11:44 -0700
Subject: [Nokogiri-talk] bug tracker is changing
Message-ID: <6959e1680904171011l56b4337eh1d3819de4b1cb667@mail.gmail.com>
Hi everyone!
I am changing the bug tracker to use github issues. Please do not
file anymore tickets in lighthouse. The new ticket tracker is here:
http://github.com/tenderlove/nokogiri/issues
I will attempt to resolve the rest of the issues on lighthouse, but
from this point forward, please use the github issue tracker. Again,
the url is:
http://github.com/tenderlove/nokogiri/issues
Thanks everyone!
--
Aaron Patterson
http://tenderlovemaking.com/
From adam.vandenhoven at gmail.com Fri Apr 17 19:01:08 2009
From: adam.vandenhoven at gmail.com (Adam van den Hoven)
Date: Fri, 17 Apr 2009 16:01:08 -0700
Subject: [Nokogiri-talk] Memory problem with Nokogiri and rack
Message-ID: <1240009268.6904.161.camel@vandenhoven>
Hey all,
I'm building a little sinatra app that uses nokogiri to load a third
party site, make a few changes and then send out the result (its a
simplistic iPhone simulator). When I run it from my laptop (running
ubuntu), I get the following error:
*** glibc detected *** /usr/bin/ruby1.8: double free or
corruption (!prev): 0x087e05c0 ***
======= Backtrace: =========
/lib/tls/i686/cmov/libc.so.6[0xb7ca6454]
/lib/tls/i686/cmov/libc.so.6(cfree+0x96)[0xb7ca84b6]
/usr/lib/libruby1.8.so.1.8(ruby_xfree+0x37)[0xb7e570a7]
/usr/lib/libxml2.so.2(xmlFreeNodeList+0x20e)[0xb762face]
/usr/lib/libxml2.so.2(xmlFreeProp+0x5a)[0xb762ffea]
/usr/lib/libxml2.so.2(xmlFreePropList+0x1b)[0xb76302bb]
/usr/lib/libxml2.so.2(xmlFreeNodeList+0xa2)[0xb762f962]
/usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
/usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
/usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
/usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
/usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
/usr/lib/libxml2.so.2(xmlFreeDoc+0xbc)[0xb762f77c]
/usr/lib/ruby/gems/1.8/gems/nokogiri-1.2.3/lib/nokogiri/native.so[0xb7ae80c9]
/usr/lib/libruby1.8.so.1.8(rb_gc_call_finalizer_at_exit
+0xa7)[0xb7e573a7]
/usr/lib/libruby1.8.so.1.8[0xb7e3bbb7]
/usr/lib/libruby1.8.so.1.8(ruby_cleanup+0x1a2)[0xb7e48a52]
/usr/lib/libruby1.8.so.1.8(ruby_stop+0x1d)[0xb7e48b7d]
/usr/lib/libruby1.8.so.1.8[0xb7e50021]
/usr/bin/ruby1.8[0x804870d]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7c4d685]
/usr/bin/ruby1.8[0x8048621]
Now I'm not sure that's a very good thing and I'm not sure I should put
it up on Dreamhost in my dev location unless someone has an idea of what
the problem is.
Any thoughts?
--
Adam van den Hoven
Hybrid Web Developer
Little Fyr Media
p: 604.618.0845
e: adam.vandenhoven at gmail.com
w: http://www.littlefyr.com
From aaron.patterson at gmail.com Fri Apr 17 19:58:55 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Fri, 17 Apr 2009 16:58:55 -0700
Subject: [Nokogiri-talk] Memory problem with Nokogiri and rack
In-Reply-To: <1240009268.6904.161.camel@vandenhoven>
References: <1240009268.6904.161.camel@vandenhoven>
Message-ID: <6959e1680904171658o540c2f8xc3261b196a7fb71b@mail.gmail.com>
On Fri, Apr 17, 2009 at 4:01 PM, Adam van den Hoven
wrote:
> Hey all,
>
> I'm building a little sinatra app that uses nokogiri to load a third
> party site, make a few changes and then send out the result (its a
> simplistic iPhone simulator). When I run it from my laptop (running
> ubuntu), I get the following error:
>
> ? ? ? ?*** glibc detected *** /usr/bin/ruby1.8: double free or
> ? ? ? ?corruption (!prev): 0x087e05c0 ***
> ? ? ? ?======= Backtrace: =========
> ? ? ? ?/lib/tls/i686/cmov/libc.so.6[0xb7ca6454]
> ? ? ? ?/lib/tls/i686/cmov/libc.so.6(cfree+0x96)[0xb7ca84b6]
> ? ? ? ?/usr/lib/libruby1.8.so.1.8(ruby_xfree+0x37)[0xb7e570a7]
> ? ? ? ?/usr/lib/libxml2.so.2(xmlFreeNodeList+0x20e)[0xb762face]
> ? ? ? ?/usr/lib/libxml2.so.2(xmlFreeProp+0x5a)[0xb762ffea]
> ? ? ? ?/usr/lib/libxml2.so.2(xmlFreePropList+0x1b)[0xb76302bb]
> ? ? ? ?/usr/lib/libxml2.so.2(xmlFreeNodeList+0xa2)[0xb762f962]
> ? ? ? ?/usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
> ? ? ? ?/usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
> ? ? ? ?/usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
> ? ? ? ?/usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
> ? ? ? ?/usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
> ? ? ? ?/usr/lib/libxml2.so.2(xmlFreeDoc+0xbc)[0xb762f77c]
> ? ? ? ?/usr/lib/ruby/gems/1.8/gems/nokogiri-1.2.3/lib/nokogiri/native.so[0xb7ae80c9]
> ? ? ? ?/usr/lib/libruby1.8.so.1.8(rb_gc_call_finalizer_at_exit
> ? ? ? ?+0xa7)[0xb7e573a7]
> ? ? ? ?/usr/lib/libruby1.8.so.1.8[0xb7e3bbb7]
> ? ? ? ?/usr/lib/libruby1.8.so.1.8(ruby_cleanup+0x1a2)[0xb7e48a52]
> ? ? ? ?/usr/lib/libruby1.8.so.1.8(ruby_stop+0x1d)[0xb7e48b7d]
> ? ? ? ?/usr/lib/libruby1.8.so.1.8[0xb7e50021]
> ? ? ? ?/usr/bin/ruby1.8[0x804870d]
> ? ? ? ?/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7c4d685]
> ? ? ? ?/usr/bin/ruby1.8[0x8048621]
>
> Now I'm not sure that's a very good thing and I'm not sure I should put
> it up on Dreamhost in my dev location unless someone has an idea of what
> the problem is.
>
> Any thoughts?
Well, it's either a bug in nokogiri or a bug in libxml2. What version
of libxml2 are you using?
--
Aaron Patterson
http://tenderlovemaking.com/
From adam.vandenhoven at gmail.com Mon Apr 20 02:27:35 2009
From: adam.vandenhoven at gmail.com (Adam van den Hoven)
Date: Sun, 19 Apr 2009 23:27:35 -0700
Subject: [Nokogiri-talk] Memory problem with Nokogiri and rack
In-Reply-To: <6959e1680904171658o540c2f8xc3261b196a7fb71b@mail.gmail.com>
References: <1240009268.6904.161.camel@vandenhoven>
<6959e1680904171658o540c2f8xc3261b196a7fb71b@mail.gmail.com>
Message-ID: <1240208855.7836.11.camel@vandenhoven>
synaptic says:
2.6.32.dfsg-4ubuntu1.1
--
Adam van den Hoven
Hybrid Web Developer
Little Fyr Media
p: 604.618.0845
e: adam.vandenhoven at gmail.com
w: http://www.littlefyr.com
On Fri, 2009-04-17 at 16:58 -0700, Aaron Patterson wrote:
> On Fri, Apr 17, 2009 at 4:01 PM, Adam van den Hoven
> wrote:
> > Hey all,
> >
> > I'm building a little sinatra app that uses nokogiri to load a third
> > party site, make a few changes and then send out the result (its a
> > simplistic iPhone simulator). When I run it from my laptop (running
> > ubuntu), I get the following error:
> >
> > *** glibc detected *** /usr/bin/ruby1.8: double free or
> > corruption (!prev): 0x087e05c0 ***
> > ======= Backtrace: =========
> > /lib/tls/i686/cmov/libc.so.6[0xb7ca6454]
> > /lib/tls/i686/cmov/libc.so.6(cfree+0x96)[0xb7ca84b6]
> > /usr/lib/libruby1.8.so.1.8(ruby_xfree+0x37)[0xb7e570a7]
> > /usr/lib/libxml2.so.2(xmlFreeNodeList+0x20e)[0xb762face]
> > /usr/lib/libxml2.so.2(xmlFreeProp+0x5a)[0xb762ffea]
> > /usr/lib/libxml2.so.2(xmlFreePropList+0x1b)[0xb76302bb]
> > /usr/lib/libxml2.so.2(xmlFreeNodeList+0xa2)[0xb762f962]
> > /usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
> > /usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
> > /usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
> > /usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
> > /usr/lib/libxml2.so.2(xmlFreeNodeList+0x80)[0xb762f940]
> > /usr/lib/libxml2.so.2(xmlFreeDoc+0xbc)[0xb762f77c]
> > /usr/lib/ruby/gems/1.8/gems/nokogiri-1.2.3/lib/nokogiri/native.so[0xb7ae80c9]
> > /usr/lib/libruby1.8.so.1.8(rb_gc_call_finalizer_at_exit
> > +0xa7)[0xb7e573a7]
> > /usr/lib/libruby1.8.so.1.8[0xb7e3bbb7]
> > /usr/lib/libruby1.8.so.1.8(ruby_cleanup+0x1a2)[0xb7e48a52]
> > /usr/lib/libruby1.8.so.1.8(ruby_stop+0x1d)[0xb7e48b7d]
> > /usr/lib/libruby1.8.so.1.8[0xb7e50021]
> > /usr/bin/ruby1.8[0x804870d]
> > /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7c4d685]
> > /usr/bin/ruby1.8[0x8048621]
> >
> > Now I'm not sure that's a very good thing and I'm not sure I should put
> > it up on Dreamhost in my dev location unless someone has an idea of what
> > the problem is.
> >
> > Any thoughts?
>
> Well, it's either a bug in nokogiri or a bug in libxml2. What version
> of libxml2 are you using?
>
>
From byrnejb at harte-lyne.ca Mon Apr 20 16:59:13 2009
From: byrnejb at harte-lyne.ca (James B. Byrne)
Date: Mon, 20 Apr 2009 16:59:13 -0400 (EDT)
Subject: [Nokogiri-talk] Ruby Rails and Nokogiri
Message-ID: <58868.216.185.71.24.1240261153.squirrel@webmail.harte-lyne.ca>
I am trying to use nokogiri in a class that pulls and parses an xml
feed. It is intended that the class be used in a stand-alone Ruby
process run via cron.
If I use the class in script/console then everything works fine. If
instead I put it into a stand alone script then I see this error:
$ bin/hll_forex_ca_feed.rb
"--- !ruby/object:ForexCASource \nforex:
!ruby/object:Nokogiri::XML::Document \n decorators: \n errors:
[]\n\n"
The script is just this:
#!/usr/bin/env ruby
require File.dirname(__FILE__) + '/../config/environment'
require 'rubygems'
require 'active_record'
require 'forex_ca_source'
fx = ForexCASource.new
...
The class looks like this:
require 'nokogiri'
require 'open-uri'
class ForexCASource
include Nokogiri::XML
FOREX_SITE = \
'http://www.bankofcanada.ca/rss/fx/noon/fx-noon-all.xml'
def initialize(source=nil)
return xchg_source unless source
return xchg_source(source)
end
def xchg_source(source=FOREX_SITE)
@forex = Nokogiri::XML(open(source))
rescue Exception => e
Rails::logger.error(
"ForexCASource unable to open #{source} \n #{e}")
raise e
end
...
end
Any idea of what I am doing wrong in setting up the environment for
nokogiri that script/console takes care of?
--
*** E-Mail is NOT a SECURE channel ***
James B. Byrne mailto:ByrneJB at Harte-Lyne.ca
Harte & Lyne Limited http://www.harte-lyne.ca
9 Brockley Drive vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada L8E 3C3
From aaron.patterson at gmail.com Mon Apr 20 17:26:06 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Mon, 20 Apr 2009 14:26:06 -0700
Subject: [Nokogiri-talk] Memory problem with Nokogiri and rack
In-Reply-To: <1240208855.7836.11.camel@vandenhoven>
References: <1240009268.6904.161.camel@vandenhoven>
<6959e1680904171658o540c2f8xc3261b196a7fb71b@mail.gmail.com>
<1240208855.7836.11.camel@vandenhoven>
Message-ID: <6959e1680904201426mfe1e0d6v9ada9806719e574d@mail.gmail.com>
Hi Adam,
On Sun, Apr 19, 2009 at 11:27 PM, Adam van den Hoven
wrote:
> synaptic says:
>
> 2.6.32.dfsg-4ubuntu1.1
mailman blocked your last email because the attachment was too big.
I downloaded the app and tried it out, but I can't seem to reproduce
the problem.
Does it happen every request, or just sometimes?
Is the document you parse the same every time?
Are you able to reproduce the bug in a stand alone ruby script?
--
Aaron Patterson
http://tenderlovemaking.com/
From byrnejb at harte-lyne.ca Mon Apr 20 17:43:30 2009
From: byrnejb at harte-lyne.ca (James B. Byrne)
Date: Mon, 20 Apr 2009 17:43:30 -0400 (EDT)
Subject: [Nokogiri-talk] [SOLVED] Re: Ruby Rails and Nokogiri
In-Reply-To: <58868.216.185.71.24.1240261153.squirrel@webmail.harte-lyne.ca>
References: <58868.216.185.71.24.1240261153.squirrel@webmail.harte-lyne.ca>
Message-ID: <46729.216.185.71.24.1240263810.squirrel@webmail.harte-lyne.ca>
Found my mistake.
--
*** E-Mail is NOT a SECURE channel ***
James B. Byrne mailto:ByrneJB at Harte-Lyne.ca
Harte & Lyne Limited http://www.harte-lyne.ca
9 Brockley Drive vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada L8E 3C3
From adam.vandenhoven at gmail.com Mon Apr 20 17:47:34 2009
From: adam.vandenhoven at gmail.com (Adam van den Hoven)
Date: Mon, 20 Apr 2009 14:47:34 -0700
Subject: [Nokogiri-talk] Memory problem with Nokogiri and rack
In-Reply-To: <6959e1680904201426mfe1e0d6v9ada9806719e574d@mail.gmail.com>
References: <1240009268.6904.161.camel@vandenhoven>
<6959e1680904171658o540c2f8xc3261b196a7fb71b@mail.gmail.com>
<1240208855.7836.11.camel@vandenhoven>
<6959e1680904201426mfe1e0d6v9ada9806719e574d@mail.gmail.com>
Message-ID: <1240264054.7108.149.camel@vandenhoven>
Aaron,
I didn't intend to send it to the list but to Mike Dalessio directly
because it was too big.
For me, the problem arises when I request it a few times and rack
terminates. However if i quit rackup (^C) I see the error I reported.
--
Adam van den Hoven
Hybrid Web Developer
Little Fyr Media
p: 604.618.0845
e: adam.vandenhoven at gmail.com
w: http://www.littlefyr.com
On Mon, 2009-04-20 at 14:26 -0700, Aaron Patterson wrote:
> Hi Adam,
>
> On Sun, Apr 19, 2009 at 11:27 PM, Adam van den Hoven
> wrote:
> > synaptic says:
> >
> > 2.6.32.dfsg-4ubuntu1.1
>
> mailman blocked your last email because the attachment was too big.
>
> I downloaded the app and tried it out, but I can't seem to reproduce
> the problem.
>
> Does it happen every request, or just sometimes?
>
> Is the document you parse the same every time?
>
> Are you able to reproduce the bug in a stand alone ruby script?
>
From aaron.patterson at gmail.com Mon Apr 20 20:45:26 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Mon, 20 Apr 2009 17:45:26 -0700
Subject: [Nokogiri-talk] Memory problem with Nokogiri and rack
In-Reply-To: <1240264054.7108.149.camel@vandenhoven>
References: <1240009268.6904.161.camel@vandenhoven>
<6959e1680904171658o540c2f8xc3261b196a7fb71b@mail.gmail.com>
<1240208855.7836.11.camel@vandenhoven>
<6959e1680904201426mfe1e0d6v9ada9806719e574d@mail.gmail.com>
<1240264054.7108.149.camel@vandenhoven>
Message-ID: <6959e1680904201745h6cd70e4dlf0874cc6d4464a91@mail.gmail.com>
On Mon, Apr 20, 2009 at 2:47 PM, Adam van den Hoven
wrote:
> Aaron,
>
> I didn't intend to send it to the list but to Mike Dalessio directly
> because it was too big.
Cool, no problem. I was just letting you know.
> For me, the problem arises when I request it a few times and rack
> terminates. However if i quit rackup (^C) I see the error I reported.
Do you know if it's parsing the same document every single time?
--
Aaron Patterson
http://tenderlovemaking.com/
From adam.vandenhoven at gmail.com Tue Apr 21 10:55:03 2009
From: adam.vandenhoven at gmail.com (Adam van den Hoven)
Date: Tue, 21 Apr 2009 07:55:03 -0700
Subject: [Nokogiri-talk] Memory problem with Nokogiri and rack
In-Reply-To: <6959e1680904201745h6cd70e4dlf0874cc6d4464a91@mail.gmail.com>
References: <1240009268.6904.161.camel@vandenhoven>
<6959e1680904171658o540c2f8xc3261b196a7fb71b@mail.gmail.com>
<1240208855.7836.11.camel@vandenhoven>
<6959e1680904201426mfe1e0d6v9ada9806719e574d@mail.gmail.com>
<1240264054.7108.149.camel@vandenhoven>
<6959e1680904201745h6cd70e4dlf0874cc6d4464a91@mail.gmail.com>
Message-ID: <1240325703.7108.1179.camel@vandenhoven>
The example I sent is doing the same every time (the page itself is
pretty static). I don't know what other pages might have the problem
since it seems to be tied to some code I wrote to fix crappy and weird
content.
--
Adam van den Hoven
Hybrid Web Developer
Little Fyr Media
p: 604.618.0845
e: adam.vandenhoven at gmail.com
w: http://www.littlefyr.com
On Mon, 2009-04-20 at 17:45 -0700, Aaron Patterson wrote:
> On Mon, Apr 20, 2009 at 2:47 PM, Adam van den Hoven
> wrote:
> > Aaron,
> >
> > I didn't intend to send it to the list but to Mike Dalessio directly
> > because it was too big.
>
> Cool, no problem. I was just letting you know.
>
> > For me, the problem arises when I request it a few times and rack
> > terminates. However if i quit rackup (^C) I see the error I reported.
>
> Do you know if it's parsing the same document every single time?
>
From adam.vandenhoven at gmail.com Wed Apr 22 02:12:55 2009
From: adam.vandenhoven at gmail.com (Adam van den Hoven)
Date: Tue, 21 Apr 2009 23:12:55 -0700
Subject: [Nokogiri-talk] Memory problem with Nokogiri and rack
In-Reply-To: <618c07250904212230h69fc63fay96cb407ba8277b08@mail.gmail.com>
References: <1240009268.6904.161.camel@vandenhoven>
<6959e1680904171658o540c2f8xc3261b196a7fb71b@mail.gmail.com>
<1240208855.7836.11.camel@vandenhoven>
<6959e1680904201426mfe1e0d6v9ada9806719e574d@mail.gmail.com>
<1240264054.7108.149.camel@vandenhoven>
<6959e1680904201745h6cd70e4dlf0874cc6d4464a91@mail.gmail.com>
<1240325703.7108.1179.camel@vandenhoven>
<618c07250904212207n53af99b9y7edeb8e2cf08135f@mail.gmail.com>
<618c07250904212230h69fc63fay96cb407ba8277b08@mail.gmail.com>
Message-ID: <1240380775.7108.1592.camel@vandenhoven>
Its nice to know that I wasn't totally insane.
Its an interesting question, however, as to what the right behaviour is
when trying to add a node that already exists in a document to some
other location in the document.
There are 4 possible things one could do in this situation:
1) Clone the node
2) Unlink the node
3) Throw an exception
4) Silently do nothing
Now the first and last ones seem like bad ideas (yes they're straw
men).
I lean toward the third based on the idea that a method should do only
one thing. Right now, it is doing two in a way that it is not clear by
the method name.
Having said that, a set of "move_" methods might be in order such that
head.add_next_sibling( body.unlink )
is the same as
body.move_after( head )
I would then have move_after, move_before, move_to_start_of,
move_to_end_of
But that's just me. Thanks for the help!
--
Adam van den Hoven
Hybrid Web Developer
Little Fyr Media
p: 604.618.0845
e: adam.vandenhoven at gmail.com
w: http://www.littlefyr.com
On Wed, 2009-04-22 at 01:30 -0400, Mike Dalessio wrote:
> And, one last note, that issue is now fixed in master, and will be
> included in the next Nokogiri release.
>
>
> On Wed, Apr 22, 2009 at 1:07 AM, Mike Dalessio wrote:
> Hi Adam,
>
> You can prevent this crash from occurring by changing this
> line:
>
> head.add_next_sibling( body.unlink )
>
> to this:
>
> head.add_next_sibling( body )
>
> That is, remove the unlink() call. Node#add_next_sibling does
> this implicitly for you.
>
> That said, Nokogiri certainly should be handling this more
> gracefully than it is. I've opened a ticket on github issues
> for it:
>
> http://github.com/tenderlove/nokogiri/issues#issue/22
>
> -mike
>
>
>
>
> On Tue, Apr 21, 2009 at 10:55 AM, Adam van den Hoven
> wrote:
> The example I sent is doing the same every time (the
> page itself is
> pretty static). I don't know what other pages might
> have the problem
> since it seems to be tied to some code I wrote to fix
> crappy and weird
> content.
> --
> Adam van den Hoven
> Hybrid Web Developer
> Little Fyr Media
> p: 604.618.0845
> e: adam.vandenhoven at gmail.com
> w: http://www.littlefyr.com
>
>
>
>
>
> On Mon, 2009-04-20 at 17:45 -0700, Aaron Patterson
> wrote:
> > On Mon, Apr 20, 2009 at 2:47 PM, Adam van den Hoven
> > wrote:
> > > Aaron,
> > >
> > > I didn't intend to send it to the list but to Mike
> Dalessio directly
> > > because it was too big.
> >
> > Cool, no problem. I was just letting you know.
> >
> > > For me, the problem arises when I request it a few
> times and rack
> > > terminates. However if i quit rackup (^C) I see
> the error I reported.
> >
> > Do you know if it's parsing the same document every
> single time?
> >
>
>
>
> _______________________________________________
> Nokogiri-talk mailing list
> Nokogiri-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/nokogiri-talk
>
>
>
>
>
>
> --
> mike dalessio
> mike at csa.net
>
>
>
>
> --
> mike dalessio
> mike at csa.net
> _______________________________________________
> Nokogiri-talk mailing list
> Nokogiri-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/nokogiri-talk
From rubikitch at ruby-lang.org Sat Apr 25 14:55:53 2009
From: rubikitch at ruby-lang.org (rubikitch at ruby-lang.org)
Date: Sun, 26 Apr 2009 03:55:53 +0900 (JST)
Subject: [Nokogiri-talk] Encoding bug
Message-ID: <20090426.035553.93256266.rubikitch@ruby-lang.org>
Hi,
There is an encoding bug in Ruby 1.9.
# -*- coding: euc-jp -*-
require 'rubygems'
require 'nokogiri'
Nokogiri::VERSION # => "1.2.3"
euc = "?????"
nokogiri = Nokogiri(euc, nil, "EUC-JP")
x = nokogiri.at(:b).inner_text # => "\xE3\x81\x82\xE3\x81\x84\xE3\x81\x86\xE3\x81\x88\xE3\x81\x8A"
x.encoding # => #
require 'kconv'
Kconv.guess(x) # => #
x.force_encoding("UTF-8").encode("EUC-JP") # => "?????"
--
rubikitch
Blog: http://d.hatena.ne.jp/rubikitch/
Site: http://www.rubyist.net/~rubikitch/
From aaron.patterson at gmail.com Sat Apr 25 15:28:06 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Sat, 25 Apr 2009 12:28:06 -0700
Subject: [Nokogiri-talk] Encoding bug
In-Reply-To: <20090426.035553.93256266.rubikitch@ruby-lang.org>
References: <20090426.035553.93256266.rubikitch@ruby-lang.org>
Message-ID: <6959e1680904251228g3cc81d6ye4c0cd5d61357227@mail.gmail.com>
2009/4/25 :
> Hi,
>
> There is an encoding bug in Ruby 1.9.
>
> # -*- coding: euc-jp -*-
> require 'rubygems'
> require 'nokogiri'
> Nokogiri::VERSION # => "1.2.3"
> euc = "?????"
> nokogiri = Nokogiri(euc, nil, "EUC-JP")
> x = nokogiri.at(:b).inner_text # => "\xE3\x81\x82\xE3\x81\x84\xE3\x81\x86\xE3\x81\x88\xE3\x81\x8A"
> x.encoding # => #
>
> require 'kconv'
> Kconv.guess(x) # => #
> x.force_encoding("UTF-8").encode("EUC-JP") # => "?????"
What version of libxml2 are you using? I'm using 2.7.3, and here is my output:
# -*- coding: euc-jp -*-
require 'rubygems'
require 'nokogiri'
puts Nokogiri::VERSION # => "1.2.3"
puts Nokogiri::LIBXML_VERSION # => '2.7.3'
euc = "?????"
nokogiri = Nokogiri(euc, nil, "EUC-JP")
puts x = nokogiri.at(:b).inner_text # => "?????"
p x.encoding # => #
require 'kconv'
p Kconv.guess(x) # => #
p x.force_encoding("UTF-8").encode("EUC-JP") # => "??????????"
p x.encode("UTF-8") # => "?????"
http://skitch.com/aaronp/bpbmq/terminal-bash-80x24
--
Aaron Patterson
http://tenderlovemaking.com/
From rubikitch at ruby-lang.org Sun Apr 26 13:45:30 2009
From: rubikitch at ruby-lang.org (rubikitch at ruby-lang.org)
Date: Mon, 27 Apr 2009 02:45:30 +0900 (JST)
Subject: [Nokogiri-talk] Encoding bug
In-Reply-To: <6959e1680904251228g3cc81d6ye4c0cd5d61357227@mail.gmail.com>
References: <20090426.035553.93256266.rubikitch@ruby-lang.org>
<6959e1680904251228g3cc81d6ye4c0cd5d61357227@mail.gmail.com>
Message-ID: <20090427.024530.175560659.rubikitch@ruby-lang.org>
From: Aaron Patterson
Subject: Re: [Nokogiri-talk] Encoding bug
Date: Sat, 25 Apr 2009 12:28:06 -0700
> What version of libxml2 are you using? I'm using 2.7.3, and here is my output:
Me too. But the result is same.
> p x.encoding # => #
>
> require 'kconv'
> p Kconv.guess(x) # => #
It means encoding inconsistency.
--
rubikitch
Blog: http://d.hatena.ne.jp/rubikitch/
Site: http://www.rubyist.net/~rubikitch/
From aaron.patterson at gmail.com Sun Apr 26 14:33:34 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Sun, 26 Apr 2009 11:33:34 -0700
Subject: [Nokogiri-talk] Encoding bug
In-Reply-To: <20090427.024530.175560659.rubikitch@ruby-lang.org>
References: <20090426.035553.93256266.rubikitch@ruby-lang.org>
<6959e1680904251228g3cc81d6ye4c0cd5d61357227@mail.gmail.com>
<20090427.024530.175560659.rubikitch@ruby-lang.org>
Message-ID: <6959e1680904261133x4b3f964erac86fcd3f91e8ab9@mail.gmail.com>
On Sun, Apr 26, 2009 at 10:45 AM, wrote:
> From: Aaron Patterson
> Subject: Re: [Nokogiri-talk] Encoding bug
> Date: Sat, 25 Apr 2009 12:28:06 -0700
>
>> What version of libxml2 are you using? ?I'm using 2.7.3, and here is my output:
>
> Me too. But the result is same.
>
>> p x.encoding ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # => #
>>
>> require 'kconv'
>> p Kconv.guess(x) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# => #
>
> It means encoding inconsistency.
Can you put your test file in gist? I may not have set up the file correctly.
--
Aaron Patterson
http://tenderlovemaking.com/
From rubikitch at ruby-lang.org Sun Apr 26 15:23:41 2009
From: rubikitch at ruby-lang.org (rubikitch at ruby-lang.org)
Date: Mon, 27 Apr 2009 04:23:41 +0900 (JST)
Subject: [Nokogiri-talk] Encoding bug
In-Reply-To: <6959e1680904261133x4b3f964erac86fcd3f91e8ab9@mail.gmail.com>
References: <6959e1680904251228g3cc81d6ye4c0cd5d61357227@mail.gmail.com>
<20090427.024530.175560659.rubikitch@ruby-lang.org>
<6959e1680904261133x4b3f964erac86fcd3f91e8ab9@mail.gmail.com>
Message-ID: <20090427.042341.120037429.rubikitch@ruby-lang.org>
From: Aaron Patterson
Subject: Re: [Nokogiri-talk] Encoding bug
Date: Sun, 26 Apr 2009 11:33:34 -0700
>>> p x.encoding ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # => #
>>>
>>> require 'kconv'
>>> p Kconv.guess(x) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# => #
>>
>> It means encoding inconsistency.
>
> Can you put your test file in gist? I may not have set up the file correctly.
http://gist.github.com/102140
--
rubikitch
Blog: http://d.hatena.ne.jp/rubikitch/
Site: http://www.rubyist.net/~rubikitch/
From aaron.patterson at gmail.com Sun Apr 26 17:35:13 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Sun, 26 Apr 2009 14:35:13 -0700
Subject: [Nokogiri-talk] paginating xml doc fragments -- help?!
In-Reply-To:
References:
Message-ID: <6959e1680904261435yaab73d4l3282e298b6f77e9e@mail.gmail.com>
On Sun, Apr 26, 2009 at 2:07 PM, Matt Mitchell wrote:
> Hi,
>
> Tenderlove gave me some ideas on how I might do this, but I still couldn't
> figure it out. Basically, I have an xml doc that is broken up by "pb" tags.
> Each "pb" tag is a page-break, and any content after, but before the next
> (if any) pb tag is considered a "page". I've got an example source doc and
> the results that I want. Anyone out there wanna take a stab at this? I
> haven't been successful (obviously), so even a tip or hint as to how I'd go
> about solving this would be just completely sweet.
>
> Here is my example: http://pastie.org/458993
What went wrong with the SAX style parser we were talking about? Do
you actually need an XML document returned?
--
Aaron Patterson
http://tenderlovemaking.com/
From aaron.patterson at gmail.com Sun Apr 26 17:45:50 2009
From: aaron.patterson at gmail.com (Aaron Patterson)
Date: Sun, 26 Apr 2009 14:45:50 -0700
Subject: [Nokogiri-talk] Encoding bug
In-Reply-To: <20090427.042341.120037429.rubikitch@ruby-lang.org>
References: <6959e1680904251228g3cc81d6ye4c0cd5d61357227@mail.gmail.com>
<20090427.024530.175560659.rubikitch@ruby-lang.org>
<6959e1680904261133x4b3f964erac86fcd3f91e8ab9@mail.gmail.com>
<20090427.042341.120037429.rubikitch@ruby-lang.org>
Message-ID: <6959e1680904261445u38625729ya6301cb8af4caf75@mail.gmail.com>
On Sun, Apr 26, 2009 at 12:23 PM, wrote:
> From: Aaron Patterson
> Subject: Re: [Nokogiri-talk] Encoding bug
> Date: Sun, 26 Apr 2009 11:33:34 -0700
>
>>>> p x.encoding ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # => #
>>>>
>>>> require 'kconv'
>>>> p Kconv.guess(x) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# => #
>>>
>>> It means encoding inconsistency.
>>
>> Can you put your test file in gist? ?I may not have set up the file correctly.
>
> http://gist.github.com/102140
Are you sure the file is written with EUC-JP and not UTF-8?
[aaron at Jordan gist-102140 (master)]$ git status
# On branch master
nothing to commit (working directory clean)
[aaron at Jordan gist-102140 (master)]$ file 27-023913.rb
27-023913.rb: UTF-8 Unicode text
[aaron at Jordan gist-102140 (master)]$
~/.multiruby/install/1.9.1-p0/bin/ruby 27-023913.rb
27-023913.rb:7: invalid multibyte char (EUC-JP)
27-023913.rb:7: invalid multibyte char (EUC-JP)
[aaron at Jordan gist-102140 (master)]$
http://skitch.com/aaronp/bprm5/terminal-bash-80x24
--
Aaron Patterson
http://tenderlovemaking.com/