[Wtr-general] W3C validation problems
Cain, Mark
Mark_Cain at rl.gov
Mon Apr 3 17:18:18 EDT 2006
You should be able to do:
ie = IE.new
ie.goto 'http://validator.w3.org/'
ie.html
--Mark
-----Original Message-----
From: wtr-general-bounces at rubyforge.org [mailto:wtr-general-bounces at rubyforge.org] On Behalf Of Jørgen Bang Erichsen
Sent: Monday, April 03, 2006 2:01 PM
To: wtr-general at rubyforge.org
Subject: [Wtr-general] W3C validation problems
Hi,
Inspired by
http://redgreenblu.com/svn/projects/assert_valid_markup/lib/assert_valid_markup.rb
I would like to have an easy way to validate the html on the page IE is
currently showing. Unfortunately, I have a problem with the html that
ie.document.body.parentelement.outerhtml outputs :-(
Take a look at the following example:
require 'test/unit'
require 'watir'
require 'net/http'
require 'cgi'
require 'xmlsimple'
class ValidationExample < Test::Unit::TestCase
include Watir
def test_w3c_validate
ie = IE.new
ie.goto 'validator.w3.org/'
html = ie.document.body.parentelement.outerhtml
response = Net::HTTP.start('validator.w3.org').post2('/check', "fragment=#{CGI.escape(html)}&output=xml")
markup_is_valid = response['x-w3c-validator-status']=='Valid'
message = markup_is_valid ? '' : XmlSimple.xml_in(response.body)['messages'][0]['msg'].collect{ |m| "Invalid markup: line #{m['line']}: #{CGI.unescapeHTML(m['content'])}" }.join("\n")
assert markup_is_valid, message
ie.close
end
end
When I run the example I get stuff like:
Invalid markup: line 1: no document type declaration; implying "<!DOCTYPE HTML SYSTEM>"
Invalid markup: line 1: there is no attribute "XML:LANG"
Invalid markup: line 1: there is no attribute "XMLNS"
The html returned by ie.document.body.parentelement.outerhtml is
<HTML lang=en xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">
<HEAD>
<TITLE>The W3C Markup Validation Service</TITLE>
<LINK rev=made href="mailto:www-validator at w3.org">
<LINK title="Home Page" rev=start href="./">
but if I view the source from IE itself it is something like
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>The W3C Markup Validation Service</title>
<link rev="made" href="mailto:www-validator at w3.org" />
<link rev="start" href="./" title="Home Page" />
...
The DOCTYPE line and several quotes are missing. Is there any
way to get the unmodified html for the current page?
If people are doing automatic validation any other way I am open
to suggestions.
Best regards,
Jørgen
_______________________________________________
Wtr-general mailing list
Wtr-general at rubyforge.org
http://rubyforge.org/mailman/listinfo/wtr-general
More information about the Wtr-general
mailing list