summaryrefslogtreecommitdiff
path: root/parser/html/java/htmlparser/ruby-gcj/README
diff options
context:
space:
mode:
authorMatt A. Tobin <email@mattatobin.com>2020-01-15 14:56:04 -0500
committerMatt A. Tobin <email@mattatobin.com>2020-01-15 14:56:04 -0500
commit6168dbe21f5f83b906e562ea0ab232d499b275a6 (patch)
tree658a4b27554c85ebcaad655fc83f2c2bb99e8e80 /parser/html/java/htmlparser/ruby-gcj/README
parent09314667a692fedff8564fc347c8a3663474faa6 (diff)
downloaduxp-6168dbe21f5f83b906e562ea0ab232d499b275a6.tar.gz
Add java htmlparser sources that match the original 52-level state
https://hg.mozilla.org/projects/htmlparser/ Commit: abe62ab2a9b69ccb3b5d8a231ec1ae11154c571d
Diffstat (limited to 'parser/html/java/htmlparser/ruby-gcj/README')
-rw-r--r--parser/html/java/htmlparser/ruby-gcj/README65
1 files changed, 65 insertions, 0 deletions
diff --git a/parser/html/java/htmlparser/ruby-gcj/README b/parser/html/java/htmlparser/ruby-gcj/README
new file mode 100644
index 0000000000..b368437f77
--- /dev/null
+++ b/parser/html/java/htmlparser/ruby-gcj/README
@@ -0,0 +1,65 @@
+Disclaimer:
+
+ This code is experimental.
+
+ When some people say experimental, they mean "it may not do what it is
+ intended to do; in fact, it might even wipe out your hard drive". I mean
+ that too. But I mean something more than that.
+
+ In this case, experimental means that I don't even know what it is intended
+ to do. I just have a vague vision, and I am trying out various things in
+ the hopes that one of them will work out.
+
+Vision:
+
+ My vague vision is that I would like to see HTML 5 be a success. For me to
+ consider it to be a success, it needs to be a standard, be interoperable,
+ and be ubiquitous.
+
+ I believe that the Validator.nu parser can be used to bootstrap that
+ process. It is written in Java. Has been compiled into JavaScript. Has
+ been translated into C++ based on the Mozilla libraries with the intent of
+ being included in Firefox. It very closely tracks to the standard.
+
+ For the moment, the effort is on extending that to another language (Ruby)
+ on a single environment (i.e., Linux). Once that is complete, intent is to
+ evaluate the results, decide what needs to be changed, and what needs to be
+ done to support other languages and environments.
+
+ The bar I'm setting for myself isn't just another SWIG generated low level
+ interface to a DOM, but rather a best of breed interface; which for Ruby
+ seems to be the one pioneered by Hpricot and adopted by Nokogiri. Success
+ will mean passing all of the tests from one of those two parsers as well as
+ all of the HTML5 tests.
+
+Build instructions:
+
+ You'll need icu4j and chardet jars. If you checked out and ran dldeps you
+ are already all set:
+
+ svn co http://svn.versiondude.net/whattf/build/trunk/ build
+ python build/build.py checkout dldeps
+
+ Fedora 11:
+
+ yum install ruby-devel rubygem-rake java-1.5.0-gcj-devel gcc-c++
+
+ Ubuntu 9.04:
+
+ apt-get install ruby ruby1.8-dev rake gcj g++
+
+ Also at this time, you need to install a jdk (e.g. sun-java6-jdk), simply
+ because the javac that comes with gcj doesn't support -sourcepath, and
+ I haven't spent the time to find a replacement.
+
+ Finally, make sure that libjaxp1.3-java is *not* installed.
+
+ http://gcc.gnu.org/ml/java/2009-06/msg00055.html
+
+ If this is done, you should be all set.
+
+ cd htmlparser/ruby-gcj
+ rake test
+
+ If things are successful, the last lines of the output will list the
+ font attributes and values found in the test/google.html file.