Extracting and reifying RDF from XML

Serializing RDF and edge labelled graphs in XML

One of the issues in serializing RDF to XML, and extracting RDF from arbitrary or colloquial XML is that the XML object model (DOM) is a node labelled graph and the RDF object model forms an edge labelled graph.

Several mechanism have been proposed to simplifiy the RDF syntax: After implementing Sergey Melnik's simplified RDF syntax using Tim Connolly's rdfp.xsl as a base, I have subsequently implemented Tim Berniers-Lee's strawman syntax:

"The major difference between this syntax and RDF 1.0 M&S is that RDF edges correspond to elements, and RDF nodes are implicit. It is basically as the M&S syntax with parseType=resourceis a default."

This proposal, with its attendant implementation has the following properties:

  1. Uses rdf:parseType='Resource' as default
  2. Does not add to current rdf vocabulary
  3. Implements XLink2RDF proposal (now with extended links)
  4. Implements rdf:aboutEach, rdf collections and bagID
  5. Transforms to <rdf:Statement><rdf:predicate .../>...</rdf:Statement> form
  6. Transforms colloqial XML into RDF Statements
  7. *** Transformation of the output of a transformation results in reification

The XSLT implementation

Example XML document using simplified RDF syntax


<t:person
rdf:about="http://www.openhealth.org/people/JohnDoe.xml"
xmlns:t="http://www.openhealth.org/types">
<t:name rdf:type="PersonName">           
<t:first>John</t:first>
<t:last>Doe</t:last>
</t:name>
<t:pid t:entity="NEMC">123-45-6789</t:pid>
<t:SSN>000-11-1234</t:SSN>
<t:patient rdf:type="Role">
<t:primary-care-physician rdf:resource=".../DrJones.xml" />
</t:patient>
<t:address rdf:type="Address" loc="home">
<t:street>750 Washington Street</t:street>
<t:city>Boston</t:city>
<t:state>MA</t:state>
</t:address>
</t:person>
             

And transformed via rdfExtractify:

Here is the reformatted HTML snippet:


          <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="http://www.openhealth.org/people/JohnDoe.xml"/>
              <rdf:object rdf:resource="http://www.openhealth.org/types#person"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.openhealth.org/types#name"/>
              <rdf:subject rdf:resource="http://www.openhealth.org/people/JohnDoe.xml"/>
              <rdf:object rdf:resource="#/1/1"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/1"/>
              <rdf:object rdf:resource="PersonName"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.openhealth.org/types#first"/>
              <rdf:subject rdf:resource="#/1/1"/>
              <rdf:object>John</rdf:object>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.openhealth.org/types#last"/>
              <rdf:subject rdf:resource="#/1/1"/>
              <rdf:object>Doe</rdf:object>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.openhealth.org/types#pid"/>
              <rdf:subject rdf:resource="http://www.openhealth.org/people/JohnDoe.xml"/>
              <rdf:object rdf:resource="#/1/2"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/2"/>
              <rdf:object rdf:resource="http://www.openhealth.org/types#pid"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.openhealth.org/types#entity"/>
              <rdf:subject rdf:resource="#/1/2"/>
              <rdf:object>NEMC</rdf:object>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#value"/>
              <rdf:subject rdf:resource="#/1/2"/>
              <rdf:object>123-45-6789</rdf:object>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.openhealth.org/types#SSN"/>
              <rdf:subject rdf:resource="http://www.openhealth.org/people/JohnDoe.xml"/>
              <rdf:object>000-11-1234</rdf:object>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.openhealth.org/types#patient"/>
              <rdf:subject rdf:resource="http://www.openhealth.org/people/JohnDoe.xml"/>
              <rdf:object rdf:resource="#/1/4"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/4"/>
              <rdf:object rdf:resource="Role"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.openhealth.org/types#primary-care-physician"/>
              <rdf:subject rdf:resource="#/1/4"/>
              <rdf:object rdf:resource="http://www.openhealth.org/people/DrJones.xml"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="http://www.openhealth.org/people/DrJones.xml"/>
              <rdf:object rdf:resource="http://www.openhealth.org/types#primary-care-physician"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.openhealth.org/types#address"/>
              <rdf:subject rdf:resource="http://www.openhealth.org/people/JohnDoe.xml"/>
              <rdf:object rdf:resource="#/1/5"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/5"/>
              <rdf:object rdf:resource="Address"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="file:/D:/rdf/test.xml#loc"/>
              <rdf:subject rdf:resource="#/1/5"/>
              <rdf:object>home</rdf:object>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.openhealth.org/types#street"/>
              <rdf:subject rdf:resource="#/1/5"/>
              <rdf:object>750 Washington Street</rdf:object>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.openhealth.org/types#city"/>
              <rdf:subject rdf:resource="#/1/5"/>
              <rdf:object>Boston</rdf:object>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.openhealth.org/types#state"/>
              <rdf:subject rdf:resource="#/1/5"/>
              <rdf:object>MA</rdf:object>
            </rdf:Statement>
          </rdf:RDF>
      

And the reified result:


          <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/1"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate"/>
              <rdf:subject rdf:resource="#/1/1"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#subject"/>
              <rdf:subject rdf:resource="#/1/1"/>
              <rdf:object rdf:resource="http://www.openhealth.org/people/JohnDoe.xml"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#object"/>
              <rdf:subject rdf:resource="#/1/1"/>
              <rdf:object rdf:resource="http://www.openhealth.org/types#person"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/2"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate"/>
              <rdf:subject rdf:resource="#/1/2"/>
              <rdf:object rdf:resource="http://www.openhealth.org/types#name"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#subject"/>
              <rdf:subject rdf:resource="#/1/2"/>
              <rdf:object rdf:resource="http://www.openhealth.org/people/JohnDoe.xml"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#object"/>
              <rdf:subject rdf:resource="#/1/2"/>
              <rdf:object rdf:resource="#/1/1"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/3"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate"/>
              <rdf:subject rdf:resource="#/1/3"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#subject"/>
              <rdf:subject rdf:resource="#/1/3"/>
              <rdf:object rdf:resource="#/1/1"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#object"/>
              <rdf:subject rdf:resource="#/1/3"/>
              <rdf:object rdf:resource="PersonName"/>
            </rdf:Statement>
            …
      

            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#object"/>
              <rdf:subject rdf:resource="#/1/13"/>
              <rdf:object rdf:resource="http://www.openhealth.org/people/DrJones.xml"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/14"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate"/>
              <rdf:subject rdf:resource="#/1/14"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#subject"/>
              <rdf:subject rdf:resource="#/1/14"/>
              <rdf:object rdf:resource="http://www.openhealth.org/people/DrJones.xml"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#object"/>
              <rdf:subject rdf:resource="#/1/14"/>
              <rdf:object rdf:resource="http://www.openhealth.org/types#primary-care-physician"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/15"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate"/>
              <rdf:subject rdf:resource="#/1/15"/>
              <rdf:object rdf:resource="http://www.openhealth.org/types#address"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#subject"/>
              <rdf:subject rdf:resource="#/1/15"/>
              <rdf:object rdf:resource="http://www.openhealth.org/people/JohnDoe.xml"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#object"/>
              <rdf:subject rdf:resource="#/1/15"/>
              <rdf:object rdf:resource="#/1/5"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/16"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate"/>
              <rdf:subject rdf:resource="#/1/16"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#subject"/>
              <rdf:subject rdf:resource="#/1/16"/>
              <rdf:object rdf:resource="#/1/5"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#object"/>
              <rdf:subject rdf:resource="#/1/16"/>
              <rdf:object rdf:resource="Address"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/17"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate"/>
              <rdf:subject rdf:resource="#/1/17"/>
              <rdf:object rdf:resource="file:/D:/rdf/test.xml#loc"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#subject"/>
              <rdf:subject rdf:resource="#/1/17"/>
              <rdf:object rdf:resource="#/1/5"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#object"/>
              <rdf:subject rdf:resource="#/1/17"/>
              <rdf:object>home</rdf:object>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/18"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate"/>
              <rdf:subject rdf:resource="#/1/18"/>
              <rdf:object rdf:resource="http://www.openhealth.org/types#street"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#subject"/>
              <rdf:subject rdf:resource="#/1/18"/>
              <rdf:object rdf:resource="#/1/5"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#object"/>
              <rdf:subject rdf:resource="#/1/18"/>
              <rdf:object>750 Washington Street</rdf:object>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/19"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate"/>
              <rdf:subject rdf:resource="#/1/19"/>
              <rdf:object rdf:resource="http://www.openhealth.org/types#city"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#subject"/>
              <rdf:subject rdf:resource="#/1/19"/>
              <rdf:object rdf:resource="#/1/5"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#object"/>
              <rdf:subject rdf:resource="#/1/19"/>
              <rdf:object>Boston</rdf:object>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"/>
              <rdf:subject rdf:resource="#/1/20"/>
              <rdf:object rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate"/>
              <rdf:subject rdf:resource="#/1/20"/>
              <rdf:object rdf:resource="http://www.openhealth.org/types#state"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#subject"/>
              <rdf:subject rdf:resource="#/1/20"/>
              <rdf:object rdf:resource="#/1/5"/>
            </rdf:Statement>
            <rdf:Statement>
              <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#object"/>
              <rdf:subject rdf:resource="#/1/20"/>
              <rdf:object>MA</rdf:object>
            </rdf:Statement>
          </rdf:RDF>
      

error-file:TidyOut.log

Comments are welcome

Jonathan Borden

jonathan@openhealth.org

September 21, 2000