Leap2R: guidelines for using RDF to represent Leap2 information
This page will act as a focal point and index for an RDF approach equivalent to the Leap2A/specification. Initally this page will contain all relevant information, but the intention is soon to move the proposal, project, partners, and details to other pages, so that this page will give the summary vital information only -- a kind of quick reference guide with links to further information.
If you are interested, please go ahead and code up some RDF in any dialect, or RDFa in something, and we can discuss together how much sense it makes, and whether there are alternatives.
What is the idea of Leap2R?
The Leap2A/specification has been accepted as a reasonable approach to representing portfolio information, based on the Atom Syndication Format, for download and transfer between e-portfolio systems. Being based on Atom, it is relatively straightforward to implement, compared to other plausible specifications. But also, being tied to Atom, it cannot be extended to any other format, XML or otherwise. The idea of Leap2R is to open this up completely, and allow the same information to be represented in any other XML, RDF, or Semantic Web format.
One of the most interesting and promising options for other formats is RDFa. The principle behind RDFa, as microformats, is principally to allow structured, machine-processable information to be contained within and alongside human readable HTML or XHTML data, using the same content wherever possible. There is much documentation on RDFa, and the reader is referred to this, as it will not be duplicated here. In practice, this means that a single HTML or XHTML document could, simultaneously, be offered both to people to read, and to e-portfolio systems to import from -- the systems could process the information provided in a standard way, and input it directly into their database. This would be especially valuable for portfolio presentations, but could also be used by a portfolio holder to store their complete portfolio information in a way which was easily and directly readable by them.
The ultimate intention is to have examples illustrating the whole process. We need to construct example RDFa web pages; then show the RDF extracted; then show how that can be imported into e-portfolio systems. If you have anything to do with developing e-portfolio systems, you may be able to help, for example, by
- proposing example material to format in RDFa with Leap2R
- exploring RDFa Leap2R export and import, and feeding back from your experience to guide developer-friendly decisions for Leap2R
- planning the implementation of Leap2R export and import.
The same approach could in principle be used with any XML dialect, following the GRDDL concept. In all cases, this involves having a standard way of extracting RDF information from the HTML, XHTML or XML. This could prove very valuable in providing a bridge from other formats to the Leap2 family.
To facilitate any of these processes, we need to define types or classes of portfolio information, and relationships or properties or predicates relating them, and give them all URIs. As Leap2A was created with this in mind, this is not starting from scratch, but it still requires work. The outputs of this work are being assembled here, either on this page, or linked from this page. Please consider contributing, either directly, or by commenting.
The substantive item types or classes come from the types given in 2A/types. Other new classes are needed to represent necessary blank nodes.
|affiliation||affiliation||#activity||new concept now in Leap2A|
which is broader
|category||see categories||atom:category||Atom requires|
|datenode||see date||#valuenode||Leap requires|
|idnode||see id; id||OnlineAccount||#valuenode||Leap requires|
|location||see spatial||Location||Leap requires|
|partnode||see Whole-part||Leap requires|
|stage||see stage||#valuenode||Leap requires|
|valuenode||see label||Leap requires|
Specification of new required node types
The new node types required for RDF have these properties/relationships. All labels are text literals.
- #has line → #addressline
- #postcode → text literal
- #country → text literal
- #countrycode → 2 or 3 alphabetic characters: ISO 3166
Just as in atom:category
- #service → URI or name from vocabulary -- FOAF uses accountServiceHomepage
- #value → ID with that service -- FOAF uses accountName
- #label → describing service
Not implemented. Abstract class to allow for location information other than address: e.g. geo coordinates.
- #value → any literal, including text, URI, vocabulary terms
- #label → label with more meaning in context for user
These are the enumerated types that are the values of certain properties. They can in principle be represented by text strings, URIs, or possibly other means.
|gendertype||0 (not known)||1 (male)||2 (female)||9 (not specified)||see gender||From MIAP. FOAF uses literals.|
|contenttype||text||html||xhtml||see content||from atom:content|
Table of predicates or properties or relationships.
This is based on the tables at Leap2A/predicates, Leap2A/personal_data, Leap2A/organizational_data and possibly LEAP 2.0 predicates. Leap2A/personal data also has a useful older cross-reference table including vCard.
vCard itself is somewhat unusual: as the microformats community spurn namespaces, hCard does not have a namespace; while the Jabber/XMPP community's vCard-XML does not assign a URI for its namespace. W3C does, however, assign a namespace URI to vCard, which is the one given.
NOTE this table is very much under construction! Feedback gratefully received.
Need to fill in the column for iCal etc..
Atom itself is not designed for RDF use -- indeed, to distinguish it from RSS, many writers abjure RDF for Atom altogether. The atom: namespace is not ideal for use directly as a prefix, as it does not end with # or /. And many Atom structures -- even basic ones like atom:content -- have attributes (type, in this case) that have to be worked around somehow. Obviously, there are several ways to work around this: you could have a blank content node, with a type as an object, or you could have three new predicates, one for text content, one for html content, and one for xhtml content.
In essence, to convert to RDF, we have to try to forget native Atom, and just represent what is in a Leap2A feed, including the Atom, in a sensible way. The line which I personally favour at the moment is to introduce blank nodes wherever needed, to keep the correspondence with element names in Leap2A reasonably close, so that we have to define as few more as feasible.
As vCard was not even designed in any way for RDF or the Semantic Web, it is not surprising that it is sometimes hard to distinguish what is a type and what is a property. On the whole, the solution here is to interpret structured properties as types. ADR and ORG are understood that way. N would be a type, but as an agent is taken as having just one name, it is not needed, and the name parts are direct properties of the agent. However, if multiple structured names were useful, we would adopt a name node equivalent to N.
Street, Locality and Region are given as "#addressline" type: the addressline is composed of the label equivalent to the substructure name, plus the value.
UID and CATEGORIES pose problems. UID is a property of an agent, while in Leap2R an agent "#has account", where hopefully the service URL plus the identifier itself will be unique. To convert a vCard UID into a idnode, one would have to distinguish the service from the id with that service.
Guiding principles for the specification
- Leap2R is not to have a namespace of its own, but to use other namespaces, including the "leap2:" namespace from Leap2A.
- Leap2R is by nature a permissive specification. A set of information, or document, whose content is entirely covered by the documentation can be described as "Just Leap2R".
Leap2A constructs and their representation in RDF
Target triples are represented in this section as Turtle format. To get a valid Turtle file, you have to prefix the triples with the lines corresponding to any prefixes used:
@prefix portfolio: <http://www.example.ac.uk/interop/atom.aspx/> . @prefix leap2: <http://terms.leapspecs.org/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix categories: <http://wiki.leapspecs.org/2A/categories/> (or something else). @prefix atom: <http://www.w3.org/2005/Atom> . @prefix xsd: <http://www.w3.org/2001/XMLSchema> .
This is a basic bare Leap2A entry with no relationships.
Example target triples
portfolio:reflexion/1357 a leap2:entry . portfolio:reflexion/1357 atom:updated "2009-03-15T14:33:12Z"^^xsd:dateTime . portfolio:reflexion/1357 atom:type "text" . portfolio:reflexion/1357 atom:content """this content can split over several lines without any problems.""" .
- Consider making participation / creation explicit, by having explicit relationships between the portfolio holder and the recorded items.
- Add agent to Leap2A development track.
- When getting on to recommending RDFa practice, check bookmarks from recent discussions particularly about CURIEs and alternatives.