The first step would be to get/use the plain HTML semantically right. In case of (X)HTML5 you should build an appropriate outline using the sectioning content elements section
, article
, aside
and nav
, and use header
and footer
to separate the metadata content from the main content; also think of inline-level semantics like time
(publication date), dfn
(definitions), abbr
(abbreviations/acronyms) etc. And make use of meta
-name
and rel
values that are defined in the spec.
The second step would be to make use of metadata attribute values that are not defined in the specification, but are registered at specified places (so they are valid to use), like name
keywords for meta
elements and rel
values for a
/area
/link
elements.
The third step would be to enhance the markup with semantic, machine-readable annotations. There are three common ways to do this:
- Microformats (using pre-defined
class
and rel
values)
- RDFa (using attributes and URIs)
- Microdata (using attributes and URIs)
RDFa and Microdata are similar (both extensible and rather complex), while Microformats is simpler (but not so expressive/extensible). I wrote a short answer over at Programmers about the differences, and more detailed answer about the differences between Microdata and RDFa.
In the case of RDFa or Microdata, your main job would be to find vocabularies/ontologies that are able to describe/classify your content. Such vocabularies can be created by everyone (you could even create one yourself), but it's often advisable to use well-known/popular ones, for example so that search engines can make use of your annotations (popular example: Schema.org).
In the case of Microformats, you'd have to find a Microformat (on the wiki at microformats.org) that suits your needs. If there is none for your case, you could propose a new Microformat (but that would take some time until it gets "accepted", if at all).
Is HTML5 a reasonable choice if I want to be so picky about metadata, or should I be using an XML doctype?
You could also use XHTML5, if you need/want XML support. If you "only" use the (X)HTML defined in the specification and no additional XML schemas/vocabularies, it won't matter from a semantic perspective if you use HTML(5) or XHTML(5).
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…