How to cite a blog post using HTML microdata and schema.org?

1.8k views Asked by At

My goal is to cite a blog post by using HTML microdata.

How can I improve the following markup for citations?

I am seeking improvements on the syntax and semantics, to produce a result that works well with HTML5 standards, renders well in current browsers, and parses well in search engines.

The bounty on this question is for expert advice and guidance. My research is turning up many opinions and snippets, so I'm seeking clear answers, complete samples, and canonical documentation.

This is my work in progress and I'm seeking advice on it's correctness:

  1. Use <div class="citation"> to wrap everything.

  2. Use <article> with itemscope and BlogPost to wrap the post info including its nested info.

  3. Use <header> and <h1 itemprop="headline"> to wrap the post name link.

  4. Use <cite> to wrap the post name.

  5. Use <footer> to wrap the author info and blog info.

  6. Use <address> to wrap the author link and name.

  7. Use rel="author" to annotate the link to the author's name.

  8. Use itemprop="isPartOf" to connect the post to the blog.

This is my work in progress HTML source:

<!-- Hello World authored by Alice posted on Bob's blog -->
<div class="citation">
  <article itemscope itemtype="http://schema.org/BlogPosting">
    <header>
      <h1 itemprop="headline">
        <a itemprop="url" href="…">
          <cite itemprop="name">Hello World</cite>
        </a>
      </h1>
    </header>
    <footer>
      authored by
      <span itemprop="author" itemscope itemtype="http://schema.org/Person">
        <address>
          <a itemprop="url" rel="author" href="…">
            <span itemprop="name">Alice</span>
          </a>
        </address>
      </span>
      posted on
      <span itemprop="isPartOf" itemscope itemtype="http://schema.org/Blog">
        <a itemprop="url" href="…">
          <span itemprop="name">Bob's blog</span>
        </a>
      </span>
    </footer>
  </article>
</div>

Related notes thus far:

  • The <cite> tag W3 reference says the tag is "phrase level", so it works like an inline span, not a block div. But the <article> tag seems to benefit from using <h1>, <header>, <footer>. As best I can tell, the spec does not give a solution for citing an article by using <cite> to wrap <article>. Is there a solution to this or a workaround? (The work in progress fudges this by using <div class="citation">)

  • The <address> tag W3 reference says the content "The address element must not be used to represent arbitrary addresses, unless those addresses are in fact the relevant contact information." As best I can tell, the spec does not give a solution for marking the article's author's URL and name, as distinct from the article's contact info. Is there a solution for this or a workaround? (The work in progress fudges this by using <address> for the author's URL and name)

Please ask questions in the comments. I will update this post as I learn more.

1

There are 1 answers

0
unor On BEST ANSWER

If you’d ask me which markup to use for a list of links to blog posts (OP’s context), without seeing your example, I’d go with something like this:

<body itemscope itemtype="http://schema.org/WebPage">

<ul>
  <li>
    <cite itemprop="citation" itemscope itemtype="http://schema.org/BlogPosting">
      <a href="…" itemprop="url" rel="external"><span itemprop="name headline">Hello World</span></a>, 
      authored by <span itemprop="author" itemscope itemtype="http://schema.org/Person"><a href="…" itemprop="url" rel="external"><span itemprop="name">Alice</span></a></span>,
      posted on <span itemprop="isPartOf" itemscope itemtype="http://schema.org/CreativeWork"><a href="…" itemprop="url" rel="external"><span itemprop="name">Bob’s blog</span></a></span>.
    </cite>
  </li>
  <li>
    <cite itemprop="citation" itemscope itemtype="http://schema.org/BlogPosting">…</cite>
  </li>
</ul>

</body>

Using the sectioning content element article, like in your example, is certainly possible, although perhaps unusual (if I understand your use case correctly): As article is a sectioning content element, it creates an entry in the document outline, which may or may not be what you want for your case. (You can check the outline with the HTML5 Outliner, for example.)

Another indication that a sectioning content element might not be the best choice: Your article doesn’t contain any actual "main" content. Simply said, the main content of a sectioning content element could be determined by stripping its metadata: header, footer, and address elements. (This is not a explicitly specified, but it follows from the defintions in Sections.)

However, not having this content is not wrong. And one could easily imagine (and maybe you intend to do so anyway) that you’ll quote a snippet from the blog post (making this case similar to a search result entry), in which case you’d have:

<article>
  <header></header>
  <blockquote></blockquote> <!-- the non-metadata part of the article -->
  <footer></footer>
</article> 

I’ll further on assume that you want to use article.

Notes about your HTML5:

  • Semantically, the wrapping div is not needed. You could add the citation class to the article directly.

  • The header element is optional if it just contains a heading element (this element makes sense when your header consists of more than just a heading element). However, having it is not wrong, of course.

  • I’d prefer to include the a element in the cite element (similar to the fifth example in the spec).

  • The span element can only contain phrasing content, so address isn’t allowed as a child.

  • The address element must only be used if it contains contact information. So if this element is appropriate depends on what is available at the linked page: if it’s a contact form, yes; if it’s a list of authored blog posts, no.

  • The author link type might not be appropriate, as it’s defined to give information about the author of the article element. But, strictly speaking, you are the author. If the article would consist only of the blog post author’s actual content, using the author link type would be appropriate; but in your case, you are writing the content ("authored by", "posted on").

  • You might want to use the external link type for all external links.

Notes about your Microdata:

Taking your example, this would give:

<body itemscope itemtype="http://schema.org/WebPage">

<article itemprop="citation" itemscope itemtype="http://schema.org/BlogPosting" class="citation">
    <header>
      <h1>
          <cite itemprop="headline name"><a itemprop="url" href="…" rel="external">Hello World</a></cite>
      </h1>
    </header>
    <footer>
      authored by
      <span itemprop="author" itemscope itemtype="http://schema.org/Person">
          <a itemprop="url" href="…" rel="external"><span itemprop="name">Alice</span></a>
      </span>
      posted on
      <span itemprop="isPartOf" itemscope itemtype="http://schema.org/Blog">
        <a itemprop="url" href="…" rel="external"><span itemprop="name">Bob’s blog</span></a>
      </span>
    </footer>
</article>

</body>

(All things considered, I still prefer the section-less variant.)