Quotations and citations: quoting text

When quoting text in HTML, there are several ways of marking up the quoted text. Which way you choose depends on what you’re quoting, how you’re quoting it, and how important it is for you that all browsers render the quotations the same way.

The available quotation elements aren’t really used a lot outside of web standards blogs. A probable reason, besides people not knowing about them, is less than perfect browser support for some features of the q element. That doesn’t mean it can’t be used though, so let’s take a look at how and when to use what quotation element.

Inline quotes: the q element

When the text you’re quoting is short enough to be displayed inline (as part of a sentence), the q element should be used. All you need to do is wrap the quoted text in a q element and web browsers automatically render quotation marks before and after the content of the q element.

Or rather, that’s what they’re supposed to do. Unfortunately, not all browsers do (hello again, IE/Win), and of the browsers that do, most don’t get it right.

All modern browsers recognise the q element and insert some form of quotation marks. Great. But there is a problem. Browsers should insert the quotation mark characters appropriate for the language specified by the lang and/or xml:lang attributes for the whole document or part of it. That’s right, different languages use different kinds of quotation marks.

Furthermore, when q elements are nested, browsers should automatically insert the correct, language specific, quotation marks for nested quotations.

As far as I know, no current browser gets all that right. Amazingly, of all the browsers that I have tested in, the one that comes closest is Internet Explorer. The Mac version, that is. It inserts the correct typographical quotation marks, even for nested quotations. It also renders different quotation marks for some languages, like French. All other browsers just insert straight double quotation marks ("), even around nested quotations.

A typographical sidenote here: the straight quotation marks aren’t really quotation marks at all. They’re called prime and double prime, are sometimes referred to as dumb quotes, and were made popular by being the only characters looking anything like quotation marks on mechanical typewriters. Yes, they are. I stand corrected. After spending some time looking through the code charts at Unicode, I found them. Thanks, Joe, for pointing it out (comment #12). Looks like Bringhurst, whose book I used to check my facts for this, is indeed wrong here.

So what is a poor markup purist to do? Give up on the q element? Manually insert quotation marks? Not yet. There is a way to improve the situation a bit.

By applying a bit of CSS you can make several browsers insert the correct quotation marks:

  1. q {
  2. quotes:"\201C" "\201D" "\2018" "\2019";
  3. }
  4. q:before {
  5. content:open-quote;
  6. }
  7. q:after {
  8. content:close-quote;
  9. }

The first rule defines a list of quotation mark pairs to use. The characters in the first pair are for top level quotations, and the next two are used for quotations within quotations. You’re not limited to two levels; provide as many pairs as you wish. You can also use any characters; just enter a backslash and then the hexadecimal entity code for the character you want. Of course, if you’re using utf-8 there is no need for entities.

The next two rules tell browsers to use the quotation marks defined in the first rule. More info on the CSS used for this can be found in Specifying quotes with the ‘quotes’ property.

Things aren’t perfect now, but better. Opera displays the quotation marks defined by the CSS for both levels of quotations. Firefox (and other Gecko based browsers) only use the first pair of quotation marks for all levels of quotations. Safari is a bit stubborn, and displays double dumb quotes for all levels. Not perfect, but better than nothing.

If you quote text in languages that use different quotation marks than your document’s main language, CSS can help you out with that too:

  1. q:lang(en) {
  2. quotes: "\201C" "\201D" "\2018" "\2019";
  3. }
  4. q:lang(sv) {
  5. quotes: "\201D" "\201D" "\2019" "\2019";
  6. }

This allows you to specify different characters to use as quotation marks for different languages. Browser support is a bit lacking, so an alternative way would be to use an attribute selector:

  1. q[lang|="en"] {
  2. quotes: "\201C" "\201D" "\2018" "\2019";
  3. }

The above will match any q element that has a lang attribute whose value begins with “en”. Again, not all browsers do this. In most cases there really isn’t any need to specify quotation marks for different languages like this anyway, unless you have a document with whole paragraphs in different languages. The quotation rules for the dominant language should be used, not those for the quoted language.

With modern browsers reasonably taken care of, in the sense that they render some kind of quotation marks around the content of q elements, there’s still IE/Win, which doesn’t render any quotation marks at all. There’s no way to make it do that automatically (well I suppose you could use some clever JavaScript, but I’ll leave that to someone else), but you can still make it display inline quotations in italics, which I think is a reasonable tradeoff:

  1. /* Hide from IE5-mac \*/
  2. * html q {
  3. font-style:italic;
  4. }
  5. /* End hiding from IE5-Mac */

In this example I used the Mac hack in combination with the Tan hack (more info on those hacks) to hide the rule from all browsers except IE/Win. Use your favourite way of sending IE/Win a separate stylesheet.

Here’s an example paragraph of nested inline quotations, with the CSS applied:

The W3C HTML 4.01 specification states that the presentation of phrase elements depends on the user agent.

Blockquotes: the blockquote element

For quoting longer passages of text that make up one or more complete paragraphs, or are otherwise not suitable to include inline, use the blockquote element.

Quotation marks in blockquote elements are not automatically inserted by browsers. The spec does not require them to either, though it encourages implementors to provide ways of doing so. You’re free to insert your own, or otherwise style the blockquote to make it look like a quote.

One thing that’s important to remember about the blockquote element is that it can only contain block level elements (easy to remember: blockquote – block elements) when you use a Strict DOCTYPE. This means that the text it contains must be enclosed in a block level element, which in most cases will be one or more p element(s).

Here’s how I’ve styled blockquotes:

The following sections discuss issues surrounding the structuring of text. Elements that present text (alignment elements, font elements, style sheets, etc.) are discussed elsewhere in the specification. For information about characters, please consult the section on the document character set.

Referencing: the cite attribute

Both the q and the blockquote elements can take an optional cite attribute. The value of the cite attribute is a URI that points to the source of the quotation. The source of my example blockquote looks like this:

  1. <blockquote cite="http://www.w3.org/TR/1999/REC-html401-19991224/ struct/text.html">

The contents of the cite attribute aren’t normally displayed by browsers (though in Mozilla/Firefox, if you right click a blockquote and select “Properties”, the URI used in the cite attribute will be displayed, along with the language specified for the blockquote), but there are ways of using CSS to display the URI. This could be useful to have in a print style sheet, to reveal any cite attributes.

The following sections discuss issues surrounding the structuring of text. Elements that present text (alignment elements, font elements, style sheets, etc.) are discussed elsewhere in the specification. For information about characters, please consult the section on the document character set.

The above blockquote uses the following CSS:

  1. blockquote.example[cite]:after {
  2. content: "URI: " attr(cite);
  3. border-top:1px dotted #999;
  4. padding-top:0.25em;
  5. display:block;
  6. color:#000;
  7. }

This works in Mozilla/Firefox, Safari, and Opera. Since this doesn’t provide any essential information or break things in other browsers, I think it’s a nice case of forward enhancement.

Citing: the cite element

While not technically a quotation, the cite element falls very close. It’s commonly used to markup the names of external references, like the title of a book or a movie. The HTML 4.01 spec states that cite Contains a citation or a reference to other sources, which leaves room for a bit of interpretation. The consensus seems to be to use it to markup titles of movies, books, and the like. The cite element can only contain inline elements, so no paragraphs. An example:

A couple of my favourite science fiction epics are the Night’s Dawn Trilogy by Peter F. Hamilton and Isaac Asimov’s Foundation series.

You can also use it in combination with a blockquote element to provide a reference to the source of the quotation. If that reference is an important part of the content, this is probably a better method than the CSS technique described above. An example:

The following sections discuss issues surrounding the structuring of text. Elements that present text (alignment elements, font elements, style sheets, etc.) are discussed elsewhere in the specification. For information about characters, please consult the section on the document character set.

URI: Paragraphs, Lines, and Phrases.

The XHTML for that:

  1. <div>
  2. <blockquote cite="URI">
  3. <p>Quotation text…</p>
  4. </blockquote>
  5. <div>
  6. URI: <cite><a href="URI">Paragraphs, Lines, and Phrases</a></cite>.
  7. </div>
  8. </div>

In this example I grouped the blockquote and cite elements by wrapping them in a div to be able to treat them as a unit.

Conclusion

Phew. This ended up being much longer than I had originally intended. If you’ve skipped straight to the end, here’s the short version:

  • Use the q element for inline quotations. Do not add any quotation marks to the markup. For greater control over styling and typography, use CSS. Due to lacking browser support for the q element and the generation of quotation marks, this isn’t an exact science. Use your own judgement.
  • Use the blockquote element for quoting paragraphs of text. Make sure to enclose the quoted text in a block level element like p if you’re using a strict DOCTYPE.
  • Use the cite element to markup the names of external references like movies or books.

Posted on November 24, 2004 in (X)HTML, Accessibility, Web Standards

Comments

  1. Thx a lot for the browser research, Roger. Already bookmarked this article. This is one of those articles that - if someone asks about quotes - one can always refer to.

  2. There are some really good tips here, especially the user of content:attr() to show the citation source.

    What I think I’d do is use that for compliant browsers and then use an IE conditional (yes, dirty, I know :) ) to include Simon Willison’s blockquote JS so it only gets pulled in when absolutely necessary.

  3. It’s the Q element and the <q> start-tag. The part about the CITE element is almost completely wrong. You can only use it when you refer to the location you got the quote from. As in: “<cite>Person Foo</cite> tell us to ‘insert quote’”.

    You might want to mention ADDRESS in a revision of the post. See the mozilla.org Markup Reference for more information.

  4. That’s an excellent article Roger and it answers a few questions I’ve been asking about quotations.

    It’s great to see all the methods of quotation dealt with in one article. Thank you.

  5. November 24, 2004 by Roger Johansson (Author comment)

    Anne: Where does it say that the cite element can only be used that way? There is an example in the HTML 4 spec that’s probably wrong if that is the case.

    And yes, perhaps it would be more correct to refer to elements not surrounded by < >. I’ll consider changing that. I’ll stick to lower case though.

    The address element could probably be used here too. I’ll take a look at the link you provided once that server starts responding.

  6. From Paragraphs, lines and phrases :

    CITE - Contains a citation or a reference to other sources.

    This definition is concurrent with the English language in that, to cite is to give reference with the intent of proving or distinguishing evidence.

    In day to day language, we cite when we refer to anything:
    “I hate crappy Eminem copies like The Streets.” Where “The Streets” is the object which was cited.

    Whilst the Mozilla Markup Reference is a very nice tool, it is strange in how it suggests certain elements:
    <div class=”quote”> for a long quotation? how bizarre. why not <p class=”quote”>. I’m not saying that it is wrong, they are suggestions from a good source (which is what it is meant to be). Not definitive though and I know that people often find their own ways of being semantically creative.

    When it comes to using the ADDRESS element… Well I think the definition is a little vague. From The global structure of an HTML document:

    The ADDRESS element may be used by authors to supply contact information for a document or a major part of a document such as a form. This element often appears at the beginning or end of a document.

    Not to say that Mozilla is wrong Anne but to say that people have different interpretations when they use elements. It reminds me of something Mark Pilgrim once said in Won’t somebody please think of the gerbils?. I’m interested to see how future HTML implementations will pan out with regards to how they handle citations. I know you (Anne) are a big WHATWG advocate and I’m working with people who are big XHTML2 advocates. I’m sitting somewhere at the back watching and waiting with great interest. Of course I’m getting off topic now. Shutting up.

    Oh and, top resource Roger. Not wanting to sound like a brown-nose but these are the sorts of things which need to be available on thowd internet. I have enjoyed reading it. If you’d been within earshot you’d have heard “I do that!” quite a lot.

  7. Thanks for this article, I’ve just implemented the blockquote tips in cunjunction with the Simon Willisons blockquote JS, that’s nice.

    Now, just a question: why not put the <cite /> element of the last eample directly in the <blockquote /> element instead of the div’s ? Something like:

    <blockquote cite=”URI”> <p>Quotation text</p> <p class=”source”>URI: <cite><a href=”URI”>Paragraphs, Lines, and Phrases</a></cite>.</p> </blockquote>

    It could be more efficient to say that both elements are connected each other…

  8. Just to respond quickly on the note from Paul. A DIV is used since it isn’t a paragraph of text.

  9. Congrats to Paul for linking to Mark’s Gerbils article before me. I hate the element :)

  10. that is

  11. <cite> even.

    I didn’t realise the preview box converted escaped tags to rendered tags.

  12. “A typographical sidenote here: the straight quotation marks arent really quotation marks at all. Theyre called prime and double prime, are sometimes referred to as dumb quotes, and were made popular by being the only characters looking anything like quotation marks on mechanical typewriters.”

    Sadly, no.

    • ” is a neutral double quotation mark. ’ is a neutral apostrophe.
    • ′ and ″ are prime marks (minutes and seconds)

    The entire argument for the vapourware q element disintegrates once you start using typographer’s quotes, which are always unique. Unicode fux0red the apostrophe/single-closing-quotation-mark distinction and those two cannot actually be distinguished, but confusion only ever erupts in some forms of British English.

    See my book.

    BTW, your code processor ate my entities.

  13. November 24, 2004 by Roger Johansson (Author comment)

    Joe: Interesting. In Robert Bringhurst’s book “The Elements of Typographic Style”, ’ and ” are referred to as prime and double prime. Prime is an abbreviation for feet and for minutes of arc, while double prime is an abbreviation for inches and for seconds of arc.

    Single and double primes may be sloped or vertical, but should not be confused with quotation marks, which in some faces (frakturs especially) also take the shape of sloped primes.

    But now you’ve made me unsure of what the correct names of these characters are, and which ones are actually produced by hammering away on a keyboard without taking special care to enter the correct typographer’s quotes ;)

    Not sure what you mean by typographer’s quotes always being unique. Unique for many languages, yes. Some languages also have alternative quotation marks, which a typographer may use. Either way you can define the characters you want to use in the CSS. Obviously, that means there will be no quotation marks in browsers that support neither the q element nor the CSS.

    BTW, your code processor ate my entities.

    Right, will look into fixing that. I noticed it caused Phil some problems too.

    Oh, and I fixed the link to your book.

  14. November 24, 2004 by Roger Johansson (Author comment)

    David: I suppose you could put the cite element inside the blockquote (wrap it in a div or a p element if you use a strict DOCTYPE). I put it outside to have more styling options. If you put it inside the blockquote I’d suggest giving the surrounding block level element a class to make it easier to style differently in case there are multiple paragraphs in the actual quote.

  15. Hi,

    You might also be interested in this technique (Semantically Correct Knockout Quotes) which allows you to do knockout (pull) quotes without adding redundant text to your document, and add a source attribution which does not appear in the text.

    I’m definitely going to mull this over and see if it warrants any updates to my article.

    Thanks for writing this, it’s definitely a good contribution to an often-confusing topic.

  16. I’ve noticed that people’s quotes in the comments are punctuated correctly, but the one q example you give:

    q { quotes:”\201C” “\201D” “\2018” “\2019”; color:#333; } q:lang(sv) { quotes: “\201D” “\201D” “\2019” “\2019”; } “the presentation of ‘phrase elements’ depends on the user agent”

    shows up with all right double-quotes. Is this how the Swedish language does quotes? Why didn’t it show up with 2019 around the “phrase elements” part? Sorry about the ugly blockquote on my part, but hopefully you get the picture.

  17. November 24, 2004 by Roger Johansson (Author comment)

    J.J.: Yep, that’s the Swedish way of quoting. It shouldn’t show up like that in this document though, since the language is specified as English. I looked at it in Opera and you’re right. For some reason it uses the quotation marks specified for Swedish. I removed the q:lang(sv) rule from my style sheet for now.

  18. AFAIK, Opera doesn’t support :lang, although Mozilla does. For Opera, I use an attribute selector: [lang|=en].

  19. November 25, 2004 by Roger Johansson (Author comment)

    Tommy: Thanks for pointing that out. I’ve updated the post to reflect that. Seems like support for using CSS to define quotation marks for different languages is not quite there yet.

  20. Anne, the mozilla.org Markup Reference is a style guide for people writing documents for the Mozilla site, not a global standard. They’re clueful people, and its interesting (and even inspiring) to see their ideas about semantic markup.

    However, their use of is not the usage prescribed by the W3C, who say it “may be used by authors to supply contact information for a document or a major part of a document such as a form”. On the other hand, they have particular and precise uses for that limit the W3C’s general “contains a citation or a reference to other sources” ruling.

    Now I can understand (though don’t share) the viewpoint that W3C specs must be followed to the letter with regard to the precise meaning and purpose of elements (even if it renders elements like address almost useless). But I don’t see why we should all follow the Mozilla style guide - unless we’re writing Mozilla pages - equally slavishly.

  21. Chris, I was just pointing to the mozilla.org Markup Reference, since it’s the only document I know the speaks of such, correct in my opinion, useage. It’s, after all, the ADDRESS of the source. And the ADDRESS should be retrieved if you have comments about it. That the W3C doesn’t mention that place in the document doesn’t make it less appropriate.

    It would be stupid if mozilla.org had a Markup Reference that was not in line with the W3C specifications, since you can’t really redefine semantics.

  22. Firefox (and other Gecko based browsers) only use the first pair of quotation marks for all levels of quotations.

    If you add a decendent selector to the css you can make Firefox support the second level of quotation marks.

    q q {
        quotes:"\2018" "\2019";
    }
    

    I don’t know how this works in any of the MAC browsers like Safari as I don’t have a MAC. Maybe someone else can test other browsers.

  23. Tanny, you could allways use the wonderfull http://danvine.com/icapture/, allthough it seems to be broken right now …

  24. Anne (kudos for figuring out which elements I was talking about since the comment software swallowed them, they looked ok in the preview!)

    If there’s a usage for the ADDRESS element that the Mozilla style guide mentions that doesn’t appear in the W3C standard, isn’t that just a breach of the standards? Doesn’t the fact that “it’s the only document [you] know that speaks of such […] usage” tend to reinforce that impression?

    Now the W3C definition of ADDRESS is absurdly restrictive, almost to the point of making it useless. Now that we’re all trying to hide our addresses from spam bots, does anyone really want an element that draws attention to contact details? Even if they do, does it make sense for it only to “supply contact information for a document or a major part of a document”? Wouldn’t that sort of thing fit better into a META element in the document’s HEAD anyway?

    People are pushing the envelope on ADDRESS (no pun intended). There was a lengthy discussion on SimpleBits about whether it was OK to use it for, well, addresses . Mozilla are using it to markup the source of BLOCKQUOTEs. Both seem sensible to me, though neither follow the letter of the W3C spec.

    Some people like to follow W3C specs as if they are holy writ. Like any good holy book, the specs are ambiguous and inconsistent - meaning that the zealots can have a high old time time accusing eachother of heresy and giving standards a bad name.

    I’m not one of them. If Mozilla want to stretch the definition of ADDRESS, I don’t think it’s “stupid”. I wouldn’t have thought it stupid if they’d been more dogmatic and demanded a “div class=’source’” instead. Or a P, or a CITE nested inside a P, or whatever else floated their boat. So long as it’s valid (X)HTML and not wildly unsemantic, it’s up to them.

  25. November 29, 2004 by Roger Johansson (Author comment)

    Chris, and others who have had HTML entities stripped from their comments: I’m aware of the problem, but I don’t know how to fix it. MovableType insists on converting entities before displaying them in the preview textarea, which means any < or > get removed when the comment is submitted. I’ve fiddled with the available attributes for comment previewing, but nothing has helped so far. I’ll take another look at it — maybe I just missed something.

  26. It appears that my editor and Joe Clark agree: double primes and primes are not quotation marks which meant that I had to enable and disable the AutoCorrect feature in Word whenever I had to switch from code to discussion.

    I have a question about the cite attribute for blockquote which doesn’t seem to have been discussed here. According to the W3C, cite should be a URI but what if I am quoting from a book: is cite=”I, Robot” wrong? Not all information on the WWW is from the WWW so this restriction on cite may not always be appropriate.

    Jules

  27. You got MY solution !! A big thanks

    Tanks again .

  28. January 26, 2005 by Aankhen

    I have a question about the cite attribute for blockquote which doesn’t seem to have been discussed here. According to the W3C, cite should be a URI but what if I am quoting from a book: is cite=”I, Robot” wrong?

    You might be better off leaving out the cite attribute there. Or putting it in a separate element, either as part of the blockquote, or outside, as shown in the entry.

  29. As per my usual MO, I’m going to go right ahead and advocate a server-side solution to the problem of quotes under IE, and named character entities in general.

    My PHP Lab series, which Roger has so graciously added under his Quicklinks, discusses this very issue and my own solution to it. You can jump directly to it under the section titled the head function.

    Included in the paragraph is a link to some sample source code that allows you to safely use quote elements under those browsers that support it, and replace them with the Unicode numeric codepoints under IE — all using the same markup.

    Doug

  30. Quick follow-up: in my references to IE, I’m talking about Win/IE of course. I’ve heard that Mac/IE is the only browser that comes close to implementing this correctly.

    I really have to get one of those mini-Macs so I can test this stuff. I guess I just let that cat out of the bag. lol ~d

  31. May 19, 2005 by John Weston

    Picking nits here, but isn’t your large grey quotation mark upside down? The fat part should be on top in the opening mark, and on the bottom in the closing mark, no?

  32. May 19, 2005 by John Weston

    Wait… now I’ve gone and confused myself. Maybe you’ve got it right. I think the reason your graphic looked wrong to me was that I had just come from this page, where they’re the other way around. Hmm.

  33. Has anyone found discussion online somewhere about using the q element for slang?

  34. what’s mean “quotes:”\201C” “\201D” “\2018” “\2019”;”

Comments are disabled for this post (read why), but if you have spotted an error or have additional info that you think should be in this post, feel free to contact me.