Get element text, including alt text for images, with JavaScript

Sometimes I find myself wanting to get the text contents of an element and its descendants. There is a DOM method called textContent that can be used for this. There is also jQuery’s text() method. Unfortunately neither method returns what I want.

In both cases, elements that can have alt attributes are omitted from the returned string. In my opinion, alt text is the text content of an img, input[type=image] or area element and should be returned by methods like these. I also find it a bit weird that they return the contents of script elements.

Not having any luck finding a method that includes alternative text and omits script elements when getting text content, I wrote my own:

var getElementText = function(el) {
    var text = '';
    // Text node (3) or CDATA node (4) - return its text
    if ( (el.nodeType === 3) || (el.nodeType === 4) ) {
        text = el.nodeValue;
    // If node is an element (1) and an img, input[type=image], or area element, return its alt text
    } else if ( (el.nodeType === 1) && (
            (el.tagName.toLowerCase() == 'img') ||
            (el.tagName.toLowerCase() == 'area') ||
            ((el.tagName.toLowerCase() == 'input') && el.getAttribute('type') && (el.getAttribute('type').toLowerCase() == 'image'))
            ) ) {
        text = el.getAttribute('alt') || '';
    // Traverse children unless this is a script or style element
    } else if ( (el.nodeType === 1) && !el.tagName.match(/^(script|style)$/i) ) {
        var children = el.childNodes;
        for (var i = 0, l = children.length; i < l; i++) {
            text += getElementText(children[i]);
        }
    }
    return text;
};

It expects the argument to be a reference to an element.

To get the text contents of an entire document, you’d call it like this:

var bodyText = getElementText(document.body);

There could be better ways of doing this, of course, but none that I have been able to find.

Posted on May 16, 2011 in JavaScript

Comments are disabled for this post (read why), but if you have spotted an error or have additional info that you think should be in this post, feel free to contact me.