Using the AMP Plugin to Protect Site Visitors and Debug Security Vulnerabilities

Recently I’ve been testing compatibility for all of Jetpack‘s various widgets when used on pages served by the AMP plugin. In the process I ran across a security vulnerability in Jetpack (which I responsibly disclosed and is now fixed), but I never would have noticed the issue if it weren’t for the AMP plugin’s internal validator.

As you may be aware, AMP is both a subset and superset of HTML. The standard HTML elements which can have problems with performance and privacy are not allowed in AMP. At the same time, AMP is also a web components library which provides custom elements that implement performance best practices and support privacy-preserving prerendering. All of the elements and attributes that AMP allows are codified in a specification which is used to programmatically validate AMP pages. Valid AMP pages can be distributed via an AMP Cache and safely prerendered to a user (e.g. in search results).

The AMP plugin internalizes the AMP specification and it uses the spec to catch invalid AMP markup to prevent it from leaking out onto the frontend. The plugin does its best to ensure your site serves valid AMP pages, not only so that Google Search Console doesn’t complain about AMP validation errors, but also in order to give you immediate feedback without having to wait for Googlebot to crawl your site. In contrast to the plugin’s Classic mode, the plugin no longer silently sanitizes the invalid AMP markup when in the Paired/Native modes; you can now be informed of what markup it is removing. This is particularly important when you have a site running ads or analytics, as you need to be alerted when the related script tags are getting stripped out (as AMP doesn’t allow custom scripts, at least not quite yet, though never like this).

So, back to the Jetpack plugin. When I tested the My Community widget, I noticed some strange new AMP validation errors reported by the AMP plugin, including unrecognized attributes: bencowboy, and alman:

New validation errors appearing after adding Jetpack’s My Community widget.

The AMP plugin’s validator stripped out these invalid attributes—being “accepted” for sanitization—so they would not have shown up on the frontend of the site. But where did they come from? Here also the AMP plugin provides a key tool. As shown above, the plugin already identified that Jetpack was the source of the errors. Then by expanding a validation error, the full context for the error including its source information is provided:

Details for an AMP validation error as provided by the plugin’s internal validator.

Here it is clear that the invalid markup is coming from that My Community widget in Jetpack, as can be seen in the source function (Jetpack_My_Community_Widget::display_callback). When I looked at the widget output in a non-AMP version of the page, the issue became clear:

<li>
    <a 
        href="https://en.gravatar.com/978a1a2a80394217a0e39c84f07a7c16" 
        title=""Cowboy" Ben Alman"
    >
        <img alt="" src="https://0.gravatar.com/avatar/978a1a2a80394217a0e39c84f07a7c16?s=96&d=https%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&r=G" class="avatar avatar-240" height="48" width="48" originals="240" scale="1" />
    </a>
</li>

The problem was title=""Cowboy" Ben Alman". The title attribute (among other attributes) was not being escaped with esc_attr() in Jetpack_My_Community_Widget::get_community():

foreach ( $members as $member ) {
    $my_community .= sprintf(
        '<li><a href="%s" %s><img alt="" src="%s" class="avatar avatar-240" height="48" width="48" originals="240" scale="1" /></a></li>',
        $member->profile_URL,
        empty( $member->name ) ? '' : 'title="' . $member->name . '"',
        $member->avatar_URL
    );
}

This is a persistent cross-site scripting vulnerability (stored XSS). An attacker could have exploited the vulnerability to run arbitrary JavaScript on a site that uses the My Community widget. All they’d have to do to exploit it is change their account “name” to something like:

John Smith"><script>doSomething("EVIL")</script><a class="

Then since the widget lists users who have recently interacted with the site, the attacker would just have to leave a comment and then wait 10 minutes for the transient to flush. At this point the malicious doSomething('evil') would run for every visitor to the site.

I responsibly disclosed this Jetpack security vulnerability to Automattic’s HackerOne, and I got approval to blog about the find. Many thanks to the Jetpack team for being so responsive and including the fix in a release so quickly.

Remember: Never trust external input. Always validate/sanitize all inputs early and escape all output late.

However, this vulnerability would not have been exploitable on an AMP-first site. In the plugin’s native mode there is no non-AMP version of the site (no paired AMP). The AMP plugin removes all custom script (including script tags and on-event handler attributes), so on a fully AMP site the AMP plugin would have prevented this stored XSS vulnerability from being exploited. Furthermore, the AMP plugin also informs the site owner of such invalid markup being removed and where it came from in the first place.

So the AMP plugin is useful for protecting visitors to your site, as well as providing you with tools for finding and debugging security vulnerabilities. To learn more about the plugin, check out amp-wp.org.

Leave a Reply

Your email address will not be published. Required fields are marked *