Categories
WordPress

Comparing Markup with PHPUnit

For the upcoming WordPress 6.3 release, I’ve been contributing to the introduction of script loading strategies (i.e. the async and defer attributes). In the WP_Scripts class, all of the script tags are manually assembled with printf() & sprintf(), and as part of that work I wanted to start making use of helper functions that assemble script tags: wp_get_script_tag(), wp_get_inline_script_tag(), and friends.

These are better than manually assembling script tags since they handle escaping out of the box, they’re much easier to read, and they provide filters that allow for CSP attributes to be added (e.g. nonce).

Nevertheless, there are two differences between how WP_Scripts manually assembles script tags compared with how the helper functions do so:

  1. WP_Scripts uses single-quoted HTML attribute values, whereas the helper functions use double-quoted ones.
  2. The order of attributes won’t necessarily be the same.

To the browser, these syntactic differences are meaningless. However, the unit tests were making extensive use of assertSame to compare the expected script markup with the actual markup, and so switching to use these helper functions broke a lot of tests.

What was needed was a way to compare markup in a way that ignores unsemantic differences.

At first I thought I might utilize the new WP_HTML_Tag_Processor to normalize the markup by sorting the attributes and ensuring double-quoted attribute values. Nevertheless, my normalization routine broke during the 6.3 development cycle. And this breakage wasn’t unwarranted since Dennis Snell said it wasn’t designed to normalize HTML.

In discussing alternatives with Joe McGill, we first considered assertXmlStringEqualsXmlString but this was problematic since it requires a complete valid XML document while we’re working with HTML fragments. Then he suggested whether we could use DOMDocument for the comparisons. But how? We couldn’t possibly just do something like the following in PHPUnit, right?

$expected = "<script src='/foo.js' id='foo-js'></script>";
$actual   = '<script id="foo-js" src="/foo.js"></script>';

$actual_dom = new DOMDocument();
$actual_dom->loadHTML( $actual );

$expected_dom = new DOMDocument();
$expected_dom->loadHTML( $expected );

$this->assertEquals( $expected_dom, $actual_dom ); // ✅Code language: PHP (php)

It turns out, yes!

I had assumed that this would fail because these are different instances of DOMDocument, so surely they wouldn’t be equal, like so:

assert( $expected_dom == $actual_dom ); // ✅ but read onCode language: PHP (php)

I expected this assertion to fail, but surprisingly it passes. The PHP docs for Comparing Objects states:

When using the comparison operator (==), object variables are compared in a simple manner, namely: Two object instances are equal if they have the same attributes and values (values are compared with ==), and are instances of the same class.

[…]

Note: Extensions can define own rules for their objects comparison (==).

So is it that the dom extension is defining its own rules for comparing DOMNode objects? Apparently, but even if so it seems the comparison rules are broken because the following comparison is also true:

$foo = new DOMDocument();
$foo->loadHTML( '<div id="foo">foo!</div>' );

$bar = new DOMDocument();
$bar->loadHTML( '<div id="bar">bar</div>' );

assert( $foo == $bar ); // ✅ WTF?Code language: PHP (php)

Something is clearly wrong here. But then how is it working in PHPUnit? Well, its Assertions documentation states:

Equality is checked using the == operator, but more specialized comparisons are used for specific argument types for $expected and $actual, see below.

And it turns out that DOMDocument is one of these specialized comparisons:

assertEquals(DOMDocument $expected, DOMDocument $actual[, string $message])

Reports an error identified by $message if the uncommented canonical form of the XML documents represented by the two DOMDocument objects $expected and $actual are not equal.

Example 1.7 Usage of assertEquals() with DOMDocument object

<?php declare(strict_types=1);
use PHPUnit\Framework\TestCase;

final class EqualsWithDomDocumentTest extends TestCase
{
    public function testFailure(): void
    {
        $expected = new DOMDocument;
        $expected->loadXML('<foo><bar/></foo>');

        $actual = new DOMDocument;
        $actual->loadXML('<bar><foo/></bar>');

        $this->assertEquals($expected, $actual);
    }
}Code language: PHP (php)

Running the test shown above yields the output shown below:

./tools/phpunit tests/EqualsWithDomDocumentTest.php
PHPUnit 10.0.19 by Sebastian Bergmann and contributors.

Runtime:       PHP 8.2.7

F                                                                   1 / 1 (100%)

Time: 00:00.002, Memory: 14.31 MB

There was 1 failure:

1) EqualsWithDomDocumentTest::testFailure
Failed asserting that two DOM documents are equal.
--- Expected
+++ Actual
@@ @@
 <?xml version="1.0"?>
-<foo>
-  <bar/>
-</foo>
+<bar>
+  <foo/>
+</bar>

/path/to/tests/EqualsWithDomDocumentTest.php:14

FAILURES!
Tests: 1, Assertions: 1, Failures: 1.Code language: plaintext (plaintext)

And it works not just with XML but HTML too. Perfect!

So given this unit test code in WordPress 6.2:

public function test_wp_add_inline_script_before() {
	wp_enqueue_script( 'test-example', 'example.com', array(), null );
	wp_add_inline_script( 'test-example', 'console.log("before");', 'before' );

	$expected  = "<script type='text/javascript' id='test-example-js-before'>\nconsole.log(\"before\");\n</script>\n";
	$expected .= "<script type='text/javascript' src='http://example.com' id='test-example-js'></script>\n";

	$this->assertSame( $expected, get_echo( 'wp_print_scripts' ) );
}Code language: PHP (php)

To make it pass in WordPress 6.3 where wp_get_inline_script() is used for assembling the script tag, the test is modified as follows:

public function test_wp_add_inline_script_before() {
	wp_enqueue_script( 'test-example', 'example.com', array(), null );
	wp_add_inline_script( 'test-example', 'console.log("before");', 'before' );

	$expected  = "<script type='text/javascript' id='test-example-js-before'>\nconsole.log(\"before\");\n</script>\n";
	$expected .= "<script type='text/javascript' src='http://example.com' id='test-example-js'></script>\n";

	$this->assertEqualMarkup( $expected, get_echo( 'wp_print_scripts' ) );
}
Code language: PHP (php)

Note the replacement of assertSame with assertEqualMarkup. This test assertion method takes the supplied $expected and $actual strings and loads the HTML into their own respective DOMDocument instances which get passed into assertEquals. Actually, it does a bit more. A markup fragment is loaded into a DOMDocument by:

  1. Supplying a valid HTML document with UTF-8 encoding.
  2. Putting the markup fragment inside the body, since otherwise the supplied markup may get put in the head.
  3. Stripping out any whitespace nodes added at the beginning/end of the body during parsing.

These aren’t all strictly necessary, but they make the comparisons more predictable.

The values then passed into assertEquals are not actually the DOMDocument objects but rather their respective DOMElement elements for the body. It turns out that PHPUnit’s assertEquals allows passing DOMElement objects for comparison too, not just DOMDocument instances.

The only real caveat to comparing markup in this way is that HTML comments are disregarded. So if you want to assert the presence of some comment, you’d perhaps also need to use assertStringContainsString with the supplied markup.

I’m looking forward to seeing how this will facilitate eliminating the remaining manual construction of script tags in WP_Scripts in WordPress 6.4. (In 6.3 it is limited to inline before/after scripts.)

I hope this helps you with unit testing markup comparisons in PHPUnit!

Leave a Reply

Your email address will not be published. Required fields are marked *