PHP Sadness

Comparison operators

PHP's comparison operators use a confusing, nontransitive set of rules documented in php language.operators.comparison and demonstrated with enormous truth tables in php types.comparisons. These rules apply not only to the scary equality operators == and !=, but also to the operators <, >, <=, and >=, which seem to be considered "safer" than their scary brethren.

Here are examples of == and < being nontransitive:

$ php -r 'var_dump(TRUE == "a"); var_dump("a" == 0); var_dump(TRUE == 0);'
bool(true)
bool(true)
bool(false)

$ php -r 'var_dump(-INF < 0); var_dump(0 < TRUE); var_dump(-INF < TRUE);'
bool(true)
bool(true)
bool(false)

Let's examine the rules for comparison in PHP. Here is a digraph of a variety of values and how they compare; a green line means ==, and an arrow from A to B means A < B:

To make the digraphs a little smaller, [...] => array(...), and {k:"v"} => (object)array("k" => "v").

We can try to simplify it a little by ignoring equality:

We can try to simplify it a little more by performing a transitive reduction:

There are two problems with this digraph. The first is that we lose a lot of information by doing a transitive reduction! Some of the values in the digraph are equal to rather than greater or less than others, but this view ignores those relationships and makes assumptions. We'll try to resolve these in a moment, but first, this reveals the second problem: PHP's comparison operators are not only nontransitive, they're also circular! This means there are situations where A < B, B < C, and C < A! Here's one:

$ cat circular.php
<?php

$a = INF;
$b = array();
$c = (object)array();

var_dump($a < $b);
var_dump($b < $c);
var_dump($c < $a);


$ php circular.php
bool(true)
bool(true)
bool(true)

Because of this very unusual property, we can construct even more ridiculous situations:

<?php

function is_sorted($array) {
  $n = count($array);
  for ($a=0; $a<$n-1; $a++) {
    for ($b=$a+1; $b<$n; $b++) {
      if ($array[$a] > $array[$b]) {
        print "Array not sorted! \$array[$a] > \$array[$b]\n";
        return false;
      }
    }
  }
  return true;
}

$array = array(INF, array(), (object)array());

sort($array);

if (is_sorted($array)) {
  print "You would expect this, wouldn't you?\n";
} else {
  print "Result of sort(\$array) wasn't sorted! :(\n";
}

The above script produces:

Array not sorted! $array[0] > $array[2]
Result of sort($array) wasn't sorted! :(

That's right - in PHP, php sort can return unsorted arrays!

So how can we draw a picture of PHP's value relationships without losing data? Since comparisons aren't strictly transitive in PHP, we can't blindly do a transitive reduction, but there is a subset of values which are completely well-ordered (that is, comparisons within that set of values are internally transitive). In the digraph below, that set and its comparisons are blue.

The black nodes contain the values which violate transitivity in some of the cases. They are linked into the blue set with orange edges; these edges imply that the black node follows the ordering of the blue set as if it were at that position, but it has some exceptions which are shown in green and red: green edges indicate equality (==), while red edges indicate that the pair's relationship is inverted with respect to the blue ordering (that is, < and > are backwards for these pairs).

One notable value missing from these digraphs is NAN, which compares weirdly but consistently and makes the digraph even more confusing.

If you'd like to play with these digraphs yourself, the code to generate them is in a GitHub repository.

Significance: Consistency

Language consistency is very important for developer efficiency. Every inconsistent language feature means that developers have one more thing to remember, one more reason to rely on the documentation, or one more situation that breaks their focus. A consistent language lets developers create habits and expectations that work throughout the language, learn the language much more quickly, more easily locate errors, and have fewer things to keep track of at once.