Empty T_ENCAPSED_AND_WHITESPACE
tokens
Which of these lines contains a syntax error?
print "$array[$a]"; # 1: variable index print "$array['a']"; # 2: string index print "$array[a]"; # 3: undefined constant index print "array['a']"; # 4: embedded single quotes, no variables print "{$array['a']}"; # 5: index with "complex curly" syntax
I'll give you a hint; here's the syntax error:
Parse error: syntax error, unexpected '' (T_ENCAPSED_AND_WHITESPACE), expecting ...
The syntax error is in line #2, and it seems to be related to how PHP tokenizes that kind of expresssion. Here's a program that will tell us how PHP tokenizes that line:
<?php $str = '<?php print "$array[\'a\']";'; foreach (token_get_all($str) as $t) { if (is_array($t)) { $n = token_name($t[0]); $v = $t[1]; } else { $n = "(literal)"; $v = $t; } printf("%26s: %s\n", $n, var_export($v,TRUE)); }
It produces this:
T_OPEN_TAG: '<?php ' T_PRINT: 'print' T_WHITESPACE: ' ' (literal): '"' T_VARIABLE: '$array' (literal): '[' T_ENCAPSED_AND_WHITESPACE: '' T_ENCAPSED_AND_WHITESPACE: '\'a\']' (literal): '"' (literal): ';'
The token the syntax error is stuck on, a seemingly empty (''
) T_ENCAPSED_AND_WHITESPACE
, can be seen there. I'm not sure why the tokenizer generates it, especially because none of the other example lines from above generate anything like it. It also mysteriously thinks that the closing square bracket should be part of that token.
Conversely, here's line #5 ("complex curly" syntax) with some annotations:
T_OPEN_TAG: '<?php ' T_PRINT: 'print' T_WHITESPACE: ' ' (literal): '"' T_CURLY_OPEN: '{' T_VARIABLE: '$array' (literal): '[' # no empty token here T_CONSTANT_ENCAPSED_STRING: '\'a\'' (literal): ']' # correct discovery of closing square bracket (literal): '}' (literal): '"' (literal): ';'
And, for comparison, here's what the tokenizer does when it doesn't think it's parsing a variable in a string (line #4):
T_OPEN_TAG: '<?php ' T_PRINT: 'print' T_WHITESPACE: ' ' T_CONSTANT_ENCAPSED_STRING: '"array[\'a\']"' (literal): ';'
Significance: Fast Debugging
It is very important to be able to quickly debug issues in your application. When every second of downtime costs your company money, bad error messages can mean thousands of dollars in unnecessary losses and hours of wasted developer time. Languages posing to be used in large applications need to ensure that developers can quickly discern the cause of an issue.