@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.) hq.recaptime.dev/wiki/Phorge
phorge phabricator
at recaptime-dev/main 398 lines 11 kB view raw
1@title PHP Pitfalls 2@group php 3 4This document discusses difficult traps and pitfalls in PHP, and how to avoid, 5work around, or at least understand them. 6 7= `array_merge()` is Incredibly Slow When Merging A List of Arrays = 8 9If you merge a list of arrays like this: 10 11 COUNTEREXAMPLE, lang=php 12 $result = array(); 13 foreach ($list_of_lists as $one_list) { 14 $result = array_merge($result, $one_list); 15 } 16 17...your program now has a huge runtime because it generates a large number of 18intermediate arrays and copies every element it has previously seen each time 19you iterate. 20 21In an Arcanist environment, you can use @{function@arcanist:array_mergev} 22instead. 23 24= `var_export()` Hates Baby Animals = 25 26If you try to `var_export()` an object that contains recursive references, your 27program will terminate. You have no chance to intercept or react to this or 28otherwise stop it from happening. Avoid `var_export()` unless you are certain 29you have only simple data. You can use `print_r()` or `var_dump()` to display 30complex variables safely. 31 32= `isset()`, `empty()` and Truthiness = 33 34A value is "truthy" if it evaluates to true in an `if` clause: 35 36 lang=php 37 $value = something(); 38 if ($value) { 39 // Value is truthy. 40 } 41 42If a value is not truthy, it is "falsey". These values are falsey in PHP: 43 44 null // null 45 0 // integer 46 0.0 // float 47 "0" // string 48 "" // empty string 49 false // boolean 50 array() // empty array 51 52Disregarding some bizarre edge cases, all other values are truthy. 53 54In addition to truth tests with `if`, PHP has two special truthiness operators 55which look like functions but aren't: `empty()` and `isset()`. These operators 56help deal with undeclared variables. 57 58In PHP, there are two major cases where you get undeclared variables -- either 59you directly use a variable without declaring it: 60 61 COUNTEREXAMPLE, lang=php 62 function f() { 63 if ($not_declared) { 64 // ... 65 } 66 } 67 68...or you index into an array with an index which may not exist: 69 70 COUNTEREXAMPLE 71 function f(array $mystery) { 72 if ($mystery['stuff']) { 73 // ... 74 } 75 } 76 77When you do either of these, PHP issues a warning. Avoid these warnings by 78using `empty()` and `isset()` to do tests that are safe to apply to undeclared 79variables. 80 81`empty()` evaluates truthiness exactly opposite of `if()`. `isset()` returns 82`true` for everything except `null`. This is the truth table: 83 84| Value | `if()` | `empty()` | `isset()` | 85|---------------|--------|-----------|-----------| 86| `null` | `false`| `true` | `false` | 87| `0` | `false`| `true` | `true` | 88| `0.0` | `false`| `true` | `true` | 89| `"0"` | `false`| `true` | `true` | 90| `""` | `false`| `true` | `true` | 91|`false` | `false`| `true` | `true` | 92|`array()` | `false`| `true` | `true` | 93|Everything else| `true` | `false` | `true` | 94 95The value of these operators is that they accept undeclared variables and do 96not issue a warning. Specifically, if you try to do this you get a warning: 97 98```lang=php, COUNTEREXAMPLE 99if ($not_previously_declared) { // PHP Notice: Undefined variable! 100 // ... 101} 102``` 103 104But these are fine: 105 106```lang=php 107if (empty($not_previously_declared)) { // No notice, returns true. 108 // ... 109} 110if (isset($not_previously_declared)) { // No notice, returns false. 111 // ... 112} 113``` 114 115So, `isset()` really means 116`is_declared_and_is_set_to_something_other_than_null()`. `empty()` really means 117`is_falsey_or_is_not_declared()`. Thus: 118 119 - If a variable is known to exist, test falsiness with `if (!$v)`, not 120 `empty()`. In particular, test for empty arrays with `if (!$array)`. There 121 is no reason to ever use `empty()` on a declared variable. 122 - When you use `isset()` on an array key, like `isset($array['key'])`, it 123 will evaluate to "false" if the key exists but has the value `null`! Test 124 for index existence with `array_key_exists()`. 125 126Put another way, use `isset()` if you want to type `if ($value !== null)` but 127are testing something that may not be declared. Use `empty()` if you want to 128type `if (!$value)` but you are testing something that may not be declared. 129 130= Check for non-empty strings = 131 132As already mentioned, note that you cannot just use an `if` or `empty()` to 133check for a non-empty string, mostly because "0" is falsey, so you cannot rely 134on this sort of thing to prevent users from making empty comments: 135 136 COUNTEREXAMPLE 137 if ($comment_text) { 138 make_comment($comment_text); 139 } 140 141This is wrong because it prevents users from making the comment "0". 142 143//THE COMMENT "0" IS TOTALLY AWESOME AND I MAKE IT ALL THE TIME SO YOU HAD 144BETTER NOT BREAK IT!!!// 145 146Another way //was// also `strlen()`: 147 148 COUNTEREXAMPLE 149 if (strlen($comment_text)) { 150 make_comment($comment_text); 151 } 152 153But using `strlen(null)` causes a deprecation warning since PHP 8.1. Also, 154using `strlen()` uses too many CPU cycles to just check of a non-empty. 155 156In short, outside Phorge, this is a general way to check for non-empty strings 157for most wild input types: 158 159```lang=php 160 $value_str = (string) $value; 161 if ($value_str !== '') { 162 // do something 163 } 164``` 165 166To do the same thing in Phorge, use this better and safer approach: 167 168```lang=php 169 $value_str = phutil_string_cast($value); 170 if ($value_str !== '') { 171 // do something 172 } 173``` 174 175And, if you are 100% sure that you are __only__ working with string and 176null, evaluate this instead: 177 178```lang=php 179 if (phutil_nonempty_string($value)) { 180 // do something 181 } 182``` 183 184WARNING: The function `phutil_nonempty_string()` is designed to throw a nice 185exception if it receives `true`, `false`, an array, an object or anything 186alien that is not a string and not null. Do your evaluations. 187 188= usort(), uksort(), and uasort() are Slow = 189 190This family of functions is often extremely slow for large datasets. You should 191avoid them if at all possible. Instead, build an array which contains surrogate 192keys that are naturally sortable with a function that uses native comparison 193(e.g., `sort()`, `asort()`, `ksort()`, or `natcasesort()`). Sort this array 194instead, and use it to reorder the original array. 195 196In an Arcanist environment, you can often do this easily with 197@{function@arcanist:isort} or @{function@arcanist:msort}. 198 199= `array_intersect()` and `array_diff()` are Also Slow = 200 201These functions are much slower for even moderately large inputs than 202`array_intersect_key()` and `array_diff_key()`, because they can not make the 203assumption that their inputs are unique scalars as the `key` varieties can. 204Strongly prefer the `key` varieties. 205 206= `array_uintersect()` and `array_udiff()` are Definitely Slow Too = 207 208These functions have the problems of both the `usort()` family and the 209`array_diff()` family. Avoid them. 210 211= `foreach()` Does Not Create Scope = 212 213Variables survive outside of the scope of `foreach()`. More problematically, 214references survive outside of the scope of `foreach()`. This code mutates 215`$array` because the reference leaks from the first loop to the second: 216 217```lang=php, COUNTEREXAMPLE 218$array = range(1, 3); 219echo implode(',', $array); // Outputs '1,2,3' 220foreach ($array as &$value) {} 221echo implode(',', $array); // Outputs '1,2,3' 222foreach ($array as $value) {} 223echo implode(',', $array); // Outputs '1,2,2' 224``` 225 226The easiest way to avoid this is to avoid using foreach-by-reference. If you do 227use it, unset the reference after the loop: 228 229```lang=php 230foreach ($array as &$value) { 231 // ... 232} 233unset($value); 234``` 235 236= `unserialize()` is Incredibly Slow on Large Datasets = 237 238The performance of `unserialize()` is nonlinear in the number of zvals you 239unserialize, roughly `O(N^2)`. 240 241| zvals | Approximate time | 242|-------|------------------| 243| 10000 |5ms | 244| 100000 | 85ms | 245| 1000000 | 8,000ms | 246| 10000000 | 72 billion years | 247 248= `call_user_func()` Breaks References = 249 250If you use `call_use_func()` to invoke a function which takes parameters by 251reference, the variables you pass in will have their references broken and will 252emerge unmodified. That is, if you have a function that takes references: 253 254```lang=php 255function add_one(&$v) { 256 $v++; 257} 258``` 259 260...and you call it with `call_user_func()`: 261 262```lang=php, COUNTEREXAMPLE 263$x = 41; 264call_user_func('add_one', $x); 265``` 266 267...`$x` will not be modified. The solution is to use `call_user_func_array()` 268and wrap the reference in an array: 269 270```lang=php 271$x = 41; 272call_user_func_array( 273 'add_one', 274 array(&$x)); // Note '&$x'! 275``` 276 277This will work as expected. 278 279= You Can't Throw From `__toString()` = 280 281If you throw from `__toString()`, your program will terminate uselessly and you 282won't get the exception. 283 284= An Object Can Have Any Scalar as a Property = 285 286Object properties are not limited to legal variable names: 287 288```lang=php 289$property = '!@#$%^&*()'; 290$obj->$property = 'zebra'; 291echo $obj->$property; // Outputs 'zebra'. 292``` 293 294So, don't make assumptions about property names. 295 296= There is an `(object)` Cast = 297 298You can cast a dictionary into an object. 299 300```lang=php 301$obj = (object)array('flavor' => 'coconut'); 302echo $obj->flavor; // Outputs 'coconut'. 303echo get_class($obj); // Outputs 'stdClass'. 304``` 305 306This is occasionally useful, mostly to force an object to become a JavaScript 307dictionary (vs a list) when passed to `json_encode()`. 308 309= Invoking `new` With an Argument Vector is Really Hard = 310 311If you have some `$class_name` and some `$argv` of constructor arguments 312and you want to do this: 313 314```lang=php 315new $class_name($argv[0], $argv[1], ...); 316``` 317 318...you'll probably invent a very interesting, very novel solution that is very 319wrong. In an Arcanist environment, solve this problem with 320@{function@arcanist:newv}. Elsewhere, copy `newv()`'s implementation. 321 322= Equality is not Transitive = 323 324This isn't terribly surprising since equality isn't transitive in a lot of 325languages, but the `==` operator is not transitive: 326 327```lang=php 328$a = ''; $b = 0; $c = '0a'; 329$a == $b; // true 330$b == $c; // true 331$c == $a; // false! 332``` 333 334When either operand is an integer, the other operand is cast to an integer 335before comparison. Avoid this and similar pitfalls by using the `===` operator, 336which is transitive. 337 338= All 676 Letters in the Alphabet = 339 340This doesn't do what you'd expect it to do in C: 341 342```lang=php 343for ($c = 'a'; $c <= 'z'; $c++) { 344 // ... 345} 346``` 347 348This is because the successor to `z` is `aa`, which is "less than" `z`. 349The loop will run for ~700 iterations until it reaches `zz` and terminates. 350That is, `$c` will take on these values: 351 352``` 353a 354b 355... 356y 357z 358aa // loop continues because 'aa' <= 'z' 359ab 360... 361mf 362mg 363... 364zw 365zx 366zy 367zz // loop now terminates because 'zz' > 'z' 368``` 369 370Instead, use this loop: 371 372```lang=php 373foreach (range('a', 'z') as $c) { 374 // ... 375} 376``` 377 378= PHP casts all-digit array keys from string to int = 379 380An array key which is a string that contains a decimal int will be cast to the 381int type: 382 383```lang=php 384$key0 = "main"; 385$key1 = "123"; 386$key2 = "0123"; 387$array = array($key0 => "foo", $key1 => "foo", $key2 => "foo"); 388foreach ($array as $key => $value) { 389 print(gettype($key)."\n"); 390} 391``` 392prints `string`, `integer`, `string`. 393 394Thus running `phutil_nonempty_string($key)` complains that it expected null or 395a string but got int. 396 397Avoid this by either explicitly casting via `(string)$key`, or by using 398`phutil_nonempty_scalar($key)` instead of `phutil_nonempty_string($key)`.