@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.)
hq.recaptime.dev/wiki/Phorge
phorge
phabricator
1@title PHP Pitfalls
2@group php
3
4This document discusses difficult traps and pitfalls in PHP, and how to avoid,
5work around, or at least understand them.
6
7= `array_merge()` is Incredibly Slow When Merging A List of Arrays =
8
9If you merge a list of arrays like this:
10
11 COUNTEREXAMPLE, lang=php
12 $result = array();
13 foreach ($list_of_lists as $one_list) {
14 $result = array_merge($result, $one_list);
15 }
16
17...your program now has a huge runtime because it generates a large number of
18intermediate arrays and copies every element it has previously seen each time
19you iterate.
20
21In an Arcanist environment, you can use @{function@arcanist:array_mergev}
22instead.
23
24= `var_export()` Hates Baby Animals =
25
26If you try to `var_export()` an object that contains recursive references, your
27program will terminate. You have no chance to intercept or react to this or
28otherwise stop it from happening. Avoid `var_export()` unless you are certain
29you have only simple data. You can use `print_r()` or `var_dump()` to display
30complex variables safely.
31
32= `isset()`, `empty()` and Truthiness =
33
34A value is "truthy" if it evaluates to true in an `if` clause:
35
36 lang=php
37 $value = something();
38 if ($value) {
39 // Value is truthy.
40 }
41
42If a value is not truthy, it is "falsey". These values are falsey in PHP:
43
44 null // null
45 0 // integer
46 0.0 // float
47 "0" // string
48 "" // empty string
49 false // boolean
50 array() // empty array
51
52Disregarding some bizarre edge cases, all other values are truthy.
53
54In addition to truth tests with `if`, PHP has two special truthiness operators
55which look like functions but aren't: `empty()` and `isset()`. These operators
56help deal with undeclared variables.
57
58In PHP, there are two major cases where you get undeclared variables -- either
59you directly use a variable without declaring it:
60
61 COUNTEREXAMPLE, lang=php
62 function f() {
63 if ($not_declared) {
64 // ...
65 }
66 }
67
68...or you index into an array with an index which may not exist:
69
70 COUNTEREXAMPLE
71 function f(array $mystery) {
72 if ($mystery['stuff']) {
73 // ...
74 }
75 }
76
77When you do either of these, PHP issues a warning. Avoid these warnings by
78using `empty()` and `isset()` to do tests that are safe to apply to undeclared
79variables.
80
81`empty()` evaluates truthiness exactly opposite of `if()`. `isset()` returns
82`true` for everything except `null`. This is the truth table:
83
84| Value | `if()` | `empty()` | `isset()` |
85|---------------|--------|-----------|-----------|
86| `null` | `false`| `true` | `false` |
87| `0` | `false`| `true` | `true` |
88| `0.0` | `false`| `true` | `true` |
89| `"0"` | `false`| `true` | `true` |
90| `""` | `false`| `true` | `true` |
91|`false` | `false`| `true` | `true` |
92|`array()` | `false`| `true` | `true` |
93|Everything else| `true` | `false` | `true` |
94
95The value of these operators is that they accept undeclared variables and do
96not issue a warning. Specifically, if you try to do this you get a warning:
97
98```lang=php, COUNTEREXAMPLE
99if ($not_previously_declared) { // PHP Notice: Undefined variable!
100 // ...
101}
102```
103
104But these are fine:
105
106```lang=php
107if (empty($not_previously_declared)) { // No notice, returns true.
108 // ...
109}
110if (isset($not_previously_declared)) { // No notice, returns false.
111 // ...
112}
113```
114
115So, `isset()` really means
116`is_declared_and_is_set_to_something_other_than_null()`. `empty()` really means
117`is_falsey_or_is_not_declared()`. Thus:
118
119 - If a variable is known to exist, test falsiness with `if (!$v)`, not
120 `empty()`. In particular, test for empty arrays with `if (!$array)`. There
121 is no reason to ever use `empty()` on a declared variable.
122 - When you use `isset()` on an array key, like `isset($array['key'])`, it
123 will evaluate to "false" if the key exists but has the value `null`! Test
124 for index existence with `array_key_exists()`.
125
126Put another way, use `isset()` if you want to type `if ($value !== null)` but
127are testing something that may not be declared. Use `empty()` if you want to
128type `if (!$value)` but you are testing something that may not be declared.
129
130= Check for non-empty strings =
131
132As already mentioned, note that you cannot just use an `if` or `empty()` to
133check for a non-empty string, mostly because "0" is falsey, so you cannot rely
134on this sort of thing to prevent users from making empty comments:
135
136 COUNTEREXAMPLE
137 if ($comment_text) {
138 make_comment($comment_text);
139 }
140
141This is wrong because it prevents users from making the comment "0".
142
143//THE COMMENT "0" IS TOTALLY AWESOME AND I MAKE IT ALL THE TIME SO YOU HAD
144BETTER NOT BREAK IT!!!//
145
146Another way //was// also `strlen()`:
147
148 COUNTEREXAMPLE
149 if (strlen($comment_text)) {
150 make_comment($comment_text);
151 }
152
153But using `strlen(null)` causes a deprecation warning since PHP 8.1. Also,
154using `strlen()` uses too many CPU cycles to just check of a non-empty.
155
156In short, outside Phorge, this is a general way to check for non-empty strings
157for most wild input types:
158
159```lang=php
160 $value_str = (string) $value;
161 if ($value_str !== '') {
162 // do something
163 }
164```
165
166To do the same thing in Phorge, use this better and safer approach:
167
168```lang=php
169 $value_str = phutil_string_cast($value);
170 if ($value_str !== '') {
171 // do something
172 }
173```
174
175And, if you are 100% sure that you are __only__ working with string and
176null, evaluate this instead:
177
178```lang=php
179 if (phutil_nonempty_string($value)) {
180 // do something
181 }
182```
183
184WARNING: The function `phutil_nonempty_string()` is designed to throw a nice
185exception if it receives `true`, `false`, an array, an object or anything
186alien that is not a string and not null. Do your evaluations.
187
188= usort(), uksort(), and uasort() are Slow =
189
190This family of functions is often extremely slow for large datasets. You should
191avoid them if at all possible. Instead, build an array which contains surrogate
192keys that are naturally sortable with a function that uses native comparison
193(e.g., `sort()`, `asort()`, `ksort()`, or `natcasesort()`). Sort this array
194instead, and use it to reorder the original array.
195
196In an Arcanist environment, you can often do this easily with
197@{function@arcanist:isort} or @{function@arcanist:msort}.
198
199= `array_intersect()` and `array_diff()` are Also Slow =
200
201These functions are much slower for even moderately large inputs than
202`array_intersect_key()` and `array_diff_key()`, because they can not make the
203assumption that their inputs are unique scalars as the `key` varieties can.
204Strongly prefer the `key` varieties.
205
206= `array_uintersect()` and `array_udiff()` are Definitely Slow Too =
207
208These functions have the problems of both the `usort()` family and the
209`array_diff()` family. Avoid them.
210
211= `foreach()` Does Not Create Scope =
212
213Variables survive outside of the scope of `foreach()`. More problematically,
214references survive outside of the scope of `foreach()`. This code mutates
215`$array` because the reference leaks from the first loop to the second:
216
217```lang=php, COUNTEREXAMPLE
218$array = range(1, 3);
219echo implode(',', $array); // Outputs '1,2,3'
220foreach ($array as &$value) {}
221echo implode(',', $array); // Outputs '1,2,3'
222foreach ($array as $value) {}
223echo implode(',', $array); // Outputs '1,2,2'
224```
225
226The easiest way to avoid this is to avoid using foreach-by-reference. If you do
227use it, unset the reference after the loop:
228
229```lang=php
230foreach ($array as &$value) {
231 // ...
232}
233unset($value);
234```
235
236= `unserialize()` is Incredibly Slow on Large Datasets =
237
238The performance of `unserialize()` is nonlinear in the number of zvals you
239unserialize, roughly `O(N^2)`.
240
241| zvals | Approximate time |
242|-------|------------------|
243| 10000 |5ms |
244| 100000 | 85ms |
245| 1000000 | 8,000ms |
246| 10000000 | 72 billion years |
247
248= `call_user_func()` Breaks References =
249
250If you use `call_use_func()` to invoke a function which takes parameters by
251reference, the variables you pass in will have their references broken and will
252emerge unmodified. That is, if you have a function that takes references:
253
254```lang=php
255function add_one(&$v) {
256 $v++;
257}
258```
259
260...and you call it with `call_user_func()`:
261
262```lang=php, COUNTEREXAMPLE
263$x = 41;
264call_user_func('add_one', $x);
265```
266
267...`$x` will not be modified. The solution is to use `call_user_func_array()`
268and wrap the reference in an array:
269
270```lang=php
271$x = 41;
272call_user_func_array(
273 'add_one',
274 array(&$x)); // Note '&$x'!
275```
276
277This will work as expected.
278
279= You Can't Throw From `__toString()` =
280
281If you throw from `__toString()`, your program will terminate uselessly and you
282won't get the exception.
283
284= An Object Can Have Any Scalar as a Property =
285
286Object properties are not limited to legal variable names:
287
288```lang=php
289$property = '!@#$%^&*()';
290$obj->$property = 'zebra';
291echo $obj->$property; // Outputs 'zebra'.
292```
293
294So, don't make assumptions about property names.
295
296= There is an `(object)` Cast =
297
298You can cast a dictionary into an object.
299
300```lang=php
301$obj = (object)array('flavor' => 'coconut');
302echo $obj->flavor; // Outputs 'coconut'.
303echo get_class($obj); // Outputs 'stdClass'.
304```
305
306This is occasionally useful, mostly to force an object to become a JavaScript
307dictionary (vs a list) when passed to `json_encode()`.
308
309= Invoking `new` With an Argument Vector is Really Hard =
310
311If you have some `$class_name` and some `$argv` of constructor arguments
312and you want to do this:
313
314```lang=php
315new $class_name($argv[0], $argv[1], ...);
316```
317
318...you'll probably invent a very interesting, very novel solution that is very
319wrong. In an Arcanist environment, solve this problem with
320@{function@arcanist:newv}. Elsewhere, copy `newv()`'s implementation.
321
322= Equality is not Transitive =
323
324This isn't terribly surprising since equality isn't transitive in a lot of
325languages, but the `==` operator is not transitive:
326
327```lang=php
328$a = ''; $b = 0; $c = '0a';
329$a == $b; // true
330$b == $c; // true
331$c == $a; // false!
332```
333
334When either operand is an integer, the other operand is cast to an integer
335before comparison. Avoid this and similar pitfalls by using the `===` operator,
336which is transitive.
337
338= All 676 Letters in the Alphabet =
339
340This doesn't do what you'd expect it to do in C:
341
342```lang=php
343for ($c = 'a'; $c <= 'z'; $c++) {
344 // ...
345}
346```
347
348This is because the successor to `z` is `aa`, which is "less than" `z`.
349The loop will run for ~700 iterations until it reaches `zz` and terminates.
350That is, `$c` will take on these values:
351
352```
353a
354b
355...
356y
357z
358aa // loop continues because 'aa' <= 'z'
359ab
360...
361mf
362mg
363...
364zw
365zx
366zy
367zz // loop now terminates because 'zz' > 'z'
368```
369
370Instead, use this loop:
371
372```lang=php
373foreach (range('a', 'z') as $c) {
374 // ...
375}
376```
377
378= PHP casts all-digit array keys from string to int =
379
380An array key which is a string that contains a decimal int will be cast to the
381int type:
382
383```lang=php
384$key0 = "main";
385$key1 = "123";
386$key2 = "0123";
387$array = array($key0 => "foo", $key1 => "foo", $key2 => "foo");
388foreach ($array as $key => $value) {
389 print(gettype($key)."\n");
390}
391```
392prints `string`, `integer`, `string`.
393
394Thus running `phutil_nonempty_string($key)` complains that it expected null or
395a string but got int.
396
397Avoid this by either explicitly casting via `(string)$key`, or by using
398`phutil_nonempty_scalar($key)` instead of `phutil_nonempty_string($key)`.