@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.)
hq.recaptime.dev/wiki/Phorge
phorge
phabricator
1@title Troubleshooting Performance Problems
2@group fieldmanual
3
4Guide to the troubleshooting slow pages and hangs.
5
6Overview
7========
8
9This document describes how to isolate, examine, understand and resolve or
10report performance issues like slow pages and hangs.
11
12This document covers the general process for handling performance problems,
13and outlines the major tools available for understanding them:
14
15 - **Multimeter** helps you understand sources of load and broad resource
16 utilization. This is a coarse, high-level tool.
17 - **DarkConsole** helps you dig into a specific slow page and understand
18 service calls. This is a general, mid-level tool.
19 - **XHProf** gives you detailed application performance profiles. This
20 is a fine-grained, low-level tool.
21
22Performance and the Upstream
23============================
24
25Performance issues and hangs will often require upstream involvement to fully
26resolve. The intent is for Phorge to perform well in all reasonable cases,
27not require tuning for different workloads (as long as those workloads are
28generally reasonable). Poor performance with a reasonable workload is likely a
29bug, not a configuration problem.
30
31However, some pages are slow because Phorge legitimately needs to do a lot
32of work to generate them. For example, if you write a 100MB wiki document,
33Phorge will need substantial time to process it, it will take a long time
34to download over the network, and your browser will probably not be able to
35render it especially quickly.
36
37We may be able to improve performance in some cases, but Phorge is not
38magic and can not wish away real complexity. The best solution to these problems
39is usually to find another way to solve your problem: for example, maybe the
40100MB document can be split into several smaller documents.
41
42Here are some examples of performance problems under reasonable workloads that
43the upstream can help resolve:
44
45 - {icon check, color=green} Commenting on a file and mentioning that same
46 file results in a hang.
47 - {icon check, color=green} Creating a new user takes many seconds.
48 - {icon check, color=green} Loading Feed hangs on 32-bit systems.
49
50The upstream will be less able to help resolve unusual workloads with high
51inherent complexity, like these:
52
53 - {icon times, color=red} A 100MB wiki page takes a long time to render.
54 - {icon times, color=red} A Turing-complete simulation of Conway's Game of
55 Life implemented in 958,000 Herald rules executes slowly.
56 - {icon times, color=red} Uploading an 8GB file takes several minutes.
57
58Generally, the path forward will be:
59
60 - Follow the instructions in this document to gain the best understanding of
61 the issue (and of how to reproduce it) that you can.
62 - In particular, is it being caused by an unusual workload (like a 100MB
63 wiki page)? If so, consider other ways to solve the problem.
64 - File a report with the upstream by following the instructions in
65 @{article:Contributing Bug Reports}.
66
67The remaining sections in this document walk through these steps.
68
69
70Understanding Performance Problems
71==================================
72
73To isolate, examine, and understand performance problems, follow these steps:
74
75**General Slowness**: If you are experiencing generally poor performance, use
76Multimeter to understand resource usage and look for load-based causes. See
77@{article:Multimeter User Guide}. If that isn't fruitful, treat this like a
78reproducible performance problem on an arbitrary page.
79
80**Hangs**: If you are experiencing hangs (pages which never return, or which
81time out with a fatal after some number of seconds), they are almost always
82the result of bugs in the upstream. Report them by following these
83instructions:
84
85 - Set `debug.time-limit` to a value like `5`.
86 - Reproduce the hang. The page should exit after 5 seconds with a more useful
87 stack trace.
88 - File a report with the reproduction instructions and the stack trace in
89 the upstream. See @{article:Contributing Bug Reports} for detailed
90 instructions.
91 - Clear `debug.time-limit` again to take your install out of debug mode.
92
93If part of the reproduction instructions include "Create a 100MB wiki page",
94the upstream may be less sympathetic to your cause than if reproducing the
95issue does not require an unusual, complex workload.
96
97In some cases, the hang may really just a very large amount of processing time.
98If you're very excited about 100MB wiki pages and don't mind waiting many
99minutes for them to render, you may be able to adjust `max_execution_time` in
100your PHP configuration to allow the process enough time to complete, or adjust
101settings in your webserver config to let it wait longer for results.
102
103**DarkConsole**: If you have a reproducible performance problem (for example,
104loading a specific page is very slow), you can enable DarkConsole (a builtin
105debugging console) to examine page performance in detail.
106
107The two most useful tabs in DarkConsole are the "Services" tab and the
108"XHProf" tab.
109
110The "Services" module allows you to examine service calls (network calls,
111subprocesses, events, etc) and find slow queries, slow services, inefficient
112query plans, and unnecessary calls. Broadly, you're looking for slow or
113repeated service calls, or calls which don't make sense given what the page
114should be doing.
115
116After installing XHProf (see @{article:Using XHProf}) you'll gain access to the
117"XHProf" tab, which is a full tracing profiler. You can use the "Profile Page"
118button to generate a complete trace of where a page is spending time. When
119reading a profile, you're looking for the overall use of time, and for anything
120which sticks out as taking unreasonably long or not making sense.
121
122See @{article:Using DarkConsole} for complete instructions on configuring
123and using DarkConsole.
124
125**AJAX Requests**: To debug Ajax requests, activate DarkConsole and then turn
126on the profiler or query analyzer on the main request by clicking the
127appropriate button. The setting will cascade to Ajax requests made by the page
128and they'll show up in the console with full query analysis or profiling
129information.
130
131**Command-Line Hangs**: If you have a script or daemon hanging, you can send
132it `SIGHUP` to have it dump a stack trace to `sys_get_temp_dir()` (usually
133`/tmp`).
134
135Do this with:
136
137```
138$ kill -HUP <pid>
139```
140
141You can use this command to figure out where the system's temporary directory
142is:
143
144```
145$ php -r 'echo sys_get_temp_dir()."\n";'
146```
147
148On most systems, this is `/tmp`. The trace should appear in that directory with
149a name like `phabricator_backtrace_<pid>`. Examining this trace may provide
150a key to understanding the problem.
151
152**Command-Line Performance**: If you have general performance issues with
153command-line scripts, you can add `--trace` to see a service call log. This is
154similar to the "Services" tab in DarkConsole. This may help identify issues.
155
156After installing XHProf, you can also add `--xprofile <filename>` to emit a
157detailed performance profile. You can `arc upload` these files and then view
158them in XHProf from the web UI.
159
160Next Steps
161==========
162
163If you've done all you can to isolate and understand the problem you're
164experiencing, report it to the upstream. Including as much relevant data as
165you can, including:
166
167 - reproduction instructions;
168 - traces from `debug.time-limit` for hangs;
169 - screenshots of service call logs from DarkConsole (review these carefully,
170 as they can sometimes contain sensitive information);
171 - traces from CLI scripts with `--trace`;
172 - traces from sending HUP to processes; and
173 - XHProf profile files from `--xprofile` or "Download .xhprof Profile" in
174 the web UI.
175
176After collecting this information:
177
178 - follow the instructions in @{article:Contributing Bug Reports} to file
179 a report in the upstream.