@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.)
hq.recaptime.dev/wiki/Phorge
phorge
phabricator
1@title Cluster: Partitioning and Advanced Configuration
2@group cluster
3
4Guide to partitioning Phorge applications across multiple database hosts.
5
6Overview
7========
8
9You can partition Phorge's applications across multiple databases. For
10example, you can move an application like Files or Maniphest to a dedicated
11database host.
12
13The advantages of doing this are:
14
15 - moving heavily used applications to dedicated hardware can help you
16 scale; and
17 - you can match application workloads to hardware or configuration to make
18 operating the cluster easier.
19
20This configuration is complex, and very few installs will benefit from pursuing
21it. Phorge will normally run comfortably with a single database master
22even for large organizations.
23
24Partitioning generally does not do much to increase resilience or make it
25easier to recover from disasters, and is primarily a mechanism for scaling and
26operational convenience.
27
28If you are considering partitioning, you likely want to configure replication
29with a single master first. Even if you choose not to deploy replication, you
30should review and understand how replication works before you partition. For
31details, see @{article:Cluster: Databases}.
32
33Databases also support some advanced configuration options. Briefly:
34
35 - `persistent`: Allows use of persistent connections, reducing pressure on
36 outbound ports.
37
38See "Advanced Configuration", below, for additional discussion.
39
40
41What Partitioning Does
42======================
43
44When you partition Phorge, you move all of the data for one or more
45applications (like Maniphest) to a new master database host. This is possible
46because Phorge stores data for each application in its own logical
47database (like `phorge_maniphest`) and performs no joins between databases.
48
49If you're running into scale limits on a single master database, you can move
50one or more of your most commonly-used applications to a second database host
51and continue adding users. You can keep partitioning applications until all
52heavily used applications have dedicated database servers.
53
54Alternatively or additionally, you can partition applications to make operating
55the cluster easier. Some applications have unusual workloads or requirements,
56and moving them to separate hosts may make things easier to deal with overall.
57
58For example: if Files accounts for most of the data on your install, you might
59move it to a different host to make backing up everything else easier.
60
61
62Configuration Overview
63======================
64
65To configure partitioning, you will add multiple entries to `cluster.databases`
66with the `master` role. Each `master` should specify a new `partition` key,
67which contains a list of application databases it should host.
68
69One master may be specified as the `default` partition. Applications not
70explicitly configured to be assigned elsewhere will be assigned here.
71
72When you define multiple `master` databases, you must also specify which master
73each `replica` database follows. Here's a simple example config:
74
75```lang=json
76...
77"cluster.databases": [
78 {
79 "host": "db001.corporation.com",
80 "role": "master",
81 "user": "phorge",
82 "pass": "hunter2!trustno1",
83 "port": 3306,
84 "partition": [
85 "default"
86 ]
87 },
88 {
89 "host": "db002.corporation.com",
90 "role": "replica",
91 "user": "phorge",
92 "pass": "hunter2!trustno1",
93 "port": 3306,
94 "master": "db001.corporation.com:3306"
95 },
96 {
97 "host": "db003.corporation.com",
98 "role": "master",
99 "user": "phorge",
100 "pass": "hunter2!trustno1",
101 "port": 3306,
102 "partition": [
103 "file",
104 "passphrase",
105 "slowvote"
106 ]
107 },
108 {
109 "host": "db004.corporation.com",
110 "role": "replica",
111 "user": "phorge",
112 "pass": "hunter2!trustno1",
113 "port": 3306,
114 "master": "db003.corporation.com:3306"
115 }
116],
117...
118```
119
120In this configuration, `db001` is a master and `db002` replicates it.
121`db003` is a second master, replicated by `db004`.
122
123Applications have been partitioned like this:
124
125 - `db003`/`db004`: Files, Passphrase, Slowvote
126 - `db001`/`db002`: Default (all other applications)
127
128Not all of the database partition names are the same as the application
129names. You can get a list of databases with `bin/storage databases` to identify
130the correct database names.
131
132After you have configured partitioning, it needs to be committed to the
133databases. This writes a copy of the configuration to tables on the databases,
134preventing errors if a webserver accidentally starts with an old or invalid
135configuration.
136
137To commit the configuration, run this command:
138
139```
140phorge/ $ ./bin/storage partition
141```
142
143Run this command after making any partition or clustering changes. Webservers
144will not serve traffic if their configuration and the database configuration
145differ.
146
147
148Launching a new Partition
149=========================
150
151To add a new partition, follow these steps:
152
153 - Set up the new database host or hosts.
154 - Add the new database to `cluster.databases`, but keep its "partition"
155 configuration empty (just an empty list). If this is the first time you
156 are partitioning, you will need to configure your existing master as the
157 new "default". This will let Phorge interact with it, but won't send
158 any traffic to it yet.
159 - Run `bin/storage partition`.
160 - Run `bin/storage upgrade` to initialize the schemata on the new hosts.
161 - Stop writes to the applications you want to move by putting Phorge
162 in read-only mode, or shutting down the webserver and daemons, or telling
163 everyone not to touch anything.
164 - Dump the data from the application databases on the old master.
165 - Load the data into the application databases on the new master.
166 - Reconfigure the "partition" setup so that Phorge knows the databases
167 have moved.
168 - Run `bin/storage partition`.
169 - While still in read-only mode, check that all the data appears to be
170 intact.
171 - Resume writes.
172
173You can do this with a small, rarely-used application first (on most installs,
174Slowvote might be a good candidate) if you want to run through the process
175end-to-end before performing a larger, higher-stakes migration.
176
177
178How Partitioning Works
179======================
180
181If you have multiple masters, Phorge keeps the entire set of schemata up
182to date on all of them. When you run `bin/storage upgrade` or other storage
183management commands, they generally affect all masters (if they do not, they
184will prompt you to be more specific).
185
186When the application goes to read or write normal data (for example, to query a
187list of tasks) it only connects to the master which the application it is
188acting on behalf of is assigned to.
189
190In most cases, a masters will not have any data in most the databases which are
191not assigned to it. If they do (for example, because they previously hosted the
192application) the data is ignored. This approach (of maintaining all schemata on
193all hosts) makes it easier to move data and to quickly revert changes if a
194configuration mistake occurs.
195
196There are some exceptions to this rule. For example, all masters keep track
197of which patches have been applied to that particular master so that
198`bin/storage upgrade` can upgrade hosts correctly.
199
200Phorge does not perform joins across logical databases, so there are no
201meaningful differences in runtime behavior if two applications are on the same
202physical host or different physical hosts.
203
204
205Advanced Configuration
206======================
207
208Separate from partitioning, some advanced configuration is supported. These
209options must be set on database specifications in `cluster.databases`. You can
210configure them without actually building a cluster by defining a cluster with
211only one master.
212
213`persistent` //(bool)// Enables persistent connections. Defaults to off.
214
215With persistent connections enabled, Phorge will keep a pool of database
216connections open between web requests and reuse them when serving subsequent
217requests.
218
219The primary benefit of using persistent connections is that it will greatly
220reduce pressure on how quickly outbound TCP ports are opened and closed. After
221a TCP port closes, it normally can't be used again for about 60 seconds, so
222rapidly cycling ports can cause resource exhaustion. If you're seeing failures
223because requests are unable to bind to an outbound port, enabling this option
224is likely to fix the issue. This option may also slightly increase performance.
225
226The cost of using persistent connections is that you may need to raise the
227MySQL `max_connections` setting: although Phorge will make far fewer
228connections, the connections it does make will be longer-lived. Raising this
229setting will increase MySQL memory requirements and may run into other limits,
230like `open_files_limit`, which may also need to be raised.
231
232Persistent connections are enabled per-database. If you always want to use
233them, set the flag on each configured database in `cluster.databases`.
234
235
236Next Steps
237==========
238
239Continue by:
240
241 - returning to @{article:Clustering Introduction}.