@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.) hq.recaptime.dev/wiki/Phorge
phorge phabricator
at upstream/main 260 lines 12 kB view raw
1@title Drydock User Guide 2@group userguide 3 4Drydock, a software and hardware resource manager. 5 6Overview 7======== 8 9WARNING: Drydock is very new and has many sharp edges. Prepare yourself for 10a challenging adventure in unmapped territory, not a streamlined experience 11where things work properly or make sense. 12 13Drydock is an infrastructure application that primarily helps other 14applications coordinate during complex build and deployment tasks. Typically, 15you will configure Drydock to enable capabilities in other applications: 16 17 - Harbormaster can use Drydock to host builds. 18 - Differential can use Drydock to perform server-side merges. 19 20Users will not normally interact with Drydock directly. 21 22If you want to get started with Drydock right away, see 23@{article:Drydock User Guide: Quick Start} for specific instructions on 24configuring integrations. 25 26 27What Drydock Does 28================= 29 30Drydock manages working copies, hosts, and other software and hardware 31resources that build and deployment processes may require in order to perform 32useful work. 33 34Many useful processes need a working copy of a repository (or some similar sort 35of resource) so they can read files, perform version control operations, or 36execute code. 37 38For example, you might want to be able to automatically run unit tests, build a 39binary, or generate documentation every time a new commit is pushed. Or you 40might want to automatically merge a revision or cherry-pick a commit from a 41development branch to a release branch. Any of these tasks need a working copy 42of the repository before they can get underway. 43 44These processes could just clone a new working copy when they started and 45delete it when they finished. This works reasonably well at a small scale, but 46will eventually hit limitations if you want to do things like: expand the build 47tier to multiple machines; or automatically scale the tier up and down based on 48usage; or reuse working copies to improve performance; or make sure things get 49cleaned up after a process fails; or have jobs wait if the tier is too busy. 50Solving these problems effectively requires coordination between the processes 51doing the actual work. 52 53Drydock solves these scaling problems by providing a central allocation 54framework for //resources//, which are physical or virtual resources like a 55host or a working copy. Processes which need to share hardware or software can 56use Drydock to coordinate creation, access, and destruction of those resources. 57 58Applications ask Drydock for resources matching a description, and it allocates 59a corresponding resource by either finding a suitable unused resource or 60creating a new resource. When work completes, the resource is returned to the 61resource pool or destroyed. 62 63 64Getting Started with Drydock 65============================ 66 67In general, you will interact with Drydock by configuring blueprints, which 68tell Drydock how to build resources. You can jump into this topic directly 69in @{article:Drydock Blueprints}. 70 71For help on configuring specific application features: 72 73 - to configure server-side merges from Differential, see 74 @{article:Differential User Guide: Automated Landing}. 75 76You should also understand the Drydock security model before deploying it 77in a production environment. See @{article:Drydock User Guide: Security}. 78 79The remainder of this document has some additional high-level discussion about 80how Drydock works and why it works that way, which may be helpful in 81understanding the application as a whole. 82 83 84Drydock Concepts 85================ 86 87The major concepts in Drydock are **Blueprints**, **Resources**, **Leases**, 88and the **Allocator**. 89 90**Blueprints** are configuration that tells Drydock how to create resources: 91where it can put them, how to access them, how many it can make at once, who is 92allowed to ask for access to them, how to actually build them, how to clean 93them up when they are no longer in use, and so on. 94 95Drydock starts without any blueprints. You'll add blueprints to configure 96Drydock and enable it to satisfy requests for resources. You can learn more 97about blueprints in @{article:Drydock Blueprints}. 98 99**Resources** represent things (like hosts or working copies) that Drydock has 100created, is managing the lifecycle for, and can give other applications access 101to. 102 103**Leases** are requests for resources with certain qualities by other 104applications. For example, Harbormaster may request a working copy of a 105particular repository so it can run unit tests. 106 107The **Allocator** is where Drydock actually does work. It works roughly like 108this: 109 110 - An application creates a lease describing a resource it needs, and 111 uses this lease to ask Drydock for an appropriate resource. 112 - Drydock looks at free resources to try to find one it can use to satisfy 113 the request. If it finds one, it marks the resource as in use and gives 114 the application details about how to access it. 115 - If it can't find an appropriate resource that already exists, it looks at 116 the blueprints it has configured to try to build one. If it can, it creates 117 a new resource, then gives the application access to it. 118 - Once the application finishes using the resource, it frees it. Depending 119 on configuration, Drydock may reuse it, destroy it, or hold onto it and 120 make a decision later. 121 122Some minor concepts in Drydock are **Slot Locks** and **Repository Operations**. 123 124**Slot Locks** are simple optimistic locks that most Drydock blueprints use to 125avoid race conditions. Their design is not particularly interesting or novel, 126they're just a fairly good fit for most of the locking problems that Drydock 127blueprints tend to encounter and Drydock provides APIs to make them easy to 128work with. 129 130**Repository Operations** help other applications coordinate writes to 131repositories. Multiple applications perform similar kinds of writes, and these 132writes require more sequencing/coordination and user feedback than other 133operations. 134 135 136Architecture Overview 137===================== 138 139This section describes some of Drydock's design goals and architectural 140choices, so you can understand its strengths and weaknesses and which problem 141domains it is well or poorly suited for. 142 143A typical use case for Drydock is giving another application access to a 144working copy in order to run a build or unit test operation. Drydock can 145satisfy the request and resume execution of application code in 1-2 seconds 146under reasonable conditions and with moderate tradeoffs, and can satisfy a 147large number of these requests in parallel. 148 149**Scalable**: Drydock is designed to scale easily to something in the realm of 150thousands of hosts in hundreds of pools, and far beyond that with a little 151work. 152 153Drydock is intended to solve resource management problems at very large scales 154and minimizes blocking operations, locks, and artificial sequencing. Drydock is 155designed to fully utilize an almost arbitrarily large pool of resources and 156improve performance roughly linearly with available hardware. 157 158Because the application assumes that deployment at this scale and complexity 159level is typical, you may need to configure more things and do more work than 160you would under the simplifying assumptions of small scale. 161 162**Heavy Resources**: Drydock assumes that resources are relatively 163heavyweight and and require a meaningful amount (a second or more) of work to 164build, maintain and tear down. It also assumes that leases will often have 165substantial lifespans (seconds or minutes) while performing operations. 166 167Resources like working copies (which typically take several seconds to create 168with a command like `git clone`) and VMs (which typically take several seconds 169to spin up) are good fits for Drydock and for the problems it is intended to 170solve. 171 172Lease operations like running unit tests, performing builds, executing merges, 173generating documentation and running temporary services (which typically last 174at least a few seconds) are also good fits for Drydock. 175 176In both cases, the general concern with lightweight resources and operations is 177that Drydock operation overhead is roughly on the order of a second for many 178tasks, so overhead from Drydock will be substantial if resources are built and 179torn down in a few milliseconds or lease operations require only a fraction of 180a second to execute. 181 182As a rule of thumb, Drydock may be a poor fit for a problem if operations 183typically take less than a second to build, execute, and destroy. 184 185**Focus on Resource Construction**: Drydock is primarily solving a resource 186construction problem: something needs a resource matching some description, so 187Drydock finds or builds that resource as quickly as possible. 188 189Drydock generally prioritizes responding to requests quickly over other 190concerns, like minimizing waste or performing complex scheduling. Although you 191can make adjustments to some of these behaviors, it generally assumes that 192resources are cheap compared to the cost of waiting for resource construction. 193 194This isn't to say that Drydock is grossly wasteful or has a terrible scheduler, 195just that efficient utilization and efficient scheduling aren't the primary 196problems the design focuses on. 197 198This prioritization corresponds to scenarios where resources are something like 199hosts or working copies, and operations are something like builds, and the cost 200of hosts and storage is small compared to the cost of engineer time spent 201waiting on jobs to get scheduled. 202 203Drydock may be a weak fit for a problem if it is bounded by resource 204availability and using resources as efficiently as possible is very important. 205Drydock generally assumes you will respond to a resource deficit by making more 206resources available (usually very cheap), rather than by paying engineers to 207wait for operations to complete (usually very expensive). 208 209**Isolation Tradeoffs**: Drydock assumes that multiple operations running at 210similar levels of trust may be interested in reducing isolation to improve 211performance, reduce complexity, or satisfy some other similar goal. It does not 212guarantee isolation and assumes most operations will not run in total isolation. 213 214If this isn't true for your use case, you'll need to be careful in configuring 215Drydock to make sure that operations are fully isolated and can not interact. 216Complete isolation will reduce the performance of the allocator as it will 217generally prevent it from reusing resources, which is one of the major ways it 218can improve performance. 219 220You can find more discussion of these tradeoffs in 221@{article:Drydock User Guide: Security}. 222 223**Agentless**: Drydock does not require an agent or daemon to be installed on 224hosts. It interacts with hosts over SSH. 225 226**Very Abstract**: Drydock's design is //extremely// abstract. Resources have 227very little hardcoded behavior. The allocator has essentially zero specialized 228knowledge about what it is actually doing. 229 230One aspect of this abstractness is that Drydock is composable, and solves 231complex allocation problems by //asking itself// to build the pieces it needs. 232To build a working copy, Drydock first asks itself for a suitable host. It 233solves this allocation sub-problem, then resolves the original request. 234 235This allows new types of resources to build on Drydock's existing knowledge of 236resource construction by just saying "build one of these other things you 237already know how to build, then apply a few adjustments". This also means that 238you can tell Drydock about a new way to build hosts (say, bring up VMs from a 239different service provider) and the rest of the pipeline can use these new 240hosts interchangeably with the old hosts. 241 242While this design theoretically makes Drydock more powerful and more flexible 243than a less abstract approach, abstraction is frequently a double-edged sword. 244 245Drydock is almost certainly at the extreme upper end of abstraction for tools 246in this space, and the level of abstraction may ultimately match poorly with a 247particular problem domain. Alternative approaches may give you more specialized 248and useful tools for approaching a given problem. 249 250 251Next Steps 252========== 253 254Continue by: 255 256 - understanding Drydock security concerns with 257 @{article:Drydock User Guide: Security}; or 258 - learning about blueprints in @{article:Drydock Blueprints}; or 259 - allowing Phorge to write to repositories with 260 @{article:Drydock User Guide: Repository Automation}.