summaryrefslogtreecommitdiff
path: root/taskcluster/docs
diff options
context:
space:
mode:
Diffstat (limited to 'taskcluster/docs')
-rw-r--r--taskcluster/docs/attributes.rst124
-rw-r--r--taskcluster/docs/caches.rst43
-rw-r--r--taskcluster/docs/docker-images.rst42
-rw-r--r--taskcluster/docs/how-tos.rst220
-rw-r--r--taskcluster/docs/index.rst30
-rw-r--r--taskcluster/docs/kinds.rst144
-rw-r--r--taskcluster/docs/loading.rst31
-rw-r--r--taskcluster/docs/parameters.rst97
-rw-r--r--taskcluster/docs/reference.rst12
-rw-r--r--taskcluster/docs/taskgraph.rst276
-rw-r--r--taskcluster/docs/transforms.rst198
-rw-r--r--taskcluster/docs/yaml-templates.rst49
12 files changed, 1266 insertions, 0 deletions
diff --git a/taskcluster/docs/attributes.rst b/taskcluster/docs/attributes.rst
new file mode 100644
index 0000000000..d93964d653
--- /dev/null
+++ b/taskcluster/docs/attributes.rst
@@ -0,0 +1,124 @@
+===============
+Task Attributes
+===============
+
+Tasks can be filtered, for example to support "try" pushes which only perform a
+subset of the task graph or to link dependent tasks. This filtering is the
+difference between a full task graph and a target task graph.
+
+Filtering takes place on the basis of attributes. Each task has a dictionary
+of attributes and filters over those attributes can be expressed in Python. A
+task may not have a value for every attribute.
+
+The attributes, and acceptable values, are defined here. In general, attribute
+names and values are the short, lower-case form, with underscores.
+
+kind
+====
+
+A task's ``kind`` attribute gives the name of the kind that generated it, e.g.,
+``build`` or ``spidermonkey``.
+
+run_on_projects
+===============
+
+The projects where this task should be in the target task set. This is how
+requirements like "only run this on inbound" get implemented. These are
+either project names or the aliases
+
+ * `integration` -- integration branches
+ * `release` -- release branches including mozilla-central
+ * `all` -- everywhere (the default)
+
+For try, this attribute applies only if ``-p all`` is specified. All jobs can
+be specified by name regardless of ``run_on_projects``.
+
+If ``run_on_projects`` is set to an empty list, then the task will not run
+anywhere, unless its build platform is specified explicitly in try syntax.
+
+task_duplicates
+===============
+
+This is used to indicate that we want multiple copies of the task created.
+This feature is used to track down intermittent job failures.
+
+If this value is set to N, the task-creation machinery will create a total of N
+copies of the task. Only the first copy will be included in the taskgraph
+output artifacts, although all tasks will be contained in the same taskGroup.
+
+While most attributes are considered read-only, target task methods may alter
+this attribute of tasks they include in the target set.
+
+build_platform
+==============
+
+The build platform defines the platform for which the binary was built. It is
+set for both build and test jobs, although test jobs may have a different
+``test_platform``.
+
+build_type
+==========
+
+The type of build being performed. This is a subdivision of ``build_platform``,
+used for different kinds of builds that target the same platform. Values are
+
+ * ``debug``
+ * ``opt``
+
+test_platform
+=============
+
+The test platform defines the platform on which tests are run. It is only
+defined for test jobs and may differ from ``build_platform`` when the same binary
+is tested on several platforms (for example, on several versions of Windows).
+This applies for both talos and unit tests.
+
+Unlike build_platform, the test platform is represented in a slash-separated
+format, e.g., ``linux64/opt``.
+
+unittest_suite
+==============
+
+This is the unit test suite being run in a unit test task. For example,
+``mochitest`` or ``cppunittest``.
+
+unittest_flavor
+===============
+
+If a unittest suite has subdivisions, those are represented as flavors. Not
+all suites have flavors, in which case this attribute should be set to match
+the suite. Examples: ``mochitest-devtools-chrome-chunked`` or ``a11y``.
+
+unittest_try_name
+=================
+
+This is the name used to refer to a unit test via try syntax. It
+may not match either of ``unittest_suite`` or ``unittest_flavor``.
+
+talos_try_name
+==============
+
+This is the name used to refer to a talos job via try syntax.
+
+test_chunk
+==========
+
+This is the chunk number of a chunked test suite (talos or unittest). Note
+that this is a string!
+
+e10s
+====
+
+For test suites which distinguish whether they run with or without e10s, this
+boolean value identifies this particular run.
+
+image_name
+==========
+
+For the ``docker_image`` kind, this attribute contains the docker image name.
+
+nightly
+=======
+
+Signals whether the task is part of a nightly graph. Useful when filtering
+out nightly tasks from full task set at target stage.
diff --git a/taskcluster/docs/caches.rst b/taskcluster/docs/caches.rst
new file mode 100644
index 0000000000..9f19035d72
--- /dev/null
+++ b/taskcluster/docs/caches.rst
@@ -0,0 +1,43 @@
+.. taskcluster_caches:
+
+=============
+Common Caches
+=============
+
+There are various caches used by the in-tree tasks. This page attempts to
+document them and their appropriate use.
+
+Version Control Caches
+======================
+
+``level-{{level}}-checkouts-{{version}}``
+ This cache holds version control checkouts, each in a subdirectory named
+ after the repo (e.g., ``gecko``).
+
+ Checkouts should be read-only. If a task needs to create new files from
+ content of a checkout, this content should be written in a separate
+ directory/cache (like a workspace).
+
+ A ``version`` parameter appears in the cache name to allow
+ backwards-incompatible changes to the cache's behavior.
+
+``level-{{level}}-{{project}}-tc-vcs`` (deprecated)
+ This cache is used internally by ``tc-vcs``. This tool is deprecated and
+ should be replaced with ``hg robustcheckout``.
+
+Workspace Caches
+================
+
+``level-{{level}}-*-workspace``
+ These caches (of various names typically ending with ``workspace``)
+ contain state to be shared between task invocations. Use cases are
+ dependent on the task.
+
+Other
+=====
+
+``tooltool-cache``
+ Tooltool invocations should use this cache. Tooltool will store files here
+ indexed by their hash, and will verify hashes before copying files from
+ this directory, so there is no concern with sharing the cache between jobs
+ of different levels.
diff --git a/taskcluster/docs/docker-images.rst b/taskcluster/docs/docker-images.rst
new file mode 100644
index 0000000000..22dea4dead
--- /dev/null
+++ b/taskcluster/docs/docker-images.rst
@@ -0,0 +1,42 @@
+.. taskcluster_dockerimages:
+
+=============
+Docker Images
+=============
+
+TaskCluster Docker images are defined in the source directory under
+``testing/docker``. Each directory therein contains the name of an
+image used as part of the task graph.
+
+Adding Extra Files to Images
+============================
+
+Dockerfile syntax has been extended to allow *any* file from the
+source checkout to be added to the image build *context*. (Traditionally
+you can only ``ADD`` files from the same directory as the Dockerfile.)
+
+Simply add the following syntax as a comment in a Dockerfile::
+
+ # %include <path>
+
+e.g.
+
+ # %include mach
+ # %include testing/mozharness
+
+The argument to ``# %include`` is a relative path from the root level of
+the source directory. It can be a file or a directory. If a file, only that
+file will be added. If a directory, every file under that directory will be
+added (even files that are untracked or ignored by version control).
+
+Files added using ``# %include`` syntax are available inside the build
+context under the ``topsrcdir/`` path.
+
+Files are added as they exist on disk. e.g. executable flags should be
+preserved. However, the file owner/group is changed to ``root`` and the
+``mtime`` of the file is normalized.
+
+Here is an example Dockerfile snippet::
+
+ # %include mach
+ ADD topsrcdir/mach /home/worker/mach
diff --git a/taskcluster/docs/how-tos.rst b/taskcluster/docs/how-tos.rst
new file mode 100644
index 0000000000..6b143dd427
--- /dev/null
+++ b/taskcluster/docs/how-tos.rst
@@ -0,0 +1,220 @@
+How Tos
+=======
+
+All of this equipment is here to help you get your work done more efficiently.
+However, learning how task-graphs are generated is probably not the work you
+are interested in doing. This section should help you accomplish some of the
+more common changes to the task graph with minimal fuss.
+
+.. important::
+
+ If you cannot accomplish what you need with the information provided here,
+ please consider whether you can achieve your goal in a different way.
+ Perhaps something simpler would cost a bit more in compute time, but save
+ the much more expensive resource of developers' mental bandwidth.
+ Task-graph generation is already complex enough!
+
+ If you want to proceed, you may need to delve into the implementation of
+ task-graph generation. The documentation and code are designed to help, as
+ are the authors - ``hg blame`` may help track down helpful people.
+
+ As you write your new transform or add a new kind, please consider the next
+ developer. Where possible, make your change data-driven and general, so
+ that others can make a much smaller change. Document the semantics of what
+ you are changing clearly, especially if it involves modifying a transform
+ schema. And if you are adding complexity temporarily while making a
+ gradual transition, please open a new bug to remind yourself to remove the
+ complexity when the transition is complete.
+
+Hacking Task Graphs
+-------------------
+
+The recommended process for changing task graphs is this:
+
+1. Find a recent decision task on the project or branch you are working on,
+ and download its ``parameters.yml`` from the Task Inspector. This file
+ contains all of the inputs to the task-graph generation process. Its
+ contents are simple enough if you would like to modify it, and it is
+ documented in :doc:`parameters`.
+
+2. Run one of the ``mach taskgraph`` subcommands (see :doc:`taskgraph`) to
+ generate a baseline against which to measure your changes. For example:
+
+ .. code-block:: none
+
+ ./mach taskgraph tasks --json -p parameters.yml > old-tasks.json
+
+3. Make your modifications under ``taskcluster/``.
+
+4. Run the same ``mach taskgraph`` command, sending the output to a new file,
+ and use ``diff`` to compare the old and new files. Make sure your changes
+ have the desired effect and no undesirable side-effects.
+
+5. When you are satisfied with the changes, push them to try to ensure that the
+ modified tasks work as expected.
+
+Common Changes
+--------------
+
+Changing Test Characteristics
+.............................
+
+First, find the test description. This will be in
+``taskcluster/ci/*/tests.yml``, for the appropriate kind (consult
+:doc:`kinds`). You will find a YAML stanza for each test suite, and each
+stanza defines the test's characteristics. For example, the ``chunks``
+property gives the number of chunks to run. This can be specified as a simple
+integer if all platforms have the same chunk count, or it can be keyed by test
+platform. For example:
+
+.. code-block:: yaml
+
+ chunks:
+ by-test-platform:
+ linux64/debug: 10
+ default: 8
+
+The full set of available properties is in
+``taskcluster/taskgraph/transform/tests/test_description.py``. Some other
+commonly-modified properties are ``max-run-time`` (useful if tests are being
+killed for exceeding maxRunTime) and ``treeherder-symbol``.
+
+.. note::
+
+ Android tests are also chunked at the mozharness level, so you will need to
+ modify the relevant mozharness config, as well.
+
+Adding a Test Suite
+...................
+
+To add a new test suite, you will need to know the proper mozharness invocation
+for that suite, and which kind it fits into (consult :doc:`kinds`).
+
+Add a new stanza to ``taskcluster/ci/<kind>/tests.yml``, copying from the other
+stanzas in that file. The meanings should be clear, but authoritative
+documentation is in
+``taskcluster/taskgraph/transform/tests/test_description.py`` should you need
+it. The stanza name is the name by which the test will be referenced in try
+syntax.
+
+Add your new test to a test set in ``test-sets.yml`` in the same directory. If
+the test should only run on a limited set of platforms, you may need to define
+a new test set and reference that from the appropriate platforms in
+``test-platforms.yml``. If you do so, include some helpful comments in
+``test-sets.yml`` for the next person.
+
+Greening Up a New Test
+......................
+
+When a test is not yet reliably green, configuration for that test should not
+be landed on integration branches. Of course, you can control where the
+configuration is landed! For many cases, it is easiest to green up a test in
+try: push the configuration to run the test to try along with your work to fix
+the remaining test failures.
+
+When working with a group, check out a "twig" repository to share among your
+group, and land the test configuration in that repository. Once the test is
+green, merge to an integration branch and the test will begin running there as
+well.
+
+Adding a New Task
+.................
+
+If you are adding a new task that is not a test suite, there are a number of
+options. A few questions to consider:
+
+ * Is this a new build platform or variant that will produce an artifact to
+ be run through the usual test suites?
+
+ * Does this task depend on other tasks? Do other tasks depend on it?
+
+ * Is this one of a few related tasks, or will you need to generate a large
+ set of tasks using some programmatic means (for example, chunking)?
+
+ * How is the task actually excuted? Mozharness? Mach?
+
+ * What kind of environment does the task require?
+
+Armed with that information, you can choose among a few options for
+implementing this new task. Try to choose the simplest solution that will
+satisfy your near-term needs. Since this is all implemented in-tree, it
+is not difficult to refactor later when you need more generality.
+
+Existing Kind
+`````````````
+
+The simplest option is to add your task to an existing kind. This is most
+practical when the task "makes sense" as part of that kind -- for example, if
+your task is building an installer for a new platform using mozharness scripts
+similar to the existing build tasks, it makes most sense to add your task to
+the ``build`` kind. If you need some additional functionality in the kind,
+it's OK to modify the implementation as necessary, as long as the modification
+is complete and useful to the next developer to come along.
+
+New Kind
+````````
+
+The next option to consider is adding a new kind. A distinct kind gives you
+some isolation from other task types, which can be nice if you are adding an
+experimental kind of task.
+
+Kinds can range in complexity. The simplest sort of kind uses the
+``TransformTask`` implementation to read a list of jobs from the ``jobs`` key,
+and applies the standard ``job`` and ``task`` transforms:
+
+.. code-block:: yaml
+
+ implementation: taskgraph.task.transform:TransformTask
+ transforms:
+ - taskgraph.transforms.job:transforms
+ - taskgraph.transforms.task:transforms
+ jobs:
+ - ..your job description here..
+
+Custom Kind Implementation
+``````````````````````````
+
+If your task depends on other tasks, then the decision of which tasks to create
+may require some code. For example, the ``upload-symbols`` kind iterates over
+the builds in the graph, generating a task for each one. This specific
+post-build behavior is implemented in the general
+``taskgraph.task.post_build:PostBuildTask`` kind implementation. If your task
+needs something more purpose-specific, then it may be time to write a new kind
+implementation.
+
+Custom Transforms
+`````````````````
+
+If your task needs to create many tasks from a single description, for example
+to implement chunking, it is time to implement some custom transforms. Ideally
+those transforms will produce job descriptions, so you can use the existing ``job``
+and ``task`` transforms:
+
+.. code-block:: yaml
+
+ transforms:
+ - taskgraph.transforms.my_stuff:transforms
+ - taskgraph.transforms.job:transforms
+ - taskgraph.transforms.task:transforms
+
+Similarly, if you need to include dynamic task defaults -- perhaps some feature
+is only available in level-3 repositories, or on specific projects -- then
+custom transforms are the appropriate tool. Try to keep transforms simple,
+single-purpose and well-documented!
+
+Custom Run-Using
+````````````````
+
+If the way your task is executed is unique (so, not a mach command or
+mozharness invocation), you can add a new implementation of the job
+description's "run" section. Before you do this, consider that it might be a
+better investment to modify your task to support invocation via mozharness or
+mach, instead. If this is not possible, then adding a new file in
+``taskcluster/taskgraph/transforms/jobs`` with a structure similar to its peers
+will make the new run-using option available for job descriptions.
+
+Something Else?
+...............
+
+If you make another change not described here that turns out to be simple or
+common, please include an update to this file in your patch.
diff --git a/taskcluster/docs/index.rst b/taskcluster/docs/index.rst
new file mode 100644
index 0000000000..d1a5c600b4
--- /dev/null
+++ b/taskcluster/docs/index.rst
@@ -0,0 +1,30 @@
+.. taskcluster_index:
+
+TaskCluster Task-Graph Generation
+=================================
+
+The ``taskcluster`` directory contains support for defining the graph of tasks
+that must be executed to build and test the Gecko tree. This is more complex
+than you might suppose! This implementation supports:
+
+ * A huge array of tasks
+ * Different behavior for different repositories
+ * "Try" pushes, with special means to select a subset of the graph for execution
+ * Optimization -- skipping tasks that have already been performed
+ * Extremely flexible generation of a variety of tasks using an approach of
+ incrementally transforming job descriptions into task definitions.
+
+This section of the documentation describes the process in some detail,
+referring to the source where necessary. If you are reading this with a
+particular goal in mind and would rather avoid becoming a task-graph expert,
+check out the :doc:`how-to section <how-tos>`.
+
+.. toctree::
+
+ taskgraph
+ loading
+ transforms
+ yaml-templates
+ docker-images
+ how-tos
+ reference
diff --git a/taskcluster/docs/kinds.rst b/taskcluster/docs/kinds.rst
new file mode 100644
index 0000000000..44bddb360b
--- /dev/null
+++ b/taskcluster/docs/kinds.rst
@@ -0,0 +1,144 @@
+Task Kinds
+==========
+
+This section lists and documents the available task kinds.
+
+build
+------
+
+Builds are tasks that produce an installer or other output that can be run by
+users or automated tests. This is more restrictive than most definitions of
+"build" in a Mozilla context: it does not include tasks that run build-like
+actions for static analysis or to produce instrumented artifacts.
+
+artifact-build
+--------------
+
+This kind performs an artifact build: one based on precompiled binaries
+discovered via the TaskCluster index. This task verifies that such builds
+continue to work correctly.
+
+hazard
+------
+
+Hazard builds are similar to "regular' builds, but use a compiler extension to
+extract a bunch of data from the build and then analyze that data looking for
+hazardous behaviors.
+
+l10n
+----
+
+TBD (Callek)
+
+source-check
+------------
+
+Source-checks are tasks that look at the Gecko source directly to check
+correctness. This can include linting, Python unit tests, source-code
+analysis, or measurement work -- basically anything that does not require a
+build.
+
+upload-symbols
+--------------
+
+Upload-symbols tasks run after builds and upload the symbols files generated by
+build tasks to Socorro for later use in crash analysis.
+
+valgrind
+--------
+
+Valgrind tasks produce builds instrumented by valgrind.
+
+static-analysis
+---------------
+
+Static analysis builds use the compiler to perform some detailed analysis of
+the source code while building. The useful output from these tasks are their
+build logs, and while they produce a binary, they do not upload it as an
+artifact.
+
+toolchain
+---------
+
+Toolchain builds create the compiler toolchains used to build Firefox. These
+will eventually be dependencies of the builds themselves, but for the moment
+are run manually via try pushes and the results uploaded to tooltool.
+
+spidermonkey
+------------
+
+Spidermonkey tasks check out the full gecko source tree, then compile only the
+spidermonkey portion. Each task runs specific tests after the build.
+
+marionette-harness
+------------------
+
+TBD (Maja)
+
+Tests
+-----
+
+Test tasks for Gecko products are divided into several kinds, but share a
+common implementation. The process goes like this, based on a set of YAML
+files named in ``kind.yml``:
+
+ * For each build task, determine the related test platforms based on the build
+ platform. For example, a Windows 2010 build might be tested on Windows 7
+ and Windows 10. Each test platform specifies a "test set" indicating which
+ tests to run. This is configured in the file named
+ ``test-platforms.yml``.
+
+ * Each test set is expanded to a list of tests to run. This is configured in
+ the file named by ``test-sets.yml``.
+
+ * Each named test is looked up in the file named by ``tests.yml`` to find a
+ test description. This test description indicates what the test does, how
+ it is reported to treeherder, and how to perform the test, all in a
+ platform-independent fashion.
+
+ * Each test description is converted into one or more tasks. This is
+ performed by a sequence of transforms defined in the ``transforms`` key in
+ ``kind.yml``. See :doc:`transforms`: for more information on these
+ transforms.
+
+ * The resulting tasks become a part of the task graph.
+
+.. important::
+
+ This process generates *all* test jobs, regardless of tree or try syntax.
+ It is up to a later stage of the task-graph generation (the target set) to
+ select the tests that will actually be performed.
+
+desktop-test
+............
+
+The ``desktop-test`` kind defines tests for Desktop builds. Its ``tests.yml``
+defines the full suite of desktop tests and their particulars, leaving it to
+the transforms to determine how those particulars apply to Linux, OS X, and
+Windows.
+
+android-test
+............
+
+The ``android-test`` kind defines tests for Android builds.
+
+It is very similar to ``desktop-test``, but the details of running the tests
+differ substantially, so they are defined separately.
+
+docker-image
+------------
+
+Tasks of the ``docker-image`` kind build the Docker images in which other
+Docker tasks run.
+
+The tasks to generate each docker image have predictable labels:
+``build-docker-image-<name>``.
+
+Docker images are built from subdirectories of ``testing/docker``, using
+``docker build``. There is currently no capability for one Docker image to
+depend on another in-tree docker image, without uploading the latter to a
+Docker repository
+
+The task definition used to create the image-building tasks is given in
+``image.yml`` in the kind directory, and is interpreted as a :doc:`YAML
+Template <yaml-templates>`.
diff --git a/taskcluster/docs/loading.rst b/taskcluster/docs/loading.rst
new file mode 100644
index 0000000000..1fa3c50f1e
--- /dev/null
+++ b/taskcluster/docs/loading.rst
@@ -0,0 +1,31 @@
+Loading Tasks
+=============
+
+The full task graph generation involves creating tasks for each kind. Kinds
+are ordered to satisfy ``kind-dependencies``, and then the ``implementation``
+specified in ``kind.yml`` is used to load the tasks for that kind.
+
+Specifically, the class's ``load_tasks`` class method is called, and returns a
+list of new ``Task`` instances.
+
+TransformTask
+-------------
+
+Most kinds generate their tasks by starting with a set of items describing the
+jobs that should be performed and transforming them into task definitions.
+This is the familiar ``transforms`` key in ``kind.yml`` and is further
+documented in :doc:`transforms`.
+
+Such kinds generally specify their tasks in a common format: either based on a
+``jobs`` property in ``kind.yml``, or on YAML files listed in ``jobs-from``.
+This is handled by the ``TransformTask`` class in
+``taskcluster/taskgraph/task/transform.py``.
+
+For kinds producing tasks that depend on other tasks -- for example, signing
+tasks depend on build tasks -- ``TransformTask`` has a ``get_inputs`` method
+that can be overridden in subclasses and written to return a set of items based
+on tasks that already exist. You can see a nice example of this behavior in
+``taskcluster/taskgraph/task/post_build.py``.
+
+For more information on how all of this works, consult the docstrings and
+comments in the source code itself.
diff --git a/taskcluster/docs/parameters.rst b/taskcluster/docs/parameters.rst
new file mode 100644
index 0000000000..8514259cef
--- /dev/null
+++ b/taskcluster/docs/parameters.rst
@@ -0,0 +1,97 @@
+==========
+Parameters
+==========
+
+Task-graph generation takes a collection of parameters as input, in the form of
+a JSON or YAML file.
+
+During decision-task processing, some of these parameters are supplied on the
+command line or by environment variables. The decision task helpfully produces
+a full parameters file as one of its output artifacts. The other ``mach
+taskgraph`` commands can take this file as input. This can be very helpful
+when working on a change to the task graph.
+
+When experimenting with local runs of the task-graph generation, it is always
+best to find a recent decision task's ``parameters.yml`` file, and modify that
+file if necessary, rather than starting from scratch. This ensures you have a
+complete set of parameters.
+
+The properties of the parameters object are described here, divided rougly by
+topic.
+
+Push Information
+----------------
+
+``triggered_by``
+ The event that precipitated this decision task; one of ``"nightly"`` or
+ ``"push"``.
+
+``base_repository``
+ The repository from which to do an initial clone, utilizing any available
+ caching.
+
+``head_repository``
+ The repository containing the changeset to be built. This may differ from
+ ``base_repository`` in cases where ``base_repository`` is likely to be cached
+ and only a few additional commits are needed from ``head_repository``.
+
+``head_rev``
+ The revision to check out; this can be a short revision string
+
+``head_ref``
+ For Mercurial repositories, this is the same as ``head_rev``. For
+ git repositories, which do not allow pulling explicit revisions, this gives
+ the symbolic ref containing ``head_rev`` that should be pulled from
+ ``head_repository``.
+
+``owner``
+ Email address indicating the person who made the push. Note that this
+ value may be forged and *must not* be relied on for authentication.
+
+``message``
+ The commit message
+
+``pushlog_id``
+ The ID from the ``hg.mozilla.org`` pushlog
+
+``pushdate``
+ The timestamp of the push to the repository that triggered this decision
+ task. Expressed as an integer seconds since the UNIX epoch.
+
+``build_date``
+ The timestamp of the build date. Defaults to ``pushdate`` and falls back to present time of
+ taskgraph invocation. Expressed as an integer seconds since the UNIX epoch.
+
+``moz_build_date``
+ A formatted timestamp of ``build_date``. Expressed as a string with the following
+ format: %Y%m%d%H%M%S
+
+Tree Information
+----------------
+
+``project``
+ Another name for what may otherwise be called tree or branch or
+ repository. This is the unqualified name, such as ``mozilla-central`` or
+ ``cedar``.
+
+``level``
+ The `SCM level
+ <https://www.mozilla.org/en-US/about/governance/policies/commit/access-policy/>`_
+ associated with this tree. This dictates the names of resources used in the
+ generated tasks, and those tasks will fail if it is incorrect.
+
+Target Set
+----------
+
+The "target set" is the set of task labels which must be included in a task
+graph. The task graph generation process will include any tasks required by
+those in the target set, recursively. In a decision task, this set can be
+specified programmatically using one of a variety of methods (e.g., parsing try
+syntax or reading a project-specific configuration file).
+
+``target_tasks_method``
+ The method to use to determine the target task set. This is the suffix of
+ one of the functions in ``tascluster/taskgraph/target_tasks.py``.
+
+``optimize_target_tasks``
+ If true, then target tasks are eligible for optimization.
diff --git a/taskcluster/docs/reference.rst b/taskcluster/docs/reference.rst
new file mode 100644
index 0000000000..813a3f630a
--- /dev/null
+++ b/taskcluster/docs/reference.rst
@@ -0,0 +1,12 @@
+Reference
+=========
+
+These sections contain some reference documentation for various aspects of
+taskgraph generation.
+
+.. toctree::
+
+ kinds
+ parameters
+ attributes
+ caches
diff --git a/taskcluster/docs/taskgraph.rst b/taskcluster/docs/taskgraph.rst
new file mode 100644
index 0000000000..5d3e7c7d3f
--- /dev/null
+++ b/taskcluster/docs/taskgraph.rst
@@ -0,0 +1,276 @@
+======================
+TaskGraph Mach Command
+======================
+
+The task graph is built by linking different kinds of tasks together, pruning
+out tasks that are not required, then optimizing by replacing subgraphs with
+links to already-completed tasks.
+
+Concepts
+--------
+
+* *Task Kind* - Tasks are grouped by kind, where tasks of the same kind do not
+ have interdependencies but have substantial similarities, and may depend on
+ tasks of other kinds. Kinds are the primary means of supporting diversity,
+ in that a developer can add a new kind to do just about anything without
+ impacting other kinds.
+
+* *Task Attributes* - Tasks have string attributes by which can be used for
+ filtering. Attributes are documented in :doc:`attributes`.
+
+* *Task Labels* - Each task has a unique identifier within the graph that is
+ stable across runs of the graph generation algorithm. Labels are replaced
+ with TaskCluster TaskIds at the latest time possible, facilitating analysis
+ of graphs without distracting noise from randomly-generated taskIds.
+
+* *Optimization* - replacement of a task in a graph with an equivalent,
+ already-completed task, or a null task, avoiding repetition of work.
+
+Kinds
+-----
+
+Kinds are the focal point of this system. They provide an interface between
+the large-scale graph-generation process and the small-scale task-definition
+needs of different kinds of tasks. Each kind may implement task generation
+differently. Some kinds may generate task definitions entirely internally (for
+example, symbol-upload tasks are all alike, and very simple), while other kinds
+may do little more than parse a directory of YAML files.
+
+A ``kind.yml`` file contains data about the kind, as well as referring to a
+Python class implementing the kind in its ``implementation`` key. That
+implementation may rely on lots of code shared with other kinds, or contain a
+completely unique implementation of some functionality.
+
+The full list of pre-defined keys in this file is:
+
+``implementation``
+ Class implementing this kind, in the form ``<module-path>:<object-path>``.
+ This class should be a subclass of ``taskgraph.kind.base:Kind``.
+
+``kind-dependencies``
+ Kinds which should be loaded before this one. This is useful when the kind
+ will use the list of already-created tasks to determine which tasks to
+ create, for example adding an upload-symbols task after every build task.
+
+Any other keys are subject to interpretation by the kind implementation.
+
+The result is a nice segmentation of implementation so that the more esoteric
+in-tree projects can do their crazy stuff in an isolated kind without making
+the bread-and-butter build and test configuration more complicated.
+
+Dependencies
+------------
+
+Dependencies between tasks are represented as labeled edges in the task graph.
+For example, a test task must depend on the build task creating the artifact it
+tests, and this dependency edge is named 'build'. The task graph generation
+process later resolves these dependencies to specific taskIds.
+
+Decision Task
+-------------
+
+The decision task is the first task created when a new graph begins. It is
+responsible for creating the rest of the task graph.
+
+The decision task for pushes is defined in-tree, in ``.taskcluster.yml``. That
+task description invokes ``mach taskcluster decision`` with some metadata about
+the push. That mach command determines the optimized task graph, then calls
+the TaskCluster API to create the tasks.
+
+Note that this mach command is *not* designed to be invoked directly by humans.
+Instead, use the mach commands described below, supplying ``parameters.yml``
+from a recent decision task. These commands allow testing everything the
+decision task does except the command-line processing and the
+``queue.createTask`` calls.
+
+Graph Generation
+----------------
+
+Graph generation, as run via ``mach taskgraph decision``, proceeds as follows:
+
+#. For all kinds, generate all tasks. The result is the "full task set"
+#. Create dependency links between tasks using kind-specific mechanisms. The
+ result is the "full task graph".
+#. Select the target tasks (based on try syntax or a tree-specific
+ specification). The result is the "target task set".
+#. Based on the full task graph, calculate the transitive closure of the target
+ task set. That is, the target tasks and all requirements of those tasks.
+ The result is the "target task graph".
+#. Optimize the target task graph based on kind-specific optimization methods.
+ The result is the "optimized task graph" with fewer nodes than the target
+ task graph.
+#. Create tasks for all tasks in the optimized task graph.
+
+Transitive Closure
+..................
+
+Transitive closure is a fancy name for this sort of operation:
+
+ * start with a set of tasks
+ * add all tasks on which any of those tasks depend
+ * repeat until nothing changes
+
+The effect is this: imagine you start with a linux32 test job and a linux64 test job.
+In the first round, each test task depends on the test docker image task, so add that image task.
+Each test also depends on a build, so add the linux32 and linux64 build tasks.
+
+Then repeat: the test docker image task is already present, as are the build
+tasks, but those build tasks depend on the build docker image task. So add
+that build docker image task. Repeat again: this time, none of the tasks in
+the set depend on a task not in the set, so nothing changes and the process is
+complete.
+
+And as you can see, the graph we've built now includes everything we wanted
+(the test jobs) plus everything required to do that (docker images, builds).
+
+Optimization
+------------
+
+The objective of optimization to remove as many tasks from the graph as
+possible, as efficiently as possible, thereby delivering useful results as
+quickly as possible. For example, ideally if only a test script is modified in
+a push, then the resulting graph contains only the corresponding test suite
+task.
+
+A task is said to be "optimized" when it is either replaced with an equivalent,
+already-existing task, or dropped from the graph entirely.
+
+A task can be optimized if all of its dependencies can be optimized and none of
+its inputs have changed. For a task on which no other tasks depend (a "leaf
+task"), the optimizer can determine what has changed by looking at the
+version-control history of the push: if the relevant files are not modified in
+the push, then it considers the inputs unchanged. For tasks on which other
+tasks depend ("non-leaf tasks"), the optimizer must replace the task with
+another, equivalent task, so it generates a hash of all of the inputs and uses
+that to search for a matching, existing task.
+
+In some cases, such as try pushes, tasks in the target task set have been
+explicitly requested and are thus excluded from optimization. In other cases,
+the target task set is almost the entire task graph, so targetted tasks are
+considered for optimization. This behavior is controlled with the
+``optimize_target_tasks`` parameter.
+
+Action Tasks
+------------
+
+Action Tasks are tasks which help you to schedule new jobs via Treeherder's
+"Add New Jobs" feature. The Decision Task creates a YAML file named
+``action.yml`` which can be used to schedule Action Tasks after suitably replacing
+``{{decision_task_id}}`` and ``{{task_labels}}``, which correspond to the decision
+task ID of the push and a comma separated list of task labels which need to be
+scheduled.
+
+This task invokes ``mach taskgraph action-task`` which builds up a task graph of
+the requested tasks. This graph is optimized using the tasks running initially in
+the same push, due to the decision task.
+
+So for instance, if you had already requested a build task in the ``try`` command,
+and you wish to add a test which depends on this build, the original build task
+is re-used.
+
+Action Tasks are currently scheduled by
+[pulse_actions](https://github.com/mozilla/pulse_actions). This feature is only
+present on ``try`` pushes for now.
+
+Mach commands
+-------------
+
+A number of mach subcommands are available aside from ``mach taskgraph
+decision`` to make this complex system more accesssible to those trying to
+understand or modify it. They allow you to run portions of the
+graph-generation process and output the results.
+
+``mach taskgraph tasks``
+ Get the full task set
+
+``mach taskgraph full``
+ Get the full task graph
+
+``mach taskgraph target``
+ Get the target task set
+
+``mach taskgraph target-graph``
+ Get the target task graph
+
+``mach taskgraph optimized``
+ Get the optimized task graph
+
+Each of these commands taskes a ``--parameters`` option giving a file with
+parameters to guide the graph generation. The decision task helpfully produces
+such a file on every run, and that is generally the easiest way to get a
+parameter file. The parameter keys and values are described in
+:doc:`parameters`; using that information, you may modify an existing
+``parameters.yml`` or create your own.
+
+Task Parameterization
+---------------------
+
+A few components of tasks are only known at the very end of the decision task
+-- just before the ``queue.createTask`` call is made. These are specified
+using simple parameterized values, as follows:
+
+``{"relative-datestamp": "certain number of seconds/hours/days/years"}``
+ Objects of this form will be replaced with an offset from the current time
+ just before the ``queue.createTask`` call is made. For example, an
+ artifact expiration might be specified as ``{"relative-timestamp": "1
+ year"}``.
+
+``{"task-reference": "string containing <dep-name>"}``
+ The task definition may contain "task references" of this form. These will
+ be replaced during the optimization step, with the appropriate taskId for
+ the named dependency substituted for ``<dep-name>`` in the string.
+ Multiple labels may be substituted in a single string, and ``<<>`` can be
+ used to escape a literal ``<``.
+
+Taskgraph JSON Format
+---------------------
+
+Task graphs -- both the graph artifacts produced by the decision task and those
+output by the ``--json`` option to the ``mach taskgraph`` commands -- are JSON
+objects, keyed by label, or for optimized task graphs, by taskId. For
+convenience, the decision task also writes out ``label-to-taskid.json``
+containing a mapping from label to taskId. Each task in the graph is
+represented as a JSON object.
+
+Each task has the following properties:
+
+``task_id``
+ The task's taskId (only for optimized task graphs)
+
+``label``
+ The task's label
+
+``attributes``
+ The task's attributes
+
+``dependencies``
+ The task's in-graph dependencies, represented as an object mapping
+ dependency name to label (or to taskId for optimized task graphs)
+
+``task``
+ The task's TaskCluster task definition.
+
+``kind_implementation``
+ The module and the class name which was used to implement this particular task.
+ It is always of the form ``<module-path>:<object-path>``
+
+The results from each command are in the same format, but with some differences
+in the content:
+
+* The ``tasks`` and ``target`` subcommands both return graphs with no edges.
+ That is, just collections of tasks without any dependencies indicated.
+
+* The ``optimized`` subcommand returns tasks that have been assigned taskIds.
+ The dependencies array, too, contains taskIds instead of labels, with
+ dependencies on optimized tasks omitted. However, the ``task.dependencies``
+ array is populated with the full list of dependency taskIds. All task
+ references are resolved in the optimized graph.
+
+The output of the ``mach taskgraph`` commands are suitable for processing with
+the `jq <https://stedolan.github.io/jq/>`_ utility. For example, to extract all
+tasks' labels and their dependencies:
+
+.. code-block:: shell
+
+ jq 'to_entries | map({label: .value.label, dependencies: .value.dependencies})'
+
diff --git a/taskcluster/docs/transforms.rst b/taskcluster/docs/transforms.rst
new file mode 100644
index 0000000000..1679c55894
--- /dev/null
+++ b/taskcluster/docs/transforms.rst
@@ -0,0 +1,198 @@
+Transforms
+==========
+
+Many task kinds generate tasks by a process of transforming job descriptions
+into task definitions. The basic operation is simple, although the sequence of
+transforms applied for a particular kind may not be!
+
+Overview
+--------
+
+To begin, a kind implementation generates a collection of items; see
+:doc:`loading`. The items are simply Python dictionaries, and describe
+"semantically" what the resulting task or tasks should do.
+
+The kind also defines a sequence of transformations. These are applied, in
+order, to each item. Early transforms might apply default values or break
+items up into smaller items (for example, chunking a test suite). Later
+transforms rewrite the items entirely, with the final result being a task
+definition.
+
+Transform Functions
+...................
+
+Each transformation looks like this:
+
+.. code-block::
+
+ @transforms.add
+ def transform_an_item(config, items):
+ """This transform ...""" # always a docstring!
+ for item in items:
+ # ..
+ yield item
+
+The ``config`` argument is a Python object containing useful configuration for
+the kind, and is a subclass of
+:class:`taskgraph.transforms.base.TransformConfig`, which specifies a few of
+its attributes. Kinds may subclass and add additional attributes if necessary.
+
+While most transforms yield one item for each item consumed, this is not always
+true: items that are not yielded are effectively filtered out. Yielding
+multiple items for each consumed item implements item duplication; this is how
+test chunking is accomplished, for example.
+
+The ``transforms`` object is an instance of
+:class:`taskgraph.transforms.base.TransformSequence`, which serves as a simple
+mechanism to combine a sequence of transforms into one.
+
+Schemas
+.......
+
+The items used in transforms are validated against some simple schemas at
+various points in the transformation process. These schemas accomplish two
+things: they provide a place to add comments about the meaning of each field,
+and they enforce that the fields are actually used in the documented fashion.
+
+Keyed By
+........
+
+Several fields in the input items can be "keyed by" another value in the item.
+For example, a test description's chunks may be keyed by ``test-platform``.
+In the item, this looks like:
+
+.. code-block:: yaml
+
+ chunks:
+ by-test-platform:
+ linux64/debug: 12
+ linux64/opt: 8
+ default: 10
+
+This is a simple but powerful way to encode business rules in the items
+provided as input to the transforms, rather than expressing those rules in the
+transforms themselves. If you are implementing a new business rule, prefer
+this mode where possible. The structure is easily resolved to a single value
+using :func:`taskgraph.transform.base.get_keyed_by`.
+
+Organization
+-------------
+
+Task creation operates broadly in a few phases, with the interfaces of those
+stages defined by schemas. The process begins with the raw data structures
+parsed from the YAML files in the kind configuration. This data can processed
+by kind-specific transforms resulting, for test jobs, in a "test description".
+For non-test jobs, the next step is a "job description". These transformations
+may also "duplicate" tasks, for example to implement chunking or several
+variations of the same task.
+
+In any case, shared transforms then convert this into a "task description",
+which the task-generation transforms then convert into a task definition
+suitable for ``queue.createTask``.
+
+Test Descriptions
+-----------------
+
+The transforms configured for test kinds proceed as follows, based on
+configuration in ``kind.yml``:
+
+ * The test description is validated to conform to the schema in
+ ``taskcluster/taskgraph/transforms/tests/test_description.py``. This schema
+ is extensively documented and is a the primary reference for anyone
+ modifying tests.
+
+ * Kind-specific transformations are applied. These may apply default
+ settings, split tests (e.g., one to run with feature X enabled, one with it
+ disabled), or apply across-the-board business rules such as "all desktop
+ debug test platforms should have a max-run-time of 5400s".
+
+ * Transformations generic to all tests are applied. These apply policies
+ which apply to multiple kinds, e.g., for treeherder tiers. This is also the
+ place where most values which differ based on platform are resolved, and
+ where chunked tests are split out into a test per chunk.
+
+ * The test is again validated against the same schema. At this point it is
+ still a test description, just with defaults and policies applied, and
+ per-platform options resolved. So transforms up to this point do not modify
+ the "shape" of the test description, and are still governed by the schema in
+ ``test_description.py``.
+
+ * The ``taskgraph.transforms.tests.make_task_description:transforms`` then
+ take the test description and create a *task* description. This transform
+ embodies the specifics of how test runs work: invoking mozharness, various
+ worker options, and so on.
+
+ * Finally, the ``taskgraph.transforms.task:transforms``, described above
+ under "Task-Generation Transforms", are applied.
+
+Test dependencies are produced in the form of a dictionary mapping dependency
+name to task label.
+
+Job Descriptions
+----------------
+
+A job description says what to run in the task. It is a combination of a
+``run`` section and all of the fields from a task description. The run section
+has a ``using`` property that defines how this task should be run; for example,
+``mozharness`` to run a mozharness script, or ``mach`` to run a mach command.
+The remainder of the run section is specific to the run-using implementation.
+
+The effect of a job description is to say "run this thing on this worker". The
+job description must contain enough information about the worker to identify
+the workerType and the implementation (docker-worker, generic-worker, etc.).
+Any other task-description information is passed along verbatim, although it is
+augmented by the run-using implementation.
+
+The run-using implementations are all located in
+``taskcluster/taskgraph/transforms/job``, along with the schemas for their
+implementations. Those well-commented source files are the canonical
+documentation for what constitutes a job description, and should be considered
+part of the documentation.
+
+Task Descriptions
+-----------------
+
+Every kind needs to create tasks, and all of those tasks have some things in
+common. They all run on one of a small set of worker implementations, each
+with their own idiosyncracies. And they all report to TreeHerder in a similar
+way.
+
+The transforms in ``taskcluster/taskgraph/transforms/task.py`` implement
+this common functionality. They expect a "task description", and produce a
+task definition. The schema for a task description is defined at the top of
+``task.py``, with copious comments. Go forth and read it now!
+
+In general, the task-description transforms handle functionality that is common
+to all Gecko tasks. While the schema is the definitive reference, the
+functionality includes:
+
+* TreeHerder metadata
+
+* Build index routes
+
+* Information about the projects on which this task should run
+
+* Optimizations
+
+* Defaults for ``expires-after`` and and ``deadline-after``, based on project
+
+* Worker configuration
+
+The parts of the task description that are specific to a worker implementation
+are isolated in a ``task_description['worker']`` object which has an
+``implementation`` property naming the worker implementation. Each worker
+implementation has its own section of the schema describing the fields it
+expects. Thus the transforms that produce a task description must be aware of
+the worker implementation to be used, but need not be aware of the details of
+its payload format.
+
+The ``task.py`` file also contains a dictionary mapping treeherder groups to
+group names using an internal list of group names. Feel free to add additional
+groups to this list as necessary.
+
+More Detail
+-----------
+
+The source files provide lots of additional detail, both in the code itself and
+in the comments and docstrings. For the next level of detail beyond this file,
+consult the transform source under ``taskcluster/taskgraph/transforms``.
diff --git a/taskcluster/docs/yaml-templates.rst b/taskcluster/docs/yaml-templates.rst
new file mode 100644
index 0000000000..515999e608
--- /dev/null
+++ b/taskcluster/docs/yaml-templates.rst
@@ -0,0 +1,49 @@
+Task Definition YAML Templates
+==============================
+
+A few kinds of tasks are described using templated YAML files. These files
+allow some limited forms of inheritance and template substitution as well as
+the usual YAML features, as described below.
+
+Please do not use these features in new kinds. If you are tempted to use
+variable substitution over a YAML file to define tasks, please instead
+implement a new kind-specific transform to accopmlish your goal. For example,
+if the current push-id must be included as an argument in
+``task.payload.command``, write a transform function that makes that assignment
+while building a job description, rather than parameterizing that value in the
+input to the transforms.
+
+Inheritance
+-----------
+
+One YAML file can "inherit" from another by including a top-level ``$inherits``
+key. That key specifies the parent file in ``from``, and optionally a
+collection of variables in ``variables``. For example:
+
+.. code-block:: yaml
+
+ $inherits:
+ from: 'tasks/builds/base_linux32.yml'
+ variables:
+ build_name: 'linux32'
+ build_type: 'dbg'
+
+Inheritance proceeds as follows: First, the child document has its template
+substitutions performed and is parsed as YAML. Then, the parent document is
+parsed, with substitutions specified by ``variables`` added to the template
+substitutions. Finally, the child document is merged with the parent.
+
+To merge two JSON objects (dictionaries), each value is merged individually.
+Lists are merged by concatenating the lists from the parent and child
+documents. Atomic values (strings, numbers, etc.) are merged by preferring the
+child document's value.
+
+Substitution
+------------
+
+Each document is expanded using the PyStache template engine before it is
+parsed as YAML. The parameters for this expansion are specific to the task
+kind.
+
+Simple value substitution looks like ``{{variable}}``. Function calls look
+like ``{{#function}}argument{{/function}}``.