Observability during Bottom Up execution - Improving PIE s change driven incremental build algorithm

Roelof Sol


Date: Thu, August 29, 2019
Time: 16:00
Room: Social Data Lab 0.E220
Note: This is a MSc thesis defense


Context: Updating an old result by selective re-execution of the inconsistent parts of some computation is usually faster than recomputing everything. Incremental build systems and interactive development pipelines use this technique to speed up feedback. They consist of different tasks. These tasks form a graph by depending on the environment and the result of other tasks. To re-execute a build requires an incremental build algorithm to find and re-execute inconsistent tasks. This can be done by traversing the dependency graph top down. When the mutations to the input environment are known in advance a build algorithm can avoid graph traversal. Thereby scaling with the size of a change instead of the size of the graph. Another important distinction between incremental build systems is their ability to handle dynamic dependencies. That is, dependencies that are discovered during a build. PIE is a bottom-up build algorithm that supports dynamic task dependencies. It schedules and executes inconsistent tasks without traversing the entire dependency graph. The current PIE algorithm is capable of adding dynamic task dependencies. However, it does not process removing dynamic task dependencies. This preserves consistency, but limits scalability over time. Each detached task is still scheduled and executed by the bottom up algorithm.

Inquiry: PIE is inefficient when it executes tasks that are no longer a transitive dependency.
In this paper we introduce task observability to solve this. This problem is unique to bottom up scheduling build systems. However, we are able to re-use techniques from incremental build systems and garbage collections to implement our solution.

Approach: We split the problem into three parts, determining task observability, scheduling, and re-observing.

Knowledge: With the new algorithm we are able to improve the efficiency of PIE and its scalability over time.

Grounding: We verify and benchmark our changes with two artificial and one real world use case in the Spoofax Workbench.

Importance: The PIE runtime is able to continue operating efficiently even when tasks are detached. In some situations the Spoofax PIE pipeline reduces the feedback time with 1800 ms when results are not observed by the user.
Additionally, new design patterns are made possible. For Spoofax, toggling observability creates the opportunity to implement various quality of life features such as file and project renaming.

As a secondary contribution we implement a garbage collector for detached tasks and create a visualization tool for displaying the dependency graphs stored in PIE.


Previous: Jeffrey Goderie |
Next: |