scalab: a build tool for scala
DESCRIPTION
Scalab: a Build Tool for Scala. Master Thesis July 4, 2008 Author: Vincent Pazeller Supervisor: Gilles Dubochet Professor: Martin Odersky Programming Methods Laboratory / LAMP. Outline. Interest of Build Tools Interest of Scalab Definition of a Build Tool Model Caches - PowerPoint PPT PresentationTRANSCRIPT
Scalab: a Build Tool for Scala
Master ThesisJuly 4, 2008
Author: Vincent Pazeller Supervisor: Gilles DubochetProfessor: Martin Odersky
Programming Methods Laboratory / LAMP
Outline
• Interest of Build Tools• Interest of Scalab• Definition of a Build Tool• Model• Caches• Internal Operation (update)• Sabbus• Further Work
Build Tools Interest
• Build Process: sequence of tasks that transform the sources of a project into its executable equivalent.
• All tasks do not always need to be executed.• Sources may not have all been modified.
A build tool automates the choice of tasks to be executed and optimizes the build process.
Increases developers’ productivity.
Interest of Scalab
• Existing tools make non-conservative approximations.
• Makes it possible to describe situations that cannot be described with any other build system.
• They are also too difficult to use: Sabbus written with Ant 1200 lines of XML. < 100 (reasonable) lines of Scala code with
Scalab.
Task
• A task employs sources to produce products.• Sources and products are resources (i.e. files). • The universe is the set of all resources.
Up-to-date
• Given a set of tasks , a task is up-to-date with respect to when the products of cannot be modified by the execution of any sequence of tasks,
∀i.i ∈ ∧ i ∈
The purpose of a build tool is to make up-to-date ∀ ∈
The build tool needs to know (at run-time) the sources and the products of each task.
Model: universe
• Static representation of a common Java project:
• Note: all resources depend directly on the universe.• Idea: make them depend on the task that created
them instead (more dynamic):
Model: filters
• Idea: indicate in the graph how resources can be extracted from the universe.
The build tool can detect changes dynamically.
• Filters are sub-divided in three categories: Selectors Scanners Mappers
• Filters can also be used to filter the products of components (tasks and filters, so far).
Model: Pipes
• We can now simplify the graph:
Becomes
Arrows are called pipes and carry resources from component to component.
Model: Gates
• Inputs Interest: distinguish subsets of sources.
Mandatory (■) or optional (□).
• Output: each component has a single output which is implicit.
Model: Black Boxes
• Interest: hide and/or make sub-graph re-usable/distributable. Reduces the risk of errors
• The inside looks like:
• The output must be explicitly provisioned, this time.
Black boxes behave exactly as if they were tasks.
Model: Build Schemes
• Generalization of black boxes.
• This scheme can then be used with any compiler and any archiver.
• The class-path input has been omitted because it cannot be generalized to all compilers.
• Build schemes are black-box generators.
Model: Targets
• Purpose: indicate clearly which components are relevant to build.
• Interests: Build process is more explicit. Avoid any misuse.
Model: Dependencies
• Hard dependencies: if a source hard dependency of a task does not exist,
the task cannot be executed. A product hard dependency indicates that the
resource has been created (no doubt).• Soft dependencies:
The absence of a source soft dependency does not prevent the task to execute.
A product soft dependency suggests a doubt on the resource creation.
Pipes can form cycles in the build graph!• Filters are not affected by (hard) dependencies.
Caches
• Interest: avoid that tasks repeat the work they have done in the past.
• Principle: load/store resources from/to repository.
• Tasks write directly in the cache repository (no copy).
Caches: Behavior
• The behavior of a cache is defined by: Change Detection Policy: used to detect when
a source has changed. Eviction Policy: used to select and delete the
least pertinent information in a cache. Core Policy: Defines how the cache loads and
stores information and coordinates the three policies.
Caches: Conservativeness• Caches can be conservative or not• Conservative caches ensure that the result of
cached tasks is always sound.
Conservative caches need to know inter-dependencies among resources.
Caller.scala:
object Caller{
def main(args: Array[String]){
Callee.invoke
}
}
Callee.scala:
object Callee{
def invoke{}
}
Scalac
Caller
.class
Caller$
.class
Callee$
.class
Callee
.class
Caller
.scala
Callee
.scala
Update
• First try:
• Wrong! If the graph contains a cycle, the algorithm will never terminate!
trait ExecutableComponent{
…
protected def update: Boolean = this.inputs forall {i =>
i.providers forall {p => p.update}
} && this.exec
…
}
Update: Cycle Detectionprotected def update(visited: Set[Component], cycles: Set[(Output, Input)]):
(Boolean, Set[Component], Set[(Output, Input)]) = { var newVisited = visited + this //add this node to the visited set var newCycles = cycles val inputsUpdated = this.inputs forall {i => i.providers forall {p => if(newCycles contains Pair(p.output, i)) //ensures termination true else{ if(visited contains p) //cycle detection newCycles = newCycles + Pair(p.output, i) val (updated, moreVisited, moreCycles) = p.update(newVisited, newCycles) //update providers newVisited = newVisited ++ moreVisited newCycles = newCycles ++ moreCycles updated } } //input providers are up-to-date } //inputs are up-to-date (inputsUpdated && this.exec, newVisited, newCycles) //update this component}
Update: Redundancy
• Presented update algorithm is not optimal:
Need to remember which components were updated.
in0
in1
Update: Efficient Versionprotected def update(visited: Set[Component], cycles: Set[(Output, Input)], updated: Set[Component]):
(Boolean, Set[Component], Set[(Output, Input)], Set[Component]) = {
if(updated contains this) //avoid redundant updates
(true, visited, cycles, updated)
else{
var newVisited = visited + this //add this node to the visited set
var newCycles = cycles
var newUpdated = updated
val inputsUpdated = this.inputs forall {i =>
i.providers forall {p =>
if(newCycles contains Pair(p.output, i)) //ensures termination
true
else{
if(visited contains p) //cycle detection
newCycles = newCycles + Pair(p.output, i)
val (updated, moreVisited, moreCycles, moreUpdated) = p.update(newVisited, newCycles, newUpdated)
newVisited = newVisited ++ moreVisited
newCycles = newCycles ++ moreCycles
newUpdated = newUpdated ++ moreUpdated
updated
}
} //input providers are up-to-date
} //inputs are up-to-date
(inputsUpdated && this.exec, newVisited, newCycles, newUpdated + this) //update this component
}
}
Sabbus
Sabbus: Preamble
1 val scalaHome = “/home/pazeller/pdm/scala/”
2 val scalaSrcs = scalaHome + “srcs/”
3 val scalacSrcs = scalaSrcs + “compiler/”
4 val scalaLib = scalaHome + “lib/”
//sources
5 val compilerSrcs = Universe -> Files(scalacSrcs) -> ListDirs() -> EndsWith(“.scala”) ->
6 StartsWith(scalacSrcs + “scala/tools/ant/”).complement
7 val libSrcs = Universe -> Files(scalaSrcs + “library/”) -> ListDirs() -> EndsWith(“.scala”)
//old classes
8 val oldLib = Files(scalaLib + “scala-library.jar”)
9 val bytecodeGen = Files(scalaLib + “fjbg.jar”) // bytecode generator
10 val oldScalac = Files(scalaLib + “scala-compiler.jar”)
11 Universe >> (oldLib, oldScalac, bytecodeGen)
Sabbus: Instantiating Compilers
//Building compilers
12 val Starr = DynamicScalac(“starr”)
13 val StarrLib = DynamicScalac(“starrLib”)
14 val locker = DynamicScalac(“locker”)
15 val lockerLib = DynamicScalac(“lockerLib”)
16 val quick = DynamicScalac(“quick”)
17 val quickLib = DynamicScalac(“quickLib”)
//grouping
18 val compilers = List(starr, locker, quick)
19 val libCompilers = List(starrLib, lockerLib, quickLib)
20 val allCompilers = compilers ::: libCompilers
Sabbus: Connecting Pipes
//sources
21 newLibSrcs >> (libCompilers map {c => c.src})
22 newCompilerSrcs >> (compilers map {c => c.src})
//loading classes
23 bytecodeGen >> (allCompilers map {c => c.load})
24 oldCompiler >> (starrLib.load, starr.load)
25 starr.runDirectory >> (lockerLib.load, locker.load)
26 locker.runDirectory >> (quickLib.load, quick.load)
//libraries
27 oldLib >> (starrLib.load, starr.load)
28 starrLib.runDirectory >> (starr.boot, lockerLib.load, locker.load)
29 lockerLib.runDirectory >> (locker.boot, quickLib.load, quick.load)
30 quickLib.runDirectory -> quick.boot
Sabbus: Concluding31 Stopwatch(allCompilers) //timing compilers executions
32 val stability = SameContent(“stability”, locker, quick) //stability check
33 val distr = Jar(“jar”, “scala-compiler.jar”, stability)
//setting cache
34 val cache = new ConservativeCCP with TimestampCDP with LRUCEP
35 cache.setCacheDirectory(“/home/pazeller/shared/cache/”)
36 allCompilers foreach {c => c.setCache(cache)}
//targets
37 val buildDir = “./build”
38 val newDistribution = Target(buildDir + “distr/”, distr)
39 val newLibrary = Target(buildDir + “library/”, lockerLib)
40 val newCompiler = Target(buildDir + “compiler/”, locker)
41 val default = newCompiler //default target
Further Work
• Parallel Task Execution• Graphical Interface• Interactive Debug Mode• Filters Caching• Automatic Graph Dismantling• Extending Library
• Thank you for your attention.• Project can be consulted on
http://scalab.googlecode.com
• Feel free to ask questions.