scons: big signature refactoring 18 april 2007. build tools do two things build things in the...
TRANSCRIPT
SCons:Big Signature Refactoring
18 April 2007
Build tools do TWO things
• Build things in the correct order
• NOT build what’s up to date
Building things in the correct order
• Simplest way to get correct order is to just write down the commands in a script
• If our dependencies never changed, we’d just script it and be done with it
• Dependency management is necessary because our software changes
Not rebuilding what’s up-to-date
• Simplest way to get up-to-date software is to build from scratch every time
• If our tools were infinitely fast, we’d do build from scratch
• Deciding what’s up-to-date is necessary because our tools take time
How does Make decide what’s up to date?
• Target file
• Source file(s)
• Is timestamp(target) > timestamp(source) for all source files, it’s up to date
What’s wrong with the way Make does it?
• Contents can change without modifying timestamp (incorrectly not rebuilding)
• Timestamps can change without modifying the contents (unnecessary rebuilds)– This can be positive: touch the source file
and rebuild ensures a target is up-to-date
• Can’t handle timestamps rolling back in time
What’s right with the way Make does it?
• It uses file metadata (the timestamp) to approximate if the file contents have changed
• It’s a cheap test
What file metadata can we use?
• Timestamp (only metadata Make uses)
• Size
• Content checksum
Way SCons currently does it
• Metadata stored in a .sconsign file– Used to be one per directory– Still can be configured that way– sconsign script to dump metadata info– Contrast ClearCase, which stores metadata
from a custom file system
• Use the metadata for more sophisticated decisions
Example
Program(‘foo.c’)
$ scons -Q
gcc –o foo.o –c foo.c
gcc –o foo foo.o
$ sconsign .sconsign
=== .:
foo: 8f72e133e001cb380a13bcb6a16fb16f None 1176861920 6762
foo.o: e61afae6ccfe99a63b0b4c15f18422f6
foo.o: e61afae6ccfe99a63b0b4c15f18422f6 None 1176861920 1488
foo.c: b489a8c34c318fc60c8dac54fd58b791
foo.h: c864c870c5c6f984fca5b0ebd7361a7d
More readable output$ sconsign –-verbose .sconsign=== .:foo: bsig: 8f72e133e001cb380a13bcb6a16fb16f csig: None timestamp: 1176861920 size: 6762 implicit: foo.o: e61afae6ccfe99a63b0b4c15f18422f6foo.o: bsig: e61afae6ccfe99a63b0b4c15f18422f6 csig: None timestamp: 1176861920 size: 1488 implicit: foo.c: b489a8c34c318fc60c8dac54fd58b791 foo.h: c864c870c5c6f984fca5b0ebd7361a7d
Readable timestamps, too$ sconsign –-verbose –-readable .sconsign=== .:foo: bsig: 8f72e133e001cb380a13bcb6a16fb16f csig: None timestamp: 'Tue Apr 17 19:15:50 2007' size: 6762 implicit: foo.o: e61afae6ccfe99a63b0b4c15f18422f6foo.o: bsig: e61afae6ccfe99a63b0b4c15f18422f6 csig: None timestamp: 'Tue Apr 17 19:15:50 2007' size: 1488 implicit: foo.c: b489a8c34c318fc60c8dac54fd58b791 foo.h: c864c870c5c6f984fca5b0ebd7361a7d
foo.c
foo.o
foo.h
foo
$ sconsign –-verbose .sconsign=== .:foo: bsig: 8f72e133e001cb380a13bcb6a16fb16f csig: None timestamp: 1176861920 size: 6762 implicit: foo.o: e61afae6ccfe99a63b0b4c15…foo.o: bsig: e61afae6ccfe99a63b0b4c15f18422f6 csig: None timestamp: 1176861920 size: 1488 implicit: foo.c: b489a8c34c318fc60c8dac54… foo.h: c864c870c5c6f984fca5b0eb…
SourceSignatures(‘MD5’)Program(‘foo.c’)
“Build signatures”bsig(foo.o) = md5( sig(foo.c) + sig(foo.h) + sig(cmd_line) )
bsig(foo) = md5( sig(foo.o) + sig(cmd_line) )
$ sconsign –-verbose .sconsign=== .:foo: bsig: 8f72e133e001cb380a13bcb6a16fb16f csig: None timestamp: 1176861920 size: 6762 implicit: foo.o: e61afae6ccfe99a63b0b4c15…foo.o: bsig: e61afae6ccfe99a63b0b4c15f18422f6 csig: None timestamp: 1176861920 size: 1488 implicit: foo.c: b489a8c34c318fc60c8dac54… foo.h: c864c870c5c6f984fca5b0eb…
foo.c
foo.o
foo.h
foo
SourceSignatures(‘MD5’)TargetSignatures(‘content’)Program(‘foo.c’)
$ sconsign –-verbose .sconsign=== .:foo: bsig: 27d34d21414965ce2f9cc5d8f3e3fbbb csig: None timestamp: 1176861920 size: 6762 implicit: foo.o: c7710864231f22aad3e8c1bc…foo.o: bsig: e61afae6ccfe99a63b0b4c15f18422f6 csig: None timestamp: 1176861920 size: 1488 implicit: foo.c: b489a8c34c318fc60c8dac54… foo.h: c864c870c5c6f984fca5b0eb…
foo.c
foo.o
foo.h
foo
$ sconsign –-verbose .sconsign=== .:foo: bsig: 1176915205 csig: None timestamp: 1176915335 size: 6762 implicit: foo.o: 1176915205foo.o: bsig: 1176915205 csig: None timestamp: 1176915335 size: 1488 implicit: foo.c: 1176915205 foo.h: 1176915196
SourceSignatures(‘timestamp’)Program(‘foo.c’)
foo.c
foo.o
foo.h
foo
$ sconsign –-verbose .sconsign=== .:foo: bsig: 1176915335 csig: None timestamp: 1176915335 size: 6762 implicit: foo.o: 1176915335foo.o: bsig: 1176915205 csig: None timestamp: 1176915335 size: 1488 implicit: foo.c: 1176915205 foo.h: 1176915196
SourceSignatures(‘timestamp’)TargetSignatures(‘content’)Program(‘foo.c’)
Problems with how SCons does it now
• Can’t switch between content and timestamps• Can’t mix content + timestamps in same config
– Example: want to use content for all input files except one really big file where you want to use timestamps
• Stores information about last build decision, not just metadata about the current state of file
• Must have complete dependency graph to make the same signature as last time– Can’t build only part of the DAG
• Can’t use dependency output from tools– Example: gcc –Md output
New .sconsign format
$ sconsign .sconsign=== .:SConstruct: 059bf2bda6723d166c5dab7d54a0ca13 1176861882 17foo: 5701724287c3d3847516781876f56d87 1176862330 6762 foo.o: cc74a5b5cd4b174a59b58495cd2ef1f9 1176862330 1488 c4245ece9e7108d276b3c8eb7662d921 [$LINK -o $TARGET ...]foo.c: b489a8c34c318fc60c8dac54fd58b791 1176861903 55foo.h: c864c870c5c6f984fca5b0ebd7361a7d 1176861911 19foo.o: cc74a5b5cd4b174a59b58495cd2ef1f9 1176862330 1488 foo.c: b489a8c34c318fc60c8dac54fd58b791 117686190355 foo.h: c864c870c5c6f984fca5b0ebd7361a7d 117686191119 d055c09cba5c626f5e38f2f17c29c6fa [$CC -o $TARGET -c...]
=== .:SConstruct: csig: 059bf2bda6723d166c5dab7d54a0ca13 timestamp: 1176861882 size: 17foo: csig: 5701724287c3d3847516781876f56d87 timestamp: 1176862330 size: 6762 implicit: foo.o: csig: cc74a5b5cd4b174a59b58495cd2ef1f9 timestamp: 1176862330 size: 1488 action: c4245ece9e7108d276b3c8eb7662d921 [$LINK –o $TARGET $LINKFLAGS $SOURCES ...]foo.c: csig: b489a8c34c318fc60c8dac54fd58b791 timestamp: 1176861903 size: 55foo.h: csig: c864c870c5c6f984fca5b0ebd7361a7d timestamp: 1176861911 size: 19foo.o: csig: cc74a5b5cd4b174a59b58495cd2ef1f9 timestamp: 1176862330 size: 1488 implicit: foo.c: csig: b489a8c34c318fc60c8dac54fd58b791 timestamp: 1176861903 size: 55 foo.h: csig: c864c870c5c6f984fca5b0ebd7361a7d timestamp: 1176861911 size: 19 action: d055c09cba5c626f5e38f2f17c29c6fa [$CC –o $TARGET –c $CFLAGS $CCFLAGS ...]
New .sconsign format
$ sconsign .sconsign=== .:SConstruct: 059bf2bda6723d166c5dab7d54a0ca13 1176861882 17foo: 5701724287c3d3847516781876f56d87 1176862330 6762 foo.o: cc74a5b5cd4b174a59b58495cd2ef1f9 1176862330 1488 c4245ece9e7108d276b3c8eb7662d921 [$LINK -o $TARGET ...]foo.c: b489a8c34c318fc60c8dac54fd58b791 1176861903 55foo.h: c864c870c5c6f984fca5b0ebd7361a7d 1176861911 19foo.o: cc74a5b5cd4b174a59b58495cd2ef1f9 1176862330 1488 foo.c: b489a8c34c318fc60c8dac54fd58b791 1176861903 55 foo.h: c864c870c5c6f984fca5b0ebd7361a7d 1176861911 19 d055c09cba5c626f5e38f2f17c29c6fa [$CC -o $TARGET -c...]
New .sconsign format
• Every file entry is consistent: foo.c: b489a8c34c318fc60c8dac54fd58b791 1176861903 55
– Content signature (if read, None if not)– Timestamp– Length– Just stores file state at last time used
• Source files are explicitly stored– Can be used for caching checksums
• Actions and their signatures are explicitly stored
No build signatures!
• Up-to-date decision is done by comparing current metadata of each input file with information last time target was built
• Each decision can be independent:– Example: “Rebuild this target file if:”
• Any input text file has different content than last time
• Any input graphic file has a different timestamp than last time
Supporting Slides
• Build signature boils down states of source+dependency files at time target was built– Not complete state, just our signature calc
foo.c
foo.o
foo.h
foo