programming languages and systems at berkeley and beyond past, present, and future
DESCRIPTION
Programming Languages and Systems at Berkeley and Beyond Past, Present, and Future. Kathy Yelick. The Questions. Programming Languages and Systems (PL&S): aka Languages: this is too narrow (some of us don’t do much “language” research) aka Software: - PowerPoint PPT PresentationTRANSCRIPT
Slide 1
Programming Languages and Systems
at Berkeley and BeyondPast, Present, and Future
Kathy Yelick
Slide 2
The Questions
• Programming Languages and Systems (PL&S):– aka Languages:
» this is too narrow (some of us don’t do much “language” research)
– aka Software: » this is too broad (what doesn’t involve
software?)
• Who are we?• What do we do?
Slide 3
The Culture of PL&S
• The middle management of EECS
– Blamed for» slow execution time» buggy software» low programmer productivity» languages that are too big, restrictive, ugly, etc.
– Need to have control over» hardware complexity» programmer quality » consumers (features over robustness)
Slide 4
The Big Motivators
• Ease of Programming– Hardware costs -> 0– Software costs -> infinity
• Correctness– Increasing reliance on software increases
cost of software errors (medical, financial, etc.)
• Performance– Increasing machine complexity– New languages and applications
» Enabling Java; network packet filters
Slide 5
History of Programming Language Research
70s 80s 90s 2K
Flop optimization
General PurposeLanguage Design
Parsing Theory
Domain-SpecificLanguage Design
Type Systems Theory
Memory Optimizations
GarbageCollection Threads
Program Verification Program Checking Tools
Data and Control AnalysisType-Based Analysis
Slide 6
Topics
• Programming Language and Systems Research– Language Design– Compilers & Tools– Libraries & Runtime Systems– Software Engineering
• Berkeley Projects: Current and Future– BANE– Titanium– Proof Carrying Code
• Future Emphasis: Reliability
Slide 7
Language Design
• Economics of programming languages– Programming training is the dominant cost
» implies languages are rarely replaced
– Languages are adopted to fill a void» not because of language quality
• Is there anything left for PL designers?– Niche languages:
» Everyone does language design, but doing it well is hard
– Understanding languages:» E.g., Titanium’s type system is sound, Split-C’s is not
• Language design at Berkeley:– Lisp (Fateman), Ada (Hilfinger), Tioga (*),
Titanium (*)
Slide 8
Compilers and Tools
• Economics of compilers– Large industrial teams built commercial
compilers• How can academia compete?
– Focus on new algorithms and future problems– Need software infrastructure for experiments
» from others (SUIF, gcc) or our own (Titanium, BANE)
• Compilers and Runtime Systems at Berkeley– Historical and continuing strength
» Code gen, profiling (Graham), sw pipelining (Aiken) » Analysis and optimization of parallel code (Yelick)» Automatic (compile-time) memory management
(Aiken)» Environments (Graham, Fateman)
Slide 9
Libraries
• Open problems in complex platforms/applications – Scientific libraries (overlaps with SciComp
group)– Parallel and distributed machines
• Economics of Libraries– Market and competition are less intense– Can’t afford to hand-code for each machine
• Berkeley strength:» Load balancing (Graham, Yelick, and many others) » Data structures (Yelick), matrices (Demmel, Kahan,
Yelick), Meshes (Shewchuk) » High precision (Demmel, Fateman, Kahan, Shewchuk)» Symbolic (Fateman, Kahan)
– New: tools to automate library construction
Slide 10
Software Engineering
• Economics of Software Engineering– Robust software is expensive
• Old approaches:– Formal: Verification, specification– Informal: Software process, patterns
• What Berkeley is doing: » Automatic analysis of large programs (Aiken)» Software fault isolation (Graham)» Proof Carrying Code (Necula)» Model checking (Henzinger, Brayton, S-V)» Experience (lots of large software construction
projects)
• What’s missing? – “Core” Software Engineering
Slide 11
Projects:Titanium
• Problem: portable scientific computing • The Approach
– Domain-specific language and compiler:» Old applications: astrophysics, combustion» New applications in Bioengineering
•modeling the cell to cure cancer (Arkin)•modeling bio-MEMs devices for treatment
(Liepmann)
– Language design» Dialect of Java with in-house compiler (to C)» Support for fast, safe multidimensional arrays» Types for distributed data, regions
– Optimizations » Communication, memory, arrays, synchronization
Slide 12
Projects: BANE
• Problem: removing bugs from large programs
• The Approach– automatic analysis– discover small facts about big programs– Target: 1,000,000 line systems
• Examples:– Find relay races in RLL programs
» RLL used in >50% of factories, at Disneyland, etc.
– Prove C programs are Y2K ready» CVS 1.10 is OK, CVS 1.9 is not
– Detect buffer overruns in security-critical code
Slide 13
Projects: Proof Carrying Code
• The Problem:– How can I trust code from another language,
person, machine?• The Approach:
– programs carry a proof of what they promise» Semantic analog of digital signatures» Properties often from program analysis (e.g., types)» Passed through compilation by validating
translations
– client’s cheap trusted verifier checks the proof• Applications
– Very fast network packet filters– “Native code” in ML that is safe– Mobile code security
Slide 14
Reliable Computing (Future)
• Problem: build more reliable systems• Approaches:
– Build from reliable components» Better languages for system design (H*)» Better environments for particular domains (F,G)» Build semantic models of system behavior (A,H,N)
– Build reliable systems from unreliable components by spend cheap hardware resources (H,K,P,Y)» Introspection of network, disks, processor, software» Use statistical models to determine normal/abnormal» Fault tolerant, self-scrubbing data structures » Redundant computation: catch transient errors
Slide 15
Summary of PL&S at Cal
• Good coverage in core language and compiler work– People move with opportunities– Traditional boundaries becoming blurred
• Strength in analysis– Semantics with practical applications
• Strength in collaborative work– Systems: Culler, Kubiatowicz, Patterson– Scientific computing: inside and outside
department
• Areas that are not well represented– Core Software Engineering– Logic
Slide 16
Faculty
• Alex Aiken• Richard Fateman• Susan Graham• Mike Harrison• Tom Henzinger• Paul Hilfinger• George Necula• Kathy Yelick
Slide 17
Long Term
• Language research can be loooong term– e.g., garbage collection
Regions
Type Inference
Pi Calculus
Partial Evaluation
Mobile Ambients
Monads
Proof Carrying CodeSet-Based Analysis
Continuations
Software Fault Isolation
Slide 18
Executive Summary
• Anything related to programming– How do we know it does what we think it
does?
• A mix of – theory– systems– human factors
Slide 19
Language Design: History
• 70s & 80s: – Design better general purpose languages
» pure functional, object-oriented, logic…» Lisp (Fateman), Ada (Hilfinger)
• 90s & 2Ks: – Domain-specific languages
» Tioga (Stonebraker, Hellerstein, Aiken)» Titanium (Graham, Yelick, Hilfinger, Aiken)
– Understanding semantics: type soundness, etc.» Titanium pointers types are sound (Split-C’s are
not)
• Good language design is hard• Almost everyone does it
Slide 20
Language Technology without Languages
• Increasing connections to other areas of CS– transfer of PL ideas to non-language tools– avoids language adoption problems– foundational ideas are portable
• High-performance thread systems– based on CPS conversion
• Low overhead virtual machines– uses software fault isolation
• More to come . . .
Slide 21
Interests and Collaborations
Programming LanguageDesign
Compilers
Systems
SoftwareEngineering
Logic
Semantics