facebook 2010 (confidential) hiphop compiler for php hiphop compiler for php transforming php into...

35
Facebook 2010 (confidential) HipHop Compiler for PHP Transforming PHP into C++ HipHop Compiler Team Facebook, Inc. May 2010

Post on 19-Dec-2015

227 views

Category:

Documents


4 download

TRANSCRIPT

Facebook 2010 (confidential)

HipHop Compiler for PHP

Transforming PHP into C++

HipHop Compiler TeamFacebook, Inc.

May 2010

PHP is easy to read

Facebook 2010 (confidential)

<?php

function tally($count) { $sum = 0; for ($i = 0; $i < $count; ++$i) { $sum += $i; } return $sum;}

print tally(10) . “\n”;

PHP syntax is similar to C++/Java

Facebook 2010 (confidential)

<?php

class Tool extends Object { public $name;

public use($target) {}}

$tool = new Tool();$tool->name = ‘hammer’;$tool->use($nail);

PHP Statements and Expressions

Facebook 2010 (confidential)

ExpressionList, AssignmentExpression, SimpleVariable, DynamicVariable, StaticMemberExpression, ArrayElementExpression, DynamicFunctionCall, SimpleFunctionCall, ScalarExpression, ObjectPropertyExpression, ObjectMethodExpression, ListAssignment, NewObjectExpression, UnaryOpExpression, IncludeExpression, BinaryOpExpression, QOpExpression, ArrayPairExpression, ClassConstantExpression, ParameterExpression, ModifierExpression, ConstantExpression, EncapsListExpression,

FunctionStatement, ClassStatement, InterfaceStatement, ClassVariable, ClassConstant, MethodStatement, StatementList, BlockStatement, IfBranchStatement, IfStatement, WhileStatement, DoStatement, ForStatement, SwitchStatement, CaseStatement, BreakStatement, ContinueStatement, ReturnStatement, GlobalStatement, StaticStatement, EchoStatement, UnsetStatement, ExpStatement, ForEachStatement, CatchStatement, TryStatement, ThrowStatement,

PHP is weakly typed

Facebook 2010 (confidential)

<?php

$a = 12345;$a = “hello”;$a = array(12345, “hello”, array());$a = new Object();

$c = $a + $b; // integer or array$c = $a . $b; // implicit casting to strings

Core PHP library is small

Facebook 2010 (confidential)

- Most are in functional style- ~200 to 500 basic functions

<?php

$len = strlen(“hello”); // C library$ret = curl_exec($curl); // open source

PHP is easy to debug

Facebook 2010 (confidential)

<?php

function tally($count) { $sum = 0; for ($i = 0; $i < $count; ++$i) { $sum += $i; var_dump($sum); } return $sum;}

PHP is easy to learn

Facebook 2010 (confidential)

easy to read easy to write easy to debug

Hello, World!

PHP is slow

Facebook 2010 (confidential)

C++ Java C# Erlang Python Perl PHP0

10

20

30

40

http://shootout.alioth.debian.org/u64q/benchmark.php?

test=all&lang=all

CPU

Why is Zend Engine slow?

Byte-code interpreter

Dynamic symbol lookups

functions, variables, constants class methods, properties,

constants

Weakly typing zval array()

Facebook 2010 (confidential)

Transforming PHP into C++

Facebook 2010 (confidential)

g++ is a native code compiler

static binding

functions, variables, constants class methods, properties,

constants

type inference integers, strings, arrays,

objects, variants struct, vector, map, array

Static Binding – Function Calls

Facebook 2010 (confidential)

<?php$ret = foo($a);

// C++Variant v_ret;Variant v_a;

v_ret = f_foo(v_a);

Dynamic Function Calls

Facebook 2010 (confidential)

<?php$func = ‘foo’;$ret = $func($a);

// C++Variant v_ret;Variant v_a;String v_func;

V_func = “foo”;v_ret = invoke(v_func, CREATE_VECTOR1(v_a));

Function Invoke Table

Facebook 2010 (confidential)

Variant invoke(CStrRef func, CArrRef params) { int64 hash = hash_string(func); switch (hash) { case 1234: if (func == “foo”) return foo(params[0]) } throw FatalError(“function not found”);}

Re-declared Functions

Facebook 2010 (confidential)

<?phpif ($condition) { function foo($a) { return $a + 1;}} else { function foo($a) { return $a + 2;}}$ret = foo($a);

// C++if (v_condition) { g->i_foo = i_foo$$0; } else { g->i_foo = i_foo$$1;}g->i_foo(v_a);

Volatile Functions

Facebook 2010 (confidential)

<?phpif (!function_exists(‘foo’)) { bar($a);} else { foo($a);}function foo($a) {}

// C++if (f_function_exists(“foo”)) { f_bar(v_a);} else { f_foo(v_a);}g->declareFunction(“foo”);

Static Binding – Variables

Facebook 2010 (confidential)

<?php$foo = ‘hello’;function foo($a) { global $foo; $bar = $foo . $a; return $bar;}

// C++String f_foo(CStrRef v_a) { Variant &gv_foo = g->GV(foo); String v_bar; v_bar = concat(toString(gv_foo), v_a); return v_bar;}

GlobalVariables Class

Facebook 2010 (confidential)

class GlobalVariables : public SystemGlobals {public: // Direct Global Variables Variant gv_foo;

// Indirect Global Variables for large compilation enum _gv_enums { gv_foo, } Variant gv[1]; };

Dynamic Variables

Facebook 2010 (confidential)

<?phpfunction foo() { $b = 10; $a = 'b'; echo($$a);}

void f_foo() { class VariableTable : public RVariableTable { public: int64 &v_b; String &v_a; VariableTable(int64 &r_b, String &r_a) : v_b(r_b), v_a(r_a) {} virtual Variant getImpl(const char *s) { // hash – switch – strcmp } } variableTable(v_b, v_a);

echo(variableTable.get("b”));}

Static Binding – Constants

Facebook 2010 (confidential)

<?phpdefine(‘FOO’, ‘hello’);echo FOO;

// C++echo(“hello” /* FOO */);

Dynamic Constants

Facebook 2010 (confidential)

<?phpif ($condition) { define(‘FOO’, ‘hello’);} else { define(‘FOO’, ‘world’);}echo FOO;

// C++if (v_condition) { g->declareConstant("FOO", g->k_FOO, "hello”);} else { g->declareConstant("FOO", g->k_FOO, "world”);}echo(toString(g->k_FOO));

Static Binding with Classes

Class methods

Class properties

Class constants

Re-declared classes

Deriving from re-declared classes

Volatile classes

Facebook 2010 (confidential)

Summary - Dynamic Symbol Lookup Problem is nicely solved

Rule of 90-10

Dynamic binding is a general form of static binding

Generated code is a super-set of static binding and dynamic binding

Facebook 2010 (confidential)

Problem 2. Weakly Typing

Type Inference

Runtime Type Info (RTTI)-Guided Optimization

Type Hints

Strongly Typed Collection Classes

Facebook 2010 (confidential)

Type Coercions

Facebook 2010 (confidential)

Type Inference Example

Facebook 2010 (confidential)

<?php$a = 10;$a = ‘string’;

Variant v_a;

Why is strong type faster?

Facebook 2010 (confidential)

$a = $b + $c;

if (is_integer($b) && is_integer($c)) { $a = (int)$b + (int)$c;} else if (is_array($b) && is_array($c)) { $a = array_merge((array)$b + (array)$c);} else { …}

int64 v_a = v_b + v_c;

Type Inference Blockers

Facebook 2010 (confidential)

<?phpfunction foo() { if ($success) return 10; // integer return false; // doh’}

$arr[$a] = 10; // doh’

++$a; // $a can be a string actually!

$a = $a + 1; // $a can become a double, ouch!

RTTI-Guided Optimization

Facebook 2010 (confidential)

<?phpfunction foo($x) { ...}

foo(10);foo(‘test’);

void foo(Variant x) { ...}

Type Specialization Method 1

Facebook 2010 (confidential)

template<typename T>void foo(T x) { // generate code with generic T (tough!)}

-Pros: smaller generated code-Cons: no type propagation

Type Specialization Method 2

Facebook 2010 (confidential)

void foo(int64 x) { // generate code assuming x is integer}void foo(Variant x) { // generate code assuming x is variant}

-Pros: type propagation-Cons: variant case is not optimized

Type Specialization Method 3

Facebook 2010 (confidential)

void foo(int64 x) { // generate code assuming x is integer}void foo(Variant x) { if (is_integer(x)) { foo(x.toInt64()); return; } // generate code assuming x is variant}

-Pros: optimized for integer case-Cons: large code size

Type Hints

Facebook 2010 (confidential)

<?phpfunction foo(int $a) { string $b;}

class bar { public array $c;}

bar $d;

Strongly Typed Collection Classes

That omnipotent “array” in PHP

Swapping out underlying implementation:Array escalationPHP classes:

VectorSetMap: un-orderedThen Array: ordered map

Facebook 2010 (confidential)

Compiler Friendly Scripting Language

If all problems described here are considered when designing a new scripting language, will it run faster than Java?

Facebook 2010 (confidential)