airbnb - braavos - whered my money go
TRANSCRIPT
Hey, where'd my money go? Building Airbnb's Financial Data Pipeline in Spark
Mike Lewis Jiang-Ming Yang
191 72 29
Countries Currencies Languages
Receipts
Payables
VAT
Revenue TOTFinancial Data
Alexandria ____________________________
Ruby on Rails dashboard app
Nightly Cron job
Dynamically create MySQL queries to ETL data
The good
Served us faithfully from 2012-2015
SQL queries are very easy to hand-test and share
Simple data can be handled with simple queries but...
...data doesn't stay simple forever
Each product is handled in a different way
Unscalable performance as data grew
Difficult to refactor complex SQL
The bad
Ever maintained a 1,000 line SQL query?
We needed: ____________________________
An actual programming language
A pipelined architecture
Distributed computing capabilities
Goals
• Handle all products using a common flow (ie. business trip or cleaning)
• Infer events from data or consume in realtime from production
• Easily query-able subledger output
Braavos ____________________________
Home of the Iron Bank
— Game of Thrones
Our next generation event-based financial data processing system
Sub-ledger entries
• General accounts: receivable / payable / revenue / tax / etc.
• Debit/credit operation against a given account.
• Double-entry accounting rule
Sub-ledger entries
• A reservation for $100 comes in • Platform events generated • Debit/credit to appropriate subledgers
Debit Credit
Receivable (Guest) 100
Deferred Revenue 10
Deferred Payable 90
Debit Credit
Cash 100
Receivable (Guest) 100
PaymentBook
Debit Credit
Deferred Revenue 10
Revenue (Guest) 10
Deferred Payable 90
Payable (Host) 90
Revenue Recognition
1 2 3
• Payment call made to processor • Payment broken into payment events • Appropriate subledger entries made
• Finance policy determines revenue recognition date
• Events generated to debit/credit appropriate subledgers
Accounting entries Booking entries based on financial policy
Reports Simpler summary queries on subledgers.
Event generation Normalized platform events Normalized payment events
Braavos pipeline
Platform Events
Product Type
Product Id
Datetime
Guest Info
Host Info
Funding Sources
Itemized Pricing
Taxes (currency / amount / remittance currency)
Product Type
Product Id
Datetime
Transaction Id
Payment Info
Currency
Amount
Effective currency rate
Reconciled rate
Payment Events
Build in Spark / Scala
Finance system is an offline component and don’t need to worry about the latency;
Developer can focus on the business logic and don’t need to worry about the scalability;
Performance (throughput): 7M events / min --conf spark.default.parallelism=200 \
--num-executors 50 \
--executor-cores 8 \
Reports
Guest Receivable Future Host Payout
SELECT sum(IF(Operation = `Debit`, amount, -amount)) FROM subledger_entries WHERE account = ‘ReceivableAccount’ AND meta[‘Guest’] IS NOT NULL AND event_date < ‘2015-01-01’;
SELECT sum(IF(Operation = `credit`, amount, -amount)) FROM subledger_entries WHERE account = ‘PayableAccount’ AND meta[‘Host’] IS NOT NULL AND event_date < ‘2015-01-01’;
SELECT sum(IF(Operation = `credit`, amount, -amount)) FROM subledger_entries WHERE account = ‘DeferredRevenueAccount’ AND event_date < ‘2015-01-01’;
Deferred Revenue
Migration Process
1. Generate all the platform events and payment events based on existing database account audit records;
2. Build reports based on Braavos to match up the existing reports;
3. Changing the upstream components to generate real events and compare with existing results;
4. Switch to use the real upstream events;
Intercompany report
Airbnb transactions involve four entities: Airbnb Inc. / Airbnb Payment Inc. / Airbnb UK / Airbnb Ireland and the number is increasing.
Guest / Host may belong to the different entities and their payment entities might be different as well.
We need to report intercompany money movement across entities.
Entity A Entity B
Intercompany
Inbound Outbound
Operation: Debit Amount: $150
Operation: Credit Amount: $200
Entity C
Operation: Credit Amount: $120
Entity D
Operation: Debit Amount: $170
Future Work
Cash Reconciliation Tie out internal data with processor data
Automate Treasury Rebalancing Robo-trade currency and hedge
against market fluctuation
Automate Everything Build out financial back-office tools to give better insight into our business
Improve Monitoring/Alerting We should catch data issues in
minutes, not days
Stream-Based Processing Consume events in realtime from our production apps
Questions?