transactions at the pb scale with marklogic serverwritings.nunojob.com/slides/2011-bbuzz.pdfpb...
TRANSCRIPT
![Page 1: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/1.jpg)
PB Scalable Transactions @dscape | #bbuzz
ACID Transac,ons at the PB Scale with MarkLogic Server A talk by Nuno Job, Welcome to Berlin Buzzwords 2011!
![Page 2: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/2.jpg)
H E L L O
my name is
Nuno �@dscape
�
past.!
present.!
portugal, new york, toronto, san francisco, london !83! 08! 09! 10! 11!
stuff I like!
![Page 3: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/3.jpg)
foreseeable future.!
![Page 4: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/4.jpg)
![Page 5: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/5.jpg)
PB Scalable Transactions @dscape | #bbuzz
the idea
60’s!
modern database!ancestors !
80’s!
relational!bloom!
90’s!
search !engine !
00’s!
NoSQL !
timeline!
Queries
Structure
Pre-defined Ad-hoc !
Pre-
defin
ed
A
d-ho
c!
ims!idms! rdbms!
search!engines !
dinosaurs !are !fun!!
!(o0)—’,--!
![Page 6: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/6.jpg)
PB Scalable Transactions @dscape | #bbuzz
a database for unstructured information
unstructured!schema-less*!easy evolution.!xml or json.!
native search!a database built!on a search engine? !
c++ core!~ pb scale!
features!acid, backups!replication, query !language (xquery).!
* they have this universal index thing.!an inverted index that is structure aware !
also stores:!text and binaries!
no tables, rows, columns!thinkin’ documents!uris? looks like a filesystem!
stop shredding your data!start storing data as is!!!!
![Page 7: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/7.jpg)
PB Scalable Transactions @dscape | #bbuzz
application server
single tier:!- no boundaries !between languages!- smaller stack!- king is dead!!
long live the king! !
XQuery: !dynamic, functional!programming language!
features:!!- easy geospatial!- http client!- facets!- alerting!- store applications!in the database!- url rewriting!!
github.com/dscape/rewrite!( for rails like routing, session later on )!
REST!
api!
apps!
apps!
apps!
stop exposing your database!start exposing your data!!!!
marklogic!
other dbs!
![Page 8: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/8.jpg)
ever wonder what!
that lotus flower!
video-clip is all !
about?!!
In MarkLogic we were thinkin’
✓ Unstructured Informa,on
✓ Mul,ple TBs to PB scale
✓ Sub-‐second response ,mes
✓ Data immediately durable
✓ Mix of complex database queries,
aler,ng, full text search, transforma,ons,
geospa,al, and real ,me analy,cs.
thinkin’ documents.! json or xml?!! big!
fast!
love to talk !about this!
realtime!
flexible!
![Page 9: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/9.jpg)
![Page 10: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/10.jpg)
unstructured? To be great, be whole; !Exclude nothing, !exaggerate nothing that is not you. !Be whole in everything. !Put all you are !Into the smallest thing you do. !So, in each lake, the moon shines !Because it blooms up above.!!Ricardo Reis, Odes!
author!title!
line!blank verse!
poem!unstructured!
closet!relational!closet!
buttons!thread!Wool!silk!
![Page 11: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/11.jpg)
universal index:!an inverted index that understands structure, organization, and security !
universal index
123, 126, 130, 152, … 122, 125, 126, 130, … 123, 126, 130, 142, … 123, 130, 131, 135, … 125, 131, 167, 212, …
Document References
126, 130
“berlin”
“buzzwords” “fast” “nosql” “hadoop” <preso> <preso> / <,tle> <y>2011</y> pub nosql Editor Read
slides:!http://www.slideshare.net/cbiow/mark-logic-strangeloop-2010!
122, 126, 130, 131, … 126, 130, 131, 167, … 122, 126, 130, 131, …
Like learning how !relational worked in!the 80s
![Page 12: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/12.jpg)
“It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most
adaptable to change.”�
- Charles Darwin
![Page 13: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/13.jpg)
![Page 14: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/14.jpg)
PB Scalable Transactions @dscape | #bbuzz
Atomicity Either all operations of the transaction are correctly executed or none is. Consistency Database will remain in a consistent state after the transaction commits. Isolation In a concurrent transactional system transactions are unaware of each other. Durability After a transaction completes, changes persist even if the system fails.!
ACID
Helps - Easy to reason about
data!- Guaranteed persistent!
state
Hurts - Hard to scale
horizontally!- Hard to assure high
availability
![Page 15: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/15.jpg)
PB Scalable Transactions @dscape | #bbuzz
CAP
Consistency Each client always has the same view of the data. Availability All clients can always read and write. Partition Tolerance System works well across physical network partitions.!
credit: blog.nahurst.com!/visual-guide-to-nosql-systems
Pick Two!
mysql
redis riak
![Page 16: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/16.jpg)
“It’s naive to explain NoSQL with CAP... for x tending to infinite it's like stating that in the world there are just 3 databases.”�
- Salvatore Sanfillipo, @antirez
![Page 17: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/17.jpg)
“There is a magic bullet! � It's called relaxing the requirements.”�
- Evan Weaver, @evan
![Page 18: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/18.jpg)
![Page 19: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/19.jpg)
![Page 20: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/20.jpg)
![Page 21: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/21.jpg)
PB Scalable Transactions @dscape | #bbuzz
scaling an inverted index
!Ingestion is limited to a size where indexes are manageable!!On query both in memory and on disk stands behave the same -> transparent to the developer!!Means:!Fast ingestion with transactions!!!
memory disk
Log-Structured Merge-Tree!(LSM-Tree)!
journaled! durability ++
Zero-latency ingestion and Indexing!
*
![Page 22: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/22.jpg)
“You cannot take a car, grow it 10 times and expect to get a mining truck.”�
- Ivan Pepelnjak, @ioshints
![Page 23: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/23.jpg)
PB Scalable Transactions @dscape | #bbuzz
par,,on2 par,,on3 par,,on1
level of abstraction: ease of use !
even distribution!across nodes!
stand !a group of trees!
makes sense to !have indexes in !the same stand!
database
divide and conquer
![Page 24: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/24.jpg)
PB Scalable Transactions @dscape | #bbuzz
E Host 1
par--on1
E Host n
D Host 4 D Host 5 D Host 6 D Host k
par--on2 par--on3 par--onm
E Host 2
par--on4
shared nothing cluster
![Page 25: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/25.jpg)
![Page 26: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/26.jpg)
PB Scalable Transactions @dscape | #bbuzz
MVCC
Append only database!
!High Throughput!
Queries don’t require locks!
Queries and Updates do !not conflict!
!ACID!
Cluster consistency: 2-phase commit
![Page 27: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/27.jpg)
PB Scalable Transactions @dscape | #bbuzz
0 5 10 15 20 25
john.json
maria.json
mary.json
eric.json
Series 1
mvcc
System ,mestamp
delete
insert
query
replace node
queries never lock!!
![Page 28: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/28.jpg)
PB Scalable Transactions @dscape | #bbuzz
delete “foo.json”delete “bar.json”
Journal B
insert-child “foo.json”.foo, “stuff”
ID ✔ ✗ URI
ID ✔ ✗ URI
How does the 2-phase commit work?
Journal A
123!
234!
345!1!
1!
22
3
3/foo.json !
/bar.json !
/foo.json !
Insert fragment 123 “/foo.json”
Insert fragment 234 “/bar.json”
Commit, ,mestamp 1, added (123)
Insert fragment 345 “/foo.json” Commit, ,mestamp 2, added (345), deleted (123)
Commit, ,mestamp 3, deleted (34567)
Distributed Begin, A added(123), B added(234)
Prepare
Prepare Commit, added (123)
Commit, deleted (234)
Distributed End
Distributed Begin, A added(123), B added(234)
Distributed End
Shard A
insert “/foo.json”, { “foo”: “” }insert “/bar.json”, { “bar”: “” }
Shard B
doesn’t lock documents, locks uris!
![Page 29: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/29.jpg)
![Page 30: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/30.jpg)
developer.marklogic.com
365q.ca
awesome project btw…!
![Page 31: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/31.jpg)
“You have database problem. You research blog and HN. You start use NoSQL product. Now you not know anymore if you have problem.”�
- Devops BORAT, @devops_borat
![Page 32: Transactions at the PB Scale with MarkLogic Serverwritings.nunojob.com/slides/2011-bbuzz.pdfPB Scalable Transactions @dscape | #bbuzz scaling an inverted index! Ingestion is limited](https://reader035.vdocuments.site/reader035/viewer/2022063000/5f0fe87c7e708231d4467c0e/html5/thumbnails/32.jpg)
PB Scalable Transactions @dscape | #bbuzz
@dscape
Questions?