big and fat: using mongodb with deep and diverse data

54

Upload: jeremymcanally

Post on 13-Jul-2015

1.433 views

Category:

Technology


1 download

TRANSCRIPT

Big and FatUsing MongoDB with deep and diverse datasets:

A case study

About me

• My name is Jeremy McAnally

• “Software architect” at Intridea

• Write a lot of books, OSS, etc.

• http://github.com/jm

• http://twitter.com/jm

• http://railsupgradehandbook.com

• http://wickhamhousebrand.com

MongoDB analytics just likemom used to make.

http://mongobase.com

Preface

The Application™

Disclaimer

We moved to (mostly) sql.

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

Lesson 1

Abstraction is a double-edged sword.

Abstract away!Talking to all data (no matter the source) the same way will

keep you sane.

!"#$"%&%'()*+,,*!#$(-#.#/!0#12)3+345%6%789'%!"#$":2;

!"#$"-#</=%>?%@!@

%%A?"0"%&%>B-/?CC#/0D?E1FA?"0"F;-GDE>1,!"#$HD>%&I%!JFD>FK;

%%L%J---K

%%/?MM#E0"%&%>B-/?CC#/0D?E1F/?MM#E0"F;-GDE>12NO=#$#2%&I%2"!M

10=D"-<>MDEH/?!E0P%0=D"-M?>#$<0?$H/?!E0;%&&%Q2;

#E>

!"#$"%&%R"#$-<CC

!"#$"-#</=%>?%@!@

%%A?"0"%&%S?"0-GDE>1,!"#$HD>%&I%!-D>;

%%L%J---K

%%/?MM#E0"%&%4?MM#E0-O=#$#12"!M10=D"-<>MDEH/?!E0P%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%0=D"-M?>#$<0?$H/?!E0;%&&%Q2;

#E>

!"#$"%&%R"#$-<CC

!"#$"-#</=%>?%@!@

%%A?"0"%&%S?"0-GDE>1,!"#$HD>%&I%!-D>;

%%L%J---K

%%/?MM#E0"%&%4?MM#E0-OD0=HGDT#H0=DEU"

#E>

...but wait!MongoDB has a lot of features that will perform better and be

less (and often better) code.

A=<$M</D"0"%&%VW

S<0D#E0-<CC-#</=%>?%@A<0D#E0@

%%A<0D#E0-A$#"/$DA0D?E"-#</=%>?%@A$#"/$DA0D?E@

%%%%A=<$M</D"0"JA$#"/DA0D?E-E<M#K%@@&%X

%%%%A=<$M</D"0"JA$#"/DA0D?E-E<M#K%Y&%Z

%%#E>

#E>

A=<$M</D"0"%&%VW

S<0D#E0-<CC-#</=%>?%@A<0D#E0@

%%A<0D#E0-A$#"/$DA0D?E"-#</=%>?%@A$#"/$DA0D?E@

%%%%A=<$M</D"0"JA$#"/DA0D?E-E<M#K%@@&%X

%%%%A=<$M</D"0"JA$#"/DA0D?E-E<M#K%Y&%Z

%%#E>

#E>SLOW AS

CRAP

M<A%&%2G!E/0D?E1;V

%%%%0=D"-A$#"/$DA0D?E"-G?$3</=1

%%%%%%G!E/0D?E1A;%V%

%%%%%%%%#MD01A-E<M#P%V%/?!E0%,%Z%W;:

%%%%W;W2%

%%

$#>!/#%&%2G!E/0D?E1[P%T;%V

%%T<$%E!MB#$%&%X:

%%G?$%T-G?$3</=1G!E/0D?E1;%V

%%%%E!MB#$%Y&%TJDK-/?!E0:

%%W;:

%%$#0!$E%V%/?!E0%,%E!MB#$%W:%

W2%

%%

A=<$M"%&%\A<0D#E0"-M<AH$#>!/#1M<AP%$#>!/#;

M<A%&%2G!E/0D?E1;V

%%%%0=D"-A$#"/$DA0D?E"-G?$3</=1

%%%%%%G!E/0D?E1A;%V%

%%%%%%%%#MD01A-E<M#P%V%/?!E0%,%Z%W;:

%%%%W;W2%

%%

$#>!/#%&%2G!E/0D?E1[P%T;%V

%%T<$%E!MB#$%&%X:

%%G?$%T-G?$3</=1G!E/0D?E1;%V

%%%%E!MB#$%Y&%TJDK-/?!E0:

%%W;:

%%$#0!$E%V%/?!E0%,%E!MB#$%W:%

W2%

%%

A=<$M"%&%\A<0D#E0"-M<AH$#>!/#1M<AP%$#>!/#;

Lesson 2

Schema design matters.

Lesson 2

Schema design matters.DATA MODEL

Embedding works.

Embedding documents is a smart decision in a lot of cases.

)3+345%6%789'%A<0D#E0"%]^383%D>&_Z_:

)3+345%6%789'%A$#"/$DA0D?E"%]^383%A<0D#E0HD>&_Z_:

)3+345%6%789'%<AA?DE0M#E0"%]^383%A<0D#E0HD>&_Z_:

)3+345%6%789'%/?E0</0"%]^383%A<0D#E0HD>&_Z_:

)3+345%6%789'%/C<DM"%]^383%A<0D#E0HD>&_Z_:

-

-

-

...but watch it.You can also hit a ton of

performance and design issues.

OUR GIANT DOCUMENT

Mongo’s Pre-Allocated Space

Patient

Pharmacy

“Reference”Pharmacy

Search, listing, etc.

Lesson 3

Don’t go nuts.

Schemaless is fun!

Having schemaless data has its own battery of advantages.

Schemaless Joy

Schemaless Joy

• Transforming data models is a delight

Schemaless Joy

• Transforming data models is a delight

• Formless data isn’t awkward

Schemaless Joy

• Transforming data models is a delight

• Formless data isn’t awkward

• Arbitrary embedding is awesome

Schemaless Joy

• Transforming data models is a delight

• Formless data isn’t awkward

• Arbitrary embedding is awesome

• Building to work with schemaless data can lead to some really powerful app concepts

...but be wary.Going nuts will create

headaches for you.

Schemaless Pain

Schemaless Pain

• Weird app behavior

Schemaless Pain

• Weird app behavior

• Huge, long-running data transformations

Schemaless Pain

• Weird app behavior

• Huge, long-running data transformations

• Annoying data transforms for development env’s

Schemaless Pain

• Weird app behavior

• Huge, long-running data transformations

• Annoying data transforms for development env’s

• Difficult to version data models

Lesson 4

Dig deep.

I%>B-$!E4?MM<E>1V2"#$T#$)0<0!"2%,%ZW;

V

% 2T#$"D?E2%,%2Z-`-a2P

% 2!A0DM#2%,%bcP

% 2C?/<C5DM#2%,%25=!%d?T%Ze%_XZX%XZ,`b,ae%

f'5gXQXX%13)5;2P

% 2UC?B<C+?/[2%,%V

% % 20?0<C5DM#2%,%bcXXQ_bXP

% % 2C?/[5DM#2%,%Zh`X`XP

% % 2$<0D?2%,%X-XXZeZ_eZchh_bXbXhc_

% WP

% 2M#M2%,%V

% % 2BD0"2%,%c`P

% % 2$#"D>#E02%,%_P

% % 2TD$0!<C2%,%_abcP

% % 2"!AA?$0#>2%,%0$!#P

% % 2M<AA#>2%,%X

% WP

% 2/?EE#/0D?E"2%,%V

% % 2/!$$#E02%,%ZP

% % 2<T<DC<BC#2%,%Zbbbb

% WP

% 2#.0$<HDEG?2%,%V

% % 2E?0#2%,%2GD#C>"%T<$(%B(%AC<0G?$M2

% WP

% 2DE>#.4?!E0#$"2%,%V

% % 2B0$##2%,%V

% % % 2<//#""#"2%,%XP

% % % 2=D0"2%,%XP

% % % 2MD""#"2%,%XP

% % % 2$#"#0"2%,%XP

% % % 2MD""8<0D?2%,%X

% % W

% WP

% 2B</[U$?!E>7C!"=DEU2%,%V

% % 2GC!"=#"2%,%ZP

% % 20?0<CHM"2%,%XP

% % 2<T#$<U#HM"2%,%XP

% % 2C<"0HM"2%,%XP

% % 2C<"0HGDED"=#>2%,%25=!%d?T%Ze%_XZX%

XZ,`b,X_%f'5gXQXX%13)5;2

% WP

% 2?A/?!E0#$"2%,%V

% % 2DE"#$02%,%XP

% % 2i!#$(2%,%ZP

% % 2!A><0#2%,%XP

% % 2>#C#0#2%,%XP

% % 2U#0M?$#2%,%XP

% % 2/?MM<E>2%,%a

% WP

% 2<""#$0"2%,%V

% % 2$#U!C<$2%,%XP

% % 2O<$EDEU2%,%XP

% % 2M"U2%,%XP

% % 2!"#$2%,%XP

% % 2$?CC?T#$"2%,%X

% WP

% 2?[2%,%Z

W

2?A/?!E0#$"2%,%V

%%2DE"#$02%,%XP

%%2i!#$(2%,%ZP

%%2!A><0#2%,%XP

%%2>#C#0#2%,%XP

%%2U#0M?$#2%,%XP

%%2/?MM<E>2%,%a

W

2/?EE#/0D?E"2%,%V

%%2/!$$#E02%,%ZP

%%2<T<DC<BC#2%,%Zbbbb

W

j#$#M(g'/kE<CC("g'</l??[gS$?,m%n#$#M(M/<E<CC(N%M?EU?"0<0

/?EE#/0#>%0?,%Z_h-X-X-Z

DE"#$0o"%i!#$(o"%!A><0#o"%>#C#0#o"%U#0M?$#o"%/?MM<E>o"%M<AA#>%%T"Dp#%%%%$#"%q%C?/[#>%q%D>.%MD""%%/?EE%%%%%0DM#%

%%%%%%%X%%%%%%%X%%%%%%%%X%%%%%%%%X%%%%%%%%%X%%%%%%%%%Z%%%%%%X%%%_abc%%%%%%a%%%%%%%%X%%%%%%%%%%X%%%%%Z%XZ,Qa,a_%

%%%%%%%X%%%%%%%X%%%%%%%%X%%%%%%%%X%%%%%%%%%X%%%%%%%%%Z%%%%%%X%%%_abc%%%%%%a%%%%%%%%X%%%%%%%%%%X%%%%%Z%XZ,Qa,aa%

%%%%%%%X%%%%%%%X%%%%%%%%X%%%%%%%%X%%%%%%%%%X%%%%%%%%%Z%%%%%%X%%%_abc%%%%%%a%%%%%%%%X%%%%%%%%%%X%%%%%Z%XZ,Qa,a`%

%%%%%%%X%%%%%%%X%%%%%%%%X%%%%%%%%X%%%%%%%%%X%%%%%%%%%Z%%%%%%X%%%_abc%%%%%%a%%%%%%%%X%%%%%%%%%%X%%%%%Z%XZ,Qa,aQ%

%%%%%%%X%%%%%%%X%%%%%%%%X%%%%%%%%X%%%%%%%%%X%%%%%%%%%Z%%%%%%X%%%_abc%%%%%%a%%%%%%%%X%%%%%%%%%%X%%%%%Z%XZ,Qa,ac%

%%%%%%%X%%%%%%%X%%%%%%%%X%%%%%%%%X%%%%%%%%%X%%%%%%%%%Z%%%%%%X%%%_abc%%%%%%a%%%%%%%%X%%%%%%%%%%X%%%%%Z%XZ,Qa,ah%

%%%%%%%X%%%%%%%X%%%%%%%%X%%%%%%%%X%%%%%%%%%X%%%%%%%%%Z%%%%%%X%%%_abc%%%%%%a%%%%%%%%X%%%%%%%%%%X%%%%%Z%XZ,Qa,ae%

%%%%%%%X%%%%%%%X%%%%%%%%X%%%%%%%%X%%%%%%%%%X%%%%%%%%%Z%%%%%%X%%%_abc%%%%%%a%%%%%%%%X%%%%%%%%%%X%%%%%Z%XZ,Qa,ab%

>B-H<>MDE4?MM<E>1V%>D<U+?UUDEU%,%Z%W;

>B-/!$$#E09A1;

V%DEA$?U,%J%V%2?AD>2%,%aQ%P%2?A2%,%2i!#$(2%P%2E"2%,%

2G!E>B-A<$0D#"2%P

%%%%%%%%%%%%%%2i!#$(2%,%2V%"/?$#%,%Z-X%W2%P%2DE+?/[2%,%Z%W

%%%%%%%%%%K

W

I%>B-?AC?U-NM<DE-GDE>1;

V%20"2%,%V%202%,%Z_bXXcaQccXXXP%2D2%,%Z%WP%2?A2%,%2D2P%2E"2%,%2MDEU-G??2P%2?2%,%V%2HD>2%,%9Bn#/0r>12`/#`/#/#<BBZBcQZQeXXXXXZ2;P%20=DEU2%,%_%W%W

V%20"2%,%V%202%,%Z_bXXcaQcbXXXP%2D2%,%Z%WP%2?A2%,%2E2P%2E"2%,%22P%2?2%,%V%W%W

V%20"2%,%V%202%,%Z_bXXcaQhbXXXP%2D2%,%Z%WP%2?A2%,%2E2P%2E"2%,%22P%2?2%,%V%W%W

V%20"2%,%V%202%,%Z_bXXcaQeZXXXP%2D2%,%Z%WP%2?A2%,%2D2P%2E"2%,%2MDEU-G??2P%2?2%,%V%2HD>2%,%9Bn#/0r>12`/#`/#>><BBZBcQZQeXXXXX_2;P%20=DEU2%,%_%W%W

V%20"2%,%V%202%,%Z_bXXcaQeZXXXP%2D2%,%_%WP%2?A2%,%2D2P%2E"2%,%2MDEU-G??2P%2?2%,%V%2HD>2%,%9Bn#/0r>12`/#`/#>><BBZBcQZQeXXXXXa2;P%20=DEU2%,%_%W%W

V%20"2%,%

%%V%202%,%Z_bXXcaQccXXXP%

%%%%2D2%,%Z%

%%WP%

%%2?A2%,%2D2P%

%%2E"2%,%2MDEU-G??2P%

%%2?2%,%V%

%%%%%2HD>2%,%9Bn#/0r>12`/#`/#/#<BBZBcQZQeXXXXXZ2;P%

%%%%%2GD#C>2%,%_%

%%W%

W

That’s all I got.

Questions?