revenge of the orms

40
Revenge of the ORMs Why SQL skills still matter in 2015 Lightning strikes © flickr.com/photos/snowpeak CC-BY

Upload: megan-bowra-dean

Post on 18-Aug-2015

295 views

Category:

Software


3 download

TRANSCRIPT

Revenge of the ORMs

Why SQL skills still matter in 2015

Lightning strikes © flickr.com/photos/snowpeak CC-BY

First, An Intro

Megan Bowra-Dean !• Rails/JS/Android/iOS/

kitchen sink developer at Rabid Tech

• 2 years .NET enterprise web dev

• 2 years C/C++ embedded dev

• Ruby NZ Committee Member

–Roy Batty, Bladerunner

“I've seen things you people wouldn't believe.”

Reintroducing ORMs

ORMs

• “Object Relational Mappers”

• Translates and serialises objects to a relational database

• Generally database agnostic

Cat.new(fur: 'calico', name: 'Ms Tibbles')

id fur name

1 calico Ms Tibbles

ms_tibbles.owner = Owner.new(name: 'Megan')

id fur name owner_id

1 calico Ms Tibbles 1

cats table

id name

1 Megan

owners table

Examples

Entity FrameworkActiveRecord django.db

The Good 👍

Saves development time & easier to maintain

SELECT * FROM books INNER JOIN libraries ON books.library_id = libraries.id WHERE libraries.name = 'Wellington City Library'

Book .joins('libraries') .where(libraries: { name: 'Wellington City Library' })

VS

Security

• Most ORMs stop SQL injection attacks 💉

• Some restrict columns that can be updated by user input (e.g. Rails 4’s strong_params)

BUT $

When and How ORMs Can Breakdown

Ultimately ORMs are an Abstraction

• Simplified for specific use cases

• What happens when the relationships between your models get more complex?

• What happens when you need data not tied to a model’s fields? 📊

N+1 Problem

A book has an editor and magazines are a type of book, how do we find all the editors belonging to

magazines to print them out?

📚'(

Naive Way

magazines = Book .where(type: 'magazine')

magazines.each do |magazine| puts magazine.editor end

Resultant SQL

SELECT * FROM books WHERE type = 'magazine'

SELECT * FROM editors WHERE book_id = 1 SELECT * FROM editors WHERE book_id = 2 .. SELECT * FROM editors WHERE book_id = n

We end up with 1 + n queries, where n is the number of magazines, hence the N+1 problem.

Optimised Way

magazines = Book .includes('editor') .where(type: 'magazine')

magazines.each do |magazine| puts magazine.editor end

Resultant SQL

SELECT * FROM books WHERE type = 'magazine'

SELECT * FROM editors WHERE (book_id IN (1,2,3..n))

• Not always as obvious as this

• Can have a lot of things happening between fetching the parent model and the child models

• Still, (most) ORMs have the capability to help

Modern Scripting Languages are SLOW* 🐢

* At handling large data sets

• Page load times can slow down noticeably with just a few thousand instances of models.

• May expect to run operations on hundreds of thousands.

• Is it WEBSCALE?

We Usually Deal With This by Over-engineering

• Adding extra caching layers

• Load balancing with horizontal scaling

• Progressive page loading

Subverting Your ORM for Fun and Profit

• We can wrest the raw database connection from the ORM

• With this we can improve performance without greatly increasing complexity

A real world example

• Web app for client that surveyed organisational performance

• Produced an online report with several different breakdowns of statistics from the survey

• Was surprisingly slow - hit web server timeout

Looking closerMost of the time spent outside of the database

• 100,532 calls to Class#new ‼

• A simple page was only creating ~900 objects

• One suspect was a function calculating the average of responses to a group of questions (a “domain”)

sum = 0.0 count = 0.0 domain.questions.each do |q| response = q.response_for(respondent) sum += response.value count += 1.0 end

if count > 0 return sum / count else return 1.0 end

conn = ActiveRecord::Base.connection result = conn.execute <<-SQL SELECT SUM(responses.value) as sum, COUNT(*) as count FROM domains INNER JOIN questions ON questions.domain_id = domains.id INNER JOIN responses ON responses.question_id = questions.id AND responses.respondent_id = #{respondent.id} WHERE domains.id = #{domain.id} SQL

score = 1.0

sum = result[0]['sum'].to_f count = result[0]['count'].to_f

score = sum / count if count > 0

• Reduced page load time by more than a half

• Reduced number of objects created by 30%

Words of Caution ⚠

• Not as maintainable

• Need to keep an eye on security. Never insert user provided values into raw SQL.

• Not as portable.

XKCD #327 https://xkcd.com/327/

sql = Cat.where(name: 'Ms Tibbles').to_sql ActiveRecord::Base.connection.execute sql

Can sometimes get ORM to help you defeat itself

Cool Things Beyond Performance 😎

• Database functions

• Common table expressions

• Views

• GIS extensions for geographical data 🌏

• Non-standard data types

Finally: Remember your

indexes!

To Summarise

• ORMs bring great benefits much of the time

• However being aware of what they’re doing is essential

• If need be, it is possible to work around them.

• Databases are your friend.