getting sql right the first try (most of the time!) may, 2008 ©2007 dan tow, all rights reserved...

39
Getting SQL Right the First Try (Most of the Time!) May, 2008 ©2007 Dan Tow, All rights reserved [email protected] www.singingsql.com Singing Singing SQL SQL Presents Presents:

Upload: melvyn-webster

Post on 30-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Getting SQL Right the First Try (Most of the Time!)

May, 2008

©2007 Dan Tow, All rights [email protected]

SingingSingingSQL SQL PresentsPresents:

Getting SQL Right the First Try

•Introduction

•Code what you know

•Know what you code

Getting SQL Right the First Try - Introduction

•This presentation deals with writing the initial SQL, before it is even tested the first time.

•If the initial SQL is functionally right, SQL tuners can take it as a clean spec for the rows the application requires at that point in the flow of control, and they don’t need an understanding of the application or even the tables.

Getting SQL Right the First Try - Introduction

•To write that clean initial SQL, the developer absolutely does need a precise understanding of the application requirements and the tables.

•There is no point in writing initial SQL that is not a clean, correct spec for what the application needs at that stage in its flow of control!

•Iteration, or trial and error, is a terrible way to write the initial SQL.

Getting SQL Right the First Try - Introduction

•Clean initial SQL, written by a developer who fully understands the functional requirements, is likely to get a good execution plan from the optimizer without manual tuning.

•Even if clean initial SQL does not perform well, as a clean, transparent, correct spec for the functional requirements of the application, it is much easier to tune than unclean SQL, and it avoids wasting time tuning something that isn’t even functionally sensible.

Code what you knowKnow what you code

First principles:

If you don’t really understand what the tables and/or views represent, the SQL will likely perform poorly,…

Code what you knowKnow what you code

First principles:

If you don’t really understand what the tables and/or views represent, the SQL will likely perform poorly,…

and even if it doesn’t, it will likely be functionally wrong!

Code what you knowKnow what you code

First Principles:

If you don’t really understand what the SQL is supposed to accomplish, it will likely perform poorly,…

Code what you knowKnow what you code

First Principles:

If you don’t really understand what the SQL is supposed to accomplish, it will likely perform poorly,…

and even if it didn’t, it will likely be functionally wrong!

Code what you knowKnow what you code

First Principles:If you don’t really understand what the SQL is supposed to accomplish, it will likely perform poorly,…and even if it didn’t, it will likely be functionally wrong…And even if it isn’t, it will be hard to understand and maintain!

Code what you know

•What set of entities does each table represent?

•What is the complete primary key to each table?

•What set of entities does each view represent?

•What is the virtual primary key of each view?

•Roughly how many rows in production will there be in each table or view?

Know the objects

If you don’t know enough to…

Fully understand the tables and views…

You don’t know enough to write the SQL!

Join Trees, a Crash Introduction

SELECT …FROM Orders O, Order_Details OD, Products P, Customers C, Shipments S, Addresses A, Code_Translations ODT, Code_Translations OTWHERE UPPER(C.Last_Name) LIKE :Last_Name||'%' AND UPPER(C.First_Name) LIKE :First_Name||'%' AND OD.Order_ID = O.Order_ID AND O.Customer_ID = C.Customer_ID AND OD.Product_ID = P.Product_ID(+) AND OD.Shipment_ID = S.Shipment_ID(+) AND S.Address_ID = A.Address_ID(+) AND O.Status_Code = OT.Code AND OT.Code_Type = 'ORDER_STATUS' AND OD.Status_Code = ODT.Code AND ODT.Code_Type = 'ORDER_DETAIL_STATUS' AND O.Order_Date > :Now - 366ORDER BY …;

S P O ODT

OD

A OT C

Join Trees, a Crash IntroductionSELECT …FROM Orders O, Order_Details OD, Products P, Customers C, Shipments S, Addresses A, Code_Translations ODT, Code_Translations OTWHERE UPPER(C.Last_Name) LIKE :Last_Name||'%' AND UPPER(C.First_Name) LIKE :First_Name||'%' AND OD.Order_ID = O.Order_ID AND O.Customer_ID = C.Customer_ID AND OD.Product_ID = P.Product_ID(+) AND OD.Shipment_ID = S.Shipment_ID(+) AND S.Address_ID = A.Address_ID(+) AND O.Status_Code = OT.Code AND OT.Code_Type = 'ORDER_STATUS' AND OD.Status_Code = ODT.Code AND ODT.Code_Type = 'ORDER_DETAIL_STATUS' AND O.Order_Date > :Now - 366ORDER BY …;

•Each table is a node, represented by its alias.•Each join is a link, with (usually downward-pointing) arrows pointing toward any side of the join that is unique.•Midpoint arrows point to optional side of any outer join.

S P O ODT

OD

A OT C

Expectations of Simple Queries

•Query maps to one tree.•The tree has one root, exactly one table with no join to its primary key.•All joins have downward-pointing arrows (joins unique on one end).•Outer joins point down, with only outer joins below outer joins.

S P O ODT

OD

A OT C

Normal Simple Queries

•The question that the query answers is basically a question about the entity represented at the top (root) of the tree (or about aggregations of that entity).•The other tables just provide reference data stored elsewhere for normalization.

S P O ODT

OD

A OT C

Know what you code

•Are the tables joined in a tree (no loops, no missing joins)? If not, why, exactly?

•What is the direction of every join? (Which side or sides are unique?) If any join is many-to-many, why is it, and how can that be right?

•What is the purpose of each join? Are you using anything from the joined-to table?

Know the joins

If you don’t know enough to…

Understand the nature and purpose of the joins…

You don’t know enough to write the SQL!

Know what you code

•What is the entity that each result row (or each pre-aggregated result row) should represent? What table or view maps to that entity?

•Is the table or the view mapping to the results the (only!) root detail table, the only table not joined to the other tables with its primary key? (Are all joins from the single root node downward-pointing?)

Know the join-tree structure

If you don’t know enough to…

Understand the structure of the join tree, and how it maps to the desired query result…

You don’t know enough to write the SQL!

A Really Wild and Crazy (and good!) Idea*

•Create the join tree before you even begin to write the SQL!•Write the SQL to precisely match the join tree. (See the whitepaper for details.)•Use the join trees to document the SQL.

*Thanks to Fahd Mirza!

Know what you code - views

If you use a view, the view-defining SQL is part of your SQL, and your SQL is only right if the whole combination is right!

Know what you code - views

•Do rows from the view map cleanly, one-to-one, to a single entity? (Are the view-defining join trees clean?) Is that exactly the right entity for purposes of your query?

•Are there elements in the view (joins, subqueries) that are unnecessary or redundant to your query?

•Will use of the view still be correct when and if the view changes?

Know the views

If you don’t know enough to…

Understand the structure of the views you use…

You don’t know enough to write the SQL!

Know what you code - views

•Would the complete join diagram, which explodes the view-defining queries into the view-using query, still be a normal, clean join tree? If not, exactly what would be the corner-case behaviors of that query, and would those behaviors be correct?

•How could you code the query to base-tables, only? If the resulting base-table query would be unusual or complex, is that complexity necessary and correct?

Know the view-decomposition

If you don’t know enough to…

Understand how to write the query direct to the base tables…

You don’t know enough to write the SQL!

Know the view-decomposition

If you don’t know enough to…

Understand how to write the query direct to the base tables…

You don’t know enough to write the SQL!

Oh, and by the way…

Know the view-decomposition

If you do know enough to

Understand how to write the query direct to the base tables…

Why don’t you!?!

Know the view-decomposition

If you do know enough to

Understand how to write the query direct to the base tables…

Why don’t you!?!

(Yes, there are reasons, but have a reason.)

Know what you code

•Roughly, how many rows should the query return? Is that a useful result set (not too large for human consumption)?

•Does the query return any rows or columns that aren’t really needed?

•When, and how often, does the query run? Can it perform well in that context, given reasonable per-row performance, or do we have an example of an “unsolvable” tuning problem?

Know the query context

If you don’t know enough to…

Understand the context of the query and the rows it returns…

You don’t know enough to write the SQL!

Know what you code

•What is the main, most-selective filter that will trim the result set from the unfiltered rowcount of the root detail table to the desired rowcount? If this isn’t obvious to the casual SQL reviewer, make it obvious, with comments.

•Is there a clean, indexed path to that that best filter, and to the other big tables, from there, and have you provided it?

Know the path to the data

If you don’t know enough to…

Understand a clean path to the data from the best (most selective) filter…

You don’t know enough to write the SQL!

Know what you code

•Is the SQL clear enough that the next developer (who will know far less about that module of the application), or even a SQL tuning specialist (who may know almost nothing about the application) can easily understand it?

•Are the unusual features of the SQL, if any, well commented with clear explanations?

•Is every column preceded with its alias?

Know how to make it clear

If you don’t know enough to…

Make the SQL transparently understandable and clear…

You don’t know enough to write the SQL!

Know what you code

•Are these rules seemingly impossible to follow on your database? Problems coding clean SQL often point to schema-design flaws – push back on these flaws when you find them.

•Does is seem impossible to make the deadline with clean SQL, following these rules? Would it be better to release the wrong SQL on time? (NO!)

Know the value of getting it right

If you don’t know enough to…

Know the value of getting it right in the first place…

You don’t know enough to write the SQL!

Conclusion

•All SQL can be built right!•All SQL should be built right!•SQL that is built following the right principles rarely needs fixing, and can always be fixed easily when necessary!

Questions?