getting sql right the first try (most of the time!) may, 2008 ©2007 dan tow, all rights reserved...

Getting SQL Right the First Try (Most of the Time!)

May, 2008

©2007 Dan Tow, All rights [email protected]

SingingSingingSQL SQL PresentsPresents:

Getting SQL Right the First Try

•Introduction

•Code what you know

•Know what you code

Getting SQL Right the First Try - Introduction

•This presentation deals with writing the initial SQL, before it is even tested the first time.

•If the initial SQL is functionally right, SQL tuners can take it as a clean spec for the rows the application requires at that point in the flow of control, and they don’t need an understanding of the application or even the tables.


•To write that clean initial SQL, the developer absolutely does need a precise understanding of the application requirements and the tables.

•There is no point in writing initial SQL that is not a clean, correct spec for what the application needs at that stage in its flow of control!

•Iteration, or trial and error, is a terrible way to write the initial SQL.


•Clean initial SQL, written by a developer who fully understands the functional requirements, is likely to get a good execution plan from the optimizer without manual tuning.

•Even if clean initial SQL does not perform well, as a clean, transparent, correct spec for the functional requirements of the application, it is much easier to tune than unclean SQL, and it avoids wasting time tuning something that isn’t even functionally sensible.

Code what you knowKnow what you code

First principles:

If you don’t really understand what the tables and/or views represent, the SQL will likely perform poorly,…


First principles:

If you don’t really understand what the tables and/or views represent, the SQL will likely perform poorly,…

and even if it doesn’t, it will likely be functionally wrong!


First Principles:

If you don’t really understand what the SQL is supposed to accomplish, it will likely perform poorly,…


First Principles:

If you don’t really understand what the SQL is supposed to accomplish, it will likely perform poorly,…

and even if it didn’t, it will likely be functionally wrong!


First Principles:If you don’t really understand what the SQL is supposed to accomplish, it will likely perform poorly,…and even if it didn’t, it will likely be functionally wrong…And even if it isn’t, it will be hard to understand and maintain!

Code what you know

•What set of entities does each table represent?

•What is the complete primary key to each table?

•What set of entities does each view represent?

•What is the virtual primary key of each view?

•Roughly how many rows in production will there be in each table or view?

Know the objects

If you don’t know enough to…

Fully understand the tables and views…

You don’t know enough to write the SQL!

Join Trees, a Crash Introduction

SELECT …FROM Orders O, Order_Details OD, Products P, Customers C, Shipments S, Addresses A, Code_Translations ODT, Code_Translations OTWHERE UPPER(C.Last_Name) LIKE :Last_Name||'%' AND UPPER(C.First_Name) LIKE :First_Name||'%' AND OD.Order_ID = O.Order_ID AND O.Customer_ID = C.Customer_ID AND OD.Product_ID = P.Product_ID(+) AND OD.Shipment_ID = S.Shipment_ID(+) AND S.Address_ID = A.Address_ID(+) AND O.Status_Code = OT.Code AND OT.Code_Type = 'ORDER_STATUS' AND OD.Status_Code = ODT.Code AND ODT.Code_Type = 'ORDER_DETAIL_STATUS' AND O.Order_Date > :Now - 366ORDER BY …;

S P O ODT

OD

A OT C

Join Trees, a Crash IntroductionSELECT …FROM Orders O, Order_Details OD, Products P, Customers C, Shipments S, Addresses A, Code_Translations ODT, Code_Translations OTWHERE UPPER(C.Last_Name) LIKE :Last_Name||'%' AND UPPER(C.First_Name) LIKE :First_Name||'%' AND OD.Order_ID = O.Order_ID AND O.Customer_ID = C.Customer_ID AND OD.Product_ID = P.Product_ID(+) AND OD.Shipment_ID = S.Shipment_ID(+) AND S.Address_ID = A.Address_ID(+) AND O.Status_Code = OT.Code AND OT.Code_Type = 'ORDER_STATUS' AND OD.Status_Code = ODT.Code AND ODT.Code_Type = 'ORDER_DETAIL_STATUS' AND O.Order_Date > :Now - 366ORDER BY …;

•Each table is a node, represented by its alias.•Each join is a link, with (usually downward-pointing) arrows pointing toward any side of the join that is unique.•Midpoint arrows point to optional side of any outer join.

S P O ODT

OD

A OT C

Expectations of Simple Queries

•Query maps to one tree.•The tree has one root, exactly one table with no join to its primary key.•All joins have downward-pointing arrows (joins unique on one end).•Outer joins point down, with only outer joins below outer joins.

S P O ODT

OD

A OT C

Normal Simple Queries

•The question that the query answers is basically a question about the entity represented at the top (root) of the tree (or about aggregations of that entity).•The other tables just provide reference data stored elsewhere for normalization.

S P O ODT

OD

A OT C

Know what you code

•Are the tables joined in a tree (no loops, no missing joins)? If not, why, exactly?

•What is the direction of every join? (Which side or sides are unique?) If any join is many-to-many, why is it, and how can that be right?

•What is the purpose of each join? Are you using anything from the joined-to table?

Know the joins


Understand the nature and purpose of the joins…


Know what you code

•What is the entity that each result row (or each pre-aggregated result row) should represent? What table or view maps to that entity?

•Is the table or the view mapping to the results the (only!) root detail table, the only table not joined to the other tables with its primary key? (Are all joins from the single root node downward-pointing?)

Know the join-tree structure


Understand the structure of the join tree, and how it maps to the desired query result…


A Really Wild and Crazy (and good!) Idea*

•Create the join tree before you even begin to write the SQL!•Write the SQL to precisely match the join tree. (See the whitepaper for details.)•Use the join trees to document the SQL.

*Thanks to Fahd Mirza!

Know what you code - views

If you use a view, the view-defining SQL is part of your SQL, and your SQL is only right if the whole combination is right!


•Do rows from the view map cleanly, one-to-one, to a single entity? (Are the view-defining join trees clean?) Is that exactly the right entity for purposes of your query?

•Are there elements in the view (joins, subqueries) that are unnecessary or redundant to your query?

•Will use of the view still be correct when and if the view changes?

Know the views


Understand the structure of the views you use…



•Would the complete join diagram, which explodes the view-defining queries into the view-using query, still be a normal, clean join tree? If not, exactly what would be the corner-case behaviors of that query, and would those behaviors be correct?

•How could you code the query to base-tables, only? If the resulting base-table query would be unusual or complex, is that complexity necessary and correct?

Know the view-decomposition


Understand how to write the query direct to the base tables…






Oh, and by the way…


If you do know enough to


Why don’t you!?!


If you do know enough to


Why don’t you!?!

(Yes, there are reasons, but have a reason.)

Know what you code

•Roughly, how many rows should the query return? Is that a useful result set (not too large for human consumption)?

•Does the query return any rows or columns that aren’t really needed?

•When, and how often, does the query run? Can it perform well in that context, given reasonable per-row performance, or do we have an example of an “unsolvable” tuning problem?

Know the query context


Understand the context of the query and the rows it returns…


Know what you code

•What is the main, most-selective filter that will trim the result set from the unfiltered rowcount of the root detail table to the desired rowcount? If this isn’t obvious to the casual SQL reviewer, make it obvious, with comments.

•Is there a clean, indexed path to that that best filter, and to the other big tables, from there, and have you provided it?

Know the path to the data


Understand a clean path to the data from the best (most selective) filter…


Know what you code

•Is the SQL clear enough that the next developer (who will know far less about that module of the application), or even a SQL tuning specialist (who may know almost nothing about the application) can easily understand it?

•Are the unusual features of the SQL, if any, well commented with clear explanations?

•Is every column preceded with its alias?

Know how to make it clear


Make the SQL transparently understandable and clear…


Know what you code

•Are these rules seemingly impossible to follow on your database? Problems coding clean SQL often point to schema-design flaws – push back on these flaws when you find them.

•Does is seem impossible to make the deadline with clean SQL, following these rules? Would it be better to release the wrong SQL on time? (NO!)

Know the value of getting it right


Know the value of getting it right in the first place…


Conclusion

•All SQL can be built right!•All SQL should be built right!•SQL that is built following the right principles rarely needs fixing, and can always be fixed easily when necessary!

Questions?

getting sql right the first try (most of the time!) may, 2008 ©2007 dan tow, all rights reserved...

Documents

codegetting sql

sql tuners

unclean sql

codefirst principles

viewsyou dont

clean spec

application requirements

application needs