70-461 Chapter 01 - Foundations Of Querying

Microsoft Exam ID : 70-461 Microsoft Exam Title : Querying Microsoft SQL Server 2012 Microsoft Exam Url : http://www.microsoft.com/learning/en-us/exam-70-461.aspx

Chapter One - Foundations Of Querying

  • RDBMS - Relational Database Management System.
  • SQL is a standard of both Intentional Organisation Standards (ISO) and American National Institute (ANSI) though there has been many major revisions over time including SQL:2008 and SQL:2011.
  • Writing SQL in standard manor is considered best practice and makes it more portable.
  • T-SQL supports two 'not equal to' expressions which are '<>' (standard) and '!=' (non-standard).
  • CAST is the standard where as CONVERT is not. Both have different usage styles.
  • CTE - Common Table Expression
  • Standard SQL is based on the relational database model which is a mathematical model initially created by Edgar F. Codd in 1969. Relational in relation database does NOT have anything to do with the relationships between tables (foreign keys), it actually comes from the mathematical concept relation.
  • A relation in the relational model is what SQL calls a table to represent a relation though some say this is not a very clear or successful attempt.
  • SQL is based on the relation model though does differ.
  • A relation (table) has a heading and body. The heading is a set of attributes (represented by columns) and an attribute is identified by name and type name. The body is a set of tuples (rows) and each tuple's heading is the heading of the relation.
  • Most important principals regarding T-SQL stem from relational model's core foundations the set theory and predicate logic.
  • Set theory - interact with the set as a whole not on a element by element basis as well as the set containing no duplicates. I.e {a,b,c} is equal to {a,b,b,c,c}. Another aspect that isn't defined by implied is that there aren't any relevance to the order of the elements in a set.
  • Predicate logic - an predicate is an expression when attributed to an object, i.e 'Price less than £10.00'. You can evaluate an predicate to get a true or false proposition. Predicates are used to enforce data integrity as well as filtering data.

Using T-SQL In A Relational Way

  • In T-SQL you should try to avoid using iterative constructs like cursors and loops that iterate through rows one by one.
  • T-SQL doesn't always enforce that a set doesn't have duplicates, i.e you can have a table without a key and duplicate rows. You should though enforce uniqueness in your tables using a primary key or unique constraint for example. Even though a table doesn't allow duplicate rows a query against that table can still return duplicates in the superset (not a 'set' as that doesn't contain duplicates). I.e selecting just the city column may return duplicates.
  • T-SQL is actually more based on multiset theory than on set theory. A multiset can also be called a superset and does allow duplicates though this isn't based on the relational theory. The DISTINCT clause is used to remove duplicates to make the data returned a set rather than a superset.
  • A 'set' doesn't state an relevance to the order of the elements and even though you may understand the physical representation of the data you shouldn't assume any order.
  • When using an ORDER BY the results aren't relational and is what standard SQL calls a cursor.
  • Columns are presented in the order as they are stored in the table definition.
  • Resulting columns in T-SQL don't have to have an assigned name on the target column though this means it breaks away from the relation model where all attributes must have names. These attributes must also be unique.
  • As well as a predicate evaluating to True or False, Codd wanted to reflect two different missing values. though being as T-SQL is based on SQL there is only one general purpose mark called NULL.
  • It's important to understand what happens when NULL's are involved in the data your querying like sorting, filtering, grouping, joining and intersecting.

Using Correct Terminology

  • A column and row are not an field and record. Tables are logical and have logical rows and columns.
  • "Null Value" is wrong. NULL is not a value, it is a term to describe a missing value.

Logical Query Processing

  • Logical is the conceptual interpretation of the query which explains what the correct result is. Physical side of query processing is the done by the database engine which produces the results defined by the logical query processing.
  • The database engine can apply optimisation which means it can rearrange steps from the logical query or remove steps but only as the end result remains the same as defined by the logical query.
  • T-SQL is a declarative english like language. You define what you want as opposed to imperative languages that also define how to achieve the result. T-SQL is written to define the 'What' part of the query and the database engine is responsible to figure out the physical how part of the operation.
  • Don't draw any performance related conclusions regarding logical query processing. It's important to understand what's happening under the hood during the optimisation stage of the database engine.
  • Originally SQL was called SEQUEL which stood for 'Structured English Query Language' though was renamed after a trademark dispute that now stands for 'Structured Query Language'.
Logical Query Processing Phases
  1. FROM
  2. WHERE
  3. GROUP BY
  4. HAVING
  5. SELECT
  6. ORDER BY

Keyed In Order

  1. SELECT
  2. FROM
  3. WHERE
  4. GROUP BY
  5. HAVING
  6. ORDER BY
  • Output from each logical steps is returned as a virtual table and is considered the input from the next phase.
  • Using Order By means the results are relational.
  • EXAM TIP - Rows which the predicate evaluates to false or to an unknown state as not returned.
  • Column aliases cannot be used in the WHERE statement as the WHERE statement is evaluated before the SELECT statement. A common mistake by someone who doesn't understand logical processing. Aliases can however be used in latter logical processing stages like ORDER BY though cannot be used in the same logical processing phrase they are declared in. I.e creating an alias in the SELECT, cannot then also be used else where in the SELECT statement.
  • HAVING filters results based on a predicate after the results have been grouped compared to WHERE which filters results on an per row basis.
  • As SQL is more based on multiset theory it's your responsibility to remove duplicates using the DISTINCT clause.
  • As SELECT is processed by the ORDER BY, the SELECT output is considered to be relation still.
  • The result of ORDER BY is what standard SQL calls a cursor though the term cursor here is used conceptual. T-SQL also supports an object called a cursor that is defined on results of a query and allows rows to be fetched one by one in a specified order.

Posted in Databases with : SQL Server, Training