SQL Development



























SET XACT_ABORT

When SET XACT_ABORT is ON, if a Transact-SQL statement raises a run-time error, the entire transaction is terminated and rolled back.
When SET XACT_ABORT is OFF, in some cases only the Transact-SQL statement that raised the error is rolled back and the transaction continues processing. Depending upon the severity of the error, the entire transaction may be rolled back even when SET XACT_ABORT is OFF. OFF is the default setting.
Compile errors, such as syntax errors, are not affected by SET XACT_ABORT.
XACT_ABORT must be set ON for data modification statements in an implicit or explicit transaction against most OLE DB providers, including SQL Server. The only case where this option is not required is if the provider supports nested transactions.
When ANSI_WARNINGS=OFF, permissions violations cause transactions to abort.
The setting of SET XACT_ABORT is set at execute or run time and not at parse time.
To view the current setting for this setting, run the following query.

GO
INSERT INTO t1 VALUES (1);
INSERT INTO t1 VALUES (3);
INSERT INTO t1 VALUES (4);
INSERT INTO t1 VALUES (6);
GO
SET XACT_ABORT OFF;
GO
BEGIN TRANSACTION;
INSERT INTO t2 VALUES (1);
INSERT INTO t2 VALUES (2); -- Foreign key error.
INSERT INTO t2 VALUES (3);
COMMIT TRANSACTION;
GO
SET XACT_ABORT ON;
GO
BEGIN TRANSACTION;
INSERT INTO t2 VALUES (4);
INSERT INTO t2 VALUES (5); -- Foreign key error.
INSERT INTO t2 VALUES (6);
COMMIT TRANSACTION;


sys.dm_tran_locks

Returns information about currently active lock manager resources in SQL Server 2016. Each row represents a currently active request to the lock manager for a lock that has been granted or is waiting to be granted.

The columns in the result set are divided into two main groups: resource and request. The resource group describes the resource on which the lock request is being made, and the request group describes the lock request.

sys.dm_tran_session_transactions

Returns correlation information for associated transactions and sessions.
Through bound sessions and distributed transactions, it is possible for a transaction to be running under more than one session. In such cases, sys.dm_tran_session_transactions will show multiple rows for the same transaction_id, one for each session under which the transaction is running.

Schema binding is a way of ensuring that the objects referenced within a function or view, do not change their definition in any way that would break the binded object. Objects are prevented from changing their defintions while they are schema bound

10 steps to optimize data access in SQL Server

indexing
Make sure that every table in your database has a primary key.
Create non-clustered indexes on columns which are:


  • Frequently used in the search criteria
  • Used to join other tables
  • Used as foreign key fields
  • Of having high selectivity (column which returns a low percentage (0-5%) of rows from a total number of rows on a particular value)
  • Used in the ORDER BY clause
  • Of type XML (primary and secondary indexes need to be created; more on this in the coming articles)


Use the Database Tuning Advisor's help while creating covered index

We all know, when a SQL is issued, the optimizer in the SQL Server engine dynamically generates different query plans based on:


  • Volume of data
  • Statistics
  • Index variation
  • Parameter value in TSQL
  • Load on server


index fragmentation?

Index fragmentation is a situation where index pages split due to heavy insert, update, and delete operations on the tables in the database. If indexes have high fragmentation, either scanning/seeking the indexes takes much time, or the indexes are not used at all (resulting in table scan) while executing queries. Thus, data retrieval operations perform slow.

defragment indexes


SQL Injection

In the following example, assume that a web site is being used to mount an attack on the database. If you think about a typical SQL statement, you might think of something like:

Hide   Copy Code
SELECT ProductName, QuantityPerUnit, UnitPrice 
FROM Products 
WHERE ProductName LIKE 'G%'

The objective of the attacker is to inject their own SQL into the statement that the application will use to query the database. If, for instance, the above query was generated from a search feature on a web site, then they user may have inserted the "G" as their query. If the server side code then inserts the user input directly into the SQL statement, it might look like this:

Hide   Copy Code
string sql = "SELECT ProductName, QuantityPerUnit, UnitPrice "+
    "FROM Products " +
    "WHERE ProductName LIKE '"+this.search.Text+"%';
SqlDataAdapter da = new SqlDataAdapter(sql, DbCommand);
da.Fill(productDataSet);
This is all fine if the data is valid, but what if the user types something unexpected? What happens if the user types:

Hide   Copy Code
' UNION SELECT name, type, id FROM sysobjects;--

Note the initial apostrophe; it closes the opening quote in the original SQL statement. Also, note the two dashes at the end; that starts a comment, which means that anything left in the original SQL statement is ignored.

Now, when the attacker views the page that was meant to list the products the user has searched for, they get a list of all the names of all the objects in the database and the type of object that they are. From this list, the attacker can see that there is a table called Users. If they take note of the id for the Users table, they could then inject the following:


Locking down


Security is something that needs to be tackled on many levels because a chain is only as strong as its weakest link. When a user interacts with a piece of software, there are many links in the chain; if the user is malicious, he could attempt to attack these links to find the weak point and attempt to break the system at that point. With this in mind, it is important that the developer does not become complacent about the security of the system because one security measure is put in place, or a set of security measures are in place on only one part of the system.


Encrypting data

Starting from the proposition that somehow an attacker has managed to break through all other defenses, what information is so sensitive that it needs to remain a secret? Candidates for encryption include user log in details or financial information such as credit card details.

For items such as passwords, the user's password can be stored as a "salted hash". What happens is that when a user creates a password, a randomly generated "salt" value is created by the application and appended to the password, and the password-and-salt are then passed through a one way encryption routine, such as found in the .NET Framework's helper class FormsAuthentication (HashPasswordForStoringInConfigFile method). The result is a salted hash which is stored in the database along with the clear text salt string

Least Privilege - Database account

Running an application that connects to the database using the database's administrator account has the potential for an attacker to perform almost limitless commands with the database. Anything an administrator can do, so can an attacker.

Using the example application above, an attacker could inject the following to discover the contents of the hard disk(s) on the server.


Cleaning and Validating input

In many applications, the developer has side-stepped the potential use of the apostrophe as a way to get access to the system by performing a string replace on the input given by the user. This is useful for valid reasons, such as being able to enter surnames such as "O'Brian" or "D'Arcy", and so the developer may not even realise that they have partly defeated a SQL injection attack. For example:


Second-Order Attacks

A second-order attack is one where the data lies dormant in the database until some future event occurs. It often happens because once data is in the database, it is often thought of as being clean and is not checked again. However, the data is frequently used in queries where it can still cause harm.

Consider an application that permits the users to set up some favourite search criteria. When the user defines the search parameters, the application escapes out all the apostrophes so that a first-order attack cannot occur when the data for the favourite is inserted into the database. However, when the user comes to perform the search, the data is taken from the database and used to form a second query which then performs the actual search. It is this second query which is the victim of the attack.

The application takes this input and escapes out apostrophe so that the final SQL statement might look like this:

Hide   Copy Code
INSERT Favourites (UserID, FriendlyName, Criteria)
VALUES(123, 'My Attack', ''';DELETE Orders;--')

he second query to the database, when fully expanded, now looks like this:

Hide   Copy Code
SELECT * FROM Products WHERE ProductName = ''; DELETE Orders;--


Parameterised Queries


SQL Server, like many database systems, supports a concept called parameterised queries. This is where the SQL Command uses a parameter instead of injecting the values directly into the command. The particular second-order attack above would not have been possible if parameterised queries had been used.

SQL Locking

Isolation Level
Definition
Read uncommitted
The lowest isolation level where transactions are isolated only enough to ensure that physically corrupt data is not read. In this level, dirty reads are allowed, so one transaction may see not-yet-committed changes made by other transactions.
Read committed
Allows a transaction to read data previously read (not modified) by another transaction without waiting for the first transaction to complete. The Database Engine keeps write locks (acquired on selected data) until the end of the transaction, but read locks are released as soon as the SELECT operation is performed. This is the Database Engine default level.
Repeatable read
The Database Engine keeps read and write locks that are acquired on selected data until the end of the transaction. However, because range-locks are not managed, phantom reads can occur.
Serializable
The highest level where transactions are completely isolated from one another. The Database Engine keeps read and write locks acquired on selected data to be released at the end of the transaction. Range-locks are acquired when a SELECT operation uses a ranged WHERE clause, especially to avoid phantom reads.
Important note Important
DDL operations and transactions on replicated tables may fail when serializable isolation level is requested. This is because replication queries use hints that may be incompatible with serializable isolation level.
Row Versioning Isolation Level
Definition
Read Committed Snapshot
When the READ_COMMITTED_SNAPSHOT database option is set ON, read committed isolation uses row versioning to provide statement-level read consistency. Read operations require only SCH-S table level locks and no page or row locks. That is, the Database Engine uses row versioning to present each statement with a transactionally consistent snapshot of the data as it existed at the start of the statement. Locks are not used to protect the data from updates by other transactions.
When the READ_COMMITTED_SNAPSHOT database option is set OFF, which is the default setting, read committed isolation uses shared locks to prevent other transactions from modifying rows while the current transaction is running a read operation. The shared locks also block the statement from reading rows modified by other transactions until the other transaction is completed. Both implementations meet the ISO definition of read committed isolation.
Snapshot
The snapshot isolation level uses row versioning to provide transaction-level read consistency. Read operations acquire no page or row locks; only SCH-S table locks are acquired. When reading rows modified by another transaction, they retrieve the version of the row that existed when the transaction started. You can only use Snapshot isolation against a database when the ALLOW_SNAPSHOT_ISOLATION database option is set ON. By default, this option is set OFF for user databases.
Note Note
SQL Server does not support versioning of metadata. For this reason, there are restrictions on what DDL operations can be performed in an explicit transaction that is running under snapshot isolation. The following DDL statements are not permitted under snapshot isolation after a BEGIN TRANSACTION statement: ALTER TABLE, CREATE INDEX, CREATE XML INDEX, ALTER INDEX, DROP INDEX, DBCC REINDEX, ALTER PARTITION FUNCTION, ALTER PARTITION SCHEME, or any common language runtime (CLR) DDL statement. These statements are permitted when you are using snapshot isolation within implicit transactions. An implicit transaction, by definition, is a single statement that makes it possible to enforce the semantics of snapshot isolation, even with DDL statements. Violations of this principle can cause error 3961: "Snapshot isolation transaction failed in database '%.*ls' because the object accessed by the statement has been modified by a DDL statement in another concurrent transaction since the start of this transaction. It is not allowed because the metadata is not versioned. A concurrent update to metadata could lead to inconsistency if mixed with snapshot isolation



Isolation level
Dirty read
Nonrepeatable read
Phantom
Read uncommitted
Yes
Yes
Yes
Read committed
No
Yes
Yes
Repeatable read
No
No
Yes
Snapshot
No
No
No
Serializable
No
No
No

XML Operation

The RAW Mode


The RAW mode generates a single XML element for each row in the result set returned by the query.

To use the FOR XML clause in RAW mode, you simply append the clause and RAW keyword to your SELECT statement, as shown in the following example:

SELECT e.EmployeeID, c.FirstName, c.MiddleName, c.LastName
FROM HumanResources.Employee e INNER JOIN Person.Contact c
   ON c.ContactID = e.ContactID
WHERE c.FirstName = 'Rob'
FOR XML RAW;


With the addition of the FOR XML clause, the statement returns the data as the following XML:

<row EmployeeID="4" FirstName="Rob" LastName="Walters" />
<row EmployeeID="168" FirstName="Rob" MiddleName="T" LastName="Caron" />


SELECT e.EmployeeID, c.FirstName, c.MiddleName, c.LastName
FROM HumanResources.Employee e INNER JOIN Person.Contact c
   ON c.ContactID = e.ContactID
WHERE c.FirstName = 'Rob'
FOR XML RAW ('Employee'), ROOT;
Notice that you must include a comma when adding an option such as ROOT in order to separate the elements. As the following results show, a <root> element is now included in the XML:

<root>
  <Employee EmployeeID="4" FirstName="Rob" LastName="Walters" />
  <Employee EmployeeID="168" FirstName="Rob" MiddleName="T" LastName="Caron" />
</root>

SELECT e.EmployeeID, c.FirstName, c.MiddleName, c.LastName
FROM HumanResources.Employee e INNER JOIN Person.Contact c
   ON c.ContactID = e.ContactID
WHERE c.FirstName = 'Rob'
FOR XML RAW ('Employee'), ROOT ('Employees'), ELEMENTS;
Once again, I’ve added a comma to separate the options. As you can see in the following results, each <Employee> element now includes a set of child elements that correspond to the columns returned by the query:

<Employees>
  <Employee>
    <EmployeeID>4</EmployeeID>
    <FirstName>Rob</FirstName>
    <LastName>Walters</LastName>
  </Employee>
  <Employee>
    <EmployeeID>168</EmployeeID>
    <FirstName>Rob</FirstName>
    <MiddleName>T</MiddleName>
    <LastName>Caron</LastName>
  </Employee>
</Employees>


The AUTO Mode


The AUTO mode in a FOR XML clause is slightly different from the RAW mode in the way that it generates the XML result set. The AUTO mode generates the XML by using heuristics based on how the SELECT statement is defined. The best way to understand how this works is to look at an example. The following SELECT statement, as in the previous examples, retrieves employee data from the AdventureWorks database:

SELECT Employee.EmployeeID, ContactInfo.FirstName,
   ContactInfo.MiddleName, ContactInfo.LastName
FROM HumanResources.Employee AS Employee
   INNER JOIN Person.Contact AS ContactInfo
   ON ContactInfo.ContactID = Employee.ContactID
WHERE ContactInfo.FirstName = 'Rob'
FOR XML AUTO, ROOT ('Employees');


Notice that I’ve provided meaningful alias names to the tables (Employee and Contact info). These names are used in defining the XML element names, so you’ll want to construct your SELECT statements accordingly. Now take a look at the results returned by this query:

<Employees>
  <Employee EmployeeID="4">
    <ContactInfo FirstName="Rob" LastName="Walters" />
  </Employee>
  <Employee EmployeeID="168">
    <ContactInfo FirstName="Rob" MiddleName="T" LastName="Caron" />
  </Employee>
</Employees>

SELECT Employee.EmployeeID, ContactInfo.FirstName,
   ContactInfo.MiddleName, ContactInfo.LastName
FROM HumanResources.Employee AS Employee
   INNER JOIN Person.Contact AS ContactInfo
   ON ContactInfo.ContactID = Employee.ContactID
WHERE ContactInfo.FirstName = 'Rob'
FOR XML AUTO, ROOT ('Employees'), ELEMENTS;
As you can see in the following XML result set, the column values are now included as child elements, rather than attributes:

<Employees>
  <Employee>
    <EmployeeID>4</EmployeeID>
    <ContactInfo>
      <FirstName>Rob</FirstName>
      <LastName>Walters</LastName>
    </ContactInfo>
  </Employee>
 <Employee>
    <EmployeeID>168</EmployeeID>
    <ContactInfo>
      <FirstName>Rob</FirstName>
      <MiddleName>T</MiddleName>
      <LastName>Caron</LastName>
    </ContactInfo>
  </Employee>
</Employees>

The EXPLICIT Mode


The EXPLICIT mode provides very specific control over your XML, but this mode is much more complex to use than the RAW or AUTO modes. To use this mode, you must build your SELECT statements in such as way as to define the XML hierarchy and structure. In addition, you must create a SELECT statement for each level of that hierarchy and use UNION ALL clauses to join those statements.

There are a number of rules that describe how to define your SELECT statements when using the EXPLICIT mode, and it is beyond the scope of this article to review all those rules, so be sure to refer to the topic “Using EXPLICIT Mode” in SQL Server Books Online for the details about how to construct your SELECT statements. In the meantime, let’s take a look at a few examples that help demonstrate some of the basic elements of the EXPLICIT mode.


SELECT 1 AS Tag,
   NULL AS Parent,
   e.EmployeeID AS [Employee!1!EmployeeID!ELEMENT],
   NULL AS [ContactInfo!2!FirstName!ELEMENT],
   NULL AS [ContactInfo!2!MiddleName!ELEMENT],
   NULL AS [ContactInfo!2!LastName!ELEMENT]
FROM HumanResources.Employee e INNER JOIN Person.Contact c
   ON c.ContactID = e.ContactID
WHERE c.FirstName = 'Rob'
UNION ALL
SELECT 2 AS Tag,
   1 AS Parent,
   e.EmployeeID,
   c.FirstName,
   c.MiddleName,
   c.LastName
FROM HumanResources.Employee e INNER JOIN Person.Contact c
   ON e.ContactID = c.ContactID
WHERE c.FirstName = 'Rob'
ORDER BY [Employee!1!EmployeeID!ELEMENT], [ContactInfo!2!FirstName!ELEMENT]
FOR XML EXPLICIT;
Now the EmployeeID value will be displayed as a child element of <Employee>,the first level element:

<Employee>
  <EmployeeID>4</EmployeeID>
  <ContactInfo>
    <FirstName>Rob</FirstName>
    <LastName>Walters</LastName>
  </ContactInfo>
</Employee>
<Employee>
  <EmployeeID>168</EmployeeID>
  <ContactInfo>
    <FirstName>Rob</FirstName>
    <MiddleName>T</MiddleName>
    <LastName>Caron</LastName>
  </ContactInfo>
</Employee>


The PATH Mode


When you specify the PATH mode in the FOR XML clause, column names (or their aliases) are treated as XPath expressions that determine how the data values will be mapped to the XML result set. By default, XML elements are defined based on column names. You can modify the default behavior by using the at (@) symbol to define attributes or the forward slash (/) to define the hierarchy. Let’s take a look at a few examples to demonstrate how all this works.


SELECT e.EmployeeID, c.FirstName,
   c.MiddleName, c.LastName
FROM HumanResources.Employee AS e
   INNER JOIN Person.Contact AS c
   ON c.ContactID = e.ContactID
WHERE c.FirstName = 'Rob'
FOR XML PATH;
Because no specific attributes or hierarchies have been defined, the query will return the following XML:

<row>
  <EmployeeID>4</EmployeeID>
  <FirstName>Rob</FirstName>
  <LastName>Walters</LastName>
</row>
<row>
  <EmployeeID>168</EmployeeID>
  <FirstName>Rob</FirstName>
  <MiddleName>T</MiddleName>
  <LastName>Caron</LastName>
</row>

Trigger


There are three action query types that you use in SQL which are INSERT, UPDATE and DELETE. So, there are three types of triggers and hybrids that come from mixing and matching the events and timings that fire them. Basically, triggers are classified into two main types:

After Triggers (For Triggers)
Instead Of Triggers


i) After Triggers
These triggers run after an insert, update or delete on a table. They are not supported for views.
AFTER TRIGGERS can be classified further into three types as:
AFTER INSERT Trigger.
AFTER UPDATE Trigger.
AFTER DELETE Trigger.

(ii) Instead Of Triggers
These can be used as an interceptor for anything that anyone tried to do on our table or view. If you define an Instead Of trigger on a table for the Delete operation, they try to delete rows, and they will not actually get deleted (unless you issue another delete instruction from within the trigger)

INSTEAD OF TRIGGERS can be classified further into three types as:

INSTEAD OF INSERT Trigger.
INSTEAD OF UPDATE Trigger.
INSTEAD OF DELETE Trigger.


Nested Triggers


Triggers can be nested to a maximum of 32 levels. If a trigger changes a table on which there is another trigger, the second trigger is activated and can then call a third trigger, and so on.

Trigger Limitations


CREATE TRIGGER must be the first statement in the batch and can apply to only one table.
A trigger is created only in the current database; however, a trigger can reference objects outside the current database.
If the trigger schema name is specified to qualify the trigger, qualify the table name in the same way.
The same trigger action can be defined for more than one user action (for example, INSERT and UPDATE) in the same CREATE TRIGGER statement.
INSTEAD OF DELETE/UPDATE triggers cannot be defined on a table that has a foreign key with a cascade on DELETE/UPDATE action defined.

Any SET statement can be specified inside a trigger. The SET option selected remains in effect during the execution of the trigger and then reverts to its former setting.

SCHEMABINDING

When you use the SchemaBinding keyword while creating a view or function you bind the structure of any underlying tables or views. So what does that mean? It means that as long as that schemabound object exists as a schemabound object (ie you don’t remove schemabinding) you are limited in changes that can be made to the tables or views that it refers to.

PERSISTED

Specifies that the Database Engine will physically store the computed values in the table, and update the values when any other columns on which the computed column depends are updated. Marking a computed column as PERSISTED allows an index to be created on a computed column that is deterministic, but not precise. For more information, see Indexes on Computed Columns. Any computed columns used as partitioning columns of a partitioned table must be explicitly marked PERSISTED. computed_column_expression must be deterministic when PERSISTED is specified.

RANK (Transact-SQL)

If two or more rows tie for a rank, each tied rows receives the same rank. For example, if the two top salespeople have the same SalesYTD value, they are both ranked one. The salesperson with the next highest SalesYTD is ranked number three, because there are two rows that are ranked higher. Therefore, the RANK function does not always return consecutive integers.

RANK ( ) OVER ( [ partition_by_clause ] order_by_clause )
USE AdventureWorks2012;
GO
SELECT i.ProductID, p.Name, i.LocationID, i.Quantity
    ,RANK() OVER
    (PARTITION BY i.LocationID ORDER BY i.Quantity DESC) AS Rank
FROM Production.ProductInventory AS i
INNER JOIN Production.Product AS p
    ON i.ProductID = p.ProductID
WHERE i.LocationID BETWEEN 3 AND 4
ORDER BY i.LocationID;
GO


DENSE_RANK (Transact-SQL)

Returns the rank of rows within the partition of a result set, without any gaps in the ranking. The rank of a row is one plus the number of distinct ranks that come before the row in question.

Using PIVOT and UNPIVOT

You can use the PIVOT and UNPIVOT relational operators to change a table-valued expression into another table. PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output, and performs aggregations where they are required on any remaining column values that are wanted in the final output. UNPIVOT performs the opposite operation to PIVOT by rotating columns of a table-valued expression into column values.


SELECT <non-pivoted column>,
    [first pivoted column] AS <column name>,
    [second pivoted column] AS <column name>,
    ...
    [last pivoted column] AS <column name>
FROM
    (<SELECT query that produces the data>)
    AS <alias for the source query>
PIVOT
(
    <aggregation function>(<column being aggregated>)
FOR
[<column that contains the values that will become column headers>]
    IN ( [first pivoted column], [second pivoted column],
    ... [last pivoted column])
) AS <alias for the pivot table>
<optional ORDER BY clause>;

Using Common Table Expressions

A common table expression (CTE) can be thought of as a temporary result set that is defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. A CTE is similar to a derived table in that it is not stored as an object and lasts only for the duration of the query. Unlike a derived table, a CTE can be self-referencing and can be referenced multiple times in the same query.

WITH Sales_CTE (SalesPersonID, SalesOrderID, SalesYear)
AS
-- Define the CTE query.
(
    SELECT SalesPersonID, SalesOrderID, YEAR(OrderDate) AS SalesYear
    FROM Sales.SalesOrderHeader
    WHERE SalesPersonID IS NOT NULL
)
-- Define the outer query referencing the CTE name.
SELECT SalesPersonID, COUNT(SalesOrderID) AS TotalSales, SalesYear
FROM Sales_CTE
GROUP BY SalesYear, SalesPersonID
ORDER BY SalesPersonID, SalesYear;
GO

EXECUTE AS Clause

By specifying the context in which the module is executed, you can control which user account the Database Engine uses to validate permissions on objects that are referenced by the module. This provides additional flexibility and control in managing permissions across the object chain that exists between user-defined modules and the objects referenced by those modules. Permissions must be granted to users only on the module itself, without having to grant them explicit permissions on the referenced objects. Only the user that the module is running as must have permissions on the objects accessed by the module.

Functions (except inline table-valued functions), Stored Procedures, and DML Triggers
{ EXEC | EXECUTE } AS { CALLER | SELF | OWNER | 'user_name' }

CALLER
Specifies the statements inside the module are executed in the context of the caller of the module. The user executing the module must have appropriate permissions not only on the module itself, but also on any database objects that are referenced by the module.

SELF
EXECUTE AS SELF is equivalent to EXECUTE AS user_name, where the specified user is the person creating or altering the module. The actual user ID of the person creating or modifying the modules is stored in the execute_as_principal_id column in the sys.sql_modules or sys.service_queues catalog view.

OWNER
Specifies the statements inside the module executes in the context of the current owner of the module. If the module does not have a specified owner, the owner of the schema of the module is used. OWNER cannot be specified for DDL or logon triggers.

USE Sales;
GO
CREATE PROCEDURE dbo.usp_Demo
WITH EXECUTE AS 'CompanyDomain\SqlUser1'
AS
SELECT user_name();
GO  

LAG 
Accesses data from a previous row in the same result set without the use of a self-join in SQL Server 2016. LAG provides access to a row at a given physical offset that comes before the current row. Use this analytic function in a SELECT statement to compare values in the current row with values in a previous row.

SELECT BusinessEntityID, YEAR(QuotaDate) AS SalesYear, SalesQuota AS CurrentQuota,  
       LAG(SalesQuota, 1,0) OVER (ORDER BY YEAR(QuotaDate)) AS PreviousQuota
FROM Sales.SalesPersonQuotaHistory
WHERE BusinessEntityID = 275 and YEAR(QuotaDate) IN ('2005','2006');

LEAD

Accesses data from a subsequent row in the same result set without the use of a self-join in SQL Server 2016. LEAD provides access to a row at a given physical offset that follows the current row. Use this analytic function in a SELECT statement to compare values in the current row with values in a following row.

Why cursors slow


The "overhead" with cursors is merely part of the API. Cursors are how parts of the RDBMS work under the hood. Often CREATE TABLE and INSERT have SELECT statements, and the implementation is the obvious internal cursor implementation.

Using higher-level "set-based operators" bundles the cursor results into a single result set, meaning less API back-and-forth.

Cursors predate modern languages that provide first-class collections. Old C, COBOL, Fortran, etc., had to process rows one at a time because there was no notion of "collection" that could be used widely. Java, C#, Python, etc., have first-class list structures to contain result sets.

The Slow Issue

In some circles, the relational joins are a mystery, and folks will write nested cursors rather than a simple join. I've seen truly epic nested loop operations written out as lots and lots of cursors. Defeating an RDBMS optimization. And running really slowly.

Simple SQL rewrites to replace nested cursor loops with joins and a single, flat cursor loop can make programs run in 100th the time. [They thought I was the god of optimization. All I did was replace nested loops with joins. Still used cursors.]

This confusion often leads to an indictment of cursors. However, it isn't the cursor, it's the misuse of the cursor that's the problem.

The Size Issue

For really epic result sets (i.e., dumping a table to a file), cursors are essential. The set-based operations can't materialize really large result sets as a single collection in memory.

Alternatives

I try to use an ORM layer as much as possible. But that has two purposes. First, the cursors are managed by the ORM component. Second, the SQL is separated from the application into a configuration file. It's not that the cursors are bad. It's that coding all those opens, closes and fetches is not value-add programming.


No comments:

Post a Comment