10 Things You Shouldn't Do with SQL Server (Data Access Developer "Don'ts")
10 Things You Shouldn't Do with SQL Server (Data Access Developer "Don'ts") |
By Doug Seven Published: 4/14/2005 Reader Level: Beginner Intermediate Rated: 4.50 by 12 member(s). |
|
This article is modeled after a presentation I gave at TechEd 2004, which has since been requested by many user groups around the country. I figured it would be easier to share this information through an article than trying to get to every user group in the world. The content of this article is based on an online discussion I had with anyone in the community who chose to participate. I posted a blog entry and a forum entry in a couple places to ask the question, "What are the things you see developers doing with SQL Server data access that they shouldn't?" The list grew to about 25 or 26 things that were hot topics (so much that the Microsoft SQL Server product team was passing the thread around). Through a non-scientific vote, we narrowed the list to the 10 most frequent, most performance inhibiting, or most vulnerable security issues. The list that follows is that list - a non-scientific list of 10 things you shouldn't do with SQL Server (or at least know what you are choosing to do and its consequences). Personally, I can tell you that at some point in my career, I have done almost all of these (Hey, nobody is perfect). And here is the list (in David Letterman count down style)... 10. Add a Low Privilege Account to the Admin Role
The Admin role in SQL Server is designed to allow administrative privileges to accounts that LEGITIMATELY need it. Rarely is this ever your application account. For example, with an ASP.NET application, you should never add the ASP.NET worker process (ASPNET or NETWORK SERVICE) to the Admin role to enable a trusted connection (integrated security). This is just asking for certain disaster. In this example, the ASP.NET worker process was never intended to run under an account with administrative privileges on a SQL Server database; the ASP.NET account is intended as a low privilege account. The ASP.NET worker process is installed when you install the .NET Framework. If you are running the .NET Framework on Windows XP or Windows 2000 the ASP.NET worker process runs under the MachineName\ASPNET account. On Windows Server 2003 the ASP.NET Worker process runs under the NT Authority\Network Service account. By including this account in the administrative role you are opening yourself up to SQL injection attacks, among other things. Instead of granting a low privilege account administrative privileges to ensure that your application can do everything it needs to, take the time to determine the needs of your application. Make all possible attempts to have all of your data access managed in stored procedures. This will enable you to grant EXECUTE privileges for the ASP.NET account (or other low privilege account) on the individual stored procedures. This will not only ensure your application can do all of the things it needs to, but it will also help to tighten the overall security of your application and database. Following is an example of TSQL code to grant the ASP.NET account access to your database, and give it execute permissions to a stored procedure.
-- Windows 2000 / XP
-- Replace "MachineName" with your machine name EXEC sp_grantlogin [MachineName\ASPNET] EXEC sp_grantdbaccess [MachineName\ASPNET], [Alias] GRANT EXECUTE ON [ProcedureName] TO [Alias] GO -- Windows Server 2003 EXEC sp_grantlogin [NT AUTHORITY\NETWORK SERVICE] EXEC sp_grantdbaccess [NT AUTHORITY\NETWORK SERVICE] GRANT EXECUTE ON [ProcedureName] TO [NT AUTHORITY\NETWORK SERVICE] GO 9. @@IDENTITY vs. SCOPE_IDENTITY
This particular issue isn't so much about doing something right or wrong, it is about understanding your options so you choose the right one. Both @@IDENTITY and SCOPE_IDENTITY() return the last identity value (primary key) that was entered by your active session, but in different scenarios they can each return different values. When I say "active session" I am referring to the current activity you are engaging in. For example, if you can a stored procedure, that is what I am referring to as your active session. Each call to t a stored procedure (or user defined function, etc) is a session, unless the a stored procedure is nested in the stored procedure you are calling. In the case of a nested stored procedure or user defined method, while they are separate methods, they are part of the current session, but not part of the current scope. Your scope is limited to the method (stored procedure or user defined function) that you explicitly invoked. This is where the difference between @@IDENTITY and SCOPE_IDENTITY() comes in. @@IDENTITY will return the last identity value entered into a table in your current session (this is limited to your session only, so you won't get identities entered by other users). While @@IDENTITY is limited to the current session, it is not limited to the current scope. In other words, if you have a trigger on a table that causes an identity to be created in another table, you will get the identity that was created last, even if it was the trigger that created it. Now this isn't bad, as long as you ensure that things are done in the correct order. Where this can get ugly is when there is an application revision and a new trigger gets added that gets fired from your stored procedure. Your code didn't anticipate this new trigger, so you could now be getting an incorrect value back. SCOPE_IDENTITY(), like @@IDENTITY, will return the last identity value created in the current session, but it will also limit it to your current scope as well. In other words, it will return the last identity value that you explicitly created, rather than any identity that was created by a trigger or a user defined function. Follow is some sample script that you can run to see the different value you get back before a trigger is added, and after. /*In a test database, create a new table named TY*/
USE SomeTestDatabase CREATE TABLE TABLE_A ( TABLE_A_id int IDENTITY(100,5)PRIMARY KEY, ItemValue varchar(20) NULL) /*INSERT records into TABLE_A*/ INSERT TABLE_A VALUES ('Widget') INSERT TABLE_A VALUES ('Boat') INSERT TABLE_A VALUES ('Car') GO /*Create a new table named TABLE_B*/ CREATE TABLE TABLE_B ( TABLE_B_id int IDENTITY(1,1)PRIMARY KEY, Username varchar(20) NOT NULL) /*INSERT records into TABLE_B*/ INSERT TABLE_B VALUES ('Doug') INSERT TABLE_B VALUES ('Erika') INSERT TABLE_B VALUES ('Lola') GO /*INSERT a record into TABLE_B*/ INSERT TABLE_B VALUES ('Kali') /*SELECT the data to see what values were returned by @@IDENTITY and SCOPE_IDENTITY()*/ SELECT * FROM TABLE_A SELECT * FROM TABLE_B SELECT @@Identity AS [@@Identity], SCOPE_IDENTITY() AS [SCOPE_IDENTITY] GO /*Create the trigger that inserts a row in table TABLE_A when a row is inserted in table TABLE_B*/ CREATE TRIGGER TABLE_B_trig ON TABLE_B FOR INSERT AS BEGIN INSERT TABLE_A VALUES ('Airplane') END GO /*Now INSERT a record into TABLE_B, which will cause the trigger to be fired*/ INSERT TABLE_B VALUES ('Donny') /*SELECT the data to see what values were returned by @@IDENTITY and SCOPE_IDENTITY() - they will be different values. SCOPE_IDENTITY() will return the identity from TABLE_A (the identity that you explicitly created), while @@IDENTITY will return the identity from TABLE_B (the triggered item).*/ SELECT * FROM TABLE_A SELECT * FROM TABLE_B SELECT @@Identity AS [@@Identity], SCOPE_IDENTITY() AS [SCOPE_IDENTITY] GO 8. Fetch Semi-static Data on Each Request of a Resource
Ah, performance. Ultimately that is what we are talking about here. If, in your application, you have some semi-static data (data that doesn't change very frequently), and you are making calls to your data store on each request of the resource, you are missing a huge opportunity to increase the performance of your application. Data that is semi-static (even if it is only static for a short amount of time) can be cached in the application to reduce the overhead associated with making database calls frequently. There are a couple options for caching in your application.
Using the Cache API:
Using Output Caching:
7. Include SQL Data Manipulation Language in Application Code
Embedding SQL code in your application code is simply asking for trouble. Not only could you be opening yourself to a SQL Injection attack, you are also creating code that is harder to maintain than it should be. With hard coded SQL in your application code, any time you want to make even a slight change to the SQL code, you have to recompile the entire application. For example, the following SQL in your application is the code equivalent of a near death experience.
For a full explanation on what can happen with this kind of code, read Stop SQL Injection Attacks Before They Stop You by Paul Litwin. Of course, a better solution that concatenated string (if you MUST have SQL syntax in your code) is to use parameterized queries. Here you can see the same query using parameters (which will aid in preventing SQL injection attacks).
An even better solution is to use stored procedures so that your queries are stored in the database, where they are compiled and optimized, and can be modified without having to recompile your application code.
The motto to live by is "embeddedSql == death;" 6. Abuse SELECT *
Strangely enough it seems that there is an all too common habit to do lazy data fetching. That is, many of us are still using "SELECT * FROM..." when we are writing our data access queries. This is bad news. Apparently many developers are still writing queries like this because at design time there are only a few fields, or the amount of data is small, or some other excuse. But what happens when the scope grows, and a new field is added. An Image field to hold a 1024x768 picture of a user for example. Now every call that uses "SELECT * FROM..." is pulling back this very large image (potentially multiple megabytes). This has HUGE performance implications. This is just lazy. My theory is, at design time you will know all of the queries you need to do, and can write explicit stored procedures to return ONLY the data you NEED for those queries - no exceptions. You should never have to use "SELECT * FROM..." again. 5. Create Stored Procedures without Exception Handling
Every day you write code (I hope). And every day you likely write some exception handling in your code because you know that something could go wrong. The input may not be what you expect, data may not be returned from a query, etc. Strangely enough, not enough of us are writing exception handling in our stored procedures. Is it that strange to think that nothing will ever go wrong in our stored procedures? So you're telling me that you just handle the exception in your application code? Why not handle the exception as close to the source as possible? That is my #5 recommendation. Following is an example of one option for exception handling in a stored procedure. There are lots of methodologies you could use, and this is only one. In this example we rely on an XML file in the application that cross references predefined error codes with human readable, user friendly error messages. The error codes were defined by our application architect.
When the error code is returned to the application you can interrogate its value. If it is "0" then there was no error. If the value is "32" you cross reference that with the ErrorCodes.xml file to return the string value:
If the error code is "1", then you cross reference that with the ErrorCodes.xml file to return the string value:
At the very least you should always include some form of exception handling on INSERT, UPDATE and DELETE statements.
An important note: As soon as you interrogate the @@ERROR value it gets reset to "0". It is critical that you copy the value of @@ERROR to a local variable to ensure you can work with the value without accidentally resetting it.
4. Prefix Stored Procedures with "sp_"
I am often reminded of my early learning with SQL Server when I stumble across this little gem. You see, when I was first learning SQL Server, I looked at how things were named and saw a common theme, stored procedures were named with the Hungarian notation "sp_". Much to my dismay I later learned that "sp_" is the notation for "System Stored Procedure" (why they didn't use "ssp_" I'll never know). I say I am often reminded of this because I see it in code all over the place. I once stumbled into hundreds of these guys in an internal application created and used by one of the largest software companies in the world (name excluded to protect the guilty). Allow me to quote SQL Server Books Online to provide clarity here: System Stored Procedures 3. You Don't Protect the Database Connection String
The database connection string is probably the single most important secret information that your application uses. You MUST protect it at all costs. In the past people just like me (OK, it was me) told you that it was OK to store your connection string in the web.config file. Well I am here to tell you I was wrong (hey, it can happen to any of us). In the early days of ASP.NET (circa 2000) we thought that this was true - that storing the connection string in the web.config was OK. The reality is that this is a big security risk - the web.config file is an XML file - human readable - which means if someone (external or internal) gets a hold of it, you're in big trouble. All secrets are revealed. New Rule: Storing the connection string in the web.config file unencrypted is too risky - store it in the web.config file encrypted. Of course this brings up the next question, "Where do I store my encryption key?" The answer, "You don't - let Windows do it for you using the Data Protection API (DPAPI)." Windows 2000, XP and 2003 all include the Win32 DPAPI. This is an unmanaged API that you can use to do strong encryption of information, while letting Windows manage the key storage. Simple as anything. There is one important fact however - anything encrypted using DPAPI can only be decrypted on the same machine it was encrypted on. That means if I encrypt something on Machine_1, I cannot decrypt it on Machine_2. This means DPAPI is not a good solution for encrypting values you will store in the database, but it is good for encrypting connections strings and storing them in the web.config. You simply need to encrypt the raw connection string on each machine you deploy to, storing the machine specific encrypted value in the web.config file on that machine. The .NET Framework v1.x doesn't include a managed wrapper for the Win32 DPAPI. You will need to write your own wrapper. Fortunately the Patterns & Practices group at Microsoft has helped out by providing a complete tutorial that includes cut-n-paste code (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/secmod/html/secmod21.asp). 2. Accept All Input
In the book, "Writing Secure Code, Second Edition" (MSPress), Michael Howard writes "All Input is Evil" (in fact, that is the name of Chapter 10). That's it...as simple as it sounds...all input is evil. Treat any input from a user - TextBox, QueryString, Cookie - as suspect until you have validated it to ensure it's legitimacy. The .NET Framework ships with a number of tools to validate user input, either at the client or on the server (or both).
1. Access the Database from the Application with the "sa" Account
Honestly I was stunned to see this at the top of the list. I thought we had all learned by now that the "sa" account is an administrative only account. Apparently note, so let me get on the soap box one last time:
Hopefully this Top 10 list has been helpful. Again, this isn't the end all be all of do's and don'ts with SQL Server, but it is a list that was compiled by a number of developers working in client locations, and seeing this stuff happen on a daily basis. Learn from it, love it, leave it. |