January 2009 - Posts
Dear All,
Milis kita secara official sudah memiliki Web site SQLServer-indo.org, dan perkemarin teman teman komunitas dan team DPE MS Indonesia (narendra dan welly) yang sudah berhasil mengintegrasikan User di SQLServer-indo.org dengan Account di Live ID (sehingga memungkinkan teman teman yang memiliki Account di MSN atau Hotmail bisa menggunakan user id dan password yang sama untuk bisa akses dan menjadi member di SQLServer-Indo.org
Apa keuntungan bagi kita? Tentu saja ini akan memudahkan kita untuk melakukan single sign on dengan hanya cukup menggunakan satu User ID (Live ID/MSN-ID/Hotmail-ID) yang dulu namanya .NET passport untuk mengakses semua situs di lingkungan komunitas microsoft (SQLServer-Indo.org, Geeks.Netindonesia.net, Wss-id.org)
Apa yang kita miliki di SQLServer-Indo.org? kita bisa membaca postingan teman teman para aktifis milis SQLServer-Indo (seperti: Hendra EP, Hendrik, David Pakpahan, Deni Kusdeni,kikinoviandi dan yang lainnya termasuk saya J) melakukan diskusi di forum diskusi bahkan mendapatkan blog di SQLServer-indo.org
Bagi yang belum mendaftar di persilahkan registrasi di SQLServer-indo.org, bagi yang minat mendapatkan blog silahkan menghubungi saya.
10. Add a Low
Privilege Account to the Admin Role
The Admin
role in SQL Server is designed to allow administrative privileges to accounts
that LEGITIMATELY need it. Rarely is this ever your application account. For
example, with an ASP.NET application, you should never add the ASP.NET worker
process (ASPNET or NETWORK SERVICE) to the Admin role to enable a trusted
connection (integrated security). This is just asking for certain disaster. In
this example, the ASP.NET worker process was never intended to run under an
account with administrative privileges on a SQL Server database; the ASP.NET
account is intended as a low privilege account.
The
ASP.NET worker process is installed when you install the .NET Framework. If you
are running the .NET Framework on Windows XP or Windows 2000 the ASP.NET worker
process runs under the MachineName\ASPNET account. On Windows Server
2003 the ASP.NET Worker process runs under the NT Authority\Network Service
account. By including this account in the administrative role you are opening
yourself up to SQL injection attacks, among other things.
Instead of
granting a low privilege account administrative privileges to ensure that your
application can do everything it needs to, take the time to determine the needs
of your application. Make all possible attempts to have all of your data access
managed in stored procedures. This will enable you to grant EXECUTE privileges
for the ASP.NET account (or other low privilege account) on the individual
stored procedures. This will not only ensure your application can do all of the
things it needs to, but it will also help to tighten the overall security of
your application and database.
Following
is an example of TSQL code to grant the ASP.NET account access to your database,
and give it execute permissions to a stored procedure.
--
Windows 2000 / XP
-- Replace "MachineName" with your machine
name
EXEC sp_grantlogin [MachineName\ASPNET]
EXEC
sp_grantdbaccess [MachineName\ASPNET],
[Alias]
GRANT EXECUTE ON [ProcedureName] TO
[Alias]
GO
-- Windows Server
2003
EXEC sp_grantlogin [NT AUTHORITY\NETWORK
SERVICE]
EXEC sp_grantdbaccess [NT AUTHORITY\NETWORK SERVICE]
GRANT
EXECUTE ON [ProcedureName] TO [NT
AUTHORITY\NETWORK SERVICE]
GO
9. @@IDENTITY
vs. SCOPE_IDENTITY
This
particular issue isn't so much about doing something right or wrong, it is about
understanding your options so you choose the right one. Both @@IDENTITY and
SCOPE_IDENTITY() return the last identity value (primary key) that was entered
by your active session, but in different scenarios they can each return
different values. When I say "active session" I am referring to the current
activity you are engaging in. For example, if you can a stored procedure, that
is what I am referring to as your active session. Each call to t a stored
procedure (or user defined function, etc) is a session, unless the a stored
procedure is nested in the stored procedure you are calling. In the case of a
nested stored procedure or user defined method, while they are separate methods,
they are part of the current session, but not part of the current scope. Your
scope is limited to the method (stored procedure or user defined function) that
you explicitly invoked. This is where the difference between @@IDENTITY and
SCOPE_IDENTITY() comes in.
@@IDENTITY
will return the last identity value entered into a table in your current session
(this is limited to your session only, so you won't get identities entered by
other users). While @@IDENTITY is limited to the current session, it is not
limited to the current scope. In other words, if you have a trigger on a table
that causes an identity to be created in another table, you will get the
identity that was created last, even if it was the trigger that created it. Now
this isn't bad, as long as you ensure that things are done in the correct order.
Where this can get ugly is when there is an application revision and a new
trigger gets added that gets fired from your stored procedure. Your code didn't
anticipate this new trigger, so you could now be getting an incorrect value
back.
SCOPE_IDENTITY(), like @@IDENTITY, will return the last identity value
created in the current session, but it will also limit it to your current scope
as well. In other words, it will return the last identity value that you
explicitly created, rather than any identity that was created by a trigger or a
user defined function.
Follow is
some sample script that you can run to see the different value you get back
before a trigger is added, and after.
/*In a test database, create a new table named TY*/
USE SomeTestDatabase
CREATE TABLE TABLE_A ( TABLE_A_id int
IDENTITY(100,5)PRIMARY KEY, ItemValue varchar(20) NULL)
/*INSERT records into TABLE_A*/
INSERT TABLE_A VALUES ('Widget')
INSERT TABLE_A VALUES
('Boat')
INSERT TABLE_A VALUES ('Car')
GO
/*Create a new table named TABLE_B*/
CREATE TABLE TABLE_B ( TABLE_B_id int IDENTITY(1,1)PRIMARY KEY,
Username varchar(20) NOT NULL)
/*INSERT records into
TABLE_B*/
INSERT TABLE_B VALUES
('Doug')
INSERT TABLE_B VALUES ('Erika')
INSERT TABLE_B VALUES
('Lola')
GO
/*INSERT a record into
TABLE_B*/
INSERT TABLE_B
VALUES
('Kali')
/*SELECT the data to see what values
were returned by @@IDENTITY and SCOPE_IDENTITY()*/
SELECT * FROM TABLE_A
SELECT * FROM TABLE_B
SELECT @@Identity AS
[@@Identity], SCOPE_IDENTITY() AS
[SCOPE_IDENTITY]
GO
/*Create the trigger
that inserts a row in table TABLE_A when a row is inserted in table
TABLE_B*/
CREATE TRIGGER TABLE_B_trig
ON
TABLE_B
FOR INSERT AS
BEGIN
INSERT TABLE_A VALUES
('Airplane')
END
GO
/*Now INSERT a record
into TABLE_B, which will cause the trigger to be fired*/
INSERT TABLE_B
VALUES ('Donny')
/*SELECT the data to see what values were returned by @@IDENTITY and
SCOPE_IDENTITY() - they will be different values. SCOPE_IDENTITY() will return
the identity from TABLE_A (the identity that you explicitly created), while
@@IDENTITY will return the identity from TABLE_B (the triggered
item).*/
SELECT * FROM TABLE_A
SELECT * FROM
TABLE_B
SELECT @@Identity AS [@@Identity], SCOPE_IDENTITY() AS
[SCOPE_IDENTITY]
GO
8. Fetch
Semi-static Data on Each Request of a Resource
Ah,
performance. Ultimately that is what we are talking about here. If, in your
application, you have some semi-static data (data that doesn't change very
frequently), and you are making calls to your data store on each request of the
resource, you are missing a huge opportunity to increase the performance of your
application. Data that is semi-static (even if it is only static for a short
amount of time) can be cached in the application to reduce the overhead
associated with making database calls frequently.
There are
a couple options for caching in your application.
1. Cache API: The Cache API is
your application-level cache. This is where you can put ANY object and define
rules around how long the object stays in the cache. The size of the cache
(e.g., how many items you can put in it) is dictated by the amount of RAM on the
machine running the application. The nice thing about the Cache API is you can
put any object into the cache, and pull it out later to reuse it, even complex
objects. You can define a sliding expiration time as the life span of the item
in the cache (e.g., keep it in the cache as long it has been used in the past
5-minutes, but kill it after 5-minutes of idle time). You can alternately define
an absolute expiration time (keep the item in cache for 1-hour and then remove
it, regardless of how much it gets used or not). You can also define a file
dependency (e.g., keep the item in cache until this file gets updated). This
works great for caching XML data, and invalidating the cache if the XML file
changes.
2. Output Caching: For data you want to cache, where you don't need access
to the raw data again, you can cache the output of the data. This is output
caching; caching only the rendered HTML, not the actual object that you used to
create the HTML. This is also very easy to implement, as seen in the second
example below.
Using
the Cache API:
DataTable productsTable;
// Insert code here to
get a table of Product data
//The following code puts the object in
cache
Cache.Add(
"ProductsTable", //Name
productsTable, //Object to
cache
null, //CacheDependency
DateTime.Now.AddSeconds(60), //Absolute
Expiration
TimeSpan.Zero, //Sliding
Expiration
CacheItemPriority.High, //Priority
null //onRemoveCallback
);
//The following code can be used to get
the object from cache
if(Cache["ProductsTable"] != null)
productsTable =
(DataTable)Cache["ProductsTable"];
Using Output Caching:
<%-- Sets the cache to 60-seconds --%>
<%@ OutputCache Duration="60" VaryByParam="None" %>
<%-- Sets the cache to 60-seconds and creates a separate cached version of the page based on the "City" parameter --%>
<%@ OutputCache Duration="60" VaryByParam="City" %>
<%-- Sets the cache to 60-seconds and creates a separate cached version of the page for each Accept-Language header --%>
<%@ OutputCache Duration="60" VaryByParam="None" VaryByHeader="Accept-Language" %>
7. Include SQL Data Manipulation Language in
Application Code
Embedding
SQL code in your application code is simply asking for trouble. Not only could
you be opening yourself to a SQL Injection attack, you are also creating code
that is harder to maintain than it should be. With hard coded SQL in your
application code, any time you want to make even a slight change to the SQL
code, you have to recompile the entire application.
For
example, the following SQL in your application is the code equivalent of a near
death experience.
string sql = "SELECT * FROM Users WHERE username='" +
Username.Text
+
"' AND password= '" +
Encrypt(Password.Text) +
"'";
SqlCommand command = new SqlCommand (sql,
connection);
For a full explanation on what can happen with this kind of
code, read Stop SQL Injection Attacks Before They Stop
You by
Paul Litwin.
Of course,
a better solution that concatenated string (if you MUST have SQL syntax in your
code) is to use parameterized queries. Here you can see the same query using
parameters (which will aid in preventing SQL injection
attacks).
string sql = "SELECT * FROM Users WHERE username=@Username AND password=
@Password";
SqlCommand command = new SqlCommand (sql,
connection);
command.Parameters.Add("@Username", SqlDbType.VarChar).Value =
UserName.Text;
command.Parameters.Add("@Password", SqlDbType.VarChar).Value =
Encrypt(Password.Text);
SqlCommand command = new SqlCommand (sql,
connection);
An even better solution is to use stored procedures so that
your queries are stored in the database, where they are compiled and optimized,
and can be modified without having to recompile your application
code.
SqlCommand command = new SqlCommand ("Users_GetUser",
connection);
command.CommandType =
CommandType.StoredProcedure;
command.Parameters.Add("@Username",
SqlDbType.VarChar).Value = UserName.Text;
command.Parameters.Add("@Password",
SqlDbType.VarChar).Value = Encrypt(Password.Text);
SqlCommand command =
new SqlCommand (sql, connection);
The motto to live by is
"embeddedSql == death;"
6. Abuse SELECT
*
Strangely
enough it seems that there is an all too common habit to do lazy data fetching.
That is, many of us are still using "SELECT * FROM..." when we are writing our
data access queries. This is bad news. Apparently many developers are still
writing queries like this because at design time there are only a few fields, or
the amount of data is small, or some other excuse. But what happens when the
scope grows, and a new field is added. An Image field to hold a 1024x768 picture
of a user for example. Now every call that uses "SELECT * FROM..." is pulling
back this very large image (potentially multiple megabytes). This has HUGE
performance implications.
This is
just lazy. My theory is, at design time you will know all of the queries you
need to do, and can write explicit stored procedures to return ONLY the data
you NEED for those queries - no exceptions. You should never have to use "SELECT
* FROM..." again.
5. Create
Stored Procedures without Exception Handling
Every day
you write code (I hope). And every day you likely write some exception handling
in your code because you know that something could go wrong. The input may not
be what you expect, data may not be returned from a query, etc. Strangely
enough, not enough of us are writing exception handling in our stored
procedures. Is it that strange to think that nothing will ever go wrong in our
stored procedures? So you're telling me that you just handle the exception in
your application code? Why not handle the exception as close to the source as
possible? That is my #5 recommendation.
Following
is an example of one option for exception handling in a stored procedure. There
are lots of methodologies you could use, and this is only one. In this example
we rely on an XML file in the application that cross references predefined error
codes with human readable, user friendly error messages. The error codes were
defined by our application architect.
CREATE PROCEDURE dbo.Users_Insert
@Username VARCHAR
(20)
AS
SET NOCOUNT ON
DECLARE @Err INT
SET @Err = 0 –
Success
INSERT Users (Username) VALUES (@Username)
SET @Err = @@ERROR –
This resets @@ERROR to 0
IF (@Err <> 0)
BEGIN
IF (@Err =
547) -- statement conflicted with constraint
BEGIN
SET @Err
= 32 -– Our error code indicating 'Username already in use'
GOTO
abort
END
ELSE
BEGIN
SET @Err = 1 –- Our
error code indicating 'Unspecified error'
END
END
abort:
SET NOCOUNT OFF
RETURN
@Err
GO
When the
error code is returned to the application you can interrogate its value. If it
is "0" then there was no error. If the value is "32" you cross reference that
with the ErrorCodes.xml file to return the string value:
"There
was an error creating your user account. The user name you selected was already
in use. Please choose a new user name and try again."
If the
error code is "1", then you cross reference that with the ErrorCodes.xml file to
return the string value:
"An
unspecified error occured. Please try again. If the problem persists, please
contact Customer Care."
At the
very least you should always include some form of exception handling on INSERT,
UPDATE and DELETE statements.
An
important note: As soon as you interrogate the @@ERROR value
it gets reset to "0". It is critical that you copy the value of @@ERROR to a
local variable to ensure you can work with the value without accidentally
resetting it.
4. Prefix
Stored Procedures with "sp_"
I am often
reminded of my early learning with SQL Server when I stumble across this little
gem. You see, when I was first learning SQL Server, I looked at how things were
named and saw a common theme, stored procedures were named with the Hungarian
notation "sp_". Much to my dismay I later learned that "sp_" is the notation for
"System Stored Procedure" (why they didn't use "ssp_" I'll never know). I say I
am often reminded of this because I see it in code all over the place. I once
stumbled into hundreds of these guys in an internal application created and
used by one of the largest software companies in the world (name excluded to
protect the guilty).
Allow me
to quote SQL Server Books
Online to
provide clarity here:
System
Stored Procedures
Many of
your administrative activities in Microsoft® SQL Server™ 2000 are performed
through a special kind of procedure known as a system stored procedure. System stored procedures are
created and stored in the master database and have the sp_ prefix. System stored procedures can be
executed from any database without having to qualify the stored procedure name
fully using the database name master.
It is
strongly recommended that you do not create any stored procedures
using sp_ as a
prefix. SQL Server always looks for a stored procedure beginning
with sp_ in
this order:
1. The stored procedure in
the master
database.
2. The stored procedure based on any qualifiers provided (database name or
owner).
3. The stored procedure using dbo as the owner, if one is not specified.
Therefore,
although the user-created stored procedure prefixed with sp_ may exist in the current database,
the master
database is always checked first, even if the stored procedure is qualified with
the database name.
Important If any user-created stored procedure has the
same name as a system stored procedure, the user-created stored procedure will
never be executed.
3. You Don't
Protect the Database Connection String
The
database connection string is probably the single most important secret
information that your application uses. You MUST protect it at all costs. In the
past people just like me (OK, it was me) told you that it was OK to store your
connection string in the web.config file. Well I am here to tell you I was wrong
(hey, it can happen to any of us). In the early days of ASP.NET (circa 2000) we
thought that this was true - that storing the connection string in the
web.config was OK. The reality is that this is a big security risk - the
web.config file is an XML file - human readable - which means if someone
(external or internal) gets a hold of it, you're in big trouble. All secrets are
revealed.
New
Rule: Storing the connection string in the
web.config file unencrypted is too risky - store it in the web.config file
encrypted.
Of course
this brings up the next question, "Where do I store my encryption key?" The
answer, "You don't - let Windows do it for you using the Data Protection API
(DPAPI)."
Windows
2000, XP and 2003 all include the Win32 DPAPI. This is an unmanaged API that you
can use to do strong encryption of information, while letting Windows manage the
key storage. Simple as anything. There is one important fact however - anything
encrypted using DPAPI can only be decrypted on the same machine it was encrypted
on. That means if I encrypt something on Machine_1, I cannot decrypt it on
Machine_2. This means DPAPI is not a good solution for encrypting values you
will store in the database, but it is good for encrypting connections strings
and storing them in the web.config. You simply need to encrypt the raw
connection string on each machine you deploy to, storing the machine specific
encrypted value in the web.config file on that machine.
The .NET
Framework v1.x doesn't include a managed wrapper for the Win32 DPAPI. You will
need to write your own wrapper. Fortunately the Patterns & Practices group
at Microsoft has helped out by providing a complete tutorial that includes
cut-n-paste code (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/secmod/html/secmod21.asp).
2. Accept All
Input
In the
book, "Writing Secure Code, Second Edition"
(MSPress),
Michael Howard writes "All Input is Evil" (in fact, that is the name of Chapter
10). That's it...as simple as it sounds...all input is evil. Treat any input
from a user - TextBox, QueryString, Cookie - as suspect until you have validated
it to ensure it's legitimacy.
The .NET
Framework ships with a number of tools to validate user input, either at the
client or on the server (or both).
· ASP.NET includes five (5)
input validation controls: RequiredFieldValidator, RegularExpressionValidator,
CompareValidator, RangeValidator, CustomValidator, plus the ValidationSummary
control.
· Windows Forms controls expose the Validating
event to perform validation tasks.
· The System.Text.RegularExpressions.RegEx class
exposes a powerful regular expression engine.
· HttpUtility.HtmlEncode can be used to HTML
encode text before echoing it to the screen (prevents scripting attacks).
· ASP.NET v1.1 (2003) includes a ValidateRequest
attribute (@ Page directive or Web.config) that prevents some malicious script
from being submitted.
1. Access the
Database from the Application with the "sa" Account
Honestly I
was stunned to see this at the top of the list. I thought we had all learned by
now that the "sa" account is an administrative only account. Apparently note, so
let me get on the soap box one last time:
· NEVER use the "sa" account to
access a database programmatically.
· Use one or more limited-privilege accounts for
programmatic data access (see #1).
o SELECT-only account for data retrieval.
o EXECUTE-only
account for stored procedures.
· Not using "sa" reduces an
attacker's ability to execute harmful commands or system procedures.
· If you really, really need to use the "sa" account, create a new account
named "essay" and see if that help get past the need to use
"sa".
Hopefully
this Top 10 list has been helpful. Again, this isn't the end all be all of do's
and don'ts with SQL Server, but it is a list that was compiled by a number of
developers working in client locations, and seeing this stuff happen on a daily
basis. Learn from it, love it, leave it.
Kita bisa gunakan cara dengan menampung semua
informasi yang duplicate ke temporary data dengan terlebih dahulu menjumlahkan
value yang ada, lalu kemudian data yang duplikat di table kita delete. Setelah
itu data dari table temporary tadi dimasukkan ke dalam table, sehingga sekarang
di table tidak terdapat informasi yang duplikat lagi.
Perlu diingat ketika kita menghapus sebuah row maka
kita akan kehilangan informasi yang mungkin kita butuhkan. Jika column yang ada
hanya berupa 3 column dan informasinya termasuk key field saja maka tidak akan
ada masalah. Yang menjadi masalah adalah ketika ada field lain yang berpengaruh
seperti pada contoh yang saya buat adalah field value ketika kita menghapus data
yang duplikat tetapi nilai field valuenya berbeda kita akan kehilangan informasi
dari field value tersebut.
Karenanya hendaknya informasi penting selain key
tidak menjadi hilang dikarenakan key duplikat. Pada contoh dibawah saya
menjumlahkan terlebih dahulu field value untuk setiap data yang duplikat dan
disimpan ke temptable, untuk seterusnya dimasukkan kembali ke Mytable dengan
nilai field value yang sudah di SUM.
Berikut Contoh Scrip untuk menghilangkan duplicate
data (silahkan di copy paste di Query analyser)
-------------------------- Begin Script
-----------------------------
--Create Table MyTable
CREATE TABLE [dbo].[MyTable](
[FieldA] [nchar](10) COLLATE
SQL_Latin1_General_CP1_CI_AS NOT NULL,
[FieldB] [nchar](10) COLLATE
SQL_Latin1_General_CP1_CI_AS NOT NULL,
[FieldC] [nchar](10) COLLATE
SQL_Latin1_General_CP1_CI_AS NOT NULL,
[Value] [int] NULL
) ON [PRIMARY]
--Insert Data to Table
INSERT INTO [dbo].[MyTable]
([FieldA],[FieldB],[FieldC],[Value])
VALUES ('A1','B1','C1',10)
INSERT INTO [dbo].[MyTable]
([FieldA],[FieldB],[FieldC],[Value])
VALUES ('A2','B2','C2',20)
INSERT INTO [dbo].[MyTable]
([FieldA],[FieldB],[FieldC],[Value])
VALUES ('A3','B3','C3',30)
INSERT INTO [dbo].[MyTable]
([FieldA],[FieldB],[FieldC],[Value])
VALUES ('A3','B3','C3',40)
INSERT INTO [dbo].[MyTable]
([FieldA],[FieldB],[FieldC],[Value])
VALUES ('A4','B4','C4',50)
INSERT INTO [dbo].[MyTable]
([FieldA],[FieldB],[FieldC],[Value])
VALUES ('A4','B4','C4',60)
-- Display All MyTable Data
Select * from [MyTable]
--Copy All Duplicate Data To #TempTable Table and Sum
the Duplicate value
Select FieldA,Fieldb,FieldC,Sum(Value) as Value into
#TempTable
from [MyTable]
where cast(FieldA as nvarchar(5)) +
cast(FieldB as nvarchar(5)) +
cast(FieldC as nvarchar(5))
in (
select
cast(FieldA as nvarchar(5)) +
cast(FieldB as nvarchar(5)) +
cast(FieldC as nvarchar(5))
from [MyTable]
group by FieldA, FieldB, FieldC
having count(
cast(FieldA as nvarchar(5)) +
cast(FieldB as nvarchar(5)) +
cast(FieldC as nvarchar(5)))>1)
Group by FieldA,Fieldb,FieldC
--Display All temptable data
select * from #TempTable
--Delete All Duplicate data in MyTable
Delete from [MyTable]
where cast(FieldA as nvarchar(5)) +
cast(FieldB as nvarchar(5)) +
cast(FieldC as nvarchar(5))
in (
select distinct
cast(FieldA as nvarchar(5)) +
cast(FieldB as nvarchar(5)) +
cast(FieldC as nvarchar(5))
from [MyTable]
group by FieldA, FieldB, FieldC
having count(cast(FieldA as nvarchar(5)) +
cast(FieldB as nvarchar(5)) +
cast(FieldC as nvarchar(5)))>1)
--Insert Data from Temptable to MyTable
insert into [MyTable] Select
FieldA,Fieldb,FieldC,Value from #TempTable
--Display All MyTable Data
select * from [MyTable]
-- Drop All Table
drop table [MyTable]
drop table #TempTable
------------------------ End Script
----------------------------
Normal
0
false
false
false
EN-US
X-NONE
X-NONE
MicrosoftInternetExplorer4
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin-top:0in;
mso-para-margin-right:0in;
mso-para-margin-bottom:10.0pt;
mso-para-margin-left:0in;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;}
Apakah Partitioned View itu?
Jika kita bekerja dengan jumlah data
besar apalagi jika dalam jumlah data mendekati tera byte, maka kita akan
menghadapi performance issue khususnya dalam mengakses data dalam jumlah row
yang besar.
Ukuran table akan yang akan diakses
akan sangat berpengaruh kepada waktu akses yang dibutuhkan, hal ini tidak dapat
diatasi dengan hanya menentukan index column yang tepat pada suatu table,
tentunya ukuran data akan sangat menentukan lama akses suatu table.
Sementara itu untuk keperluan data
ware house availability dari data sebisa mungkin di keep selamanya didalam
system seperti pada transaksi perbankan atau pun finance yang menuntut agar
semua data dapat disimpan selama mungkin.
Untuk mengatasi masalah ini ada satu
best practice yang bisa kita ikuti yaitu dengan memecah table menjadi partisi
partisi data sehingga jumlah data akan selalu terjaga dan yang paling penting
adalah waktu aksesnya tetap cepat.
|--------- > Sales_Jan
|---------
> Sales_Feb
Table Sales_Data,dipecah |--------- > Sales_Mar
|---------
> Sales_Apr
|---------
> dst…
Pemisahan data seperti ini tentunya
tidak menjadi masalah ketika data yang diambil adalah data transaksi yang
sifatnya mati dan tidak bergerak lagi. Untuk data yang sudah di close kita
tinggal gunakan statement SELECT INTO atau INSERT INTO berdasarkan kelompok
criteria yang diinginkan untuk contoh diatas berdasarkan bulan tertentu.
Proses pemecahan table diatas akan
menjadi masalah jika data yang akan diakses merupakan data hidup dan kita perlu
melakukan operasi CRUD (Create, Retreave, Update dan Delete) kedalamnya.
Tentunya kita memerlukan SP atau TSQL yang smart untuk mengatasi operasi CRUD
dengan memecah table tersebut.
Ternyata untuk memecahkan masalah
ini tidak serumit yang kita bayangkan, SQL server 2000 menyediakan kemampuan
untuk dapat membuat apa yang disebut sebagai Partitioned View.
Membuat Partitioned View
Untuk membuat partitioned view
memang menurut saya agak sedikit tricky, karena dengan cara membuat constrain
di setiap fact table dan membuat view dari table – table tersebut maka kita
sudah membuat partitioned view dari data yang kita inginkan. Hasilnya
performance dari akses data tetap bisa dijaga.
Berikut langkah langkah untuk
membuat partitioned view:
- Buat partition table untuk fact table yang akan dibuat,
contohnya sebagai berikut :
-- Creating the Tables
CREATE TABLE [dbo].[DATA_0501] (
[CustomerId] [varchar] (50)
COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[OrderId] [varchar] (20)
COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[OrderDate] [datetime] NOT
NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[DATA_0502] (
[CustomerId] [varchar] (50)
COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[OrderId] [varchar] (20)
COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[OrderDate] [datetime] NOT
NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[DATA_0503] (
[CustomerId] [varchar] (50)
COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[OrderId] [varchar] (20)
COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[OrderDate] [datetime] NOT
NULL
) ON [PRIMARY]
GO
2. Tentukan Constrain untuk setiap
table sesuai dengan criteria data yang akan disimpan
-- Check Constraints
ALTER TABLE [dbo].[DATA_0501] WITH
CHECK ADD CONSTRAINT CHKDATE_DATA_0501
CHECK ([OrderDate] >=
'1-Jan-2005' and [OrderDate] < '1-Feb-2005')
GO
ALTER TABLE [dbo].[DATA_0502] WITH
CHECK ADD CONSTRAINT CHKDATE_DATA_0502
CHECK ([OrderDate] >=
'1-Feb-2005' and [OrderDate] < '1-Mar-2005')
GO
ALTER TABLE [dbo].[DATA_0503] WITH
CHECK ADD CONSTRAINT CHKDATE_DATA_0503
CHECK ([OrderDate] >=
'1-Mar-2005' and [OrderDate] < '1-Apr-2005')
GO
Constrain CHECK akan mengaktifkan
validasi data yang akan masuk kedalam table dimana kondisi data yang
bisa dimasukkan (untuk Insert atau
pun Update) harus memenuhi syarat tertentu misalnya
([OrderDate] >= '1-Jan-2005' and
[OrderDate] < '1-Feb-2005').
Demikian juga untuk table yang lain
akan divalidasi sesuai dengan constrain yang di tentukan pada expresi
CHECK di setiap table.
3. Kemudian buat Partition View untuk
ketiga table tersebut dengan membuat view yang berisi union dari fact
table yang sudah kita buat tadi
-- Create the PV
CREATE VIEW vw_DATA AS
SELECT * FROM DATA_0501
UNION ALL
SELECT * FROM DATA_0502
UNION ALL
SELECT * FROM DATA_0503
GO
4. Untuk mengakses Partitioned View ini
dapat dilakukan dengan memanggil view tersebut seperti intruksi select
pada table biasa
-- Select Data
SELECT * FROM vw_DATA Where
OrderDate = '12-Feb-2005'
-- Insert some data into base table
INSERT INTO DATA_0501 VALUES
('Customer1', 'Order1', '12-Jan-2005')
INSERT INTO DATA_0502 VALUES
('Customer2', 'Order2', '12-Feb-2005')
INSERT INTO DATA_0503 VALUES
('Customer3', 'Order3', '12-Mar-2005')
Demikian juga dengan intruksi update
dengan delete, semua akan diatur oleh Constrain yang ada di table
masing masing sehingga kita tidak
perlu effort terlalu besar untuk mengatur mekanisme Insert, Update dan
delete untuk setiap
masing masing fact table.
Partitioned view akan mempercepat
akses data karena setiap data yang diakses akan otomatis di tujukan pada fact
table yang berhubungan dengan data
yang dicari dan tidak melakukan select kepada semua fact table yang ada di
definisi view.
Hal – Hal yang perlu diperhatikan
Ada beberapa hal yang perlu
diperhatikan dalam menggunakan partitioned view ini yang berhubungan dengan
feature yang ada di SQL 2005. hal yang perlu diperhatikan itu adalah :
Aturan Table
·
Untuk table yang di select pada view yang dibuat hendaknya hanya di select
sekali saja untuk menjaga tidak ada data yang ditampilkan dua kali
·
setiap Fact table tidak boleh memiliki index yang nilainya computed dan Auto
increment.
·
Setiap table memiliki primary key yang sama.
·
setiap table hendaknya memiliki ANSI Padding yang sama.
Aturan Column
·
Semua Column dari setiap table harus termasuk dalam selection List di
partitioned view yang dibuat.
·
Setiap colum harus dituliskan sekali saja pada select statement untuk
setiap table
·
posisi column yang deselect harus sama urutannya pada setiap select statement
dari setiap fact table
·
setiap colum yang di select dari setiap table harus memiliki tipe data yang
sama.
Aturan Penentuan Column partisi
- Column yang akan menjadi batasan partisi hendaknya
memiliki constrain yang merupakan range data yang diharapkan dengan
menggunakan operatorBETWEEN, AND, OR, <, <=, >, >=, =.
- Posisi Column yang menjadi partisi hendaknya ada pada
posisi yang sama pada statement select union di View yang dibuat.
- Column yang akan dijadikan batas nilai partisi
hendaknya merupakan primary key.
Data modification
Untuk melakukan modifikasi hendaknya
dilakukan dari partitioned view yang sudah dibuat, perubahan terhadap data yang
langsung pada fact table akan berakibat data tidak akan tersimpan jika
constrain data tidak terpenuhi pada fact table tersebut.
Kesimpulan
Jika kita menggunakan data yang
sangat banyak mendekati terabyte sedangkan kita membutuhkan akses data yang
cepat penggunakan partitioned view ini akan sangat membantu dalam meningkatkan
performance dari akses data.
Masalah Data partition ini
pada SQL 2005 menjadi semakin mudah dan tidak tricky seperti di SQL 2000,
karena pada SQL 2005 menyediakan fitur table partitioning yang mempercepat
akses data dari setiap table yang kita kehendaki. Untuk lebih jelas
tentang table partitioning di SQL 2005 dapat dilihat di http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sql2k5partition.asp
Refferensi:
Berikut artikel
menarik tentang partitioned view termasuk perbandingan kecepatan aksesnya dan
link lain yang bermanfaat:
·
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/createdb/cm_8_des_06_17zr.asp
·
http://www.sqlteam.com/item.asp?ItemID=684
·
http://www.sqlservercentral.com/columnists/njacobson/speedofdpv1.asp
·
http://www.sqlteam.com/item.asp?ItemID=751
Pertanyaan dari judul ini sangat menantang khususnya bagi orang orang
yang selama ini menganggap remeh store procedure dan mengganggap logic itu cukup
kita simpan di coding yang kita buat.
Ada beberapa hal yang perlu diperhatikan khususnya mengenai begitu
pentingnya SP ini sehingga tidak dapat diabaikan dalam hal proses pembuatan
sebuah aplikasi yang berhubungan dengan database.
Dari kegunaanya kita bisa melihat ada beberapa hal yang menjadi nilai
utama dari penggunaan SP dalam membuat aplikasi khususnya yang berinteraksi
dengan data base. adapun kegunaan dari Store procedure adalah:
-
Perubahan
logic yang dilakukan dapat dilakukan dengan dampak sekecil kecilnya karena semua
logic disimpan secara sentralized di server
-
Logic
disimpan di server sehingga operasi yang memerlukan roundtrips antara server dan
client dapat diminimasi.
-
Semua
plans yang di compile disimpan didalam cache, secara default setiap store
procedure plans akan dicompile dan di simpan kedalam cache pada saat pertama
kali Store procedure tersebut exekusi. Oleh karenanya plan sendiri tidak
langsung terbentuk ketika sebuah store procedure di buat atau disimpan di dalam
cache secara permanen. Ada beberapa hal yang menyebabkan sebuah plan
hilang dari cachenya. Penyebab itu diantaranya :
o
Server
restart
o
Tidak
cukupnya memory untuk menyimpan cache.
o
Dihilangkan
dari cache dengan cara :
§
Melakukan
pemanggilan DBCC FREEPROCCACHE untuk
melakukan kill terhadap semua plans dari cache.
§
Data
yang digunakan oleh store procedure berubah sehingga yang beribah sehingga
statistic dari data tersebut tidak dapat divalidasi yang tentunya berakibat pada
plan yang digunakan oleh store procedure tersebut menjadi
invalid.
§
Melakukan
pemanggilan sp_recompile pada object yang digunakan oleh store procedure
tersebut, sehingga ketika kita melakukan sp_recompile TobjectName maka semua store procedure plan yang terlibat
dengan object Tobjectname akan mengalami
invalidate.
Sedangkan advantages lain dalam hal penggunaan SP dibandingkan kita
menggunakan TSQL statement dan query biasa adalah dalam hal pengaturan security
dari store procedure yang diantaranya adalah:
-
Penggunaan
Store procedure akan lebih aman dan lebih mudah untuk di manage khususnya dalam
pengaturan permission dari user terhadap store procedure
tersebut.
-
Jika
seorang owner dari table membuat sebuah store procedure yang aksesnya dilakukan
berdasarkan table tersebut, maka si owner tadi dapat mengatur pemberiaan akses
terhadap store procedure tersebut tanpa perlu memberikan hak akses terhadap
table yang dia miliki. Bayangkan kekacauan yang akan terjadi jika seseorang
melakukan intruksi delete menggunakan tsql seperti ini DELETE Sales WHERE invnum = 1234,
dibandingan dengan intruksi DELETE Sales tentunya sang dba dan
ownerlah yang akan mengalami masalah. maka dengan Store procedure kita bisa
mengurangi resiko yang akan terjadi oleh user yang tidak berhak mengakses table
yang bersangkutan.
-
Keamanan
lain dari yang bisa dilakukan oleh stro procedure adalah keamanan dalam hal
gangguan hacker berupa SQL injection yang artinya dengan store procedure user
akan dibatasi proses interaksi datanya dengan system hanya berupa pengiriman
data lewat parameter.
selain itu dalam hal performance penggunaan SP akan semakin meningkatkan
performa dari aplikasi yang kita miliki dibandingkan dengan menggunakan TSQL
statement biasa dari aplikasi. karena SP memiliki kemampuan untuk melakukan
caching terhadap SP yang dibuat sehingga execution timenya lebih cepat. untuk
masalah penggunaan caching dan process recompile dari SP akan saya bahas pada
artikel selanjutnya dari artikel ini.
More Posts