Favourite Quote

"A Pessimist sees Difficulty in every Opportunity, an Optimist sees Opportunity in every Difficulty"

- Winston Churchill

Friday, April 24, 2009

Scalar function performance

I was tuning a query which was performing very badly and I managed to find the culprit after playing with the query for 1/2 hr. The query goes like this,

We have the following view which has 2 scalar functions in the SELECT clause. We construct our stored procedures based on the views:

SET statistics io ON

SELECT dbo.curriculumunitactivity.curriculumunitactivitystatusid,
dbo.learningactivity.learningactivityid,
dbo.learningactivity.title AS learningactivitytitle,
dbo.learningactivity.overview AS learningactivityoverview,
dbo.learningactivity.SEQUENCE AS learningactivitysequence,
dbo.curriculumunit.curriculumunitid,
dbo.curriculumunit.curriculumid,
dbo.curriculumunit.title AS curriculumunittitle,
dbo.curriculumunit.curriculumunitstatusid,
dbo.learningactivity.learningactivityflags64,
dbo.learningactivityassignment.learningactivityassignmentstatusid,
dbo.learningactivityassignment.duedate,
dbo.curriculumunit.parentcurriculumunitid,
dbo.curriculumunit.rootcurriculumunitid,
dbo.learningactivity.learningactivitytypeid,
dbo.curriculumunit.curriculumunittypeid,
dbo.learningactivity.learningactivitystatusid,
dbo.learningactivityassignment.learningactivityassignmentid,
dbo.learningactivityassignment.duedateappointmentid,
dbo.learningactivityassignment.assignedcontexttypeid,
dbo.learningactivityassignment.assignmentby,
dbo.learningactivityassignment.assignmentdate,
dbo.learningactivityassignment.retractedby,
dbo.learningactivityassignment.retracteddate,
dbo.learningactivityassignment.assignedtoid,
dbo.learningactivityassignment.assignedtotypeid,
dbo.curriculumunitactivity.curriculumunitactivityid,
dbo.learningactivityassignment.assignedcontextid,
dbo.orgmember.memberid AS assignedbymemberid,
dbo.org.name AS assignedbyorgname,
dbo.learningactivityassignment.createddate AS assignmentcreateddate,
dbo.learningactivity.durationinminutes,
dbo.learningactivityassignment.rewardpoint,
dbo.learningactivityassignment.learningactivityassignmentflag64,
dbo.classroom.classroomid,
classroom.classroomtitle,
dbo.Fngetrootcurriculumunittitle(dbo.curriculumunit.curriculumunitid) AS rootcurriculumunittitle,
dbo.Fngetnumberoflearningactivityassignmentparticipants(dbo.learningactivityassignment.learningactivityassignmentid) AS totalnumberofparticipants
FROM dbo.learningactivity
INNER JOIN dbo.curriculumunitactivity
ON dbo.learningactivity.learningactivityid = dbo.curriculumunitactivity.learningactivityid
INNER JOIN dbo.learningactivityassignment
ON dbo.learningactivity.learningactivityid = dbo.learningactivityassignment.learningactivityid
AND dbo.curriculumunitactivity.curriculumunitactivityid = dbo.learningactivityassignment.assignedcontextid
AND dbo.learningactivityassignment.assignedcontexttypeid = 4
INNER JOIN dbo.curriculumunit
ON dbo.curriculumunitactivity.curriculumunitid = dbo.curriculumunit.curriculumunitid
INNER JOIN dbo.orgmember
ON dbo.learningactivityassignment.assignmentby = dbo.orgmember.orgmemberid
INNER JOIN dbo.org
ON dbo.org.orgid = dbo.orgmember.orgid
INNER JOIN dbo.classroom
ON dbo.learningactivityassignment.assignedtoid = dbo.classroom.classroomid
WHERE dbo.learningactivityassignment.assignedtotypeid IN (2,3)


Now we have our stored procedures based on this view and filter the records by passing the stored procedure parameter. It will be something like,
SELECT * FROM View WHERE C1=@param1
Instead of * we used have the relevant fields including the ones related to scalar function.

The problem was the procedure took more than 5 minutes and I could figure out that the problem was with the scalar function present in the SELECT list of the view. I came to know this only by Adam Mechanic's blog entry on the same issue some time back.

Actually what happens is that the view does not have any filter and would return numerous record. Say for example if the view returns 500 records the scalar function would be executed 500 times. This is a major performance bottleneck.

Initially I could not figure out the problem as I was checking the execution plan and SET STATISTICS IO output which did not show any difference when I tested with and without the functions in the SELECT list of the view. However when I ran profiler, I can see that there were numerous reads (almost 100 times more than what was shown in Statistics IO output) when the scalar function was present in the SELECT list of the view.

So to fix this I removed the function from the view and instead use it in the stored procedure directly and the result came in less than a second. The reason was we had filters applied in the procedures as a result the number of records where very minimal. Scalar functions are not bad unless you use it in an inappropriate place as we did. If we can convert the scalar function to inline function it would even be better. The following blog post from Adam Mechanic might be of great help to someone facing the same issue.

0 comments: