Saturday, December 24, 2016

Sitecore Deadlock Analysis for Item Bucketing

Here I am sharing my detailed analysis of Sitecore Dead-lock which was occurring on our production site because of using a  Fast query for Item bucketing search.

Recently we faced the site down issue; sometimes our site went down for some time like 2-3 minutes in some intervals and then automatically up, and we also noticed that traffic was huge on the site(concurrent request was around 1000) during the downtime of the site.

After Detail analysis of the logs we found some deadlock victims as below:

Exception: System.Data.SqlClient.SqlException
Message: Transaction (Process ID 65) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Source: .Net SqlClient Data Provider
   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at System.Data.SqlClient.SqlCommand.RunExecuteNonQueryTds(String methodName, Boolean async, Int32 timeout, Boolean asyncWrite)
   at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean asyncWrite)
   at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
   at Sitecore.Data.DataProviders.Sql.DataProviderCommand.ExecuteNonQuery()

Then we switched our focus towards the database and tried to figure out the exact query that Cause the deadlock, and we got the below SQL query that was victim of the deadlock

(@value1 uniqueidentifier,@value2 nvarchar(38))SELECT DISTINCT [i].[ID] [ID], [i].[ParentID] [ParentID] FROM
[Items] [i] INNER JOIN [Descendants] ON [i].[ID] = [Descendants].[Descendant] INNER JOIN (SELECT DISTINCT [i].[ID] [ID],
[i].[ParentID] [ParentID] FROM [Items] [i] INNER JOIN (SELECT DISTINCT [i].[ID] [ID], [i].[ParentID] [ParentID] FROM
[Items] [i] INNER JOIN (SELECT DISTINCT [i].[ID] [ID], [i].[ParentID] [ParentID] FROM [Items] [i] WHERE LOWER([i].[Name]) =
'sitecore' AND [i].[ParentID] = @value1) [a] ON [i].[ParentID] = [a].[ID] WHERE LOWER([i].[Name]) = 'content') [a] ON [i].[ParentID] =
[a].[ID] WHERE LOWER([i].[Name]) = 'home') [a] ON [Descendants].[Ancestor] = [a].[ID] WHERE ([i].[TemplateID] = @value2)

Now at that particular time we figured out the problem of site-down, but we had not even the solution for this problem because we don’t know the root cause I means the piece of code which is creating the problem, so finally we more digged in to the code and figured out the exact problem,

"Actually we were using the fast query for getting the bucketing item" as below:

string fastQuery = string.Format("fast:/sitecore/content/Home/ /#Events#//*[@@templateid='{0}' and @Event From != '' and (@Event To >= '{1}') and @ POIs= \"%{2}%\"]", SitecoreContext.NewEventContentItemTemplateID.ToString(), DateTime.Now.ToString("yyyy-MM-dd"), this.ContextItem.ID.ToString());
var eventsItem = Sitecore.Context.Database.SelectItems(fastQuery).OrderBy(x => x["Event From"]).ToList();

We were searching events from around 5000 bucketing items, and due to high concurrent load(1000 request) it occurred the dead lock situation, then we changed the query(Index Content Search) to get the item bucketing items as per the Sitecore recommendation as we already knew but didn’t implement :)

Now we are getting the bucket items from Sitecore context search query from Index as below

  var bucket = Sitecore.Context.Database.GetItem(Constants.ItemID.EventsBucketID);

                using (var ctx = ContentSearchManager.GetIndex("IndexName".ToScSetting()).CreateSearchContext())
                    var today = DateTime.Today;
                    var pred = BaseEventPredicate(bucket, true)
                    var evts = ctx.GetQueryable<SearchEvent>(new CultureExecutionContext(Sitecore.Context.Language.CultureInfo))
                        .OrderBy(i => i.DateWeight)
                        .ThenBy(i => i.EventFrom)
                        .Select(i => i.GetItem())
                    return evts;

Now there are no error logs related With Dead-Lock, no site down

Now I was curious to reproduce the same error on the dev box, as this issue only occurred in production, so first thing I had to give some load on the dev box to reproduce the same error:

So I used visual studio load test and reproduce the same error on my local machine as below:

On the same time I watched the SQL profiler for the same load and found out the same dead lock issue as below:

So the conclusion of this post is, "never use Sitecore Fast query for Item Bucketing search".

Always use Sitecore Context Search Query using Index.

I hope this article will help you.

Happy Sitecoring:)