From SharePoint 2010 Microsoft promises us, that they now are able to handle large list – namely because they now have implemented “Query trottling” that prohibits (limits?) queries, that return too many items. So you think to yourself: Wow – that’s nice! Now I can implement applications that supports large amounts of data; You start implementing your solution; You test it on your development box (where you probably like me are running as Farm Admin, so you can install your new application) – you even do a test with at least 5001 items in a list (over your throttle limit), and everything seems to be working. Then you hand it over to the customer, for them to install, and it works for a couple of days – but as soon as the customer inserts item number 5001, everything stops to work.
Why is that?
First of all: The Query Throttling isn’t a hard limit on your returns. It’s a “if I think I would return to many items, or at least have to process too many items for giving you your result in a specified sort order etc. I WILL THROW AN EXCEPTION, and thus return nothing to you”. It doesn’t matter if you’ve specified an item limit on your query – YOU WILL GET NOTHING!
It’s even so, that if you were to query for
|var l = from c in rep.Caseswhere c.ItemID=22select c|
You would get an exception (if it wasn’t the case that ItemID was indexed, which of course it is – but more on indexing later).
So you go back to your documentation, and try to see what you’re doing wrong: Microsoft had promised us, that they supported large lists right?
|If you read the book “Inside SharePoint 2010” this concept is explained, though not particularly thorough. It also has more than hints, that the throttle limit is a limit, and therefore should work like a hard “ItemLimit” property – you can just take a look at the example for query throttling to get you convinced. I know that Ted Pattison et. Al. are not in any way Microsoft employees, but the book is published by Microsoft press, so you would imagine that there had been some kind of review process on the book.|
You look at the query property “QueryThrottleMode”, and try to set this properly, and of course it still works on your development environment, since you’re running as Farm Administrator, but it doesn’t do anything for your production environment.
The documentation hints, that there are 3 kinds of users in play, when it comes to throttling:
1) Various kind of ordinary users (which will be 99% of the users using your application)
2) Administrators/Auditors – in our setup that was around 2 users
3) The Farm administrator (but the real Farm admin doesn’t want to give you his password, saying that that would be a *security breach*)
So let’s take a look at what the QueryThrottleMode does for these user types, when querying for more than the 5000 default item limit, in in a list with 5001 items:
So now we have 2 users being able to use our application – which is a progress from before, since then we had no users on the application.
The first thought, of course, is:
Could we run the query with elevated rights?
Yes – of course we could, but that would then just postpone the problem until we get to the override limit. We also lose any kind of security trimming on the list, since the administrator has (and should have) the right to see everything.
Seeing this, it makes it kind of hard to see where the support for large lists lies. It also makes absolutely no sense, why there is a QueryThrottlemode, to do something when handling large lists, when most users are “ordinary” users.
The next question is:
Should we scrap our application, and build it in something, that supports large amount of data?
That’s probably not an option – since you’ve already wasted too much money on the current implementation, though it’s theoretically possible, since you’ve build a nice layered application, like I showed in my first posting.
|Forgive me for liking, when I can reference myself|
There is actually something you can do: The indexing feature on your lists will actually enable you to search in a list, with elements over your throttle limit. But there is a lot of things you need to know about this.
Your query must contain a “where” element using your index attributes – this is where element must bring your items down below your throttle limit.
Sorting should preferably be done on indexed columns.
Using multiple indexes in a query is not an option – this is why I in my fist post didn’t want to do a:
|var l = from c in rep.Cases|
where c.Cpr = “facepalm” ||
c.Name = “facepalm” ||
c.Title = “facepalm”