Hi,
the "problem" is, that the JCR query implementation is lazy, when it comes to returning the result. Experience has shown, that in many cases people are not interested in the full result set, but only the first X items of it. That's the reason, why the query implementation does not compute all results immediately, when you start to read the result set. Instead if computes the result in chunks; and because it does not know all results yet even if you start reading from the result set. This might be caused by the fact, that the internal result set must be filtered first through the ACLs, so the raw return set (for example provided by lucene) is never returned to you as a user of this API, but it must be filtered. So unless the query implementation has not read all the result set and done all necessary checks, it simply does not know how much results there will be. But this takes time, especially if you have millions of results (does this query then make sense at all?).
The easiest way to force the query implementation to do this is to use a "order by" statement; because then it has to run through all the result items and order them appropriatly. In that case the size will be reported correctly.
kind regards,
Jörg