ScheduledThreadPoolExecutor[1]. Maybe this suits the requirements. And
> Hi,
>
> Marcel Reutegger schrieb:
>> Hi,
>>
>> 2009/7/12 Jukka Zitting <
[hidden email]>:
>>> Hi,
>>>
>>> 2009/7/8 Marcel Reutegger <
[hidden email]>:
>>>> - paralleled execution of some work. this is primarily to make use of
>>>> multi-core processors. execution should be distributed over and
>>>> executed by N threads which is a factor of the available processors.
>>> If I recall correctly we debated this already earlier. My point was
>>> that limiting the number of tasks to the number of available
>>> processors may not be a good approach as the tasks may be IO-bound or
>>> block for other reasons, in which case having more task threads would
>>> give you better throughput. But I recall being proven wrong, did we
>>> have some benchmark for that? Do you remember where this discussion
>>> was?
>>
>> I don't remember either... But let's just start a new one.
>>
>> I think this very much depends on the work that needs to be distributed. there
>> is no prove that one way is better than the other. for CPU intensive work we'd
>> probably want to limit the number of concurrent tasks. for I/O intensive work
>> the concurrency should be higher.
>>
>> my above point was rather related to CPU intensive work. e.g. creating a posting
>> list while content is indexed. but of course there might be other work that may
>> be parallelized more aggressively.
>>
>> I guess the actual pool shouldn't care about that. some utility on top
>> of the pool
>> should provide that functionality. i.e. execute a number of tasks with a given
>> level of concurrency. the utility would then dispatch the tasks to the pool
>> accordingly.
>>
>>>> - Timers used in TransactionContext and MultiIndex. This could be
>>>> turned into a scheduling mechanism that could also be used by the
>>>> ClusterNode sync. Other classes that use periodic checks in a
>>>> background thread: DatabaseJournal (ClusterRevisionJanitor),
>>>> CooperativeFileLock (watch dog).
>>> Yep. Perhaps we could also reuse some of the scheduling functionality in Sling.
>>
>> I'm not sure this is needed. the java rt library already comes with
>> Timer and Task
>> classes. our needs are very simple and I'm not sure that justifies a
>> new dependency.
>
> Yes, AFAICT Java also has ThreadPool implementations. If not, I urge to
> still _not_ reinvent the wheel and take something existing even if it
> would a single dependency.
>
> Regards
> Felix
>
>>
>>>> the more I think about it, the more I like your idea. but we should be
>>>> careful with a maximum size for a repository wide pool. extensive use
>>>> of the pool by a module should not lock up another module just because
>>>> there are no more idle threads. maybe that global pool shouldn't have
>>>> a maximum size...
>>> That might make sense. Perhaps we should have some concept of
>>> sub-pools (that borrow from the main pool) with fixed limits for tasks
>>> that need them (see above).
>>
>> hmm, that doesn't sound flexible and generic. I just thought again how cool
>> it was if we could deploy jackrabbit into a google app-engine. that however
>> requires that all background threads are removed. if we have that generic
>> pool and client code adjusted accordingly it could be as easy as turning
>> the pool into a direct executor variant ;) well, that's very optimistic but
>> sounds promising to me...
>>
>> regards
>> marcel
>>
>