Rendering a single high quality image may take several hours, or even days. The com-plexity of both the model and the lighting simulation may require excessive computer re-sources. In order to reduce the total rendering time and to accommodate large and complex models that exceed the size of a single processor system, a parallel renderer may provide a viable alternative to sequential computing. In this paper, a data-parallel strategy is applied to allow large models to be distributed over the processors' memories. The resulting uneven workload is then balanced by schedul-ing demand driven tasks on the same set of processors. Tasks are scheduled in either demand driven or data parallel fashion, according to ray coherence. Implications of this hybrid algo-rithm with respect to performance, caching and memory usage are investigated.