| StarPU Handbook - StarPU Extensions
    | 
TODO: improve!
Scheduling contexts represent abstracts sets of workers that allow the programmers to control the distribution of computational resources (i.e. CPUs and GPUs) to concurrent kernels. The main goal is to minimize interferences between the execution of multiple parallel kernels, by partitioning the underlying pool of workers using contexts. Scheduling contexts additionally allow a user to make use of a different scheduling policy depending on the target resource set.
By default, the application submits tasks to an initial context, which disposes of all the computation resources available to StarPU (all the workers). If the application programmer plans to launch several kernels simultaneously, by default these kernels will be executed within this initial context, using a single scheduler policy (see TaskSchedulingPolicy). Meanwhile, if the application programmer is aware of the demands of these kernels and of the specificity of the machine used to execute them, the workers can be divided between several contexts. These scheduling contexts will isolate the execution of each kernel, and they will permit the use of a scheduling policy proper to each one of them.
Scheduling Contexts may be created in two ways: either the programmers indicates the set of workers corresponding to each context (providing he knows the identifiers of the workers running within StarPU), or the programmer does not provide any worker list and leaves the Hypervisor to assign workers to each context according to their needs (Scheduling Context Hypervisor).
Both cases require a call to the function starpu_sched_ctx_create(), which requires as input the worker list (the exact list or a NULL pointer), the amount of workers (or -1 to designate all workers on the platform) and a list of optional parameters such as the scheduling policy, terminated by a 0. The scheduling policy can be a character list corresponding to the name of a StarPU predefined policy or the pointer to a custom policy. The function returns an identifier of the context created, which you will use to indicate the context you want to submit the tasks to. A basic example is available in the file examples/sched_ctx/sched_ctx.c.
Note: Parallel greedy and parallel heft scheduling policies do not support the existence of several disjoint contexts on the machine. Combined workers are constructed depending on the entire topology of the machine, not only the one belonging to a context.
If no scheduling policy is specified when creating the context, it will be used as another type of resource: a parallel worker. A parallel worker is a context without scheduler (eventually delegated to another runtime). For more information, see Creating Parallel Workers On A Machine. It is therefore mandatory to stipulate a scheduler to use the contexts in this traditional way.
To create a context with the default scheduler, that is either controlled through the environment variable STARPU_SCHED or the StarPU default scheduler, one can explicitly use the option STARPU_SCHED_CTX_POLICY_NAME, "" as in the following example:
A full example is available in the file examples/sched_ctx/two_cpu_contexts.c.
The contexts can also be used to group a set of SMs of an NVIDIA GPU in order to isolate the parallel kernels and allow them to coexecution on a specified partition of the GPU.
Each context will be mapped to a stream and users can indicate the number of SMs. The context can be added to a larger context already grouping CPU cores. This larger context can use a scheduling policy that assigns tasks to both CPUs and contexts (partitions of the GPU) based on performance models adjusted to the number of SMs.
The GPU implementation of the task has to be modified accordingly and receive as a parameter the number of SMs.
A full example is available in the file examples/sched_ctx/gpu_partition.c.
A scheduling context can be modified dynamically. The application may change its requirements during the execution, and the programmer can add additional workers to a context or remove those no longer needed. In the following example, we have two scheduling contexts sched_ctx1 and sched_ctx2. After executing a part of the tasks, some of the workers of sched_ctx1 will be moved to context sched_ctx2.
An example is available in the file examples/sched_ctx/sched_ctx_remove.c.
The application may submit tasks to several contexts, either simultaneously or sequentially. If several threads of submission are used, the function starpu_sched_ctx_set_context() may be called just before starpu_task_submit(). Thus, StarPU considers that the current thread will submit tasks to the corresponding context. An example is available in the file examples/sched_ctx/gpu_partition.c.
When the application may not assign a thread of submission to each context, the id of the context must be indicated by using the function starpu_task_submit_to_ctx() or the field STARPU_SCHED_CTX for starpu_task_insert(). An example is available in the file examples/sched_ctx/sched_ctx.c.
When a context is no longer needed, it must be deleted. The application can indicate which context should keep the resources of a deleted one. All the tasks of the context should be executed before doing this. Thus, the programmer may use either a barrier and then delete the context directly, or just indicate that other tasks will not be submitted later on to the context (such that when the last task is executed its workers will be moved to the inheritor) and delete the context at the end of the execution (when a barrier will be used eventually).
A full example is available in the file examples/sched_ctx/sched_ctx.c.
A context may have no resources at the beginning or at a certain moment of the execution. Tasks can still be submitted to these contexts, they will be executed as soon as the contexts will have resources. A list of tasks pending to be executed is kept and will be submitted when workers are added to the contexts.
The full example is available in the file examples/sched_ctx/sched_ctx_empty.c.
However, if resources are never allocated to the context, the application will not terminate. If these tasks have low priority, the application can inform StarPU to not submit them by calling the function starpu_sched_ctx_stop_task_submission().