Thursday, July 16, 2009

The Concept of Process







  • Processes are among the most useful abstractions in operating systems (OS) theory and design, since they offer a unified framework to describe all the various activities of a computer as they are managed by the OS. The term process was (allegedly) first used by the designers of Multics in the '60s, to mean something more general than a job in a multiprogramming environment. Similar ideas, however, were at the heart of many independent system design efforts at the time, so it's rather difficult to point at one particular person or team as the originator of the concept.
    As is common for concepts discovered and re-discovered many times on the field before being put on theory books, several definitions have been proposed for the term process, including picturesque ones like ``the animated spirit of a program''. We'd rather draw upon the very general ideas of system theory instead, and regard a process as a representation of the state of an instance of a program in execution.
    In this definition, the word instance (also ``image'', ``activation'') refers to the fact that in a multiprogramming environment several copies of the same program (or of a piece of executable common to different programs) may be concurrently executed by different users or applications. Instead of mantaining in main memory several copies of the executable code of the program, it is often possible to store in memory just one copy of it, and mantain a description of the current status (program counter position, values of the variables, etc.) of each executing activation of it. Main memory usage is in this way maximized. This tecnique is called code reentrance, and its implementation requires both careful crafting of the reentrant routines, whose instructions constitute the permanent part of the activation, and provisions in the OS in order to mantain an activation record of the temporary part relative to each activation, such as program counter value, variable values, a pointer back to the calling routine and to its activation record,etc.
    Similarly to the way in which activation records allow distinguishing between different activations of the same piece of executable code, by mantaining information about their status, a process description allow an OS to manage, without ensuing chaos, the concurrent execution of different programs all sharing the same resources in terms of processors, memory, peripherals. Again, the keyword here is state i.e., in system theory parlance, all the information that, along with the knowledge of the current and future input values, allows predicting the evolution of a deterministic system like a program.
    What information is this? Obviously the program's executable code is a part of it, as is the associated data needed by the program (variables, I/O buffers, etc.), but this is not enough. The OS needs also to know about the execution context of the program, which includes -at the very least- the content of the processor registers and the work space in main memory, and often additional information like a priority value, whether the process is running or waiting for the completion of an I/O event, etc.
    Consider the scheme in Fig. 1, which depicts a simple process implementation scheme. There are two processes, A and B, each with its own instructions, data and context, stored in main memory. The OS maintains, also in memory, a list of pointers to the above processes, and perhaps some additional information for each of them. The content of a ``current process'' location identifies which process is currently being executed. The processor registers then contain data relevant to that particular process. Among them are the base and top adrresses of the area in memory reserved to the process: an error condition would be trapped if the program being executed tried to write in a memory word whose address is outside those bounds. This allows process protectin and prevents unwanted interferences. When the OS decides, according to a predefined policy, that time has come to suspend the current process, the whole process registers content would be saved in the process's context area, and the registers would be restored with the context of another process. Since the program counter register of the latter process would be restored too, execution would restart automatically from the previous suspension point.


  • Process State

The process state consist of everything necessary to resume the process execution if it is somehow put aside temporarily. The process state consists of at least following:

  • Code for the program.
  • Program's static data.
  • Program's dynamic data.
  • Program's procedure call stack.
  • Contents of general purpose registers.
  • Contents of program counter (PC)
  • Contents of program status word (PSW).
  • Operating Systems resource in use.


  • Process Control Block

A Process Control Block (PCB, also called Task Control Block or Task Struct) is a data structure in the operating system kernel containing the information needed to manage a particular process. The PCB is "the manifestation of a process in an operating system".[1]

Included information
Implementations differ, but in general a PCB will include, directly or indirectly:
The identifier of the process (a process identifier, or PID)
Register values for the process including, notably,
the Program Counter value for the process
The address space for the process
Priority (in which higher priority process gets first preference. eg., nice value on Unix operating systems)
Process accounting information, such as when the process was last run, how much CPU time it has accumulated, etc.
Pointer to the next PCB i.e. pointer to the PCB of the next process to run
I/O Information (i.e. I/O devices allocated to this process, list of opened files, etc)
During a context switch, the running process is stopped and another process is given a chance to run. The kernel must stop the execution of the running process, copy out the values in hardware registers to its PCB, and update the hardware registers with the values from the PCB of the new process.

  • Threads
For the form of code consisting entirely of subroutine calls, see Threaded code. For the collection of posts, see Internet forum#Thread.

A process with two threads of execution.
In computer science, a thread of execution results from a fork of a computer program into two or more concurrently running tasks. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process. Multiple threads can exist within the same process and share resources such as memory, while different processes do not share these resources.
On a single processor, multithreading generally occurs by time-division multiplexing (as in multitasking): the processor switches between different threads. This context switching generally happens frequently enough that the user perceives the threads or tasks as running at the same time. On a multiprocessor or multi-core system, the threads or tasks will generally run at the same time, with each processor or core running a particular thread or task. Support for threads in programming languages varies: a number of languages simply do not support having more than one execution context inside the same program executing at the same time. Examples of such languages include Python, and OCaml, because the parallel support of their runtime support is limited by the use of a central lock, called "Global Interpreter Lock" in Python, "master lock" in Ocaml. Other languages may be limited because they use threads that are user threads, which are not visible to the kernel, and thus cannot be scheduled to run concurrently. On the other hand, kernel threads, which are visible to the kernel, can run concurrently.
Many modern operating systems directly support both time-sliced and multiprocessor threading with a process scheduler. The kernel of an operating system allows programmers to manipulate threads via the system call interface. Some implementations are called a kernel thread, whereas a lightweight process (LWP) is a specific type of kernel thread that shares the same state and information.
Programs can have user-space threads when threading with timers, signals, or other methods to interrupt their own execution, performing a sort of ad-hoc time-slicing.



No comments: