Pipeline (software) Article Index for
Pipeline
Website Links For
Pipeline
 

Information About

Pipeline (software)




In Software Engineering , a pipeline consists of a chain of processing elements ( Processes , Threads , Coroutine s, ''etc''.), arranged so that the output of each element is the input of the next. Usually some amount of Buffering is provided between consecutive elements. The information that flows in these pipelines is often a Stream of Records , Bytes or Bits .

The concept is also called the pipes and filters Design Pattern . It was named by analogy to a physical Pipeline .


MULTIPROCESSED PIPELINES

Pipelines are often implemented in a Multitasking OS , by launching all elements at the same time as processes, and automatically servicing the data read requests by each process with the data written by the upstream process. In this way, the CPU will be naturally switched among the processes by the Scheduler so as to minimize its idle time. In other common models elements are implemented as lightweight threads or as coroutines, to reduce the OS overhead often involved with processes. Depending upon the OS, threads may be scheduled directly by the OS or by a thread manager. Coroutines are always scheduled by a coroutine manager of some form.

Usually, read and write requests are blocking operations, which means that the execution of the source process, upon writing, is suspended until all data could be written to the destination process, and, likewise, the execution of the destination process, upon reading, is suspended until at least some of the requested data could be obtained from the source process. Obviously, this cannot lead to a Deadlock , where both processes would wait indefinitely for each other to respond, since at least one of the two processes will soon thereafter have its request serviced by the operating system, and continue to run.

For performance, most operating systems implementing pipes use pipe Buffers , which allow the source process to provide more data than the destination process is currently able or willing to receive. Under most Unices and Unix-like operating systems, a special command is also available which implements a pipe buffer of potentially much larger and configurable size, typically called "buffer". This command can be useful if the destination process is significantly slower than the source process, but it is anyway desired that the source process can complete its task as soon as possible. E.g., if the source process consists of a command which reads an Audio Track from a CD and the destination process consists of a command which compresses the Waveform audio data to a format like OGG Vorbis . In this case, buffering the entire track in a pipe buffer would allow the CD drive to spin down more quickly, and enable the user to remove the CD from the drive before the encoding process has finished.

Such a buffer command can be implemented using available operating system Primitive s for reading and writing data. Wasteful Active Waiting can be avoided by using facilities such as Poll or Select or Multithreading .


VM/CMS and MVS

CMS Pipelines is a port of the pipeline idea to VM/CMS and MVS systems. It supports much more complex pipeline structures than Unix shells, with steps taking multiple input streams and producing multiple output streams. (Such functionality is supported by the Unix kernel, but few programs use it and none of the shells provide a syntax for it.) Due to the different nature of IBM mainframe operating systems, it implements many steps inside CMS Pipelines which in Unix are separate external programs, but can also call separate external programs for their functionality. Also, due to the record-oriented nature of files on IBM mainframes, pipelines operate in a record-oriented, rather than stream-oriented manner.


PSEUDO-PIPELINES

On single-tasking operating systems, the processes of a pipeline have to be executed one by one in sequential order; thus the output of each process must be saved to a Temporary File , which is then read by the next process. Since there is no parallelism or CPU switching, this version is called a "pseudo-pipeline".