Information About

Microprogram




The elements composing a microprogram exist on a lower conceptual level than the more familiar assembler instructions. Each element is differentiated by the "micro" prefix to avoid confusion: microprogram, microcode, microinstruction, microassembler, etc.

Microprograms are carefully designed and optimized for the fastest possible execution, since a slow microprogram would yield a slow machine instruction which would in turn cause all programs using that instruction to be slow. The microprogrammer must have extensive low-level hardware knowledge of the computer circuitry, as the microcode controls this. The microcode is written by the CPU engineer during the design phase.

On most computers using microcode, the microcode doesn't reside in the main system Memory , but exists in a special high speed memory, called the Control Store . This memory might be Read-only Memory , or it might be Read-write Memory , in which case the microcode would be loaded into the control store from some other storage medium as part of the initialization of the CPU. If the microcode is in read-write memory, it can be altered to correct bugs in the instruction set, or to implement new machine instructions. Microcode can also allow one computer Microarchitecture to Emulate another, usually more-complex architecture.

Microprograms consist of series of microinstructions. These microinstructions control the CPU at a very fundamental level. For example, a single typical microinstruction might specify the following operations:

  • Connect Register 1 to the "A" side of the ALU

  • Connect Register 7 to the "B" side of the ALU

  • Set the ALU to perform two's-complement addition

  • Set the ALU's carry input to zero

  • Store the result value in Register 8

  • Update the "condition codes" with the ALU status flags ("Negative", "Zero", "Overflow", and "Carry")

  • Microjump to MicroPC nnn for the next microinstruction


To simultaneously control all of these features, the microinstruction is often very wide, for example, 56 bits or more.


THE REASON FOR MICROPROGRAMMING

Microcode was originally developed as a simpler method of developing the control logic for a computer. Initially CPU instruction sets were " Hard Wired ". Each machine instruction (add, shift, move) was implemented directly with circuitry. This provided fast performance, but as instruction sets grew more complex, hard-wired instruction sets became more difficult to design and debug.

Microcode alleviated that problem by allowing CPU design engineers to write a microprogram to implement a machine instruction rather than design circuitry for it. Even late in the design process, microcode could easily be changed, whereas hard wired instructions could not. This greatly facilitated CPU design and led to more complex instruction sets.

Another advantage of microcode was the implementation of more complex machine instructions. In the 1960s through the late 1970s, much programming was done in Assembly Language , a symbolic equivalent of machine instructions. The more abstract and higher level the machine instruction, the greater the programmer productivity. The ultimate extension of this were "Directly Executable High Level Language" designs. In these each statement of a high level language such as PL/I would be entirely and directly executed by microcode, without compilation. The IBM Future Systems Project and Data General Fountainhead Processor were examples of this.

Microprogramming also helped alleviate the memory bandwidth problem. During the 1970s, CPU speeds grew more quickly than memory speeds. Numerous acceleration techniques such as Memory Block Transfer , Memory Pre-fetch and Multi-level Cache s helped reduce this. However high level machine instructions (made possible by microcode) helped further. Fewer more complex machine instructions require less memory bandwidth. For example complete operations on character strings could be done as a single machine instruction, thus avoiding multiple instruction fetches.

Architectures using this approach included the IBM System/360 and Digital Equipment Corporation VAX , the instruction sets of which were implemented by complex microprograms. The approach of using increasingly complex microcode-implemented instruction sets was later called CISC .


OTHER BENEFITS

A processor's microprograms operate on a more primitive, totally different and much more hardware-oriented architecture than the assembly instructions visible to normal programmers. In coordination with the hardware, the microcode implements the programmer-visible architecture. The underlying hardware need not have a fixed relationship to the visible architecture. This makes it possible to implement a given instruction set architecture on a wide variety of underlying hardware micro-architectures.

Doing so is important if binary program compatibility is a priority. That way previously existing programs can run on totally new hardware without requiring revision and recompilation. However there may be a performance penalty for this approach. The tradeoffs between application backward compatibility vs CPU performance are hotly debated by CPU design engineers.

The IBM System/360 has a 32-bit architecture with 16 general-purpose registers, but most of the System/360 implementations actually use hardware that implemented a much simpler underlying microarchitecture; for example, the System/360 Model 30 had 8-bit data paths to the Arithmetic Logic Unit (ALU) and main memory and implemented the general-purpose registers in a special unit of higher-speed Core Memory , and the System/360 Model 40 had 8-bit data paths to the ALU and 16-bit data paths to main memory and also implemented the general-purpose registers in a special unit of higher-speed core memory.. The Model 50 and Model 65 had full 32-bit data paths and implemented the general-purpose registers in faster transistor circuits. In this way, microprogramming enabled IBM to design many System/360 models with substantially different hardware and spanning a wide range of cost and performance, while making them all architecturally compatible. This dramatically reduced the amount of unique system software that had to be written for each model.

A similar approach was used by Digital Equipment Corporation in their VAX family of computers. Initially a 32-bit TTL processor in conjunction with supporting microcode implemented the programmer-visible architecture. Later VAX versions used different microarchitectures, yet the programmer-visible architecture didn't change.

Microprogramming also reduced the cost of field changes to correct defects ( Bug s) in the processor; a bug could often be fixed by replacing a portion of the microprogram rather than by changes being made to hardware logic and wiring.


HISTORY


In 1947, the design of the MIT Whirlwind introduced the concept of a Control Store as a way to simplify computer design and move beyond '' Ad Hoc '' methods. The control store was a two-dimensional lattice: one dimension accepted "control time pulses" from the CPU's internal clock, and the other connected to control signals on gates and other circuits. A "pulse distributor" would take the pulses generated by the CPU clock and break them up into eight separate time pulses, each of which would activate a different row of the lattice. When the row was activated, it would activate the control signals connected to it.

Described another way, the signals transmitted by the control store are being played much like a Player Piano roll. That is, they are controlled by a sequence of very wide words constructed of Bit s, and they are "played" sequentially. In a control store, however, the "song" is short and repeated continuously.

In 1951 Maurice Wilkes enhanced this concept by adding ''conditional execution'', a concept akin to a Conditional in computer software. His initial implementation consisted of a pair of matrices, the first one generated signals in the manner of the Whirlwind control store, while the second matrix selected which row of signals (the microprogram instruction word, as it were) to invoke on the next cycle. Conditionals were implemented by providing a way that a single line in the control store could choose from alternatives in the second matrix. This made the control signals conditional on the detected internal signal. Wilkes coined the term microprogramming to describe this feature and distinguish it from a simple control store.


EXAMPLES OF MICROPROGRAMMED SYSTEMS

  • Most models of the IBM System/360 series were microprogrammed:

  • The Model 25 was unique among System/360 models in using the top 16k bytes of core storage to hold the control storage for the microprogram. The 2025 used a 16-bit microarchitecture with seven control words (or microinstructions).

  • The Model 30, the slowest model in the line, used an 8-bit microarchitecture with only a few hardware registers; everything that the programmer saw was emulated by the microprogram.

  • The Model 40 used 56-bit control words. The 2040 box implements both the System/360 main processor and the multiplex channel (the I/O processor).

  • The Model 50 had two internal datapaths which operated in parallel: a 32-bit datapath used for arithmetic operations, and an 8-bit data path used in some logical operations. The control store used 90-bit microinstructions.

  • The Model 85 had separate instruction fetch (I-unit) and execution (E-unit) to provide high performance. The I-unit is hardware controlled. The E-unit is microprogrammed with 108-bit control words.

  • The Digital Equipment Corporation PDP-11 processors, with the exception of the PDP-11/20, were microprogrammed1.

  • The Burroughs B700 "microprocessor" executed application-level opcodes using sequences of 16-bit microinstructions stored in main memory, each of these was either a register-load operation or mapped to a single 56-bit "nanocode" instruction stored in read-only memory. This allowed comparatively simple hardware to act either as a mainframe peripheral controller or to be packaged as a standalone computer.

  • The Burroughs B1700 was implemented with radically different hardware including bit-addressable main memory but had a similar multi-layer organisation.

  • In common with many other complex mechanical devices Charles Babbage's Analytical Engine used banks of cams to control each operation, i.e. it had a read-only control store. As such it deserves to be recognised as the first microprogrammed computer to be designed, even if it has not yet been realised in hardware.

  • The VU0 and VU1 vector units in the Sony Playstation 2 are microprogrammable; in fact, VU1 was ''only'' accessible via microcode for the first several generations of the SDK.



IMPLEMENTATION

Each microinstruction in a microprogram provides the bits which control the functional elements that internally comprise a CPU. The advantage over a hard-wired CPU is that internal CPU control becomes a specialized form of a computer program. Microcode thus transforms a complex electronic design challenge (the control of a CPU) into a less-complex programming challenge.

To take advantage of this, computers were divided into several parts:

A Microsequencer picked the next word of the Control Store . A sequencer is mostly a counter, but usually also has some way to jump to a different part of the control store depending on some data, usually data from the Instruction Register and always some part of the control store. The simplest sequencer is just a register loaded from a few bits of the control store.

A Register set is a fast memory containing the data of the central processing unit. It may include the program counter, stack pointer, and other numbers that are not easily accessible to the application programmer. Often the register set is triple-ported, that is, two registers can be read, and a third written at the same time.

An Arithmetic And Logic Unit performs calculations, usually addition, logical negation, a right shift, and logical AND. It often performs other functions, as well.

There may also be a Memory Address Register and a Memory Data Register , used to access the main Computer Storage .

Together, these elements form an " Execution Unit ." Most modern CPUs have several execution units. Even simple computers usually have one unit to read and write memory, and another to execute user code.

These elements could often be bought together as a single chip. This chip came in a fixed width which would form a 'slice' through the execution unit. These were known as ' Bit Slice ' chips. The AMD Am2900 is the best known example of a bit slice processor.

The parts of the execution units, and the execution units themselves are interconnected by a bundle of wires called a Bus .

Programmers develop microprograms. The basic tools are software: A Microassembler allows a programmer to define the table of bits symbolically. A Simulator program executes the bits in the same way as the electronics (hopefully), and allows much more freedom to debug the microprogram.

After the microprogram is finalized, and extensively tested, it is sometimes used as the input to a computer program that constructs logic to produce the same data. This program is similar to those used to optimize a Programmable Logic Array . No known computer program can produce optimal logic, but even pretty good logic can vastly reduce the number of transistors from the number required for a ROM control store. This reduces the cost and power used by a CPU.

Microcode can be characterized as horizontal or '''vertical'''. This refers primarily to whether each microinstruction directly controls CPU elements (horizontal microcode), or requires subsequent decoding by Combinational Logic before doing so (vertical microcode). Consequently each horizontal microinstruction is wider (contains more bits) and occupies more storage space than a vertical microinstruction.


Horizontal microcode

A typical horizontal microprogram control word has a field, a range of bits, to control each piece of electronics in the CPU. For example, one simple arrangement might be: