subroutine
In computer programming, a subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed.
Subroutines may be defined within programs, or separately in libraries that can be used by many programs. In different programming languages, a subroutine may be called a routine, subprogram, function, method, or procedure. Technically, these terms all have different definitions. The generic, umbrella term callable unit is sometimes used.
The name subprogram suggests a subroutine behaves in much the same way as a computer program that is used as one step in a larger program or another subprogram. A subroutine is often coded so that it can be started several times and from several places during one execution of the program, including from other subroutines, and then branch back (return) to the next instruction after the call, once the subroutine's task is done. The idea of a subroutine was initially conceived by John Mauchly during his work on ENIAC, and recorded in a January 1947 Harvard symposium on "Preparation of Problems for EDVAC-type Machines". Maurice Wilkes, David Wheeler, and Stanley Gill are generally credited with the formal invention of this concept, which they termed a closed subroutine, contrasted with an open subroutine or macro. However, Turing had discussed subroutines in a paper of 1945 on design proposals for the NPL ACE, going so far as to invent the concept of a return address stack.
Subroutines are a powerful programming tool, and the syntax of many programming languages includes support for writing and using them. Judicious use of subroutines (for example, through the structured programming approach) will often substantially reduce the cost of developing and maintaining a large program, while increasing its quality and reliability. Subroutines, often collected into libraries, are an important mechanism for sharing and trading software. The discipline of object-oriented programming is based on objects and methods (which are subroutines attached to these objects or object classes).
In the compiling method called threaded code, the executable program is basically a sequence of subroutine calls.
The content of a subroutine is its body, which is the piece of program code that is executed when the subroutine is called or invoked.
A subroutine may be written so that it expects to obtain one or more data values from the calling program (to replace its parameters or formal parameters). The calling program provides actual values for these parameters, called arguments. Different programming languages may use different conventions for passing arguments:
- Call by value: Argument is evaluated and copy of the value is passed to subroutine
- Call by reference: Reference to an argument, typically its address is passed
- Call by result: Parameter value is copied back to argument on return from the subroutine
- Call by value-result: Parameter value is copied back on entry to the subroutine and again on return
- Call by name: Like a macro – replace the parameters with the unevaluated argument expressions
- Call by constant value: Like call by value except that the parameter is treated as a constant
The subroutine may return a computed value to its caller (its return value), or provide various result values or output parameters. Indeed, a common use of subroutines is to implement mathematical functions, in which the purpose of the subroutine is purely to compute one or more results whose values are entirely determined by the arguments passed to the subroutine. (Examples might include computing the logarithm of a number or the determinant of a matrix.) This type is called a function.
A subroutine call may also have side effects such as modifying data structures in a computer memory, reading from or writing to a peripheral device, creating a file, halting the program or the machine, or even delaying the program's execution for a specified time. A subprogram with side effects may return different results each time it is called, even if it is called with the same arguments. An example is a random number subroutine, available in many languages, that returns a different pseudo-random number each time it is called. The widespread use of subroutines with side effects is a characteristic of imperative programming languages.
A subroutine can be coded so that it may call itself recursively, at one or more places, to perform its task. This method allows direct implementation of functions defined by mathematical induction and recursive divide and conquer algorithms.
A subroutine whose purpose is to compute one boolean-valued function (that is, to answer a yes/no question) is sometimes called a predicate. In logic programming languages, often all subroutines are called predicates, since they primarily determine success or failure.
A subroutine that returns no value or returns a null value is sometimes called a procedure. Procedures usually modify their arguments and are a core part of procedural programming.
High-level programming languages usually include specific constructs to:
- Delimit the part of the program (body) that makes up the subroutine
- Assign an identifier (name) to the subroutine
- Specify the names and data types of its parameters and return values
- Provide a private naming scope for its temporary variables
- Identify variables outside the subroutine that are accessible within it
- Call the subroutine
- Provide values to its parameters
- The main program contains the address of the subprogram
- The subprogram contains the address of the next instruction of the function call in the main program
- Specify the return values from within its body
- Return to the calling program
- Dispose of the values returned by a call
- Handle any exceptional conditions encountered during the call
- Package subroutines into a module, library, object, or class
In strictly functional programming languages such as Haskell, subprograms can have no side effects, which means that various internal states of the program will not change. Functions will always return the same result if repeatedly called with the same arguments. Such languages typically only support functions, since subroutines that do not return a value have no use unless they can cause a side effect.
In programming languages such as C, C++, and C#, subroutines may also simply be called functions, not to be confused with mathematical functions or functional programming, which are different concepts.
A language's compiler will usually translate procedure calls and returns into machine instructions according to a well-defined calling convention, so that subroutines can be compiled separately from the programs that call them. The instruction sequences corresponding to call and return statements are called the procedure's prologue and epilogue.
The advantages of breaking a program into subroutines include:
- Decomposing a complex programming task into simpler steps: this is one of the two main tools of structured programming, along with data structures
- Reducing duplicate code within a program
- Enabling reuse of code across multiple programs
- Dividing a large programming task among various programmers or various stages of a project
- Hiding implementation details from users of the subroutine
- Improving readability of code by replacing a block of code with a function call where a descriptive function name serves to describe the block of code. This makes the calling code concise and readable even if the function is not meant to be reused.
- Improving traceability (i.e. most languages offer ways to obtain the call trace which includes the names of the involved subroutines and perhaps even more information such as file names and line numbers); by not decomposing the code into subroutines, debugging would be severely impaired
Compared to using in-line code, invoking a subroutine imposes some computational overhead in the call mechanism.
A subroutine typically requires standard housekeeping code – both at the entry to, and exit from, the function (function prologue and epilogue – usually saving general purpose registers and return address as a minimum).
Most modern implementations of a subroutine call use a call stack, a special case of the stack data structure, to implement subroutine calls and returns. Each procedure call creates a new entry, called a stack frame, at the top of the stack; when the procedure returns, its stack frame is deleted from the stack, and its space may be used for other procedure calls. Each stack frame contains the private data of the corresponding call, which typically includes the procedure's parameters and internal variables, and the return address.
The call sequence can be implemented by a sequence of ordinary instructions (an approach still used in reduced instruction set computing (RISC) and very long instruction word (VLIW) architectures), but many traditional machines designed since the late 1960s have included special instructions for that purpose.
The call stack is usually implemented as a contiguous area of memory. It is an arbitrary design choice whether the bottom of the stack is the lowest or highest address within this area, so that the stack may grow forwards or backwards in memory; however, many architectures chose the latter.
Some designs, notably some Forth implementations, used two separate stacks, one mainly for control information (like return addresses and loop counters) and the other for data. The former was, or worked like, a call stack and was only indirectly accessible to the programmer through other language constructs while the latter was more directly accessible.
When stack-based procedure calls were first introduced, an important motivation was to save precious memory. With this scheme, the compiler does not have to reserve separate space in memory for the private data (parameters, return address, and local variables) of each procedure. At any moment, the stack contains only the private data of the calls that are currently active (namely, which have been called but haven't returned yet). Indeed, the call stack mechanism can be viewed as the earliest and simplest method for automatic memory management. However, another advantage of the call stack method is that it allows recursive subroutine calls, since each nested call to the same procedure gets a separate instance of its private data.
One disadvantage of the call stack mechanism is the increased cost of a procedure call and its matching return. The extra cost includes incrementing and decrementing the stack pointer (and, in some architectures, checking for stack overflow), and accessing the local variables and parameters by frame-relative addresses, instead of absolute addresses. The cost may be realized in increased execution time, or increased processor complexity, or both.
This overhead is most obvious and objectionable in leaf procedures or leaf functions, which return without making any procedure calls themselves. To reduce that overhead, many modern compilers try to delay the use of a call stack until it is really needed. For example, the call of a procedure P may store the return address and parameters of the called procedure in certain processor registers, and transfer control to the procedure's body by a simple jump. If the procedure P returns without making any other call, the call stack is not used at all. If P needs to call another procedure Q, it will then use the call stack to save the contents of any registers (such as the return address) that will be needed after Q returns.
A subprogram may find it useful to make use of a certain amount of scratch space; that is, memory used during the execution of that subprogram to hold intermediate results. Variables stored in this scratch space are termed local variables, and the scratch space is termed an activation record. An activation record typically has a return address that tells it where to pass control back to when the subprogram finishes.
If a subprogram can be executed properly even when another execution of the same subprogram is already in progress, that subprogram is said to be reentrant. A recursive subprogram must be reentrant. Reentrant subprograms are also useful in multi-threaded situations since multiple threads can call the same subprogram without fear of interfering with each other.
In object-oriented programming, when a series of functions with the same name can accept different parameter profiles or parameters of different types, each of the functions is said to be overloaded.
A closure is a subprogram together with the values of some of its variables captured from the environment in which it was created. Closures were a notable feature of the Lisp programming language, introduced by John McCarthy.
A wide number of conventions for the coding of subroutines have been developed. Pertaining to their naming, many developers have adopted the approach that the name of a subroutine should be a verb when it does a certain task, and adjective when it makes some inquiry, and a noun when it is used to substitute variables. Examples: append(), empty(), list_head().
There is a significant runtime overhead in a calling a subroutine, including passing the arguments, branching to the subprogram, and branching back to the caller. A method used to eliminate this overhead is inline expansion or inlining of the subprogram's body at each call site (versus branching to the subroutine and back). Not only does this avoid the call overhead, but it also allows the compiler to optimize the procedure's body more effectively by taking into account the context and arguments at that call. The inserted body can be optimized by the compiler. Inlining, however, will usually increase the code size, unless the program contains only one call to the subroutine.
六级/考研单词: compute, instruct, seldom, execute, conceive, symposium, ace, stack, potent, hardware, compile, thread, data, parameter, convention, evaluate, implement, mathematics, modify, halt, random, imperative, conquer, logic, construct, assign, invariable, disposition, encounter, parcel, strict, confuse, translate, accord, correspond, duplicate, multiple, concise, trace, impair, overhead, delete, arbitrary, latter, loop, motive, precious, namely, haven, nest, overflow, scratch, intermediate, situate, interfere, profile, notable, verb, adjective, inquiry, noun, substitute, eliminate, versus, insert