gnu inline asm
::: index
asm keyword, assembly language in C, inline assembly language, mixing
assembly language and C
:::
How to Use Inline Assembly Language in C Code
The asm
keyword allows you to embed assembler instructions within C
code. GCC provides two forms of inline asm
statements. A basic asm
statement is one with no operands (see basic-asm
{.interpreted-text
role="ref"}), while an extended asm
statement (see
extended-asm
) includes one or more
operands. The extended form is preferred for mixing C and assembly
language within a function, but to include assembly language at top
level you must use basic asm
.
basic asm没有操作数,extended asm包含至少一个操作数。函数中使用extended asm,全局作用域中使用basic asm。
You can also use the asm
keyword to override the assembler name for a
C symbol, or to place a C variable in a specific register.
::: {.toctree maxdepth="2"}
:::
::: index
basic asm, assembly language in C, basic
:::
Basic Asm --- Assembler Instructions Without Operands
A basic asm
statement has the following syntax:
asm asm-qualifiers ( AssemblerInstructions )
For the C language, the asm
keyword is a GNU extension. When writing C
code that can be compiled with -ansi
and the -std
options that select C
dialects without GNU extensions, use __asm__
instead of asm
(see
alternate-keywords
). For the C++
language, asm
is a standard keyword, but __asm__
can be used for
code compiled with -fno-asm
.
Qualifiers
volatile
- The optional
volatile
qualifier has no effect. All basicasm
blocks are implicitly volatile. inline
- If you use the
inline
qualifier, then for inlining purposes the
size of theasm
statement is taken as the smallest size possible
(seesize-of-an-asm
).
Parameters
{AssemblerInstructions}
-
This is a literal string that specifies the assembler code. The
string can contain any instructions recognized by the assembler,
including directives. GCC does not parse the assembler instructions
themselves and does not know what they mean or even whether they are
valid assembler input.You may place multiple assembler instructions together in a single
asm
string, separated by the characters normally used in assembly
code for the system. A combination that works in most places is a
newline to break the line, plus a tab character (written as
\\n\\t
). Some assemblers allow
semicolons as a line separator. However, note that some assembler
dialects use semicolons to start a comment.
Remarks
Using extended asm
(see extended-asm
)
typically produces smaller, safer, and more efficient code, and in most
cases it is a better solution than basic asm
. However, there are two
situations where only basic asm
can be used:
- Extended
asm
statements have to be inside a C function, so to
write inline assembly language at file scope ('top-level'),
outside of C functions, you must use basicasm
. You can use this
technique to emit assembler directives, define assembly language
macros that can be invoked elsewhere in the file, or write entire
functions in assembly language. Basicasm
statements outside of
functions may not use any qualifiers. - Functions declared with the
naked
{.interpreted-text
role="fn-attr"} attribute also require basicasm
(see
function-attributes
).
Safely accessing C data and calling functions from basic asm
is more
complex than it may appear. To access C data, it is better to use
extended asm
.
Do not expect a sequence of asm
statements to remain perfectly
consecutive after compilation. If certain instructions need to remain
consecutive in the output, put them in a single multi-instruction asm
statement. Note that GCC's optimizers can move asm
statements
relative to other code, including across jumps.
asm
statements may not perform jumps into other asm
statements. GCC
does not know about these jumps, and therefore cannot take account of
them when deciding how to optimize. Jumps from asm
to C labels are
only supported in extended asm
.
Under certain circumstances, GCC may duplicate (or remove duplicates of)
your assembly code when optimizing. This can lead to unexpected
duplicate symbol errors during compilation if your assembly code defines
symbols or labels.
::: warning
::: title
Warning
:::
The C standards do not specify semantics for asm
, making it a
potential source of incompatibilities between compilers. These
incompatibilities may not produce compiler warnings/errors.
:::
GCC does not parse basic asm
's
{AssemblerInstructions}
, which means
there is no way to communicate to the compiler what is happening inside
them. GCC has no visibility of symbols in the asm
and may discard them
as unreferenced. It also does not know about side effects of the
assembler code, such as modifications to memory or registers. Unlike
some compilers, GCC assumes that no changes to general purpose registers
occur. This assumption may change in a future release.
To avoid complications from future changes to the semantics and the
compatibility issues between compilers, consider replacing basic asm
with extended asm
. See How to convert from basic asm to extended
asm for information
about how to perform this conversion.
The compiler copies the assembler instructions in a basic asm
verbatim
to the assembly language output file, without processing dialects or any
of the %
operators that are available
with extended asm
. This results in minor differences between basic
asm
strings and extended asm
templates. For example, to refer to
registers you might use %eax
in basic
asm
and %%eax
in extended asm
.
On targets such as x86 that support multiple assembler dialects, all
basic asm
blocks use the assembler dialect specified by the
-masm
command-line option (see
x86-options
). Basic asm
provides no
mechanism to provide different assembler strings for different dialects.
For basic asm
with non-empty assembler string GCC assumes the
assembler block does not change any general purpose registers, but it
may read or write any globally accessible variable.
Here is an example of basic asm
for i386:
/* Note that this code will not compile with -masm=intel */
#define DebugBreak() asm("int $3")
::: index
extended asm, assembly language in C, extended
:::
Extended Asm - Assembler Instructions with C Expression Operands
With extended asm
you can read and write C variables from assembler
and perform jumps from assembler code to C labels. Extended asm
syntax
uses colons (:
) to delimit the operand
parameters after the assembler template:
asm asm-qualifiers ( AssemblerTemplate
: OutputOperands
[ : InputOperands
[ : Clobbers ] ])
asm asm-qualifiers ( AssemblerTemplate
: OutputOperands
: InputOperands
: Clobbers
: GotoLabels)
where in the last form, {asm-qualifiers}
{.interpreted-text
role="samp"} contains goto
(and in the first form, not).
The asm
keyword is a GNU extension. When writing code that can be
compiled with -ansi
and the various
-std
options, use __asm__
instead
of asm
(see alternate-keywords
).
Qualifiers
volatile
- The typical use of extended
asm
statements is to manipulate input
values to produce output values. However, yourasm
statements may
also produce side effects. If so, you may need to use thevolatile
qualifier to disable certain optimizations. See
volatile
. inline
- If you use the
inline
qualifier, then for inlining purposes the
size of theasm
statement is taken as the smallest size possible
(seesize-of-an-asm
). goto
- This qualifier informs the compiler that the
asm
statement may
perform a jump to one of the labels listed in the
{GotoLabels}
. See
gotolabels
.
Parameters
{AssemblerTemplate}
-
This is a literal string that is the template for the assembler
code. It is a combination of fixed text and tokens that refer to the
input, output, and goto parameters. See
assemblertemplate
. {OutputOperands}
-
A comma-separated list of the C variables modified by the
instructions in the{AssemblerTemplate}
{.interpreted-text
role="samp"}. An empty list is permitted. See
outputoperands
. {InputOperands}
-
A comma-separated list of C expressions read by the instructions in
the{AssemblerTemplate}
. An empty
list is permitted. Seeinputoperands
{.interpreted-text
role="ref"}. {Clobbers}
-
A comma-separated list of registers or other values changed by the
{AssemblerTemplate}
, beyond those
listed as outputs. An empty list is permitted. See
clobbers-and-scratch-registers
. {GotoLabels}
-
When you are using the
goto
form ofasm
, this section contains
the list of all C labels to which the code in the
{AssemblerTemplate}
may jump. See
gotolabels
.asm
statements may not perform jumps into otherasm
statements,
only to the listed{GotoLabels}
.
GCC's optimizers do not know about other jumps; therefore they
cannot take account of them when deciding how to optimize.The total number of input + output + goto operands is limited to 30.
Remarks
The asm
statement allows you to include assembly instructions directly
within C code. This may help you to maximize performance in
time-sensitive code or to access assembly instructions that are not
readily available to C programs.
Note that extended asm
statements must be inside a function. Only
basic asm
may be outside functions (see basic-asm
{.interpreted-text
role="ref"}). Functions declared with the naked
{.interpreted-text
role="fn-attr"} attribute also require basic asm
(see
function-attributes
).
While the uses of asm
are many and varied, it may help to think of an
asm
statement as a series of low-level instructions that convert input
parameters to output parameters. So a simple (if not particularly
useful) example for i386 using asm
might look like this:
int src = 1;
int dst;
asm ("mov %1, %0\n\t"
"add $1, %0"
: "=r" (dst)
: "r" (src));
printf("%d\n", dst);
This code copies src
to dst
and add 1 to dst
.
::: index
volatile asm, asm volatile
:::
Volatile
GCC's optimizers sometimes discard asm
statements if they determine
there is no need for the output variables. Also, the optimizers may move
code out of loops if they believe that the code will always return the
same result (i.e. none of its input values change between calls). Using
the volatile
qualifier disables these optimizations. asm
statements
that have no output operands and asm goto
statements, are implicitly
volatile.
This i386 code demonstrates a case that does not use (or require) the
volatile
qualifier. If it is performing assertion checking, this code
uses asm
to perform the validation. Otherwise, dwRes
is unreferenced
by any code. As a result, the optimizers can discard the asm
statement, which in turn removes the need for the entire DoCheck
routine. By omitting the volatile
qualifier when it isn't needed you
allow the optimizers to produce the most efficient code possible.
void DoCheck(uint32_t dwSomeValue)
{
uint32_t dwRes;
// Assumes dwSomeValue is not zero.
asm ("bsfl %1,%0"
: "=r" (dwRes)
: "r" (dwSomeValue)
: "cc");
assert(dwRes > 3);
}
The next example shows a case where the optimizers can recognize that
the input (dwSomeValue
) never changes during the execution of the
function and can therefore move the asm
outside the loop to produce
more efficient code. Again, using the volatile
qualifier disables this
type of optimization.
void do_print(uint32_t dwSomeValue)
{
uint32_t dwRes;
for (uint32_t x=0; x < 5; x++)
{
// Assumes dwSomeValue is not zero.
asm ("bsfl %1,%0"
: "=r" (dwRes)
: "r" (dwSomeValue)
: "cc");
printf("%u: %u %u\n", x, dwSomeValue, dwRes);
}
}
The following example demonstrates a case where you need to use the
volatile
qualifier. It uses the x86 rdtsc
instruction, which reads
the computer's time-stamp counter. Without the volatile
qualifier,
the optimizers might assume that the asm
block will always return the
same value and therefore optimize away the second call.
uint64_t msr;
asm volatile ( "rdtsc\n\t" // Returns the time in EDX:EAX.
"shl $32, %%rdx\n\t" // Shift the upper bits left.
"or %%rdx, %0" // 'Or' in the lower bits.
: "=a" (msr)
:
: "rdx");
printf("msr: %llx\n", msr);
// Do other work...
// Reprint the timestamp
asm volatile ( "rdtsc\n\t" // Returns the time in EDX:EAX.
"shl $32, %%rdx\n\t" // Shift the upper bits left.
"or %%rdx, %0" // 'Or' in the lower bits.
: "=a" (msr)
:
: "rdx");
printf("msr: %llx\n", msr);
GCC's optimizers do not treat this code like the non-volatile code in
the earlier examples. They do not move it out of loops or omit it on the
assumption that the result from a previous call is still valid.
Note that the compiler can move even volatile asm
instructions
relative to other code, including across jump instructions. For example,
on many targets there is a system register that controls the rounding
mode of floating-point operations. Setting it with a volatile asm
statement, as in the following PowerPC example, does not work reliably.
asm volatile("mtfsf 255, %0" : : "f" (fpenv));
sum = x + y;
The compiler may move the addition back before the volatile asm
statement. To make it work as expected, add an artificial dependency to
the asm
by referencing a variable in the subsequent code, for example:
asm volatile ("mtfsf 255,%1" : "=X" (sum) : "f" (fpenv));
sum = x + y;
Under certain circumstances, GCC may duplicate (or remove duplicates of)
your assembly code when optimizing. This can lead to unexpected
duplicate symbol errors during compilation if your asm
code defines
symbols or labels. Using %=
(see
assemblertemplate
) may help resolve this
problem.
::: index
asm assembler template
:::
Assembler Template
An assembler template is a literal string containing assembler
instructions. The compiler replaces tokens in the template that refer to
inputs, outputs, and goto labels, and then outputs the resulting string
to the assembler. The string can contain any instructions recognized by
the assembler, including directives. GCC does not parse the assembler
instructions themselves and does not know what they mean or even whether
they are valid assembler input. However, it does count the statements
(see size-of-an-asm
).
You may place multiple assembler instructions together in a single asm
string, separated by the characters normally used in assembly code for
the system. A combination that works in most places is a newline to
break the line, plus a tab character to move to the instruction field
(written as \\n\\t
). Some assemblers
allow semicolons as a line separator. However, note that some assembler
dialects use semicolons to start a comment.
Do not expect a sequence of asm
statements to remain perfectly
consecutive after compilation, even when you are using the volatile
qualifier. If certain instructions need to remain consecutive in the
output, put them in a single multi-instruction asm
statement.
Accessing data from C programs without using input/output operands (such
as by using global symbols directly from the assembler template) may not
work as expected. Similarly, calling functions directly from an
assembler template requires a detailed understanding of the target
assembler and ABI.
Since GCC does not parse the assembler template, it has no visibility of
any symbols it references. This may result in GCC discarding those
symbols as unreferenced unless they are also listed as input, output, or
goto operands.
Special format strings
In addition to the tokens described by the input, output, and goto
operands, these tokens have special meanings in the assembler template:
%%
- Outputs a single
%
into the
assembler code. %=
- Outputs a number that is unique to each instance of the
asm
statement in the entire compilation. This option is useful when
creating local labels and referring to them multiple times in a
single template that generates multiple assembler instructions. %{
%|
%}
- Outputs
{
,|
{.interpreted-text
role="samp"}, and}
characters
(respectively) into the assembler code. When unescaped, these
characters have special meaning to indicate multiple assembler
dialects, as described below.
Multiple assembler dialects in asm templates
On targets such as x86, GCC supports multiple assembler dialects. The
-masm
option controls which dialect
GCC uses as its default for inline assembler. The target-specific
documentation for the -masm
option
contains the list of supported dialects, as well as the default dialect
if the option is not specified. This information may be important to
understand, since assembler code that works correctly when compiled
using one dialect will likely fail if compiled using another. See
x86-options
.
If your code needs to support multiple assembler dialects (for example,
if you are writing public headers that need to support a variety of
compilation options), use constructs of this form:
{ dialect0 | dialect1 | dialect2... }
This construct outputs dialect0
when using dialect #0 to compile the
code, dialect1
for dialect #1, etc. If there are fewer alternatives
within the braces than the number of dialects the compiler supports, the
construct outputs nothing.
For example, if an x86 compiler supports two dialects
(att
, intel
{.interpreted-text
role="samp"}), an assembler template such as this:
"bt{l %[Offset],%[Base] | %[Base],%[Offset]}; jc %l2"
is equivalent to one of
"btl %[Offset],%[Base] ; jc %l2" /* att dialect */
"bt %[Base],%[Offset]; jc %l2" /* intel dialect */
Using that same compiler, this code:
"xchg{l}\t{%%}ebx, %1"
corresponds to either
"xchgl\t%%ebx, %1" /* att dialect */
"xchg\tebx, %1" /* intel dialect */
There is no support for nesting dialect alternatives.
::: index
asm output operands
:::
Output Operands
An asm
statement has zero or more output operands indicating the names
of C variables modified by the assembler code.
In this i386 example, old
(referred to in the template string as %0
)
and *Base
(as %1
) are outputs and Offset
(%2
) is an input:
bool old;
__asm__ ("btsl %2,%1\n\t" // Turn on zero-based bit #Offset in Base.
"sbb %0,%0" // Use the CF to calculate old.
: "=r" (old), "+rm" (*Base)
: "Ir" (Offset)
: "cc");
return old;
Operands are separated by commas. Each operand has this format:
[ [asmSymbolicName] ] constraint (cvariablename)
{asmSymbolicName}
-
Specifies a symbolic name for the operand. Reference the name in the
assembler template by enclosing it in square brackets (i.e.
%[Value]
). The scope of the name is
theasm
statement that contains the definition. Any valid C
variable name is acceptable, including names already defined in the
surrounding code. No two operands within the sameasm
statement
can use the same symbolic name.When not using an
{asmSymbolicName}
{.interpreted-text
role="samp"}, use the (zero-based) position of the operand in the
list of operands in the assembler template. For example if there are
three output operands, use%0
in
the template to refer to the first,%1
{.interpreted-text
role="samp"} for the second, and%2
for the third. {constraint}
-
A string constant specifying constraints on the placement of the
operand; Seeconstraints
, for
details.Output constraints must begin with either
=
{.interpreted-text
role="samp"} (a variable overwriting an existing value) or
+
(when reading and writing). When
using=
, do not assume the location
contains the existing value on entry to theasm
, except when the
operand is tied to an input; seeinputoperands
{.interpreted-text
role="ref"}.After the prefix, there must be one or more additional constraints
(seeconstraints
) that describe
where the value resides. Common constraints include
r
for register and
m
for memory. When you list more
than one possible location (for example,"=rm"
), the compiler
chooses the most efficient one based on the current context. If you
list as many alternates as theasm
statement allows, you permit
the optimizers to produce the best possible code. If you must use a
specific register, but your Machine Constraints do not provide
sufficient control to select the specific register you want, local
register variables may provide a solution (see
local-register-variables
). {cvariablename}
-
Specifies a C lvalue expression to hold the output, typically a
variable name. The enclosing parentheses are a required part of the
syntax.
When the compiler selects the registers to use to represent the output
operands, it does not use any of the clobbered registers (see
clobbers-and-scratch-registers
).
Output operand expressions must be lvalues. The compiler cannot check
whether the operands have data types that are reasonable for the
instruction being executed. For output expressions that are not directly
addressable (for example a bit-field), the constraint must allow a
register. In that case, GCC uses the register as the output of the
asm
, and then stores that register into the output.
Operands using the +
constraint
modifier count as two operands (that is, both as input and output)
towards the total maximum of 30 operands per asm
statement.
Use the &
constraint modifier (see
modifiers
) on all output operands that
must not overlap an input. Otherwise, GCC may allocate the output
operand in the same register as an unrelated input operand, on the
assumption that the assembler code consumes its inputs before producing
outputs. This assumption may be false if the assembler code actually
consists of more than one instruction.
The same problem can occur if one output parameter
({a}
) allows a register constraint and
another output parameter ({b}
) allows a
memory constraint. The code generated by GCC to access the memory
address in {b}
can contain registers
which might be shared by {a}
, and GCC
considers those registers to be inputs to the asm. As above, GCC assumes
that such input registers are consumed before any outputs are written.
This assumption may result in incorrect behavior if the asm
statement
writes to {a}
before using
{b}
. Combining the
&
modifier with the register constraint
on {a}
ensures that modifying
{a}
does not affect the address
referenced by {b}
. Otherwise, the
location of {b}
is undefined if
{a}
is modified before using
{b}
.
asm
supports operand modifiers on operands (for example
%k2
instead of simply
%2
). Typically these qualifiers are
hardware dependent. The list of supported modifiers for x86 is found at
x86operandmodifiers
.
If the C code that follows the asm
makes no use of any of the output
operands, use volatile
for the asm
statement to prevent the
optimizers from discarding the asm
statement as unneeded (see
volatile
).
This code makes no use of the optional
{asmSymbolicName}
. Therefore it
references the first output operand as %0
(were there a second, it
would be %1
, etc). The number of the first input operand is one
greater than that of the last output operand. In this i386 example, that
makes Mask
referenced as %1
:
uint32_t Mask = 1234;
uint32_t Index;
asm ("bsfl %1, %0"
: "=r" (Index)
: "r" (Mask)
: "cc");
That code overwrites the variable Index
(=
{.interpreted-text
role="samp"}), placing the value in a register (r
{.interpreted-text
role="samp"}). Using the generic r
constraint instead of a constraint for a specific register allows the
compiler to pick the register to use, which can result in more efficient
code. This may not be possible if an assembler instruction requires a
specific register.
The following i386 example uses the
{asmSymbolicName}
syntax. It produces
the same result as the code above, but some may consider it more
readable or more maintainable since reordering index numbers is not
necessary when adding or removing operands. The names aIndex
and
aMask
are only used in this example to emphasize which names get used
where. It is acceptable to reuse the names Index
and Mask
.
uint32_t Mask = 1234;
uint32_t Index;
asm ("bsfl %[aMask], %[aIndex]"
: [aIndex] "=r" (Index)
: [aMask] "r" (Mask)
: "cc");
Here are some more examples of output operands.
uint32_t c = 1;
uint32_t d;
uint32_t *e = &c;
asm ("mov %[e], %[d]"
: [d] "=rm" (d)
: [e] "rm" (*e));
Here, d
may either be in a register or in memory. Since the compiler
might already have the current value of the uint32_t
location pointed
to by e
in a register, you can enable it to choose the best location
for d
by specifying both constraints.
::: index
asm flag output operands
:::
Flag Output Operands
Some targets have a special register that holds the 'flags' for the
result of an operation or comparison. Normally, the contents of that
register are either unmodifed by the asm, or the asm
statement is
considered to clobber the contents.
On some targets, a special form of output operand exists by which
conditions in the flags register may be outputs of the asm. The set of
conditions supported are target specific, but the general rule is that
the output variable must be a scalar integer, and the value is boolean.
When supported, the target defines the preprocessor symbol
__GCC_ASM_FLAG_OUTPUTS__
.
Because of the special nature of the flag output operands, the
constraint may not include alternatives.
Most often, the target has only one flags register, and thus is an
implied operand of many instructions. In this case, the operand should
not be referenced within the assembler template via %0
etc, as
there's no corresponding text in the assembly language.
- ARM AArch64
-
The flag output constraints for the ARM family are of the form
=@cc{cond}
where
{cond}
is one of the standard
conditions defined in the ARM ARM forConditionHolds
.eq
-
Z flag set, or equal
ne
-
Z flag clear or not equal
cs
hs
-
C flag set or unsigned greater than equal
cc
lo
-
C flag clear or unsigned less than
mi
-
N flag set or 'minus'
pl
-
N flag clear or 'plus'
vs
-
V flag set or signed overflow
vc
-
V flag clear
hi
-
unsigned greater than
ls
-
unsigned less than equal
ge
-
signed greater than equal
lt
-
signed less than
gt
-
signed greater than
le
-
signed less than equal
The flag output constraints are not supported in thumb1 mode.
- x86 family
-
The flag output constraints for the x86 family are of the form
=@cc{cond}
where
{cond}
is one of the standard
conditions defined in the ISA manual forjcc
orsetcc
.a
- 'above' or unsigned greater than
ae
- 'above or equal' or unsigned greater than or equal
b
- 'below' or unsigned less than
be
- 'below or equal' or unsigned less than or equal
c
- carry flag set
e
z
- 'equal' or zero flag set
g
- signed greater than
ge
- signed greater than or equal
l
- signed less than
le
- signed less than or equal
o
- overflow flag set
p
- parity flag set
s
- sign flag set
na
nae
nb
nbe
nc
ne
ng
nge
nl
nle
no
np
ns
nz
- 'not'
{flag}
, or inverted
versions of those above
::: index
asm input operands, asm expressions
:::
Input Operands
Input operands make values from C variables and expressions available to
the assembly code.
Operands are separated by commas. Each operand has this format:
[ [asmSymbolicName] ] constraint (cexpression)
{asmSymbolicName}
-
Specifies a symbolic name for the operand. Reference the name in the
assembler template by enclosing it in square brackets (i.e.
%[Value]
). The scope of the name is
theasm
statement that contains the definition. Any valid C
variable name is acceptable, including names already defined in the
surrounding code. No two operands within the sameasm
statement
can use the same symbolic name.When not using an
{asmSymbolicName}
{.interpreted-text
role="samp"}, use the (zero-based) position of the operand in the
list of operands in the assembler template. For example if there are
two output operands and three inputs, use%2
{.interpreted-text
role="samp"} in the template to refer to the first input operand,
%3
for the second, and
%4
for the third. {constraint}
-
A string constant specifying constraints on the placement of the
operand; Seeconstraints
, for
details.Input constraint strings may not begin with either
=
or+
{.interpreted-text
role="samp"}. When you list more than one possible location (for
example,"irm"
), the compiler
chooses the most efficient one based on the current context. If you
must use a specific register, but your Machine Constraints do not
provide sufficient control to select the specific register you want,
local register variables may provide a solution (see
local-register-variables
).Input constraints can also be digits (for example,
"0"
). This
indicates that the specified input must be in the same place as the
output constraint at the (zero-based) index in the output constraint
list. When using{asmSymbolicName}
syntax for the output operands, you may use these names (enclosed in
brackets[]
) instead of digits. {cexpression}
-
This is the C variable or expression being passed to the
asm
statement as input. The enclosing parentheses are a required part of
the syntax.
When the compiler selects the registers to use to represent the input
operands, it does not use any of the clobbered registers (see
clobbers-and-scratch-registers
).
If there are no output operands but there are input operands, place two
consecutive colons where the output operands would go:
__asm__ ("some instructions"
: /* No outputs. */
: "r" (Offset / 8));
::: warning
::: title
Warning
:::
Do not modify the contents of input-only operands (except for inputs
tied to outputs). The compiler assumes that on exit from the asm
statement these operands contain the same values as they had before
executing the statement.
:::
It is not possible to use clobbers to inform the compiler that the
values in these inputs are changing. One common work-around is to tie
the changing input variable to an output variable that never gets used.
Note, however, that if the code that follows the asm
statement makes
no use of any of the output operands, the GCC optimizers may discard the
asm
statement as unneeded (see volatile
{.interpreted-text
role="ref"}).
asm
supports operand modifiers on operands (for example
%k2
instead of simply
%2
). Typically these qualifiers are
hardware dependent. The list of supported modifiers for x86 is found at
x86operandmodifiers
.
In this example using the fictitious combine
instruction, the
constraint "0"
for input operand 1 says that it must occupy the same
location as output operand 0. Only input operands may use numbers in
constraints, and they must each refer to an output operand. Only a
number (or the symbolic assembler name) in the constraint can guarantee
that one operand is in the same place as another. The mere fact that
foo
is the value of both operands is not enough to guarantee that they
are in the same place in the generated assembler code.
asm ("combine %2, %0"
: "=r" (foo)
: "0" (foo), "g" (bar));
Here is an example using symbolic names.
asm ("cmoveq %1, %2, %[result]"
: [result] "=r"(result)
: "r" (test), "r" (new), "[result]" (old));
::: index
asm clobbers, asm scratch registers
:::
Clobbers and Scratch Registers
While the compiler is aware of changes to entries listed in the output
operands, the inline asm
code may modify more than just the outputs.
For example, calculations may require additional registers, or the
processor may overwrite a register as a side effect of a particular
assembler instruction. In order to inform the compiler of these changes,
list them in the clobber list. Clobber list items are either register
names or the special clobbers (listed below). Each clobber list item is
a string constant enclosed in double quotes and separated by commas.
Clobber descriptions may not in any way overlap with an input or output
operand. For example, you may not have an operand describing a register
class with one member when listing that register in the clobber list.
Variables declared to live in specific registers (see
explicit-register-variables
) and used as
asm
input or output operands must have no part mentioned in the
clobber description. In particular, there is no way to specify that
input operands get modified without also specifying them as output
operands.
When the compiler selects which registers to use to represent input and
output operands, it does not use any of the clobbered registers. As a
result, clobbered registers are available for any use in the assembler
code.
Another restriction is that the clobber list should not contain the
stack pointer register. This is because the compiler requires the value
of the stack pointer to be the same after an asm
statement as it was
on entry to the statement. However, previous versions of GCC did not
enforce this rule and allowed the stack pointer to appear in the list,
with unclear semantics. This behavior is deprecated and listing the
stack pointer may become an error in future versions of GCC.
Here is a realistic example for the VAX showing the use of clobbered
registers:
asm volatile ("movc3 %0, %1, %2"
: /* No outputs. */
: "g" (from), "g" (to), "g" (count)
: "r0", "r1", "r2", "r3", "r4", "r5", "memory");
Also, there are two special clobber arguments:
"cc"
-
The
"cc"
clobber indicates that the assembler code modifies the
flags register. On some machines, GCC represents the condition codes
as a specific hardware register;"cc"
serves to name this
register. On other machines, condition code handling is different,
and specifying"cc"
has no effect. But it is valid no matter what
the target. "memory"
-
The
"memory"
clobber tells the compiler that the assembly code
performs memory reads or writes to items other than those listed in
the input and output operands (for example, accessing the memory
pointed to by one of the input parameters). To ensure memory
contains correct values, GCC may need to flush specific register
values to memory before executing theasm
. Further, the compiler
does not assume that any values read from memory before anasm
remain unchanged after thatasm
; it reloads them as needed. Using
the"memory"
clobber effectively forms a read/write memory barrier
for the compiler.Note that this clobber does not prevent the processor from doing
speculative reads past theasm
statement. To prevent that, you
need processor-specific fence instructions.
Flushing registers to memory has performance implications and may be an
issue for time-sensitive code. You can provide better information to GCC
to avoid this, as shown in the following examples. At a minimum,
aliasing rules allow GCC to know what memory doesn't need to be
flushed.
Here is a fictitious sum of squares instruction, that takes two pointers
to floating point values in memory and produces a floating point
register output. Notice that x
, and y
both appear twice in the asm
parameters, once to specify memory accessed, and once to specify a base
register used by the asm
. You won't normally be wasting a register by
doing this as GCC can use the same register for both purposes. However,
it would be foolish to use both %1
and %3
for x
in this asm
and
expect them to be the same. In fact, %3
may well not be a register. It
might be a symbolic memory reference to the object pointed to by x
.
asm ("sumsq %0, %1, %2"
: "+f" (result)
: "r" (x), "r" (y), "m" (*x), "m" (*y));
Here is a fictitious *z++ = *x++ * *y++
instruction. Notice that the
x
, y
and z
pointer registers must be specified as input/output
because the asm
modifies them.
asm ("vecmul %0, %1, %2"
: "+r" (z), "+r" (x), "+r" (y), "=m" (*z)
: "m" (*x), "m" (*y));
An x86 example where the string memory argument is of unknown length.
asm("repne scasb"
: "=c" (count), "+D" (p)
: "m" (*(const char (*)[]) p), "0" (-1), "a" (0));
If you know the above will only be reading a ten byte array then you
could instead use a memory input like: "m" (*(const char (*)[10]) p)
.
Here is an example of a PowerPC vector scale implemented in assembly,
complete with vector and condition code clobbers, and some initialized
offset registers that are unchanged by the asm
.
void
dscal (size_t n, double *x, double alpha)
{
asm ("/* lots of asm here */"
: "+m" (*(double (*)[n]) x), "+&r" (n), "+b" (x)
: "d" (alpha), "b" (32), "b" (48), "b" (64),
"b" (80), "b" (96), "b" (112)
: "cr0",
"vs32","vs33","vs34","vs35","vs36","vs37","vs38","vs39",
"vs40","vs41","vs42","vs43","vs44","vs45","vs46","vs47");
}
Rather than allocating fixed registers via clobbers to provide scratch
registers for an asm
statement, an alternative is to define a variable
and make it an early-clobber output as with a2
and a3
in the example
below. This gives the compiler register allocator more freedom. You can
also define a variable and make it an output tied to an input as with
a0
and a1
, tied respectively to ap
and lda
. Of course, with tied
outputs your asm
can't use the input value after modifying the output
register since they are one and the same register. What's more, if you
omit the early-clobber on the output, it is possible that GCC might
allocate the same register to another of the inputs if GCC could prove
they had the same value on entry to the asm
. This is why a1
has an
early-clobber. Its tied input, lda
might conceivably be known to have
the value 16 and without an early-clobber share the same register as
%11
. On the other hand, ap
can't be the same as any of the other
inputs, so an early-clobber on a0
is not needed. It is also not
desirable in this case. An early-clobber on a0
would cause GCC to
allocate a separate register for the "m" (*(const double (*)[]) ap)
input. Note that tying an input to an output is the way to set up an
initialized temporary register modified by an asm
statement. An input
not tied to an output is assumed by GCC to be unchanged, for example
"b" (16)
below sets up %11
to 16, and GCC might use that register in
following code if the value 16 happened to be needed. You can even use a
normal asm
output for a scratch if all inputs that might share the
same register are consumed before the scratch is used. The VSX registers
clobbered by the asm
statement could have used this technique except
for GCC's limit on the number of asm
parameters.
static void
dgemv_kernel_4x4 (long n, const double *ap, long lda,
const double *x, double *y, double alpha)
{
double *a0;
double *a1;
double *a2;
double *a3;
__asm__
(
/* lots of asm here */
"#n=%1 ap=%8=%12 lda=%13 x=%7=%10 y=%0=%2 alpha=%9 o16=%11\n"
"#a0=%3 a1=%4 a2=%5 a3=%6"
:
"+m" (*(double (*)[n]) y),
"+&r" (n), // 1
"+b" (y), // 2
"=b" (a0), // 3
"=&b" (a1), // 4
"=&b" (a2), // 5
"=&b" (a3) // 6
:
"m" (*(const double (*)[n]) x),
"m" (*(const double (*)[]) ap),
"d" (alpha), // 9
"r" (x), // 10
"b" (16), // 11
"3" (ap), // 12
"4" (lda) // 13
:
"cr0",
"vs32","vs33","vs34","vs35","vs36","vs37",
"vs40","vs41","vs42","vs43","vs44","vs45","vs46","vs47"
);
}
::: index
asm goto labels
:::
Goto Labels
asm goto
allows assembly code to jump to one or more C labels. The
{GotoLabels}
section in an asm goto
statement contains a comma-separated list of all C labels to which the
assembler code may jump. GCC assumes that asm
execution falls through
to the next statement (if this is not the case, consider using the
__builtin_unreachable
intrinsic after the asm
statement).
Optimization of asm goto
may be improved by using the
hot
and cold
{.interpreted-text
role="fn-attr"} label attributes (see
label-attributes
).
If the assembler code does modify anything, use the "memory"
clobber
to force the optimizers to flush all register values to memory and
reload them if necessary after the asm
statement.
Also note that an asm goto
statement is always implicitly considered
volatile.
Be careful when you set output operands inside asm goto
only on some
possible control flow paths. If you don't set up the output on given
path and never use it on this path, it is okay. Otherwise, you should
use +
constraint modifier meaning that
the operand is input and output one. With this modifier you will have
the correct values on all possible paths from the asm goto
.
To reference a label in the assembler template, prefix it with
%l
(lowercase L
{.interpreted-text
role="samp"}) followed by its (zero-based) position in
{GotoLabels}
plus the number of input
and output operands. Output operand with constraint modifier
+
is counted as two operands because it
is considered as one output and one input operand. For example, if the
asm
has three inputs, one output operand with constraint modifier
+
and one output operand with
constraint modifier =
and references
two labels, refer to the first label as %l6
{.interpreted-text
role="samp"} and the second as %l7
).
Alternately, you can reference labels using the actual C label name
enclosed in brackets. For example, to reference a label named carry
,
you can use %l[carry]
. The label must
still be listed in the {GotoLabels}
section when using this approach. It is better to use the named
references for labels as in this case you can avoid counting input and
output operands and special treatment of output operands with constraint
modifier +
.
Here is an example of asm goto
for i386:
asm goto (
"btl %1, %0\n\t"
"jc %l2"
: /* No outputs. */
: "r" (p1), "r" (p2)
: "cc"
: carry);
return 0;
carry:
return 1;
The following example shows an asm goto
that uses a memory clobber.
int frob(int x)
{
int y;
asm goto ("frob %%r5, %1; jc %l[error]; mov (%2), %%r5"
: /* No outputs. */
: "r"(x), "r"(&y)
: "r5", "memory"
: error);
return y;
error:
return -1;
}
The following example shows an asm goto
that uses an output.
int foo(int count)
{
asm goto ("dec %0; jb %l[stop]"
: "+r" (count)
:
:
: stop);
return count;
stop:
return 0;
}
The following artificial example shows an asm goto
that sets up an
output only on one path inside the asm goto
. Usage of constraint
modifier =
instead of +
would be wrong as factor
is used on all
paths from the asm goto
.
int foo(int inp)
{
int factor = 0;
asm goto ("cmp %1, 10; jb %l[lab]; mov 2, %0"
: "+r" (factor)
: "r" (inp)
:
: lab);
lab:
return inp * factor; /* return 2 * inp or 0 if inp < 10 */
}
x86 Operand Modifiers
References to input, output, and goto operands in the assembler template
of extended asm
statements can use modifiers to affect the way the
operands are formatted in the code output to the assembler. For example,
the following code uses the h
and
b
modifiers for x86:
uint16_t num;
asm volatile ("xchg %h0, %b0" : "+a" (num) );
These modifiers generate this assembler code:
xchg %ah, %al
The rest of this discussion uses the following code for illustrative
purposes.
int main()
{
int iInt = 1;
top:
asm volatile goto ("some assembler instructions here"
: /* No outputs. */
: "q" (iInt), "X" (sizeof(unsigned char) + 1), "i" (42)
: /* No clobbers. */
: top);
}
With no modifiers, this is what the output from the operands would be
for the att
and
intel
dialects of assembler:
Operand att
intel
%0
%eax
eax
%1
$2
2
%3
$.L3
OFFSET FLAT:.L3
%4
$8
8
%5
%xmm0
xmm0
%7
$0
0
The table below shows the list of supported modifiers and their effects.
Modifier Description Operand att
{.interpreted-text intel
{.interpreted-text
role="samp"} role="samp"}
A
Print an absolute memory reference. %A0
*%rax
rax
b
Print the QImode name of the register. %b0
%al
al
B
print the opcode suffix of b. %B0
b
c
Require a constant operand and print %c1
2
2
the constant expression with no
punctuation.
d
print duplicated register operand for %d5
%xmm0, %xmm0
xmm0, xmm0
AVX instruction.
E
Print the address in Double Integer %E1
%(rax)
[rax]
(DImode) mode (8 bytes) when the target
is 64-bit. Otherwise mode is
unspecified (VOIDmode).
g
Print the V16SFmode name of the %g0
%zmm0
zmm0
register.
h
Print the QImode name for a 'high' %h0
%ah
ah
register.
H
Add 8 bytes to an offsettable memory %H0
8(%rax)
8[rax]
reference. Useful when accessing the
high 8 bytes of SSE values. For a
memref in (%rax), it generates
k
Print the SImode name of the register. %k0
%eax
eax
l
Print the label name with no %l3
.L3
.L3
punctuation.
L
print the opcode suffix of l. %L0
l
N
print maskz. %N7
{z}
{z}
p
Print raw symbol name (without %p2
42
42
syntax-specific prefixes).
P
If used for a function, print the PLT
suffix and generate PIC code. For
example, emit foo@PLT
instead of
'foo' for the function foo(). If used
for a constant, drop all
syntax-specific prefixes and issue the
bare constant. See p
above.
q
Print the DImode name of the register. %q0
%rax
rax
Q
print the opcode suffix of q. %Q0
q
R
print embedded rounding and sae. %R4
{rn-sae},
, {rn-sae}
r
print only sae. %r4
{sae},
, {sae}
s
print a shift double count, followed by %s1
$2,
2,
the assemblers argument delimiterprint
the opcode suffix of s.
S
print the opcode suffix of s. %S0
s
t
print the V8SFmode name of the %t5
%ymm0
ymm0
register.
T
print the opcode suffix of t. %T0
t
V
print naked full integer register name %V0
eax
eax
without %.
w
Print the HImode name of the register. %w0
%ax
ax
W
print the opcode suffix of w. %W0
w
x
print the V4SFmode name of the %x5
%xmm0
xmm0
register.
y
print "st(0)" instead of "st" as a %y6
%st(0)
st(0)
register.
z
Print the opcode suffix for the size of %z0
l
the current integer operand (one of b
/ w
/ l
/ q
).
Z
Like z
, with special suffixes for x87
instructions.
x86 Floating-Point asm Operands
On x86 targets, there are several rules on the usage of stack-like
registers in the operands of an asm
. These rules apply only to the
operands that are stack-like registers:
-
Given a set of input registers that die in an
asm
, it is necessary
to know which are implicitly popped by theasm
, and which must be
explicitly popped by GCC.An input register that is implicitly popped by the
asm
must be
explicitly clobbered, unless it is constrained to match an output
operand. -
For any input register that is implicitly popped by an
asm
, it is
necessary to know how to adjust the stack to compensate for the pop.
If any non-popped input is closer to the top of the reg-stack than
the implicitly popped register, it would not be possible to know
what the stack looked like---it's not clear how the rest of the
stack 'slides up'.All implicitly popped input registers must be closer to the top of
the reg-stack than any input that is not implicitly popped.It is possible that if an input dies in an
asm
, the compiler might
use the input register for an output reload. Consider this example:asm ("foo" : "=t" (a) : "f" (b));
This code says that input
b
is not popped by theasm
, and that
theasm
pushes a result onto the reg-stack, i.e., the stack is one
deeper after theasm
than it was before. But, it is possible that
reload may think that it can use the same register for both the
input and the output.To prevent this from happening, if any input operand uses the
f
constraint, all output register
constraints must use the&
early-clobber modifier.The example above is correctly written as:
asm ("foo" : "=&t" (a) : "f" (b));
-
Some operands need to be in particular places on the stack. All
output operands fall in this category---GCC has no other way to
know which registers the outputs appear in unless you indicate this
in the constraints.Output operands must specifically indicate which register an output
appears in after anasm
.=f
is
not allowed: the operand constraints must select a class with a
single register. -
Output operands may not be 'inserted' between existing stack
registers. Since no 387 opcode uses a read/write operand, all output
operands are dead before theasm
, and are pushed by theasm
. It
makes no sense to push anywhere but the top of the reg-stack.Output operands must start at the top of the reg-stack: output
operands may not 'skip' a register. -
Some
asm
statements may need extra stack space for internal
calculations. This can be guaranteed by clobbering stack registers
unrelated to the inputs and outputs.
This asm
takes one input, which is internally popped, and produces two
outputs.
asm ("fsincos" : "=t" (cos), "=u" (sin) : "0" (inp));
This asm
takes two inputs, which are popped by the fyl2xp1
opcode,
and replaces them with one output. The st(1)
clobber is necessary for
the compiler to know that fyl2xp1
pops both inputs.
asm ("fyl2xp1" : "=t" (result) : "0" (x), "u" (y) : "st(1)");
MSP430 Operand Modifiers
The list below describes the supported modifiers and their effects for
MSP430.
Modifier Description
A
Select low 16-bits of the constant/register/memory operand.
B
Select high 16-bits of the constant/register/memory operand.
C
Select bits 32-47 of the constant/register/memory operand.
D
Select bits 48-63 of the constant/register/memory operand.
H
Equivalent to B
(for backwards compatibility).
I
Print the inverse (logical NOT
) of the constant value.
J
Print an integer without a #
prefix.
L
Equivalent to A
(for backwards compatibility).
O
Offset of the current frame from the top of the stack.
Q
Use the A
instruction postfix.
R
Inverse of condition code, for unsigned comparisons.
W
Subtract 16 from the constant value.
X
Use the X
instruction postfix.
Y
Subtract 4 from the constant value.
Z
Subtract 1 from the constant value.
b
Append .B
, .W
or .A
to the instruction, depending on the
mode.
d
Offset 1 byte of a memory reference or constant value.
e
Offset 3 bytes of a memory reference or constant value.
f
Offset 5 bytes of a memory reference or constant value.
g
Offset 7 bytes of a memory reference or constant value.
p
Print the value of 2, raised to the power of the given
constant. Used to select the specified bit position.
r
Inverse of condition code, for signed comparisons.
x
Equivialent to X
, but only for pointers.
::: index
operand constraints, asm, constraints, asm, asm constraints
:::
Constraints for asm Operands
Here are specific details on what constraint letters you can use with
asm
operands. Constraints can say whether an operand may be in a
register, and which kinds of register; whether the operand can be a
memory reference, and which kinds of address; whether the operand may be
an immediate constant, and which possible values it may have.
Constraints can also require two operands to match. Side-effects aren't
allowed in operands of inline asm
, unless <
{.interpreted-text
role="samp"} or >
constraints are used,
because there is no guarantee that the side effects will happen exactly
once in an instruction that can update the addressing register.
::: {.toctree maxdepth="2"}
:::
::: index
assembler names for identifiers, names used in assembler code,
identifiers, names in assembler code
:::
Controlling Names Used in Assembler Code
You can specify the name to be used in the assembler code for a C
function or variable by writing the asm
(or __asm__
) keyword after
the declarator. It is up to you to make sure that the assembler names
you choose do not conflict with any other assembler symbols, or
reference registers.
Assembler names for data
This sample shows how to specify the assembler name for data:
int foo asm ("myfoo") = 2;
This specifies that the name to be used for the variable foo
in the
assembler code should be myfoo
rather
than the usual _foo
.
On systems where an underscore is normally prepended to the name of a C
variable, this feature allows you to define names for the linker that do
not start with an underscore.
GCC does not support using this feature with a non-static local variable
since such variables do not have assembler names. If you are trying to
put the variable in a particular register, see
explicit-register-variables
.
Assembler names for functions
To specify the assembler name for functions, write a declaration for the
function before its definition and put asm
there, like this:
int func (int x, int y) asm ("MYFUNC");
int func (int x, int y)
{
/* ... */
This specifies that the name to be used for the function func
in the
assembler code should be MYFUNC
.
Variables in Specified Registers
::: index
explicit register variables, variables in specified registers, specified
registers
:::
::: {#explicit-reg-vars}
GNU C allows you to associate specific hardware registers with C
variables. In almost all cases, allowing the compiler to assign
registers produces the best code. However under certain unusual
circumstances, more precise control over the variable storage is
required.
:::
Both global and local variables can be associated with a register. The
consequences of performing this association are very different between
the two, as explained in the sections below.
::: {.toctree maxdepth="2"}
:::
Defining Global Register Variables
::: index
global register variables, registers, global variables in, registers,
global allocation
:::
::: {#global-reg-vars}
You can define a global register variable and associate it with a
specified register like this:
:::
register int *foo asm ("r12");
Here r12
is the name of the register that should be used. Note that
this is the same syntax used for defining local register variables, but
for a global variable the declaration appears outside a function. The
register
keyword is required, and cannot be combined with static
.
The register name must be a valid register name for the target platform.
Do not use type qualifiers such as const
and volatile
, as the
outcome may be contrary to expectations. In particular, using the
volatile
qualifier does not fully prevent the compiler from optimizing
accesses to the register.
Registers are a scarce resource on most systems and allowing the
compiler to manage their usage usually results in the best code.
However, under special circumstances it can make sense to reserve some
globally. For example this may be useful in programs such as programming
language interpreters that have a couple of global variables that are
accessed very often.
After defining a global register variable, for the current compilation
unit:
- If the register is a call-saved register, call ABI is affected: the
register will not be restored in function epilogue sequences after
the variable has been assigned. Therefore, functions cannot safely
return to callers that assume standard ABI. - Conversely, if the register is a call-clobbered register, making
calls to functions that use standard ABI may lose contents of the
variable. Such calls may be created by the compiler even if none are
evident in the original program, for example when libgcc functions
are used to make up for unavailable instructions. - Accesses to the variable may be optimized as usual and the register
remains available for allocation and use in any computations,
provided that observable values of the variable are not affected. - If the variable is referenced in inline assembly, the type of access
must be provided to the compiler via constraints (see
constraints
). Accesses from basic
asms are not supported.
Note that these points only apply to code that is compiled with the
definition. The behavior of code that is merely linked in (for example
code from libraries) is not affected.
If you want to recompile source files that do not actually use your
global register variable so they do not use the specified register for
any other purpose, you need not actually add the global register
declaration to their source code. It suffices to specify the compiler
option -ffixed-reg
(see
code-gen-options
) to reserve the
register.
Declaring the variable
Global register variables cannot have initial values, because an
executable file has no means to supply initial contents for a register.
When selecting a register, choose one that is normally saved and
restored by function calls on your machine. This ensures that code which
is unaware of this reservation (such as library routines) will restore
it before returning.
On machines with register windows, be sure to choose a global register
that is not affected magically by the function call mechanism.
::: index
qsort, and global register variables
:::
Using the variable
When calling routines that are not aware of the reservation, be cautious
if those routines call back into code which uses them. As an example, if
you call the system library version of qsort
, it may clobber your
registers during execution, but (if you have selected appropriate
registers) it will restore them before returning. However it will not
restore them before calling qsort
's comparison function. As a
result, global values will not reliably be available to the comparison
function unless the qsort
function itself is rebuilt.
Similarly, it is not safe to access the global register variables from
signal handlers or from more than one thread of control. Unless you
recompile them specially for the task at hand, the system library
routines may temporarily use the register for other things. Furthermore,
since the register is not reserved exclusively for the variable,
accessing it from handlers of asynchronous signals may observe unrelated
temporary values residing in the register.
::: index
register variable after longjmp, global register after longjmp, value
after longjmp, longjmp, setjmp
:::
On most machines, longjmp
restores to each global register variable
the value it had at the time of the setjmp
. On some machines, however,
longjmp
does not change the value of global register variables. To be
portable, the function that called setjmp
should make other
arrangements to save the values of the global register variables, and to
restore them in a longjmp
. This way, the same thing happens regardless
of what longjmp
does.
Specifying Registers for Local Variables
::: index
local variables, specifying registers, specifying registers for local
variables, registers for local variables
:::
::: {#local-reg-vars}
You can define a local register variable and associate it with a
specified register like this:
:::
register int *foo asm ("r12");
Here r12
is the name of the register that should be used. Note that
this is the same syntax used for defining global register variables, but
for a local variable the declaration appears within a function. The
register
keyword is required, and cannot be combined with static
.
The register name must be a valid register name for the target platform.
Do not use type qualifiers such as const
and volatile
, as the
outcome may be contrary to expectations. In particular, when the const
qualifier is used, the compiler may substitute the variable with its
initializer in asm
statements, which may cause the corresponding
operand to appear in a different register.
As with global register variables, it is recommended that you choose a
register that is normally saved and restored by function calls on your
machine, so that calls to library routines will not clobber it.
The only supported use for this feature is to specify registers for
input and output operands when calling Extended asm
(see
extended-asm
). This may be necessary if
the constraints for a particular machine don't provide sufficient
control to select the desired register. To force an operand into a
register, create a local variable and specify the register name after
the variable's declaration. Then use the local variable for the asm
operand and specify any constraint letter that matches the register:
register int *p1 asm ("r0") = ...;
register int *p2 asm ("r1") = ...;
register int *result asm ("r0");
asm ("sysint" : "=r" (result) : "0" (p1), "r" (p2));
::: warning
::: title
Warning
:::
In the above example, be aware that a register (for example r0
) can be
call-clobbered by subsequent code, including function calls and library
calls for arithmetic operators on other variables (for example the
initialization of p2
). In this case, use temporary variables for
expressions between the register assignments:
:::
int t1 = ...;
register int *p1 asm ("r0") = ...;
register int *p2 asm ("r1") = t1;
register int *result asm ("r0");
asm ("sysint" : "=r" (result) : "0" (p1), "r" (p2));
Defining a register variable does not reserve the register. Other than
when invoking the Extended asm
, the contents of the specified register
are not guaranteed. For this reason, the following uses are explicitly
not supported. If they appear to work, it is only happenstance, and
may stop working as intended due to (seemingly) unrelated changes in
surrounding code, or even minor changes in the optimization of a future
version of gcc:
- Passing parameters to or from Basic
asm
- Passing parameters to or from Extended
asm
without using input or
output operands. - Passing parameters to or from routines written in assembler (or
other languages) using non-standard calling conventions.
Some developers use Local Register Variables in an attempt to improve
gcc's allocation of registers, especially in large functions. In this
case the register name is essentially a hint to the register allocator.
While in some instances this can generate better code, improvements are
subject to the whims of the allocator/optimizers. Since there are no
guarantees that your improvements won't be lost, this usage of Local
Register Variables is discouraged.
On the MIPS platform, there is related use for local register variables
with slightly different characteristics (see
gccint:mips-coprocessors
).
Size of an asm
Some targets require that GCC track the size of each instruction used in
order to generate correct code. Because the final length of the code
produced by an asm
statement is only known by the assembler, GCC must
make an estimate as to how big it will be. It does this by counting the
number of instructions in the pattern of the asm
and multiplying that
by the length of the longest instruction supported by that processor.
(When working out the number of instructions, it assumes that any
occurrence of a newline or of whatever statement separator character is
supported by the assembler ---typically ;
{.interpreted-text
role="samp"} --- indicates the end of an instruction.)
Normally, GCC's estimate is adequate to ensure that correct code is
generated, but it is possible to confuse the compiler if you use pseudo
instructions or assembler macros that expand into multiple real
instructions, or if you use assembler directives that expand to more
space in the object file than is needed for a single instruction. If
this happens then the assembler may produce a diagnostic saying that a
label is unreachable.
::: index
asm inline
:::
This size is also used for inlining decisions. If you use asm inline
instead of just asm
, then for inlining purposes the size of the asm is
taken as the minimum size, ignoring how many instructions GCC thinks it
is.
本文来自博客园,作者:ijpq,转载请注明原文链接:https://www.cnblogs.com/ijpq/p/18287218