unsafe 非类型安全指针 Uint and uintptr in golang 非类型安全指针
小结
1、
概念 转换
A uintptr is an integer, not a reference.
Converting a Pointer to a uintptr creates an integer value with no pointer semantics.
Even if a uintptr holds the address of some object,the garbage collector will not update that uintptr's value
if the object moves, nor will that uintptr keep the object from being reclaimed.
Converting a Pointer to a uintptr produces the memory address of the value pointed at, as an integer.
a Pointer----> <----a pointer value of any type
a Pointer----> <----a uintptr
1.1.2
a Pointer 1---->a uintptr----> a Pointer 2
应用:
access fields in a struct or elements of an array:
equivalent to f := unsafe.Pointer(&s.f)
f := unsafe.Pointer(uintptr(unsafe.Pointer(&s)) + unsafe.Offsetof(s.f))
equivalent to e := unsafe.Pointer(&x[i])
e := unsafe.Pointer(uintptr(unsafe.Pointer(&x[0])) + i*unsafe.Sizeof(x[0]))
If p points into an allocated object, it can be advanced through the object
by conversion to uintptr, addition of an offset, and conversion back to Pointer.
将指针转化为uintptr,再加上偏移量,最后转换回指针:可以用来访问结构体的字段或数组的元素
注意还有其他用法,详情请见 源码 src\unsafe\unsafe.go
unsafe\unsafe.go
// Copyright 2009 The Go Authors. All rights reserved. // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. /* Package unsafe contains operations that step around the type safety of Go programs. Packages that import unsafe may be non-portable and are not protected by the Go 1 compatibility guidelines. */ package unsafe // ArbitraryType is here for the purposes of documentation only and is not actually // part of the unsafe package. It represents the type of an arbitrary Go expression. type ArbitraryType int // IntegerType is here for the purposes of documentation only and is not actually // part of the unsafe package. It represents any arbitrary integer type. type IntegerType int // Pointer represents a pointer to an arbitrary type. There are four special operations // available for type Pointer that are not available for other types: // - A pointer value of any type can be converted to a Pointer. // - A Pointer can be converted to a pointer value of any type. // - A uintptr can be converted to a Pointer. // - A Pointer can be converted to a uintptr. // Pointer therefore allows a program to defeat the type system and read and write // arbitrary memory. It should be used with extreme care. // // The following patterns involving Pointer are valid. // Code not using these patterns is likely to be invalid today // or to become invalid in the future. // Even the valid patterns below come with important caveats. // // Running "go vet" can help find uses of Pointer that do not conform to these patterns, // but silence from "go vet" is not a guarantee that the code is valid. // // (1) Conversion of a *T1 to Pointer to *T2. // // Provided that T2 is no larger than T1 and that the two share an equivalent // memory layout, this conversion allows reinterpreting data of one type as // data of another type. An example is the implementation of // math.Float64bits: // // func Float64bits(f float64) uint64 { // return *(*uint64)(unsafe.Pointer(&f)) // } // // (2) Conversion of a Pointer to a uintptr (but not back to Pointer). // // Converting a Pointer to a uintptr produces the memory address of the value // pointed at, as an integer. The usual use for such a uintptr is to print it. // // Conversion of a uintptr back to Pointer is not valid in general. // // A uintptr is an integer, not a reference. // Converting a Pointer to a uintptr creates an integer value // with no pointer semantics. // Even if a uintptr holds the address of some object, // the garbage collector will not update that uintptr's value // if the object moves, nor will that uintptr keep the object // from being reclaimed. // // The remaining patterns enumerate the only valid conversions // from uintptr to Pointer. // // (3) Conversion of a Pointer to a uintptr and back, with arithmetic. // // If p points into an allocated object, it can be advanced through the object // by conversion to uintptr, addition of an offset, and conversion back to Pointer. // // p = unsafe.Pointer(uintptr(p) + offset) // // The most common use of this pattern is to access fields in a struct // or elements of an array: // // // equivalent to f := unsafe.Pointer(&s.f) // f := unsafe.Pointer(uintptr(unsafe.Pointer(&s)) + unsafe.Offsetof(s.f)) // // // equivalent to e := unsafe.Pointer(&x[i]) // e := unsafe.Pointer(uintptr(unsafe.Pointer(&x[0])) + i*unsafe.Sizeof(x[0])) // // It is valid both to add and to subtract offsets from a pointer in this way. // It is also valid to use &^ to round pointers, usually for alignment. // In all cases, the result must continue to point into the original allocated object. // // Unlike in C, it is not valid to advance a pointer just beyond the end of // its original allocation: // // // INVALID: end points outside allocated space. // var s thing // end = unsafe.Pointer(uintptr(unsafe.Pointer(&s)) + unsafe.Sizeof(s)) // // // INVALID: end points outside allocated space. // b := make([]byte, n) // end = unsafe.Pointer(uintptr(unsafe.Pointer(&b[0])) + uintptr(n)) // // Note that both conversions must appear in the same expression, with only // the intervening arithmetic between them: // // // INVALID: uintptr cannot be stored in variable // // before conversion back to Pointer. // u := uintptr(p) // p = unsafe.Pointer(u + offset) // // Note that the pointer must point into an allocated object, so it may not be nil. // // // INVALID: conversion of nil pointer // u := unsafe.Pointer(nil) // p := unsafe.Pointer(uintptr(u) + offset) // // (4) Conversion of a Pointer to a uintptr when calling syscall.Syscall. // // The Syscall functions in package syscall pass their uintptr arguments directly // to the operating system, which then may, depending on the details of the call, // reinterpret some of them as pointers. // That is, the system call implementation is implicitly converting certain arguments // back from uintptr to pointer. // // If a pointer argument must be converted to uintptr for use as an argument, // that conversion must appear in the call expression itself: // // syscall.Syscall(SYS_READ, uintptr(fd), uintptr(unsafe.Pointer(p)), uintptr(n)) // // The compiler handles a Pointer converted to a uintptr in the argument list of // a call to a function implemented in assembly by arranging that the referenced // allocated object, if any, is retained and not moved until the call completes, // even though from the types alone it would appear that the object is no longer // needed during the call. // // For the compiler to recognize this pattern, // the conversion must appear in the argument list: // // // INVALID: uintptr cannot be stored in variable // // before implicit conversion back to Pointer during system call. // u := uintptr(unsafe.Pointer(p)) // syscall.Syscall(SYS_READ, uintptr(fd), u, uintptr(n)) // // (5) Conversion of the result of reflect.Value.Pointer or reflect.Value.UnsafeAddr // from uintptr to Pointer. // // Package reflect's Value methods named Pointer and UnsafeAddr return type uintptr // instead of unsafe.Pointer to keep callers from changing the result to an arbitrary // type without first importing "unsafe". However, this means that the result is // fragile and must be converted to Pointer immediately after making the call, // in the same expression: // // p := (*int)(unsafe.Pointer(reflect.ValueOf(new(int)).Pointer())) // // As in the cases above, it is invalid to store the result before the conversion: // // // INVALID: uintptr cannot be stored in variable // // before conversion back to Pointer. // u := reflect.ValueOf(new(int)).Pointer() // p := (*int)(unsafe.Pointer(u)) // // (6) Conversion of a reflect.SliceHeader or reflect.StringHeader Data field to or from Pointer. // // As in the previous case, the reflect data structures SliceHeader and StringHeader // declare the field Data as a uintptr to keep callers from changing the result to // an arbitrary type without first importing "unsafe". However, this means that // SliceHeader and StringHeader are only valid when interpreting the content // of an actual slice or string value. // // var s string // hdr := (*reflect.StringHeader)(unsafe.Pointer(&s)) // case 1 // hdr.Data = uintptr(unsafe.Pointer(p)) // case 6 (this case) // hdr.Len = n // // In this usage hdr.Data is really an alternate way to refer to the underlying // pointer in the string header, not a uintptr variable itself. // // In general, reflect.SliceHeader and reflect.StringHeader should be used // only as *reflect.SliceHeader and *reflect.StringHeader pointing at actual // slices or strings, never as plain structs. // A program should not declare or allocate variables of these struct types. // // // INVALID: a directly-declared header will not hold Data as a reference. // var hdr reflect.StringHeader // hdr.Data = uintptr(unsafe.Pointer(p)) // hdr.Len = n // s := *(*string)(unsafe.Pointer(&hdr)) // p possibly already lost // type Pointer *ArbitraryType // Sizeof takes an expression x of any type and returns the size in bytes // of a hypothetical variable v as if v was declared via var v = x. // The size does not include any memory possibly referenced by x. // For instance, if x is a slice, Sizeof returns the size of the slice // descriptor, not the size of the memory referenced by the slice. // The return value of Sizeof is a Go constant. func Sizeof(x ArbitraryType) uintptr // Offsetof returns the offset within the struct of the field represented by x, // which must be of the form structValue.field. In other words, it returns the // number of bytes between the start of the struct and the start of the field. // The return value of Offsetof is a Go constant. func Offsetof(x ArbitraryType) uintptr // Alignof takes an expression x of any type and returns the required alignment // of a hypothetical variable v as if v was declared via var v = x. // It is the largest value m such that the address of v is always zero mod m. // It is the same as the value returned by reflect.TypeOf(x).Align(). // As a special case, if a variable s is of struct type and f is a field // within that struct, then Alignof(s.f) will return the required alignment // of a field of that type within a struct. This case is the same as the // value returned by reflect.TypeOf(s.f).FieldAlign(). // The return value of Alignof is a Go constant. func Alignof(x ArbitraryType) uintptr // The function Add adds len to ptr and returns the updated pointer // Pointer(uintptr(ptr) + uintptr(len)). // The len argument must be of integer type or an untyped constant. // A constant len argument must be representable by a value of type int; // if it is an untyped constant it is given type int. // The rules for valid uses of Pointer still apply. func Add(ptr Pointer, len IntegerType) Pointer // The function Slice returns a slice whose underlying array starts at ptr // and whose length and capacity are len. // Slice(ptr, len) is equivalent to // // (*[len]ArbitraryType)(unsafe.Pointer(ptr))[:] // // except that, as a special case, if ptr is nil and len is zero, // Slice returns nil. // // The len argument must be of integer type or an untyped constant. // A constant len argument must be non-negative and representable by a value of type int; // if it is an untyped constant it is given type int. // At run time, if len is negative, or if ptr is nil and len is not zero, // a run-time panic occurs. func Slice(ptr *ArbitraryType, len IntegerType) []ArbitraryType
math\unsafe.go
// Copyright 2009 The Go Authors. All rights reserved. // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. package math import "unsafe" // Float32bits returns the IEEE 754 binary representation of f, // with the sign bit of f and the result in the same bit position. // Float32bits(Float32frombits(x)) == x. func Float32bits(f float32) uint32 { return *(*uint32)(unsafe.Pointer(&f)) } // Float32frombits returns the floating-point number corresponding // to the IEEE 754 binary representation b, with the sign bit of b // and the result in the same bit position. // Float32frombits(Float32bits(x)) == x. func Float32frombits(b uint32) float32 { return *(*float32)(unsafe.Pointer(&b)) } // Float64bits returns the IEEE 754 binary representation of f, // with the sign bit of f and the result in the same bit position, // and Float64bits(Float64frombits(x)) == x. func Float64bits(f float64) uint64 { return *(*uint64)(unsafe.Pointer(&f)) } // Float64frombits returns the floating-point number corresponding // to the IEEE 754 binary representation b, with the sign bit of b // and the result in the same bit position. // Float64frombits(Float64bits(x)) == x. func Float64frombits(b uint64) float64 { return *(*float64)(unsafe.Pointer(&b)) }
package main import ( "fmt" "unsafe" ) func f() { a := [16]int{3: 3, 9: 9, 11: 11} eleSize := int(unsafe.Sizeof(a[0])) p9 := &a[9] up9 := unsafe.Pointer(p9) p3 := (*int)(unsafe.Add(up9, -6*eleSize)) s := unsafe.Slice(p9, 5)[:3] t := unsafe.Slice((*int)(nil), 0) _ = unsafe.Add(up9, 7*eleSize) _ = unsafe.Slice(p9, 8) fmt.Println(a) fmt.Println(*p3) fmt.Println(s) fmt.Println(len(s), cap(s)) fmt.Println(t == nil) i := unsafe.Add(up9, 7*eleSize) j := unsafe.Add(up9, 16*eleSize) i1 := unsafe.Add(up9, 1*eleSize) j1 := unsafe.Add(up9, 2*eleSize) j3 := unsafe.Add(up9, 17*eleSize) fmt.Println(i, j) fmt.Println(i1, j1, j3) } func main() { f() }
[0 0 0 3 0 0 0 0 0 9 0 11 0 0 0 0]
3
[9 0 11]
3 5
true
0xc0000d4100 0xc0000d4148
0xc0000d40d0 0xc0000d40d8 0xc0000d4150
unsafe package - unsafe - pkg.go.dev https://pkg.go.dev/unsafe
https://pkg.go.dev/unsafe
非类型安全指针 - Go语言101(通俗版Go白皮书) https://gfw.go101.org/article/unsafe.html
(1) Conversion of a *T1 to Pointer to *T2.
Provided that T2 is no larger than T1 and that the two share an equivalent memory layout, this conversion allows reinterpreting data of one type as data of another type. An example is the implementation of math.Float64bits:
func Float64bits(f float64) uint64 { return *(*uint64)(unsafe.Pointer(&f)) }
Understanding uintptr in Golang - Welcome To Golang By Example https://golangbyexample.com/understanding-uintptr-golang/
Overview
This is an unsigned integer type which is large enough to hold any pointer address. Therefore its size is platform dependent. It is just an integer representation of an address.
Properties
- A uintptr can be converted to unsafe.Pointer and viceversa. Later we will talk about where conversion of uintptr to unsafe.Pointer is useful.
- Arithmetic can be performed on the uintptr. Do note here arithmetic cannot be performed in a pointer in Go or unsafe.Pointer in Go.
- uintptr even though it holds a pointer address, is just a value and does not reference any object. Therefore
- Its value will not be updated if the corresponding object moves. Eg When goroutine stack changes
- The corresponding object can be garbage collected. The GC does not consider uintptr as live references and hence they can be garbage collected.
Purpose
uintptr can be used for below purposes:
- One purpose of uintptr is to be used along with unsafe.Pointer for unsafe memory access. Arithmetic operations cannot be performed on unsafe.Pointer. To perform such arithmetic
- unsafe. Pointer is converted to uintptr
- arithmetic is then performed on uintptr
- uintptr is converted back to unsafe.Pointer to access the object now pointed by the address
Be careful that the above steps should be atomic with respect to Garbage Collector, otherwise it could lead to issues. For eg after the first step 1, the referenced object is liable to be collection. If that happens then after step 3, the pointer will be an invalid Go pointer and can crash the program. Look at the unsafe package documentation.
https://golang.org/pkg/unsafe/#Pointer
It lists down when the above conversion can be safe. See the below code for the scenario mentioned above.
In the below code we are doing arithmetic like below to get to address of field “b” in struct sample and then printing the value at that address. This below code is atomic with reference to the garbage collector.
p := unsafe.Pointer(uintptr(unsafe.Pointer(s)) + unsafe.Offsetof(s.b))
package main
import (
"fmt"
"unsafe"
)
type sample struct {
a int
b string
}
func main() {
s := &sample{a: 1, b: "test"}
//Getting the address of field b in struct s
p := unsafe.Pointer(uintptr(unsafe.Pointer(s)) + unsafe.Offsetof(s.b))
//Typecasting it to a string pointer and printing the value of it
fmt.Println(*(*string)(p))
}
Output:
test
- Another purpose of uintptr is when you want to save the pointer address value for printing it or storing it. Since the address is just stored and does not reference anything, the corresponding object can be garbage collected.
See below code where we are converting an unsafe.Pointer to uintptr and printing it. Also, note as mentioned before too one the unsafe.Pointer is converted to uinptr, the reference is lost and the reference variable can be garbage collected.
package main
import (
"fmt"
"unsafe"
)
type sample struct {
a int
b string
}
func main() {
s := &sample{
a: 1,
b: "test",
}
//Get the address as a uintptr
startAddress := uintptr(unsafe.Pointer(s))
fmt.Printf("Start Address of s: %d\n", startAddress)
}
Output:
The output will be dependent upon the machine as it is an address.
Start Address of s: 824634330992
atomic.LoadUintptr() Function in Golang With Examples - GeeksforGeeks https://www.geeksforgeeks.org/atomic-loaduintptr-function-in-golang-with-examples/
In Go language, atomic packages supply lower-level atomic memory that is helpful is implementing synchronization algorithms. The LoadUintptr() function in Go language is used to atomically load *addr. This function is defined under the atomic package. Here, you need to import “sync/atomic” package in order to use these functions.
Syntax:
func LoadUintptr(addr *uintptr) (val uintptr)
Here, addr indicates address.
Note: (*uintptr) is the pointer to a uintptr value. And uintptr is an integer type that is too large that it can contain the bit pattern of any pointer.
Return value: It returns the value loaded to the *addr.
Example 1:
// Program to illustrate the usage of // LoadUintptr function in Golang // Including main package package main // importing fmt and sync/atomic import ( "fmt" "sync/atomic" ) // Main function func main() { // Assigning values // to the uintptr var ( i uintptr = 98 j uintptr = 255 k uintptr = 6576567667788 l uintptr = 5 ) // Calling LoadUintptr method // with its parameters load_1 := atomic.LoadUintptr(&i) load_2 := atomic.LoadUintptr(&j) load_3 := atomic.LoadUintptr(&k) load_4 := atomic.LoadUintptr(&l) // Displays uintptr value // loaded in the *addr fmt.Println(load_1) fmt.Println(load_2) fmt.Println(load_3) fmt.Println(load_4) } |
Output:
98 255 6576567667788 5
Example 2:
// Program to illustrate the usage of // LoadUintptr function in Golang // Including main package package main // Importing fmt and sync/atomic import ( "fmt" "sync/atomic" ) // Main function func main() { // Declaring u var u uintptr // For loop for i := 1; i < 1000; i += 1 { // Function with // AddUintptr method go func() { atomic.AddUintptr(&u, 9) }() } // Prints loaded values address fmt.Println(atomic.LoadUintptr(&u)) } |
Output:
1818 // A random value is returned in each run
In the above example, the new values are returned from AddUintptr() method in each call until the loop stops, LoadUintptr() method loads these new uintptr values. And these values are stored in different addresses which can be random one so, the output of the LoadUintptr() method here in each run is different. So, here a random value is returned in the output.
go - Uint and uintptr in golang - Stack Overflow https://stackoverflow.com/questions/69182402/uint-and-uintptr-in-golang
非类型安全指针 - Go语言101(通俗版Go白皮书) https://gfw.go101.org/article/unsafe.html
unsafe - The Go Programming Language https://golang.google.cn/pkg/unsafe/#Pointer
type Pointer
Pointer represents a pointer to an arbitrary type. There are four special operations available for type Pointer that are not available for other types:
- A pointer value of any type can be converted to a Pointer. - A Pointer can be converted to a pointer value of any type. - A uintptr can be converted to a Pointer. - A Pointer can be converted to a uintptr.
Pointer therefore allows a program to defeat the type system and read and write arbitrary memory. It should be used with extreme care.
The following patterns involving Pointer are valid. Code not using these patterns is likely to be invalid today or to become invalid in the future. Even the valid patterns below come with important caveats.
Running "go vet" can help find uses of Pointer that do not conform to these patterns, but silence from "go vet" is not a guarantee that the code is valid.
(1) Conversion of a *T1 to Pointer to *T2.
Provided that T2 is no larger than T1 and that the two share an equivalent memory layout, this conversion allows reinterpreting data of one type as data of another type. An example is the implementation of math.Float64bits:
func Float64bits(f float64) uint64 { return *(*uint64)(unsafe.Pointer(&f)) }
(2) Conversion of a Pointer to a uintptr (but not back to Pointer).
Converting a Pointer to a uintptr produces the memory address of the value pointed at, as an integer. The usual use for such a uintptr is to print it.
Conversion of a uintptr back to Pointer is not valid in general.
A uintptr is an integer, not a reference. Converting a Pointer to a uintptr creates an integer value with no pointer semantics. Even if a uintptr holds the address of some object, the garbage collector will not update that uintptr's value if the object moves, nor will that uintptr keep the object from being reclaimed.
The remaining patterns enumerate the only valid conversions from uintptr to Pointer.
(3) Conversion of a Pointer to a uintptr and back, with arithmetic.
If p points into an allocated object, it can be advanced through the object by conversion to uintptr, addition of an offset, and conversion back to Pointer.
p = unsafe.Pointer(uintptr(p) + offset)
The most common use of this pattern is to access fields in a struct or elements of an array:
// equivalent to f := unsafe.Pointer(&s.f) f := unsafe.Pointer(uintptr(unsafe.Pointer(&s)) + unsafe.Offsetof(s.f)) // equivalent to e := unsafe.Pointer(&x[i]) e := unsafe.Pointer(uintptr(unsafe.Pointer(&x[0])) + i*unsafe.Sizeof(x[0]))
It is valid both to add and to subtract offsets from a pointer in this way. It is also valid to use &^ to round pointers, usually for alignment. In all cases, the result must continue to point into the original allocated object.
Unlike in C, it is not valid to advance a pointer just beyond the end of its original allocation:
// INVALID: end points outside allocated space. var s thing end = unsafe.Pointer(uintptr(unsafe.Pointer(&s)) + unsafe.Sizeof(s)) // INVALID: end points outside allocated space. b := make([]byte, n) end = unsafe.Pointer(uintptr(unsafe.Pointer(&b[0])) + uintptr(n))
Note that both conversions must appear in the same expression, with only the intervening arithmetic between them:
// INVALID: uintptr cannot be stored in variable // before conversion back to Pointer. u := uintptr(p) p = unsafe.Pointer(u + offset)
Note that the pointer must point into an allocated object, so it may not be nil.
// INVALID: conversion of nil pointer u := unsafe.Pointer(nil) p := unsafe.Pointer(uintptr(u) + offset)
(4) Conversion of a Pointer to a uintptr when calling syscall.Syscall.
The Syscall functions in package syscall pass their uintptr arguments directly to the operating system, which then may, depending on the details of the call, reinterpret some of them as pointers. That is, the system call implementation is implicitly converting certain arguments back from uintptr to pointer.
If a pointer argument must be converted to uintptr for use as an argument, that conversion must appear in the call expression itself:
syscall.Syscall(SYS_READ, uintptr(fd), uintptr(unsafe.Pointer(p)), uintptr(n))
The compiler handles a Pointer converted to a uintptr in the argument list of a call to a function implemented in assembly by arranging that the referenced allocated object, if any, is retained and not moved until the call completes, even though from the types alone it would appear that the object is no longer needed during the call.
For the compiler to recognize this pattern, the conversion must appear in the argument list:
// INVALID: uintptr cannot be stored in variable // before implicit conversion back to Pointer during system call. u := uintptr(unsafe.Pointer(p)) syscall.Syscall(SYS_READ, uintptr(fd), u, uintptr(n))
(5) Conversion of the result of reflect.Value.Pointer or reflect.Value.UnsafeAddr from uintptr to Pointer.
Package reflect's Value methods named Pointer and UnsafeAddr return type uintptr instead of unsafe.Pointer to keep callers from changing the result to an arbitrary type without first importing "unsafe". However, this means that the result is fragile and must be converted to Pointer immediately after making the call, in the same expression:
p := (*int)(unsafe.Pointer(reflect.ValueOf(new(int)).Pointer()))
As in the cases above, it is invalid to store the result before the conversion:
// INVALID: uintptr cannot be stored in variable // before conversion back to Pointer. u := reflect.ValueOf(new(int)).Pointer() p := (*int)(unsafe.Pointer(u))
(6) Conversion of a reflect.SliceHeader or reflect.StringHeader Data field to or from Pointer.
As in the previous case, the reflect data structures SliceHeader and StringHeader declare the field Data as a uintptr to keep callers from changing the result to an arbitrary type without first importing "unsafe". However, this means that SliceHeader and StringHeader are only valid when interpreting the content of an actual slice or string value.
var s string hdr := (*reflect.StringHeader)(unsafe.Pointer(&s)) // case 1 hdr.Data = uintptr(unsafe.Pointer(p)) // case 6 (this case) hdr.Len = n
In this usage hdr.Data is really an alternate way to refer to the underlying pointer in the string header, not a uintptr variable itself.
In general, reflect.SliceHeader and reflect.StringHeader should be used only as *reflect.SliceHeader and *reflect.StringHeader pointing at actual slices or strings, never as plain structs. A program should not declare or allocate variables of these struct types.
// INVALID: a directly-declared header will not hold Data as a reference. var hdr reflect.StringHeader hdr.Data = uintptr(unsafe.Pointer(p)) hdr.Len = n s := *(*string)(unsafe.Pointer(&hdr)) // p possibly already lost
type Pointer *ArbitraryType
func Add
func Add(ptr Pointer, len IntegerType) Pointer
The function Add adds len to ptr and returns the updated pointer Pointer(uintptr(ptr) + uintptr(len)). The len argument must be of integer type or an untyped constant. A constant len argument must be representable by a value of type int; if it is an untyped constant it is given type int. The rules for valid uses of Pointer still apply.
非类型安全指针
我们已经从Go中的指针一文中学习到关于指针的各种概念和规则。 从那篇文章中,我们得知,相对于C指针,Go指针有很多限制。 比如,Go指针不支持算术运算,并且对于任意两个指针值,很可能它们不能转换到对方的类型。
事实上,在那篇文章中解释的指针的完整称呼应该为类型安全指针。 虽然类型安全指针有助于我们轻松写出安全的代码,但是有时候施加在类型安全指针上的限制也确实导致我们不能写出最高效的代码。
实际上,Go也支持限制较少的非类型安全指针。 非类型安全指针和C指针类似,它们都很强大,但同时也都很危险。 在某些情形下,通过非类型安全指针的帮助,我们可以写出效率更高的代码; 但另一方面,使用非类型安全指针也导致我们可能轻易地写出潜在的不安全的代码,这些潜在的不安全点很难在它们产生危害之前被及时发现。
使用非类型安全指针的另外一个较大的风险是Go中目前提供的非类型安全指针机制并不受到Go 1 兼容性保证的保护。 使用了非类型安全指针的代码可能从今后的某个Go版本开始将不再能编译通过,或者运行行为发生了变化。
如果出于种种原因,你确实希望在你的代码中使用非类型安全指针,你不仅需要提防上述风险,你还需遵守Go官方文档中列出的非类型安全指针使用模式,并清楚地知晓使用非类型安全指针带来的效果。否则,你很难使用非类型安全指针写出安全的代码。
关于unsafe
标准库包
unsafe
标准库包来使用非类型安全指针。 非类型安全指针unsafe.Pointer
被声明定义为:
type Pointer *ArbitraryType
当然,这不是一个普通的类型定义。这里的ArbitraryType
仅仅是暗示unsafe.Pointer
类型值可以被转换为任意类型安全指针(反之亦然)。换句话说,unsafe.Pointer
类似于C语言中的void*
。
非类型安全指针是指底层类型为unsafe.Pointer
的类型。
非类型安全指针的零值也使用预声明的nil
标识符来表示。
unsafe
标准库包只提供了三个函数:
func Alignof(variable ArbitraryType) uintptr
。 此函数用来取得一个值在内存中的地址对齐保证(address alignment guarantee)。 注意,同一个类型的值做为结构体字段和非结构体字段时地址对齐保证可能是不同的。 当然,这和具体编译器的实现有关。对于目前的标准编译器,同一个类型的值做为结构体字段和非结构体字段时的地址对齐保证总是相同的。 gccgo编译器对这两种情形是区别对待的。func Offsetof(selector ArbitraryType) uintptr
。 此函数用来取得一个结构体值的某个字段的地址相对于此结构体值的地址的偏移。 在一个程序中,对于同一个结构体类型的不同值的对应相同字段,此函数的返回值总是相同的。func Sizeof(variable ArbitraryType) uintptr
。 此函数用来取得一个值的尺寸(亦即此值的类型的尺寸)。 在一个程序中,对于同一个类型的不同值,此函数的返回值总是相同的。
- 这三个函数的返回值的类型均为内置类型
uintptr
。下面我们将了解到uintptr
类型的值可以转换为非类型安全指针(反之亦然)。 - 尽管这三个函数之一的任何调用的返回结果在同一个编译好的程序中总是一致的,但是这样的一个调用在不同架构的操作系统中(或者使用不同的编译器编译时)的返回值可能是不一样的。
- 这三个函数的调用总是在编译时刻被估值,估值结果为类型为
uintptr
的常量。 - 传递给
Offsetof
函数的实参必须为一个字段选择器形式value.field
。 此选择器可以表示一个内嵌字段,但此选择器的路径中不能包含指针类型的隐式字段。
package main
import "fmt"
import "unsafe"
func main() {
var x struct {
a int64
b bool
c string
}
const M, N = unsafe.Sizeof(x.c), unsafe.Sizeof(x)
fmt.Println(M, N) // 16 32
fmt.Println(unsafe.Alignof(x.a)) // 8
fmt.Println(unsafe.Alignof(x.b)) // 1
fmt.Println(unsafe.Alignof(x.c)) // 8
fmt.Println(unsafe.Offsetof(x.a)) // 0
fmt.Println(unsafe.Offsetof(x.b)) // 8
fmt.Println(unsafe.Offsetof(x.c)) // 16
}
下面是一个展示了上面提到的最后一个注意点的例子:
package main
import "fmt"
import "unsafe"
func main() {
type T struct {
c string
}
type S struct {
b bool
}
var x struct {
a int64
*S
T
}
fmt.Println(unsafe.Offsetof(x.a)) // 0
fmt.Println(unsafe.Offsetof(x.S)) // 8
fmt.Println(unsafe.Offsetof(x.T)) // 16
// 此行可以编译过,因为选择器x.c中的隐含字段T为非指针。
fmt.Println(unsafe.Offsetof(x.c)) // 16
// 此行编译不过,因为选择器x.b中的隐含字段S为指针。
//fmt.Println(unsafe.Offsetof(x.b)) // error
// 此行可以编译过,但是它将打印出字段b在x.S中的偏移量.
fmt.Println(unsafe.Offsetof(x.S.b)) // 0
}
注意,上面程序中的注释所暗示的输出结果是此程序在AMD64架构上使用标准编译器1.17版本编译时的结果。
unsafe
包提供的这三个函数看上去并不怎么危险。 它们的原型在以后的Go 1版本中几乎不可能会发生改变。 Rob Pike甚至曾经将这几个函数挪到其它包中。 unsafe
包的危险性基本上来自于非类型安全指针。它们和C指针一样危险,这是Go安全指针千方百计设法去避免的。
IntegerType
。它的定义如下。 此类型不代表着一个具体类型,它只是表示任意整数类型(有点泛型的意思)。
type IntegerType int
Go 1.17引入的两个函数为:
func Add(ptr Pointer, len IntegerType) Pointer
。 此函数在一个(非安全)指针表示的地址上添加一个偏移量,然后返回表示新地址的一个指针。 此函数以一种更正规的形式部分地覆盖了下面将要介绍的使用模式3中展示的合法用法。func Slice(ptr *ArbitraryType, len IntegerType) []ArbitraryType
。 此函数用来从一个任意(安全)指针派生出一个指定长度的切片。
package main
import (
"fmt"
"unsafe"
)
func main() {
a := [16]int{3: 3, 9: 9, 11: 11}
fmt.Println(a)
eleSize := int(unsafe.Sizeof(a[0]))
p9 := &a[9]
up9 := unsafe.Pointer(p9)
p3 := (*int)(unsafe.Add(up9, -6 * eleSize))
fmt.Println(*p3) // 3
s := unsafe.Slice(p9, 5)[:3]
fmt.Println(s) // [9 0 11]
fmt.Println(len(s), cap(s)) // 3 5
t := unsafe.Slice((*int)(nil), 0)
fmt.Println(t == nil) // true
// 下面是两个不正确的调用。因为它们
// 的返回结果引用了未知的内存块。
_ = unsafe.Add(up9, 7 * eleSize)
_ = unsafe.Slice(p9, 8)
}
非类型安全指针相关的类型转换
- 一个类型安全指针值可以被显式转换为一个非类型安全指针类型,反之亦然。
- 一个uintptr值可以被显式转换为一个非类型安全指针类型,反之亦然。 但是,注意,一个nil非类型安全指针类型不应该被转换为uintptr并进行算术运算后再转换回来。
通过使用这些转换规则,我们可以将任意两个类型安全指针转换为对方的类型,我们也可以将一个安全指针值和一个uintptr值转换为对方的类型。
然而,尽管这些转换在编译时刻是合法的,但是它们中一些在运行时刻并非是合法和安全的。 这些转换摧毁了Go的类型系统(不包括非类型安全指针部分)精心设立的内存安全屏障。 我们必须遵循本文后面要介绍的一些用法指示来使用非类型安全指针才能写出合法并安全的代码。
我们需要知道的一些事实
在开始介绍合法的非类型安全指针使用模式之前,我们需要知道一些事实。
事实一:非类型安全指针值是指针但uintptr值是整数
每一个非零安全或者不安全指针值均引用着另一个值。但是一个uintptr值并不引用任何值,它被看作是一个整数,尽管常常它存储的是一个地址的数字表示。
Go是一门支持垃圾回收的语言。 当一个Go程序在运行中,Go运行时(runtime)将不时地检查哪些内存块将不再被程序中的任何仍在使用中的值所引用并且回收这些内存块。 指针在这一过程中扮演着重要的角色。值与值之间和内存块与值之间的引用关系是通过指针来表征的。
既然一个uintptr值是一个整数,那么它可以参与算术运算。
下一节中的例子将展示指针和uintptr值的不同。
事实二:不再被使用的内存块的回收时间点是不确定的
在运行时刻,一次新的垃圾回收过程可能在一个不确定的时间启动,并且此过程可能需要一段不确定的时长才能完成。 所以一个不再被使用的内存块的回收时间点是不确定的。
一个例子:import "unsafe"
// 假设此函数不会被内联(inline)。
//go:noinline
func createInt() *int {
return new(int)
}
func foo() {
p0, y, z := createInt(), createInt(), createInt()
var p1 = unsafe.Pointer(y) // 和y一样引用着同一个值
var p2 = uintptr(unsafe.Pointer(z))
// 此时,即使z指针值所引用的int值的地址仍旧存储
// 在p2值中,但是此int值已经不再被使用了,所以垃圾
// 回收器认为可以回收它所占据的内存块了。另一方面,
// p0和p1各自所引用的int值仍旧将在下面被使用。
// uintptr值可以参与算术运算。
p2 += 2; p2--; p2--
*p0 = 1 // okay
*(*int)(p1) = 2 // okay
*(*int)(unsafe.Pointer(p2)) = 3 // 危险操作!
}
在上面这个例子中,值p2
仍旧在使用这个事实并不能保证曾经被z
指针值所引用的int
值所占的内存块一定还没有被回收。 换句话说,当*(*int)(unsafe.Pointer(p2)) = 3
被执行的时候,此内存块有可能已经被回收了。 所以,继续通过解引用值p2
中存储的地址是非常危险的,因为此内存块可能已经被重新分配给其它值使用了。
事实三:一个值的地址在程序运行中可能改变
详情请阅读内存块一文(见链接所指一节的尾部)。 这里我们只需要知道当一个协程的栈的大小改变时,开辟在此栈上的内存块需要移动,从而相应的值的地址将改变。
事实四:一个值的生命范围可能并没有代码中看上去的大
比如中下面这个例子,值t
仍旧在使用中并不能保证被值t.y
所引用的值仍在被使用。
type T struct {
x int
y *[1<<23]byte
}
func bar() {
t := T{y: new([1<<23]byte)}
p := uintptr(unsafe.Pointer(&t.y[0]))
... // 使用t.x和t.y
// 一个聪明的编译器能够觉察到值t.y将不会再被用到,
// 所以认为t.y值所占的内存块可以被回收了。
*(*byte)(unsafe.Pointer(p)) = 1 // 危险操作!
println(t.x) // ok。继续使用值t,但只使用t.x字段。
}
事实五:*unsafe.Pointer
是一个类型安全指针类型
是的,类型*unsafe.Pointer
是一个类型安全指针类型。 它的基类型为unsafe.Pointer
。 既然它是一个类型安全指针类型,根据上面列出的类型转换规则,它的值可以转换为类型unsafe.Pointer
,反之亦然。
package main
import "unsafe"
func main() {
x := 123 // 类型为int
p := unsafe.Pointer(&x) // 类型为unsafe.Pointer
pp := &p // 类型为*unsafe.Pointer
p = unsafe.Pointer(pp)
pp = (*unsafe.Pointer)(p)
}
如何正确地使用非类型安全指针?
unsafe
标准库包的文档中列出了六种非类型安全指针的使用模式。 下面将对它们逐一进行讲解。
使用模式一:将类型*T1
的一个值转换为非类型安全指针值,然后将此非类型安全指针值转换为类型*T2
。
利用前面列出的非类型安全指针相关的转换规则,我们可以将一个*T1
值转换为类型*T2
,其中T1
和T2
为两个任意类型。 然而,我们只有在T1
的尺寸不小于T2
并且此转换具有实际意义的时候才应该实施这样的转换。
通过将一个*T1
值转换为类型*T2
,我们也可以将一个T1
值转换为类型T2
。
math
标准库包中的Float64bits
函数。 此函数将一个float64
值转换为一个uint64
值。 在此转换过程中,此float64
值在内存中的每个位(bit)都保持不变。 函数math.Float64frombits
为此转换的逆转换。
func Float64bits(f float64) uint64 {
return *(*uint64)(unsafe.Pointer(&f))
}
func Float64frombits(b uint64) float64 {
return *(*float64)(unsafe.Pointer(&b))
}
请注意,函数调用math.Float64bits(aFloat64)
的结果和显式转换uint64(aFloat64)
的结果不同。
[]MyString
值和一个[]string
值转换为对方的类型。 结果切片和被转换的切片将共享底层元素。(这样的转换是不可能通过安全的方式来实现的。)
package main
import (
"fmt"
"unsafe"
)
func main() {
type MyString string
ms := []MyString{"C", "C++", "Go"}
fmt.Printf("%s\n", ms) // [C C++ Go]
// ss := ([]string)(ms) // 编译错误
ss := *(*[]string)(unsafe.Pointer(&ms))
ss[1] = "Rust"
fmt.Printf("%s\n", ms) // [C Rust Go]
// ms = []MyString(ss) // 编译错误
ms = *(*[]MyString)(unsafe.Pointer(&ss))
}
当然,从Go 1.17开始,我们也可以使用unsafe.Slice((*string)(&ms[0]), len(ms))
来实现此类型转换。
func ByteSlice2String(bs []byte) string {
return *(*string)(unsafe.Pointer(&bs))
}
此实现借鉴于strings
标准库包中的Builder
类型的String
方法的实现。 字节切片的尺寸比字符串的尺寸要大,并且它们的底层结构类似,所以此转换(对于当前的主流Go编译器来说)是安全的。 即使这样,此实现也只推荐在标准库中使用,而不推荐在用户代码中使用。 在用户代码中,最好尽量使用文末提供的另一种实现。
func String2ByteSlice(s string) []byte {
return *(*[]byte)(unsafe.Pointer(&s)) // 危险
}
在后面的模式六中展示了一种合法的(无需复制底层字节序列即可)将一个字符串转换为字节切片的实现。
注意:当运用上面展示的使用非类型安全指针将一个字节切片转换为字符串的技巧时,请确保结果字符串在使用过程中绝对不修改此字节切片中的字节值。
使用模式二:将一个非类型安全指针值转换为一个uintptr值,然后使用此uintptr值。
此模式不是很有用。一般我们将最终的转换结果uintptr值输出到日志中用来调试,但是有很多其它安全并且简洁的途径也可以实现此目的。
一个例子:package main
import "fmt"
import "unsafe"
func main() {
type T struct{a int}
var t T
fmt.Printf("%p\n", &t) // 0xc6233120a8
println(&t) // 0xc6233120a8
fmt.Printf("%x\n", uintptr(unsafe.Pointer(&t))) // c6233120a8
}
输出地址在每次运行中可能都会不同。
使用模式三:将一个非类型安全指针转换为一个uintptr值,然后此uintptr值参与各种算术运算,再将算术运算的结果uintptr值转回非类型安全指针。
package main
import "fmt"
import "unsafe"
type T struct {
x bool
y [3]int16
}
const N = unsafe.Offsetof(T{}.y)
const M = unsafe.Sizeof(T{}.y[0])
func main() {
t := T{y: [3]int16{123, 456, 789}}
p := unsafe.Pointer(&t)
// "uintptr(p) + N + M + M"为t.y[2]的内存地址。
ty2 := (*int16)(unsafe.Pointer(uintptr(p)+N+M+M))
fmt.Println(*ty2) // 789
}
其实,对于这样地址加减运算,更推荐使用上面介绍的Go 1.17中引入的unsafe.Add
函数来完成。
unsafe.Pointer(uintptr(p) + N + M + M)
不应该像下面这样被拆成两行。 请阅读下面的代码中的注释以获取原因。
func main() {
t := T{y: [3]int16{123, 456, 789}}
p := unsafe.Pointer(&t)
// ty2 := (*int16)(unsafe.Pointer(uintptr(p)+N+M+M))
addr := uintptr(p) + N + M + M
// ...(一些其它操作)
// 从这里到下一行代码执行之前,t值将不再被任何值
// 引用,所以垃圾回收器认为它可以被回收了。一旦
// 它真地被回收了,下面继续使用t.y[2]值的曾经
// 的地址是非法和危险的!另一个危险的原因是
// t的地址在执行下一行之前可能改变(见事实三)。
// 另一个潜在的危险是:如果在此期间发生了一些
// 操作导致协程堆栈大小改变的情况,则记录在addr
// 中的地址将失效。
ty2 := (*int16)(unsafe.Pointer(addr))
fmt.Println(*ty2)
}
这样的bug是非常微妙和很难被觉察到的,并且爆发出来的几率是相当得低。 一旦这样的bug爆发出来,将很让人摸不到头脑。这也是使用非类型安全指针被认为是危险操作的原因之一。
中间uintptr值可以参与&^
清位运算来进行内存对齐计算,只要保证转换前后的非类型安全指针同时指向同一个内存块,整个转换就是合法安全的。
另一个需要注意的细节是最好不要将一个内存块的结尾边界地址存储在一个(安全或非安全)指针中。 这样做将导致紧随着此内存块的另一个内存块因为被引用而不会被垃圾回收掉,或者因为形成非法指针而导致程序崩溃(取决于具体编译器实现)。 请阅读这个问答以获取更多解释。
使用模式四:将非类型安全指针值转换为uintptr
值并传递给syscall.Syscall
函数调用。
// 假设此函数不会被内联。
func DoSomething(addr uintptr) {
// 对处于传递进来的地址处的值进行读写...
}
上面这个函数是危险的原因在于此函数本身不能保证传递进来的地址处的内存块一定没有被回收。 如果此内存块已经被回收了或者被重新分配给了其它值,那么此函数内部的操作将是非法和危险的。
然而,syscall
标准库包中的Syscall
函数的原型为:
func Syscall(trap, a1, a2, a3 uintptr) (r1, r2 uintptr, err Errno)
那么此函数是如何保证处于传递给它的地址参数值a1
、a2
和a3
处的内存块在此函数执行过程中一定没有被回收和被移动呢? 此函数无法做出这样的保证。事实上,是编译器做出了这样的保证。 这是syscall.Syscall
这样的函数的特权。其它自定义函数无法享受到这样的待遇。
我们可以认为编译器针对每个syscall.Syscall
函数调用中的每个被转换为uintptr
类型的非类型安全指针实参添加了一些指令,从而保证此非类型安全指针所引用着的内存块在此调用返回之前不会被垃圾回收和移动。
注意:在Go 1.15之前,类型转换表达式uintptr(anUnsafePointer)
可以呈现为相关实参的子表达式。 但是,从Go 1.15开始,使用此模式的要求变得略加严格:相关实参必须呈现为uintptr(anUnsafePointer)
这种形式。
syscall.Syscall(syscall.SYS_READ, uintptr(fd),
uintptr(unsafe.Pointer(p)), uintptr(n))
但下面这个调用则是危险的:
u := uintptr(unsafe.Pointer(p))
// 被p所引用着的值在此时有可能会被回收掉,
// 或者它的地址已经发生了改变。
syscall.Syscall(SYS_READ, uintptr(fd), u, uintptr(n))
// 相关实参必须呈现为"uintptr(anUnsafePointer)"
// 这种形式。事实上,Go 1.15之前,此调用是合法的;
// 但是Go 1.15略改了一点规则。
syscall.Syscall(SYS_XXX, uintptr(uintptr(fd)),
uint(uintptr(unsafe.Pointer(p))), uintptr(n))
再提醒一次,此使用模式不适用于其它自定义函数。
使用模式五:将reflect.Value.Pointer
或者reflect.Value.UnsafeAddr
方法的uintptr
返回值立即转换为非类型安全指针。
reflect
标准库包中的Value
类型的Pointer
和UnsafeAddr
方法都返回一个uintptr
值,而不是一个unsafe.Pointer
值。 这样设计的目的是避免用户不引用unsafe
标准库包就可以将这两个方法的返回值(如果是unsafe.Pointer
类型)转换为任何类型安全指针类型。
这样的设计需要我们将这两个方法的调用的uintptr
结果立即转换为非类型安全指针。 否则,将出现一个短暂的可能导致处于返回的地址处的内存块被回收掉的时间窗。 此时间窗是如此短暂以至于此内存块被回收掉的几率非常之低,因而这样的编程错误造成的bug的重现几率亦十分得低。
p := (*int)(unsafe.Pointer(reflect.ValueOf(new(int)).Pointer()))
而下面这个调用是危险的:
u := reflect.ValueOf(new(int)).Pointer()
// 在这个时刻,处于存储在u中的地址处的内存块
// 可能会被回收掉。
p := (*int)(unsafe.Pointer(u))
注意:此使用模式也适用于Windows系统中的syscall.Proc.Call和syscall.LazyProc.Call系统调用。
使用模式六:将一个reflect.SliceHeader
或者reflect.StringHeader
值的Data
字段转换为非类型安全指针,以及其逆转换。
和上一小节中提到的同样的原因,reflect
标准库包中的SliceHeader
和StringHeader
类型的Data
字段的类型被指定为uintptr
,而不是unsafe.Pointer
。
我们可以将一个字符串的指针值转换为一个*reflect.StringHeader
指针值,从而可以对此字符串的内部进行修改。 类似地,我们可以将一个切片的指针值转换为一个*reflect.SliceHeader
指针值,从而可以对此切片的内部进行修改。
reflect.StringHeader
的例子:
package main
import "fmt"
import "unsafe"
import "reflect"
func main() {
a := [...]byte{'G', 'o', 'l', 'a', 'n', 'g'}
s := "Java"
hdr := (*reflect.StringHeader)(unsafe.Pointer(&s))
hdr.Data = uintptr(unsafe.Pointer(&a))
hdr.Len = len(a)
fmt.Println(s) // Golang
// 现在,字符串s和切片a共享着底层的byte字节序列,
// 从而使得此字符串中的字节变得可以修改。
a[2], a[3], a[4], a[5] = 'o', 'g', 'l', 'e'
fmt.Println(s) // Google
}
一个使用了
reflect.SliceHeader
的例子:
package main
import (
"fmt"
"unsafe"
"reflect"
)
func main() {
a := [6]byte{'G', 'o', '1', '0', '1'}
bs := []byte("Golang")
hdr := (*reflect.SliceHeader)(unsafe.Pointer(&bs))
hdr.Data = uintptr(unsafe.Pointer(&a))
hdr.Len = 2
hdr.Cap = len(a)
fmt.Printf("%s\n", bs) // Go
bs = bs[:cap(bs)]
fmt.Printf("%s\n", bs) // Go101
}
一般说来,我们只应该从一个已经存在的字符串值得到一个
*reflect.StringHeader
指针, 或者从一个已经存在的切片值得到一个*reflect.SliceHeader
指针, 而不应该从一个StringHeader
值生成一个字符串,或者从一个SliceHeader
值生成一个切片。 比如,下面的代码是不安全的:
var hdr reflect.StringHeader
hdr.Data = uintptr(unsafe.Pointer(new([5]byte)))
// 在此时刻,上一行代码中刚开辟的数组内存块已经不再被任何值
// 所引用,所以它可以被回收了。
hdr.Len = 5
s := *(*string)(unsafe.Pointer(&hdr)) // 危险!
下面是一个展示了如何通过使用非类型安全途径将一个字符串转换为字节切片的例子。 和使用类型安全途径进行转换不同,使用非类型安全途径避免了复制一份底层字节序列。
package main
import (
"fmt"
"reflect"
"strings"
"unsafe"
)
func String2ByteSlice(str string) (bs []byte) {
strHdr := (*reflect.StringHeader)(unsafe.Pointer(&str))
sliceHdr := (*reflect.SliceHeader)(unsafe.Pointer(&bs))
sliceHdr.Data = strHdr.Data
sliceHdr.Cap = strHdr.Len
sliceHdr.Len = strHdr.Len
return
}
func main() {
// str := "Golang"
// 对于官方标准编译器来说,上面这行将使str中的字节
// 开辟在不可修改内存区。所以这里我们使用下面这行。
str := strings.Join([]string{"Go", "land"}, "")
s := String2ByteSlice(str)
fmt.Printf("%s\n", s) // Goland
s[5] = 'g'
fmt.Println(str) // Golang
}
reflect
标准库包中SliceHeader
和StringHeader
类型的文档提到这两个结构体类型的定义不保证在以后的版本中不发生改变。好在目前的两个主流Go编译器(标准编译器和gccgo编译器)都认可当前版本中的定义。这也可以看作是使用非类型安全指针的另一个(较低的)潜在风险。
注意:当使用上面展示的使用非类型安全指针将一个字符串转换为字节切片时,请确保结果此源字符串的生命期内务必不要修改结果字节切片中的字节值(上面的例子违背了此原则)。 事实上,更为推荐的是最好永远不要修改结果字节切片中的字节值。此非类型安全方式的目的主要是为了在局部感知范围内避免一次内存开辟,而不是一种通用的方式。
我们可以使用类似的实现(如下所示)来将一个字节切片转换为字符串。 此实现被模式一中展示的方法略为安全一些(但是也更慢一些)。func ByteSlice2String(bs []byte) (str string) {
sliceHdr := (*reflect.SliceHeader)(unsafe.Pointer(&bs))
strHdr := (*reflect.StringHeader)(unsafe.Pointer(&str))
strHdr.Data = sliceHdr.Data
strHdr.Len = sliceHdr.Len
return
}
同样地,请确保结果此结果字符串的生命期内务必不要修改实参字节切片中的字节值。
最后,顺便举一个违背了模式三的使用原则的例子:package main
import (
"fmt"
"reflect"
"unsafe"
)
func Example_Bad() *byte {
var str = "godoc"
hdr := (*reflect.StringHeader)(unsafe.Pointer(&str))
pbyte := (*byte)(unsafe.Pointer(hdr.Data + 2))
return pbyte // *pbyte == 'd'
}
func main() {
fmt.Println(string(*Example_Bad()))
}
下面是两个正确的实现:
func Example_Good1() *byte {
var str = "godoc"
hdr := (*reflect.StringHeader)(unsafe.Pointer(&str))
pbyte := (*byte)(unsafe.Pointer(
uintptr(unsafe.Pointer(hdr.Data)) + 2))
return pbyte
}
// 从Go 1.17开始也可以使用此实现。
func Example_Good2() *byte {
var str = "godoc"
hdr := (*reflect.StringHeader)(unsafe.Pointer(&str))
pbyte := (*byte)(unsafe.Add(unsafe.Pointer(hdr.Data), 2))
return pbyte
}
上面这几个例子借鉴自Bryan C. Mills在slack中发表的一个留言。
总结一下
从上面解释中,我们得知,对于某些情形,非类型安全机制可以帮助我们写出运行效率更高的代码。 但是,使用非类型安全指针也使得我们可能轻易地写出一些重现几率非常低的微妙的bug。 一个含有这样的bug的程序很可能在很长一段时间内都运行正常,但是突然变得不正常甚至崩溃。 这样的bug很难发现和调试。
我们只应该在不得不使用非类型安全机制的时候才使用它们。 特别地,当我们使用非类型安全机制时,请务必遵循上面列出的使用模式。
重申一次,我们应该知晓当前的非类型安全机制规则和使用模式可能在以后的Go版本中完全失效。 当然,目前没有任何迹象表明这种变化将很快会来到。 但是,一旦发生这种变化,本文中列出的当前是正确的代码将变得不再安全甚至编译不通过。 所以,在实践中,请尽量保证能够将使用了非类型安全机制的代码轻松改为使用安全途径实现。
最后值得提一下的是,Go官方工具链1.14中加入了一个-gcflags=all=-d=checkptr
编译器动态分析选项(在Windows平台上推荐使用工具链1.15+)。 当此选项被使用的时候,编译出的程序在运行时会监测到很多(但并非所有)非类型安全指针的错误使用。一旦错误的使用被监测到,恐慌将产生。 感谢Matthew Dempsky实现了此特性。
你不知道的Go unsafe.Pointer uintptr原理和玩法 - sunsky303 - 博客园 https://www.cnblogs.com/sunsky303/p/11820500.html
你不知道的Go unsafe.Pointer uintptr原理和玩法
unsafe.Pointer
这个类型比较重要,它是实现定位和读写的内存的基础,Go runtime大量使用它。官方文档对该类型有四个重要描述:
(1)任何类型的指针都可以被转化为Pointer
(2)Pointer可以被转化为任何类型的指针
(3)uintptr可以被转化为Pointer
(4)Pointer可以被转化为uintptr
大多数指针类型会写成T,表示是“一个指向T类型变量的指针”。unsafe.Pointer是特别定义的一种指针类型(译注:类似C语言中的void类型的指针),它可以包含任意类型变量的地址。当然,我们不可以直接通过*p来获取unsafe.Pointer指针指向的真实变量的值,因为我们并不知道变量的具体类型。和普通指针一样,unsafe.Pointer指针也是可以比较的,并且支持和nil常量比较判断是否为空指针。
一个普通的T类型指针可以被转化为unsafe.Pointer类型指针,并且一个unsafe.Pointer类型指针也可以被转回普通的指针,被转回普通的指针类型并不需要和原始的T类型相同。
package main import ( "fmt" "unsafe" "reflect" ) type W struct { b byte i int32 j int64 } //通过将float64类型指针转化为uint64类型指针,我们可以查看一个浮点数变量的位模式。 func Float64bits(f float64) uint64 { fmt.Println(reflect.TypeOf(unsafe.Pointer(&f))) //unsafe.Pointer fmt.Println(reflect.TypeOf((*uint64)(unsafe.Pointer(&f)))) //*uint64 return *(*uint64)(unsafe.Pointer(&f)) } func Uint(i int)uint{ return *(*uint)(unsafe.Pointer(&i)) } type Uint6 struct { low [2]byte high uint32 } //func (u *Uint6) SetLow() { // fmt.Printf("i=%d\n", this.i) //} // //func (u *Uint6) SetHigh() { // fmt.Printf("j=%d\n", this.j) //} func writeByPointer(){ uint6 := &Uint6{} lowPointer:=(*[2]byte)(unsafe.Pointer(uint6)) *lowPointer = [2]byte{1,2} //unsafe.Offsetof会计算padding后的偏移距离 //必须将unsafe.Pointer转化成 uintptr类型才能进行指针的运算,uintptr 与 unsafe.Pointer 之间可以相互转换。 highPointer:=(*uint32)(unsafe.Pointer(uintptr(unsafe.Pointer(uint6))+unsafe.Offsetof(uint6.high))) fmt.Printf("addr %x addr %x size %v size %v size %v align %v offset %v \n", uintptr(unsafe.Pointer(uint6)),uintptr(unsafe.Pointer(uint6))+unsafe.Sizeof(uint6.low),unsafe.Sizeof([2]byte{1,2}),unsafe.Sizeof(uint6.low), unsafe.Sizeof(uint6.high), unsafe.Alignof(uint6.low), unsafe.Offsetof(uint6.high)) *highPointer = uint32(9) //借助于 unsafe.Pointer,我们实现了像 C 语言中的指针偏移操作。可以看出,这种不安全的操作使得我们可以在任何地方直接访问结构体中未公开的成员,只要能得到这个结构体变量的地址。 fmt.Printf("%+v %v %v %v \n", uint6, &uint6,&uint6.low[0], &uint6.high) } type T struct { t1 byte t2 int32 t3 int64 t4 string t5 bool } func main() { fmt.Printf("%#x %#b \n", Float64bits(11.3), Float64bits(4)) // "0x3ff0000000000000" var intA int =99 uintA:=Uint(intA) fmt.Printf("%#v %v %v \n", intA, reflect.TypeOf(uintA), uintA) var w W = W{} //在struct中,它的对齐值是它的成员中的最大对齐值。 fmt.Printf("%v, %v, %v, %v, %v, %v, %v, %v\n", unsafe.Alignof(w), unsafe.Alignof(w.b), unsafe.Alignof(w.i), unsafe.Alignof(w.j), unsafe.Sizeof(w),unsafe.Sizeof(w.b),unsafe.Sizeof(w.i),unsafe.Sizeof(w.j), ) fmt.Println(unsafe.Alignof(byte(0))) fmt.Println(unsafe.Alignof(int8(0))) fmt.Println(unsafe.Alignof(uint8(0))) fmt.Println(unsafe.Alignof(int16(0))) fmt.Println(unsafe.Alignof(uint16(0))) fmt.Println(unsafe.Alignof(int32(0))) fmt.Println(unsafe.Alignof(uint32(0))) fmt.Println(unsafe.Alignof(int64(0))) fmt.Println(unsafe.Alignof(uint64(0))) fmt.Println(unsafe.Alignof(uintptr(0))) fmt.Println(unsafe.Alignof(float32(0))) fmt.Println(unsafe.Alignof(float64(0))) //fmt.Println(unsafe.Alignof(complex(0, 0))) fmt.Println(unsafe.Alignof(complex64(0))) fmt.Println(unsafe.Alignof(complex128(0))) fmt.Println(unsafe.Alignof("")) fmt.Println(unsafe.Alignof(new(int))) fmt.Println(unsafe.Alignof(struct { f float32 ff float64 }{})) fmt.Println(unsafe.Alignof(make(chan bool, 10))) fmt.Println(unsafe.Alignof(make([]int, 10))) fmt.Println(unsafe.Alignof(make(map[string]string, 10))) t := &T{1, 2, 3, "", true} fmt.Println("sizeof :") fmt.Println(unsafe.Sizeof(*t)) fmt.Println(unsafe.Sizeof(t.t1)) fmt.Println(unsafe.Sizeof(t.t2)) fmt.Println(unsafe.Sizeof(t.t3)) fmt.Println(unsafe.Sizeof(t.t4)) fmt.Println(unsafe.Sizeof(t.t5)) //这里以0x0作为基准内存地址。打印出来总共占用40个字节。t.t1 为 char,对齐值为 1,0x0 % 1 == 0,从0x0开始,占用一个字节;t.t2 为 int32,对齐值为 4,0x4 % 4 == 0,从 0x4 开始,占用 4 个字节;t.t3 为 int64,对齐值为 8,0x8 % 8 == 0,从 0x8 开始,占用 8 个字节;t.t4 为 string,对齐值为 8,0x16 % 8 == 0,从 0x16 开始, 占用 16 个字节(string 内部实现是一个结构体,包含一个字节类型指针和一个整型的长度值);t.t5 为 bool,对齐值为 1,0x32 % 8 == 0,从 0x32 开始,占用 1 个字节。从上面分析,可以知道 t 的对齐值为 8,最后 bool 之后会补齐到 8 的倍数,故总共是 40 个字节。 fmt.Println("Offsetof : ") fmt.Println(unsafe.Offsetof(t.t1)) fmt.Println(unsafe.Offsetof(t.t2)) fmt.Println(unsafe.Offsetof(t.t3)) fmt.Println(unsafe.Offsetof(t.t4)) fmt.Println(unsafe.Offsetof(t.t5)) writeByPointer() //CPU看待内存是以block为单位的,就像是linux下文件大小的单位IO block为4096一样, //是一种牺牲空间换取时间的做法, 我们一定要注意不要浪费空间, //struct类型定义的时候一定要将占用内从空间小的类型放在前面, 充足利用padding, 才能提升内存、cpu效率 }
go run PLAY.go unsafe.Pointer *uint64 unsafe.Pointer *uint64 0x402699999999999a 0b100000000010000000000000000000000000000000000000000000000000000 99 uint 99 8, 1, 4, 8, 16, 1, 4, 8 1 1 1 2 2 4 4 8 8 8 4 8 4 8 8 8 8 8 8 8 sizeof : 40 1 4 8 16 1 Offsetof : 0 4 8 16 32 addr c00008e038 addr c00008e03a size 2 size 2 size 4 align 1 offset 4 &{low:[1 2] high:9} 0xc00008a010 0xc00008e038 0xc00008e03c
uintptr
// uintptr is an integer type that is large enough to hold the bit pattern of
// any pointer.
type uintptr uintptr
uintptr是golang的内置类型,是能存储指针的整型,在64位平台上底层的数据类型是,
typedef unsigned long long int uint64;
typedef uint64 uintptr;
一个unsafe.Pointer指针也可以被转化为uintptr类型,然后保存到指针型数值变量中(注:这只是和当前指针相同的一个数字值,并不是一个指针),然后用以做必要的指针数值运算。(uintptr是一个无符号的整型数,足以保存一个地址)这种转换虽然也是可逆的,但是将uintptr转为unsafe.Pointer指针可能会破坏类型系统,因为并不是所有的数字都是有效的内存地址。
许多将unsafe.Pointer指针转为原生数字,然后再转回为unsafe.Pointer类型指针的操作也是不安全的。比如下面的例子需要将变量x的地址加上b字段地址偏移量转化为*int16类型指针,然后通过该指针更新x.b:
package main import ( "fmt" "unsafe" ) func main() { var x struct { a bool b int16 c []int } /** unsafe.Offsetof 函数的参数必须是一个字段 x.f, 然后返回 f 字段相对于 x 起始地址的偏移量, 包括可能的空洞. */ /** uintptr(unsafe.Pointer(&x)) + unsafe.Offsetof(x.b) 指针的运算 */ // 和 pb := &x.b 等价 pb := (*int16)(unsafe.Pointer(uintptr(unsafe.Pointer(&x)) + unsafe.Offsetof(x.b))) *pb = 42 fmt.Println(x.b) // "42" }
上面的写法尽管很繁琐,但在这里并不是一件坏事,因为这些功能应该很谨慎地使用。不要试图引入一个uintptr类型的临时变量,因为它可能会破坏代码的安全性(注:这是真正可以体会unsafe包为何不安全的例子)。
下面段代码是错误的:
// NOTE: subtly incorrect!
tmp := uintptr(unsafe.Pointer(&x)) + unsafe.Offsetof(x.b)
pb := (*int16)(unsafe.Pointer(tmp))
*pb = 42
产生错误的原因很微妙。有时候垃圾回收器会移动一些变量以降低内存碎片等问题。这类垃圾回收器被称为移动GC。当一个变量被移动,所有的保存改变量旧地址的指针必须同时被更新为变量移动后的新地址。从垃圾收集器的视角来看,一个unsafe.Pointer是一个指向变量的指针,因此当变量被移动是对应的指针也必须被更新;但是uintptr类型的临时变量只是一个普通的数字,所以其值不应该被改变。上面错误的代码因为引入一个非指针的临时变量tmp,导致垃圾收集器无法正确识别这个是一个指向变量x的指针。当第二个语句执行时,变量x可能已经被转移,这时候临时变量tmp也就不再是现在的&x.b地址。第三个向之前无效地址空间的赋值语句将彻底摧毁整个程序!
总结
第一是 unsafe.Pointer
可以让你的变量在不同的指针类型转来转去,也就是表示为任意可寻址的指针类型。第二是 uintptr
常用于与 unsafe.Pointer
打配合,用于做指针运算,和C (*void)指针一样。
unsafe是不安全的,所以我们应该尽可能少的使用它,比如内存的操纵,这是绕过Go本身设计的安全机制的,不当的操作,可能会破坏一块内存,而且这种问题非常不好定位。
当然必须的时候我们可以使用它,比如底层类型相同的数组之间的转换;比如使用sync/atomic包中的一些函数时;还有访问Struct的私有字段时;该用还是要用,不过一定要慎之又慎。
还有,整个unsafe包都是用于Go编译器的,不用运行时,在我们编译的时候,Go编译器已经把他们都处理了。