Swift 里字符串(一)概览

感受一下字符串相关的源文件个数

String 概览

  • 是一个结构体
  • 只有一个变量,类型是 _StringGuts

 如上所示,String 真正的内容在__StringStorage或者__SharedStringStorage里面。

  private static func create(
    realCodeUnitCapacity: Int, countAndFlags: CountAndFlags
  ) -> __StringStorage {
    let storage = Builtin.allocWithTailElems_2(
      __StringStorage.self,
      realCodeUnitCapacity._builtinWordValue, UInt8.self,
      1._builtinWordValue, Optional<_StringBreadcrumbs>.self)
#if arch(i386) || arch(arm)
    storage._realCapacity = realCodeUnitCapacity
    storage._count = countAndFlags.count
    storage._flags = countAndFlags.flags
#else
    storage._realCapacityAndFlags =
      UInt64(truncatingIfNeeded: realCodeUnitCapacity)
    storage._countAndFlags = countAndFlags
#endif

    storage._breadcrumbsAddress.initialize(to: nil)
    storage.terminator.pointee = 0 // nul-terminated

    // NOTE: We can't _invariantCheck() now, because code units have not been
    // initialized. But, _StringGuts's initializer will.
    return storage
  }

这里是真正分配内存的地方。

标记位

String 里有若干标记位,表示不同类型,一共有4位,被称为 discriminator

On 64-bit platforms, the discriminator is the most significant 4 bits of the bridge object.

字符串粗略可以分为Small stringsLarge strings

几乎所有的字符串操作,都根据是否是Small string来做了区分,比如判断是否是ACSCII

  //
  // Whether the string is all ASCII
  //
  @inlinable
  internal var isASCII: Bool {
    @inline(__always) get {
      if isSmall { return smallIsASCII }
      return _countAndFlags.isASCII
    }
  }

_StringObject里获取并判断标记位

获取标记位

  internal var discriminatedObjectRawBits: UInt64 {
    return Builtin.reinterpretCast(_object)
  }

即bridge object的最高位。

判断是否可变

  @inlinable
  internal var isImmortal: Bool {
    @inline(__always) get {
      return (discriminatedObjectRawBits & 0x8000_0000_0000_0000) != 0
    }
  }

判断是否是 small string

  internal var isSmall: Bool {
    @inline(__always) get {
      return (discriminatedObjectRawBits & 0x2000_0000_0000_0000) != 0
    }
  }

判断是否提供了连续的UTF8 code point

  // Whether this string can provide access to contiguous UTF-8 code units:
  //   - Small strings can by spilling to the stack
  //   - Large native strings can through an offset
  //   - Shared strings can:
  //     - Cocoa strings which respond to e.g. CFStringGetCStringPtr()
  //     - Non-Cocoa shared strings
  @inlinable
  internal var providesFastUTF8: Bool {
    @inline(__always) get {
      return (discriminatedObjectRawBits & 0x1000_0000_0000_0000) == 0
    }
  }

posted on 2019-03-20 07:55  花老🐯  阅读(512)  评论(0编辑  收藏  举报

导航