[转]Inside Swift

原文地址:http://www.eswick.com/2014/06/inside-swift/

 

Inside Swift

Swift is Apple’s new programming language, said by many to ‘replace’ Objective-C. This is not the case. I’ve spent some time reverse engineering Swift binaries and the runtime, and I’ve found out quite a bit about it. So far, the verdict is this; Swift is Objective-C without messages.

Objects

Believe it or not, Swift objects are actually Objective-C objects. In a Mach-O binary, the__objc_classlist section contains data for each class in the binary. The structure is like so:

struct objc_class {uint64_t isa;uint64_t superclass;uint64_t cache;uint64_t vtable;uint64_t data;};

(note: all structures are from 64-bit builds)

Note the data entry. It points to a structure listing the methods, ivars, protocols, etc. of the class. Normally, data is 8-byte-aligned. However, for Swift classes, the last bit of data will be 1.

Classes

The actual structure for Swift classes is a bit odd. Swift classes have no Objective-C methods. We’ll get to that later. Variables for Swift classes are stored as ivars. The Swift getter and setter methods actually modify the ivar values. Oddly, ivars for Swift classes have no type encoding. The pointer that is normally supposed to point to the type encoding is NULL. This is presumably due to the fact that the Objective-C runtime is not supposed to deal with Swift variables itself.

Inheritance

Inheritance in Swift is as you would expect. In Swift, a Square that is a subclass of Shape will also be a subclass of Shape in the Objective-C class. However, what if a class in Swift doesn’t have a superclass?

e.g.

class Shape { }

In this case, the Shape class would be a subclass of SwiftObjectSwiftObject is a root Objective-C class, similar to NSObject. It has no superclass, meaning the isa points to itself. Its purpose is to use Swift runtime methods for things like allocation and deallocation, instead of the standard Objective-C runtime. For example, - (void)retain does not call objc_retain, but instead callsswift_retain.

Class Methods

Like I mentioned earlier, classes for Swift objects have no methods. Instead, they have been replaced with C++-like functions, mangling and all. This is likely why Swift has been said to be much faster than Objective-C; there is no more need for objc_msgSend to find and call method implementations.

In Objective-C, method implementations are like so:

type method(id self, SEL _cmd, id arg1, id arg2, ...)

Swift methods are very similar, but with a slightly different argument layout. self is passed as the last argument, and there is no selector.

type method(id arg1, id arg2, ..., id self)

vtable

Just like in C++, Swift classes have a vtable which lists the methods in the class. It is located directly after the class data in the binary, and looks something like this:

struct swift_vtable_header {uint32_t vtable_size;uint32_t unknown_000;uint32_t unknown_001;uint32_t unknown_002;void* nominalTypeDescriptor;// vtable pointers}

From what I can tell, the vtable for a Swift class is only used when it is visible during compile time. Otherwise, it finds the mangled symbol.

Name Mangling

Swift keeps metadata about functions (and more) in their respective symbols, which is called name mangling. This metadata includes the function’s name (obviously), attributes, module name, argument types, return type, and more. Take this for example:

classShape{
    func numberOfSides()->Int{return5}}

The mangled name for the simpleDescription method is_TFC9swifttest5Shape17simpleDescriptionfS0_FT_Si. Here’s the breakdown:

_T – The prefix for all Swift symbols. Everything will start with this.

F – Function.

C – Function of a class. (method)

9swifttest – The module name, with a prefixed length.

5Shape – The class name the function belongs to, again, with a prefixed length.

17simpleDescription – The function name.

f – The function attribute. In this case it’s ‘f’, which is just a normal function. We’ll get to that in a minute.

S0_FT – I’m not exactly sure what this means, but it appears to mark the start of the arguments and return type.

‘_’ – This underscore separates the argument types from the return type. Since the function takes no arguments, it comes directly after S0_FT.

S – This is the beginning of the return type. The ‘S’ stands for Swift; the return type is a Swift builtin type. The next character determines the type.

i – This is the Swift builtin type. A lowercase ‘I’, which stands for Int.

Function Attributes

Character
Type
fNormal Function
sSetter
gGetter
dDestructor
DDeallocator
cConstructor
CAllocator

Swift Builtins

Character
Type
aArray
bBool
cUnicodeScalar
dDouble
fFloat
iInt
uUInt
QImplicitlyUnwrappedOptional
SString

There’s a lot more to name mangling than just functions, but I’ve just given a brief overview.

Function Hooking

Enough with semantics, let’s get to the fun part! Let’s say we have a class like so:

classShape{var numberOfSides:Int;

    init(){
        numberOfSides =5;}}

Let’s say we want to change the numberOfSides to 4. There are multiple ways to do this. We could use MobileSubstrate to hook into the getter method, and change the return value, like so:

int(*numberOfSides)(id self);MSHook(int, numberOfSides, id self){return4;}%ctor{
    numberOfSides =(int(*)(id self)) dlsym(RTLD_DEFAULT,"_TFC9swifttest5Shapeg13numberOfSidesSi");MSHookFunction(numberOfSides,MSHake(numberOfSides));}

If we create an instance of Shape and print out the value of numberOfSides, we see 4! That wasn’t so bad, was it? Now, I know what you’re thinking; “aren’t you supposed to return an object instead of a 4 literal?”

Well, in Swift, a lot of the builtin types are literals. An Int, for example, is the same as an int in C (although it could be a long – don’t hold me to that). A little note, the String type is a little bit odd; it’s a little-endian UTF-16 string, so no C literals can be used.

Let’s do the same thing, but this time, we’ll hook the setter instead of the getter.

void(*setNumberOfSides)(int newNumber, id self);MSHook(void, setNumberOfSides,int newNumber, id self){
    _setNumberOfSides(4,self);}%ctor {
    setNumberOfSides =(void(*)(int newNumber, id self)) dlsym(RTLD_DEFAULT,"_TFC9swifttest5Shapes13numberOfSidesSi");MSHookFunction(setNumberOfSides,MSHake(setNumberOfSides));}

Try it again and….it’s still 5. What is happening, you ask? Well, in certain places in Swift, functions are inlined. The class constructor is one of these places. It directly sets the numberOfSides ivar. So, the setter will only be called if the number is set again from the top level code. Call it from there and, what do you know, we get 4.

Finally, let’s change numberOfSides by directly setting the ivar.

void(*setNumberOfSides)(int newNumber, id self);MSHook(void, setNumberOfSides,int newNumber, id self){MSHookIvar<int>(self,"numberOfSides")=4;}%ctor {
    setNumberOfSides =(void(*)(int newNumber, id self)) dlsym(RTLD_DEFAULT,"_TFC9swifttest5Shapes13numberOfSidesSi");MSHookFunction(setNumberOfSides,MSHake(setNumberOfSides));}

This works. It’s not recommended, but it works.

That’s all I have to write about for now. There’s quite a few other things that I’m looking at, including witness tables, but I don’t know enough about them to write. A lot of things in this post are subject to change. They’re just what I’ve reverse engineered so far by looking at the runtime and binaries compiled with Swift.

What I’ve found here is very good. It means that MobileSubstrate will not die along with Objective-C, and tweaks can still be made! I wonder what the future has in store for the jailbreaking scene… maybe Logos could be updated to automatically mangle names? Or even a library that deals with common Swift types…

If you find out more about how Swift works, don’t hesitate to let me know!

posted @ 2014-06-11 20:07  Proteas  阅读(872)  评论(0编辑  收藏  举报