Dynamic Library Design Guidelines

https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/DynamicLibraries/100-Articles/DynamicLibraryDesignGuidelines.html

 

Specifying Your Library’s Interface

The most important aspect to define before implementing a dynamic library is its interface to its clients. The public interface affects several areas in the use of the library by its clients, the library’s development and maintenance, and the performance of the apps in which the library is used:

  • Ease of use: A library with a few but easily understandable public symbols is far easier to use than one that exports all the symbols it defines.

  • Ease of maintenance: A library that has a small set of public symbols and an adequate set of private symbols, is far easier to maintain because there are few client entry points to test. Also, developers can change the private symbols to improve the library in newer versions without impacting the functionality of clients that were linked with an earlier version.

  • Performance: Designing a dynamic library so that it exports the minimum number of symbols optimizes the amount of time the dynamic loader takes to load the library into a process. The fewer exported symbols a library has, the faster the dynamic loader loads it.

The following sections show how to determine which of the library’s symbols to export, how to name them, and how to export them.

Deciding What Symbols to Export

Reducing the set of symbols your library exports makes the library easy to use and easy to maintain. With a reduced symbol set, the users of your library are exposed only to the symbols that are relevant to them. And with few public symbols, you are free to make substantial changes to the internal interfaces, such as adding or removing internal symbols that do not affect clients of your library.

Global variables should never be exported. Providing uncontrolled access to a library’s global variables leaves the library open to problems caused by clients assigning inappropriate values to those variables. It’s also difficult to make changes to global variables from one version of your library to another without making newer revisions incompatible with clients that were not linked with them. One of the main features of dynamic libraries is the fact that, when implemented correctly, clients can use newer versions of them without relinking. If clients need to access a value stored in a global variable, your library should export accessor functions but not the global variable itself. Adhering to this guideline allows library developers to change the definitions of global variables between versions of the library, without introducing incompatible revisions.

If your library needs the functionality implemented by functions it exports, you should consider implementing internal versions of the functions, adding wrapper functions to them, and exporting the wrappers. For example, your library may have a function whose arguments must be validated, but you’re certain that the library always provides valid values when invoking the function. The internal version of the function could be optimized by removing validation code from it, making internal use more efficient. The validation code can then be placed in the wrapper function, maintaining the validation process for clients. In addition, you can further change the internal implementation of the function to include more parameters, for example, while maintaining the external version the same.

Having wrapper functions call internal versions reduces the performance of an app, especially if the function is called repeatedly by clients. However, the advantages of flexible maintenance for you and a stable interface for your clients greatly outweigh this negligible performance impact.

Naming Exported Symbols

The dynamic loader doesn’t detect naming conflicts between the symbols exported by the dynamic libraries it loads. When a client contains a reference to a symbol that two or more of its dependent libraries export, the dynamic loader binds the reference to the first dependent library that exports the symbol in the client’s dependent library list. The dependent library list is a list of the client’s dependent libraries in the order they were specified when the client was linked with them. Also, when the dlsym(3) OS X Developer Tools Manual Page function is invoked, the dynamic loader returns the address of the first symbol it finds in the specified scope (global, local, or next) with a matching name. For details on symbol-search scope, see Using Symbols.

To ensure that your library’s clients always have access to the symbols your library exports, the symbols must have unique names in a process’s namespace. One way is for apps to use two-level namespaces. Another is to add prefixes to every exported symbol. This is the convention used by most of the OS X frameworks, such as Carbon and Cocoa. For more information on two-level namespace, see Executing Mach-O Files in Mach-O Programming Topics.

Symbol Exporting Strategies

After you have identified the symbols you want to expose to your library’s users, you must devise a strategy for exporting them or for not exporting the rest of the symbols. This process is also known as setting the visibility of the symbols—that is, whether they are accessible to clients. Public or exported symbols are accessible to clients; private, hidden, or unexported symbols are not accessible to clients. In OS X, there are several ways of specifying the visibility of a library’s symbols:

  • The static storage class: This is the easiest way to indicate that you don’t want to export a symbol.

  • The exported symbols list or the unexported symbols list: The list is a file with the names of symbols to export or a list of symbols to keep private. The symbol names must include the underscore (_) prefix. You can use only one type of list when generating the dynamic library file.

  • The visibility attribute: You place this attribute in the definition of symbols in implementation files to set the visibility of symbols individually. It gives you more granular control over which symbols are public or private.

  • The compiler -fvisibility command-line option: This option specifies at compilation time the visibility of symbols with unspecified visibility in implementation files. This option, combined with the visibility attribute, is the most safe and convenient way of identifying public symbols.

  • The weak_import attribute: Placing this attribute in the declaration of a symbol in a header file tells the compiler to generate a weak reference to the symbol. This feature is called weak linking; symbols with the weak_import attribute are called weakly linked symbols. With weak linking, clients do not fail to launch when the version of the dependent library found at launch time or load time doesn’t export a weakly linked symbol referenced by the client. It’s important to place the weak_import attribute in the header files that the source files of the library’s clients use, so that the client developers know that they must ensure the existence of the symbol before using it. Otherwise, the client would crash or function incorrectly when it attempts to use the symbol. See Using Weakly Linked Symbols for further details on weakly linked symbols. For more information on symbol definitions, see Executing Mach-O Files in Mach-O Programming Topics.

  • The compiler -weak_library command-line option: This option tells the compiler to treat all the library’s exported symbols as weakly linked symbols.

To illustrate how to set the visibility of a library’s symbols, let’s start with a dynamic library that allows its clients to set a value kept in a global variable in the library, and to retrieve the value. Listing 1 shows the code that makes up the library.

Listing 1  A simple dynamic library

/* File: Person.h */
char* name(void);
void set_name(char* name);
 
/* File: Person.c */
#include "Person.h"
#include <string.h>
char _person_name[30] = {'\0'};
char* name(void) {
    return _person_name;
}
 
void _set_name(char* name) {
   strcpy(_person_name, name);
}
 
void set_name(char* name) {
    if (name == NULL) {
        _set_name("");
    }
    else {
        _set_name(name);
    }
}

The intent of the library’s developer is to provide clients the ability to set the value of _person_name with the set_name function and to let them obtain the value of the variable with the name function. However, the library exports more than the name and set_name functions, as shown by the output of the nm command-line tool:

% clang -dynamiclib Person.c -o libPerson.dylib
% nm -gm libPerson.dylib
                 (undefined) external ___strcpy_chk (from libSystem)
0000000000001020 (__DATA,__common) external __person_name     // Inadvertently exported
0000000000000e80 (__TEXT,__text) external __set_name          // Inadvertently exported
0000000000000e70 (__TEXT,__text) external _name
0000000000000ec0 (__TEXT,__text) external _set_name
                 (undefined) external dyld_stub_binder (from libSystem)

Note that the _person_name global variable and the _set_name function are exported along with the name and set_name functions. There are many options to remove _person_name and _set_name from the symbols exported by the library. This section explores a few.

The first option is to add the static storage class to the definition of _person_name and _set_name in Person.c , as shown in Listing 2.

Listing 2  Person module hiding a symbol with the static storage class

/* File: Person.c */
#include "Person.h"
#include <string.h>
 
static char _person_name[30] = {'\0'};        // Added 'static' storage class
char* name(void) {
    return _person_name;
}
 
static void _set_name(char* name) {           // Added 'static' storage class
   strcpy(_person_name, name);
}
 
void set_name(char* name) {
    if (name == NULL) {
        _set_name("");
    }
    else {
        _set_name(name);
    }
}

Now, the nm output, looks like this:

                 (undefined) external ___strcpy_chk (from libSystem)
0000000000000e80 (__TEXT,__text) external _name
0000000000000e90 (__TEXT,__text) external _set_name
                 (undefined) external dyld_stub_binder (from libSystem)

This means that the library exports only name and set_name. Actually, the library also exports some undefined symbols, including strcpy. They are references to symbols the library obtains from its dependent libraries.

Note: You should always use the static storage class for symbols that you want to keep private for a specific file. It’s a very effective fail-safe measure against inadvertently exposing symbols that should be hidden from clients.

The problem with this approach is that it hides the internal _set_name function from other modules in the library. If the library’s developer trusts that any internal call to _set_name doesn’t need to be validated but wants to validate all client calls, the symbol must be visible to other modules within the library but not to the library’s client. Therefore, the static storage class is not appropriate to hide symbols from the client but disclose them to all the library’s modules.

A second option for exposing only the symbols intended for client use is to have an exported symbols file that lists the symbols to export; all other symbols are hidden. Listing 3 shows the export_list file.

Listing 3  File listing the names of the symbols to export

# File: export_list
_name
_set_name

To compile the library, you use the clang -exported_symbols_list option to specify the file containing the names of the symbols to export, as shown here:

clang -dynamiclib Person.c -exported_symbols_list export_list -o libPerson.dylib

The third and most convenient option for exposing only name and set_name is to set the visibility attribute in their implementations to "default" and set the -fvisibility compiler command-line option to hidden when compiling the library’s source files. Listing 4 shows how the Person.c file looks after setting the visibility attribute for the symbols to be exported.

Listing 4  Person module using visibility attribute to export symbols

/* File: Person.c */
#include "Person.h"
#include <string.h>
 
// Symbolic name for visibility("default") attribute.
#define EXPORT __attribute__((visibility("default")))
 
char _person_name[30] = {'\0'};
 
EXPORT                        // Symbol to export
char* name(void) {
    return _person_name;
}
 
void _set_name(char* name) {
   strcpy(_person_name, name);
}
 
EXPORT                        // Symbol to export
void set_name(char* name) {
    if (name == NULL) {
        _set_name("");
    }
    else {
        _set_name(name);
    }
}

The library would then be compiled using the following command:

% clang -dynamiclib Person.c -fvisibility=hidden -o libPerson.dylib

The -fvisibility=hidden command-line option tells the compiler to set the visibility of any symbols without a visibility attribute to hidden, thereby hiding them from the library’s clients. For details on the visibility attribute and the -fvisibility command-line option, see http://gcc.gnu.org/onlinedocs/gcc and the clang man page.

Following these symbol-exporting guidelines ensures that libraries export only the symbols you want to make available to your clients, simplifying the use of the library by its clients and facilitating its maintenance by its developers. The document How to Write Shared Libraries provides an in-depth analysis of symbol exporting strategies. This document is available at http://people.redhat.com/drepper/dsohowto.pdf.

Locating External Resources

When you need to locate resources your library or program needs at runtime—such as frameworks, images, and so on—you can use either of the following methods:

  • Executable-relative location. To specify a file path relative to the location of the main executable, not the referencing library, place the @executable_path macro at the beginning of the path. For example, in an app package that contains private frameworks (which, in turn, contain shared libraries), any of the libraries can locate an app resource called MyImage.tiff inside the package by specifying the path @executable_path/../Resources/MyImage.tiff. Because @executable_path resolves to the binary inside the MacOS directory in the app bundle, the resource file path must specify the Resources directory as a subdirectory of the MacOS parent directory (the Contents directory). For a detailed discussion of directory bundles, see Bundle Programming Guide.

  • Library-relative location. To specify a file path relative to the location of the library itself, place the @loader_path macro at the beginning of the pathname. Library-relative location allows you to locate library resources within a directory hierarchy regardless of where the main executable is located.

Library Dependencies

When you develop a dynamic library, you specify its dependent libraries by linking your source code with them. When a client of your library tries to load it, your library’s dependent libraries must be present in the file system for your library to load successfully. (See Run-Path Dependent Libraries to learn about installing dependent libraries in a relocatable directory.) Depending on how the client loads your library, some or all of your library’s references to symbols exported by its dependent libraries are resolved. You should consider using the dlsym(3) OS X Developer Tools Manual Page function to get the address of symbols when they are needed instead of having references that may always have to be resolved at load time. See Using Symbols for details.

The more dependent libraries your library has, the longer it takes for your library to load. Therefore, you should link your library only with those dynamic libraries required at load time. After you compile your library, you can view its dependent libraries in a shell editor with the otool -L command.

Any dynamic libraries your library seldom uses or whose functionality is needed only when performing specific tasks should be used as runtime loaded libraries; that is, they should be opened with the dlopen(3) OS X Developer Tools Manual Page function. For example, when a module in your library needs to perform a task that requires the use of a nondependent library, the module should use dlopen to load the library, use the library to perform its task, and close the library with dlclose(3) OS X Developer Tools Manual Page when finished. For additional information on loading libraries at runtime, see Opening Dynamic Libraries.

You should also keep to a minimum the number of external references to symbols in dependent libraries. This practice optimizes further your library’s load time.

You must disclose to your library’s users all the libraries your library uses and whether they are dependent libraries. When users of your dynamic library link their images, the static linker must be able to find all your library’s dependent libraries, either through the link line or symbolic links. Also, because your dynamic library loads successfully even when some or all the libraries it opens at runtime are not present at load time, users of your library must know which dynamic libraries your library opens at runtime and under which circumstances. Your library’s users can use that information when investigating unexpected behavior by your library.

Module Initializers and Finalizers

When dynamic libraries are loaded, they may need to prepare resources or perform special initialization before doing anything else. Conversely, when the libraries are unloaded, they may need to perform some finalization processes. These tasks are performed by initializer functions and finalizer functions, also called constructors and destructors.

Note: Apps can also define and use initializer and finalizers. However, this section focuses on their use in dynamic libraries.

Initializers can safely use symbols from dependent libraries because the dynamic loader executes the static initializers of an image’s dependent libraries before invoking the image’s static initializers.

You indicate that a function is an initializer by adding the constructor attribute to its definition. The destructor attribute identifies finalizer functions. Initializers and finalizers must not be exported. A dynamic library’s initializers are executed in the order they are encountered by the compiler. It’s finalizers, on the other hand, are executed in the reverse order as encountered by the compiler.

For example, Listing 5 shows a set of initializers and finalizers defined identically in two files File1.c and File2.c in a dynamic library called Inifi.

Listing 5  Inifi initializer and finalizer definitions

/* Files: File1.c, File2.c */
#include <stdio.h>
__attribute__((constructor))
static void initializer1() {
    printf("[%s] [%s]\n", __FILE__, __FUNCTION__);
}
 
__attribute__((constructor))
static void initializer2() {
    printf("[%s] [%s]\n", __FILE__, __FUNCTION__);
}
 
__attribute__((constructor))
static void initializer3() {
    printf("[%s] [%s]\n", __FILE__, __FUNCTION__);
}
 
__attribute__((destructor))
static void finalizer1() {
    printf("[%s] [%s]\n", __FILE__, __FUNCTION__);
}
 
__attribute__((destructor))
static void finalizer2() {
    printf("[%s] [%s]\n", __FILE__, __FUNCTION__);
}
 
__attribute__((destructor))
static void finalizer3() {
    printf("[%s] [%s]\n", __FILE__, __FUNCTION__);
}

Continuing the example, the Inifi dynamic library is the sole dependent library of the Trial program, generated from the Trial.c file, shown in Listing 6.

Listing 6  The Trial.c file

/* Trial.c */
#include <stdio.h>
int main(int argc, char** argv) {
    printf("[%s] [%s] Finished loading. Now quitting.\n", __FILE__, __FUNCTION__);
    return 0;
}

Listing 7 shows the output produced by the Trial app.

Listing 7  Execution order of a dynamic library’s initializers and finalizers

% clang -dynamiclib File1.c File2.c -fvisibility=hidden -o libInifi.dylib
% clang Trial.c libInifi.dylib -o trial
% ./trial
[File1.c] [initializer1]
[File1.c] [initializer2]
[File1.c] [initializer3]
[File2.c] [initializer1]
[File2.c] [initializer2]
[File2.c] [initializer3]
[Trial.c] [main] Finished loading. Now quitting.
[File2.c] [finalizer3]
[File2.c] [finalizer2]
[File2.c] [finalizer1]
[File1.c] [finalizer3]
[File1.c] [finalizer2]
[File1.c] [finalizer1]

Although you can have as many static initializers and finalizers in an image as you want, you should consolidate your initialization and finalization code into one initializer and one finalizer per module, as needed. You may also choose to have one initializer and one finalizer per library.

In OS X v10.4 and later, static initializers can access the arguments given to the current program. By defining the initializer’s parameters as you would define the parameters to an program’s main function, you can get the number of arguments given, the arguments themselves, and the process’s environment variables. In addition, to guard against an initializer or finalizer being called twice, you should conditionalize your initialization and finalization code inside the function. Listing 8 shows the definition of a static initializer that has access to the program’s arguments and conditionalizes its initialization code.

Listing 8  Definition of a static initializer

__attribute__((constructor))
static void initializer(int argc, char** argv, char** envp) {
    static initialized = 0;
    if (!initialized) {
        // Initialization code.
        initialized = 1;
    }
}

Note: Some operating systems support a naming convention for initializers and finalizers, _init and _fini. This convention is not supported in OS X.

posted @ 2015-03-19 10:57  微信公众号--共鸣圈  阅读(253)  评论(0编辑  收藏  举报