Why I wrote this article?
When I learn more about Android’s graphics system, and do more work about how to use CPU/GPU in more paralleled way to improve the graphics performance in Android, I start to think that there are actually some big design mistakes in Android graphics system, especially the rendering architecture in the client side.
Some mistakes have been solved after 3.x, especially above 4.1, but others can never be solved due to the compatible reason. As developers, we need to know how the Android graphics system work, how to utilize the new features Android 3.x and 4.x provided, and how to do the optimazation and overcome the shortage of Android.
Direct Rendering
In the very beginning, Android’s client/server architecture of its graphics system choose the direct rendering mode, Android team develop a lightweight Window Compositor – SurfaceFlinger by themself, it response to create native windows, allocate graphics buffers and composite the window surface to display (via framebuffer). In client side, apps (through Android runtime) can use CPU to access the graphics buffer of native window directly (through Skia), or use GPU to manipulate it indirectly (through GLES). This direct rendeing mode is more appropriate than the traditional X in desktop Linux.
Window Composite in Server Side
SurfaceFlinger use GLES1 to do the window composition (via texture operation) in the very beginning, this cause two issues :
- GPU may have more efficient way to do the window composition than use GLES;
- When the client side also use GLES to render the window surface, it may competes with the server about the GPU resources;
Just mention, when GPU do not support GLES1, Android has a built-in SW version GLES1 stack, and it can use copybit HAL module provided by chip vendor to accelerated the texture operation.
So, after 3.0, Android introduce hwcomposer HAL module to solve these issues, also abandon the old copybit module. Chip vendor can through the implementation of hwcomposer module to declare they can do all of or part of windows’ composition by themselves, usually with dedicated hardware and always more efficient than use GLES. Also, after 4.1, hwcomposer can provide the vsync signal of the display, so that Android can sync three things together :
- the windows composition in server side
- the rendering of window surface in client side
- the refresh of display
Rendering Architecture in Client Side
The most mistake Android team make is the rendering architecture of its GUI framework :
- It do not has a layer rendering architecture (or called scene graph in some GUI fw);
- It do not has a dedicated rendering thread to render the window surface;
- It’s rendering only use CPU until 3.0;
The first one is partially support after 3.0, the third is support after 3.0, but the second problem can never be solved…
Compare to iOS
In iOS, every View has a Layer as its backing store, app can create more Layers for better performance. View’s content will be drew into its Layer, as long as the content do not changed, the View do not need to draw itself again. iOS do a lot of things to avoid change the content of a View, many many properties of a View can be changed without affect to its content, such as background, border, opacity, position, transformation, even the geometry!!!
The composition of Layers done by another dedicated rendering thread, it always use GPU to draw each Layer to the window surface. The main thread only reponse to handle touch event, relayout the Views, draw View’s content into its Layer when necessary, etc… So the main thread only use CPU and the rendering thread use GPU mostly, and I think there will be just a few synchronization between these two threads, and they can run concurrently in most of time.
But in Android, the main thread need to do everything, such as handle touch events, relayout views, dequeue/enqueue graphics buffer, draw views’ content on window surface, and other things need to be done by app… And it only use CPU before 3.0!!! Even the position of a View just change one pixel, Android need to redraw its content along with the contents of other Views overlapped, and the content drawing is very expensive for CPU!!!
The Improvements
A lot improvements have been made after 3.0 to overcome the shortage of previous versions. Android 3.0 introduce a new hwui module, and it can use GPU to accelerated the drawing of View’s content, it create a hw accelerated Canvas to replace the old sw Canvas, the new Canvas use OpenGL ES as its backend instead of use SkCanvas from Skia.
Android 3.0 also introduce the DisplayList mechanism, DisplayList can be considered as a 2D drawing commands buffer, every View has its own DisplayList , and when its onDraw method called, all drawing commands issue through the input Canvas actually store into its own DisplayList. When every DisplayList are ready, Android will draw all the DisplayLists, it actually turn the 2D drawing commands into GLES commands to use GPU to render the window surface. So the rendering of View Hierarchy actually separated into two steps, generate View’s DisplayList, and then draw the DisplayLists, and the second one use GPU mostly.
When app invalidate a View, Android need to regenerate its DisplayList, but the overlapped View’s DisplayList can keep untouched Also, Android 4.1 introduce DisplayList properties, DisplayList now can has many properties such as opacity, transformation, etc…, and the changed of some properties of View will just cause changed of corresponding properties of its DisplayList and need not to regenerate it. These two improvements can save some times by avoid regenerate DisplayLists unnecessary.
Although Android can never has a layer rendering architecture, it actually introduce some Layer support after 3.0, a View can has a Layer as its backing store. The so called HW Layer actually back by a FBO, if the content of View is too complicated and unlikely to change in the future, use Layer may help. Also, when a View is animating (but content do not change), cache itself and its parent with Layers may also help. But use Layer with caution, because it increase the memory usage, if your want to use Layers during animation, your may need to release them when the animation is finish, Android 4.2 provide new animation API to help you about that. Also, because Android use GLES to draw the content of View, so most Views’ drawing will be fast enough, and the use of Layer may be unnecessary.
Android 4.0 also introduce a new type of native window – SurfaceTextureClient (back by a SurfaceTexture) and its Java wrapper TextureView, app can create and own this kind of native window and response to its composition. If the content of View is too complicated and continue to change, TextureView will be very helpful, app can use another thread to generate the content of TextureView and notify the main thread to update, and main thread can use the TextureView as a normal texture and draw it directly on the window of current Activity. (TextureView can also replace the usage of original SurfaceView and GLSurfaceView)
Android 4.1 make the touch event handling and ui drawing sync with the vsync signal of display, it also use triple buffers to avoid block the main thread too often because it need to wait the SurfaceFlinger to do the page flip and release the previous front buffer, and SurfaceFlinger will always sync with vsync signal of display.
The OpenGL Renderer for hw accelerated Canvas is continue be improved and become faster, especially for complicated shape drawing.
But…
But Android can never has a dedicated rendering thread… Although the drawing is much faster than before, and keep the changed of everything as little as possible during animating, it still need to share the 16ms interval with other jobs in main thread to achieve 60 fps.
So…
So, as developer, we need to utilize the improvements in higher version Android as much as possible :
- Turn on the GPU acceleration switch above Android 3.0;
- Use the new Animation Framework for your animation;
- Use Layer and TextureView when necessary;
- etc…
And avoid to block the main thread as much as possible, that means :
- If your handle the touch events too long, do it in another thread;
- If your need to load a file from sdcard or read data from database, do it in another thread;
- If your need to decode a bitmap, do it in another thread;
- If your View’s content is too complicated, use Layer, if it continue to change, use TextureView;
- Even your can use another standalone window (such as SurfaceView) as a overlay and render it in another thread;
- etc…