浅谈Google认证失败项分析

浅谈Google认证失败项分析

一、概述

AndroidTV机顶盒项目的Google认证包含8项测试:CTS、GTS、STS、VTS、CTS-ON-GSI、TVTS、SmokeTest、CtsVerifier、BTS,详细的规范要求见文档:GTVS RequirementsATV Help#Android TV Certification

本文着重讲述一个AndroidTV项目认证流程中的技术环节,即xts失败项分析。尽管这些测试内容方法以及测试用例实现不一,但其分析解决流程都是相同的:确认失败项、确认测试内容、修复

ps:BTS本文不讨论。

二、确认失败项

一份测试报告提供出来,它的失败项全都需要开发人员分析吗?往往不是这样。

经验来说这其中会有许多的环境相关项。比如未接camera、HDMI;未接入支持ipv6的wifi网络或网络环境太差;未恢复出厂retry等等。还有Google认为可以失败的waiver项(一般是Google的测试用例bug导致)。

以下是几个经常见到的环境相关项

armeabi-v7a CtsMediaTestCases
Test Result Details
android.media.cts.DecodeAccuracyTest#testSurfaceViewLargerHeightDecodeAccuracy[22] fail junit.framework.AssertionFailedError: With the best matched border crop (0.0, 0.0), greatest pixel difference is 167 at (60, 32) which is over the allowed difference 90
armeabi-v7a CtsNetTestCases
Test Result Details
android.net.cts.DnsTest#testDnsWorks fail junit.framework.AssertionFailedError: [RERUN] DNS could not resolve www.google.com. Check internet connection
armeabi-v7a CtsCameraTestCases
Test Result Details
android.hardware.camera2.cts.CameraManagerTest#testCameraManagerGetDeviceIdList fail junit.framework.AssertionFailedError: External camera is not connected on device with FEATURE_CAMERA_EXTERNAL
android.hardware.cts.CameraTest#testCameraExternalConnected fail junit.framework.AssertionFailedError: Devices with external camera support must have a camera connected for testing

以下是当前的一条cts-on-gsi测试wavier项,适用2020-10月system.img与VTS_9.0_R14工具。waiver项是动态更新的,Google收到各厂商反馈后,会在下个版本修复,所以要及时跟进同步。

armeabi-v7a CtsOsTestCases
Test Result Details
android.os.cts.LocaleListTest#testRepeatedArguments fail junit.framework.AssertionFailedError: expected: but was:

基本上,像cts这样测试套件自动化的测试,调整尽可能ok的环境恢复出厂retry多次至失败项不再减少时,可以认为是真正的失败项了。就可以发出报告与日志安排对应模块开发人员分析。

冒烟测试和ctsVerifier是手工测试,若测试不过,基本是有问题的,应直接介入分析。

三、分析前的一些背景知识

3.1 xts测试的工具、源码及形式

自动化的几个测试都是类似的,apk的形式安装到Android设备中进行测试,sts有部分会推送二进制到设备执行,vts会执行一些命令来获取底层信息。

测试名 工具获取 源码获取 测试形式 测试目的
CTS cts测试套件,公开下载。Compatibility Test Suite Downloads AOSP 设备端以JUnit tests和apk的形式测试。Types of test cases包含单元测试与功能测试 Android平台兼容性。CDD + Android SDK/NDK/APIs
STS sts测试套件,非公开。同安全补丁一起发布Security Test Suite 部分AOSP 设备端测试junit tests和主机端二进制推送值设备测试。Security Test Suite#Types of test cases 测试安全漏洞。每个月和安全补丁一起更新。Security patch compliance
VTS vts测试套件,非公开。Vendor Test Suite (VTS) and Generic System Image (GSI) AOSP 主要是执行shell命令Device shell commands 替换GSI后测试框架以下的部分。HAL、驱动与内核。Treble compliance
CTS-ON-GSI 同vts。 AOSP 同cts 替换GSI后测试一遍CTS。Treble compliance
GTS gts测试套件,非公开。Downloading and Running GTS 反编译获取 同cts 用于验证GMS应用程序是否正确集成。Platform implementation for GTVS services
TVTS tvts测试套件,非公开。TV Test Suite 反编译获取 同cts 用于验证GMS应用程序性能。Performance
CtsVerfier CtsVerfier测试工具包,公开下载Compatibility Test Suite Downloads AOSP 安装CtsVerifier.apk后按条测,半自动 cts的补充测试,需要人工判断。CDD + Android SDK/NDK/APIs
SmokeTest 表格,非公开。Smoke Test planYouTube and Play Movies Video Test PackYouTube and Play Movies Video Test Pack 按照表格指示逐条人工测试 验证基本功能是否正常。Android TV functional;YouTube & Play Movies compatibility & performance

3.2 通过报告定位到测试源码

3.2.1 device端的apk形式测试用例

大部分的测试用例均是该种形式,组织为一个测试apk,推送到Android设备运行并返回测试结果。他们的源码定位方式如下。

以第二节中camera项为例查找对应tag的代码。此处推荐两个网站https://android.googlesource.comhttps://cs.android.com。一般是在cs.android上搜测试类名方法名,找到路径后去googlesource查找对应tag的精确的代码。

或者自己维护一个aosp的工程,随时到子库中切换至对应分支tag。

  • CtsCameraTestCases是模块名,往往构建为测试套件中的一个apk或jar包。见测试套件目录:

    ./android-cts/testcases/CtsCameraTestCases.apk
    ./android-cts/testcases/CtsCameraTestCases.config```
    
    
  • android.hardware.camera2.cts.CameraManagerTest是测试用例包名加类名。见aosp工程:

    platform/cts/tests/camera/src/android/hardware/camera2/cts$```
    
    
  • testCameraManagerGetDeviceIdList是测试方法名

    CameraManagerTest.java
    140:    public void testCameraManagerGetDeviceIdList() throws Exception {```
    
    
  • 定位异常行。

    java写的测试用例会有异常调用栈的打印。在对应测试结果与日志目录下俩文件,logs/2020.11.10_15.03.26/inv_xxx/host_log_xxx.txtresults/2020.11.10_15.03.26/test_result.xml

        <TestCase name="android.hardware.camera2.cts.CameraManagerTest">
          <Test result="pass" name="testCameraManagerGetCameraCharacteristics" />
          <Test result="fail" name="testCameraManagerGetDeviceIdList">
            <Failure message="junit.framework.AssertionFailedError: External camera is not connected on device with FEATURE_CAMERA_EXTERNAL&#13;">
              <StackTrace>junit.framework.AssertionFailedError: External camera is not connected on device with FEATURE_CAMERA_EXTERNAL
    	at junit.framework.Assert.fail(Assert.java:50)
    	at junit.framework.Assert.assertTrue(Assert.java:20)
    	at android.hardware.camera2.cts.CameraManagerTest.testCameraManagerGetDeviceIdList(CameraManagerTest.java:172)
    	at java.lang.reflect.Method.invoke(Native Method)
    	at junit.framework.TestCase.runTest(TestCase.java:168)
    	at junit.framework.TestCase.runBare(TestCase.java:134)
    	at junit.framework.TestResult$1.protect(TestResult.java:115)
    	at androidx.test.internal.runner.junit3.AndroidTestResult.runProtected(AndroidTestResult.java:73)
    	at junit.framework.TestResult.run(TestResult.java:118)
    	at androidx.test.internal.runner.junit3.AndroidTestResult.run(AndroidTestResult.java:51)
    	at junit.framework.TestCase.run(TestCase.java:124)
    	at androidx.test.internal.runner.junit3.NonLeakyTestSuite$NonLeakyTest.run(NonLeakyTestSuite.java:62)
    	at androidx.test.internal.runner.junit3.AndroidTestSuite$2.run(AndroidTestSuite.java:101)
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:458)
    	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
    	at java.lang.Thread.run(Thread.java:764)
    

关于GTS、TVTS我们如何处置?由于不开源,所以只能去测试套件下反编译响应模块的apk或jar来阅读测试代码。我一般是jadx-gui拖进去即可,有跳转有行号。

3.2.2 host端的测试用例

典型的是部分vts测试用例。我们看下面这条测试。

armeabi-v7a VtsKernelProcFileApi
Test Result Details
VtsKernelProcFileApi#testProcMemInfoTest fail Failed to parse! Failed to parse line MemTotal: 2056324 kB according to rule {:name}: {:lu}
  • 定位测试模块

    :~/platform/test/vts-testcase/kernel/api/proc$```
    
    确认下模块名**VtsKernelProcFileApi**
    
    ```test/vts-testcase/kernel/api/proc$ ag VtsKernelProcFileApi
    Android.bp
    18:    name: "VtsKernelProcFileApi",
    
    AndroidTest.xml
    26:        <option name="test-module-name" value="VtsKernelProcFileApi" />
    27:        <option name="test-case-path" value="vts/testcases/kernel/api/proc/VtsKernelProcFileApiTest" />
    
    VtsKernelProcFileApiTest.py
    123:class VtsKernelProcFileApiTest(base_test.BaseTestClass):
    
  • 定位测试方法

    ag ProcMemInfoTest
    ProcMemInfoTest.py
    35:class ProcMemInfoTest(KernelProcFileTestBase.KernelProcFileTestBase):`
    
    VtsKernelProcFileApiTest.py
    32:from vts.testcases.kernel.api.proc import ProcMemInfoTest
    63:    ProcMemInfoTest.ProcMemInfoTest(),
    
  • 定位crash异常代码行

    test/vts-testcase/kernel/api/proc$ ag "Failed to parse line"
    KernelProcFileTestBase.py
    159:            raise SyntaxError("Failed to parse line %s according to rule %s" %
    

    和3.2.1中情况不同,由于测试用例不是device端,没有crash调用栈,所以通过搜索来确认。

四、 几个分析案例

现在,我们有了报告和日志、可以100%复现的场景与设备,并且拥有了程序源码。已经没有什么能阻挡我们分析bug了。之后的流程与其他Android的bug分析,或者说,和计算机编程领域的问题分析是完全一致的。

接下来,我们进行几个xts失败项的分析,如此来对xts分析有个感性的印象。

4.1 cts

Suite / Plan CTS / cts
Suite / Build 10_r5 / 6723298

armeabi-v7a CtsHardwareTestCases
Test Result Details
android.hardware.consumerir.cts.ConsumerIrTest#test_timing fail junit.framework.AssertionFailedError: Pattern length pattern:499995000, actual:1192958
at junit.framework.Assert.fail(Assert.java:50)
at junit.framework.Assert.assertTrue(Assert.java:20)
at android.hardware.consumerir.cts.ConsumerIrTest.test_timing(ConsumerIrTest.java:94)
  • 先说结论。

    这是Q版本上的一条失败项,失败的原因是bsp需求开发占用了响应的gpio,导致consumerIr这个红外发射功能,驱动是没有实现的。测试用例调用响应接口去发射红外,然后期望一段时间有结果。实际花费时间要在期望值的0.5倍到1.5倍之间。由于我们驱动是空实现,所以直接异常返回导致用时非常短。测试失败。

  • 解决方案。

    解决方案明显的,有两条路,我们不需要红外发射功能,所以可以去掉,我们采用这项;另一条就是底层真正实现它了。

  • 分析详细

    测试代码:cts/tests/tests/hardware/src/android/hardware/consumerir/cts/ConsumerIrTest.java

        public void test_timing() {
            if (!mHasConsumerIr) {
                // Skip the test if consumer IR is not present.
                return;
            }
    
            ConsumerIrManager.CarrierFrequencyRange[] freqs = mCIR.getCarrierFrequencies();
            // Transmit two seconds for min and max for each frequency range
            int[] pattern = {11111, 22222, 33333, 44444, 55555, 66666, 77777, 88888, 99999};
            long totalXmitTimeNanos = 0; // get the length of the pattern
            for (int slice : pattern) {
                totalXmitTimeNanos += slice * 1000; // add the time in nanoseconds
            }
            double margin = 0.5; // max fraction xmit is allowed to be off timing
    
            for (ConsumerIrManager.CarrierFrequencyRange range : freqs) {
                // test min freq
                long currentTime = SystemClock.elapsedRealtimeNanos();
                mCIR.transmit(range.getMinFrequency(), pattern);
                long newTime = SystemClock.elapsedRealtimeNanos();
                String msg = String.format("Pattern length pattern:%d, actual:%d",
                        totalXmitTimeNanos, newTime - currentTime);
                assertTrue(msg, newTime - currentTime >= totalXmitTimeNanos * (1.0 - margin));
                assertTrue(msg, newTime - currentTime <= totalXmitTimeNanos * (1.0 + margin));
    
                // test max freq
                currentTime = SystemClock.elapsedRealtimeNanos();
                mCIR.transmit(range.getMaxFrequency(), pattern);
                newTime = SystemClock.elapsedRealtimeNanos();
                msg = String.format("Pattern length pattern:%d, actual:%d",
                        totalXmitTimeNanos, newTime - currentTime);
                assertTrue(msg, newTime - currentTime >= totalXmitTimeNanos * (1.0 - margin));
                assertTrue(msg, newTime - currentTime <= totalXmitTimeNanos * (1.0 + margin));//1
            }
        }
    
    

    如上是测试代码,在注释//1处断言抛异常,说明实际时间太小了。到底为什么小呢?ConsumerIrManager#transmit。我们往下跟这个方法。

        /**
         * Transmit an infrared pattern
         * <p>
         * This method is synchronous; when it returns the pattern has
         * been transmitted. Only patterns shorter than 2 seconds will
         * be transmitted.
         * </p>
         *
         * @param carrierFrequency The IR carrier frequency in Hertz.
         * @param pattern The alternating on/off pattern in microseconds to transmit.
         */
        public void transmit(int carrierFrequency, int[] pattern) {
            if (mService == null) {
                Log.w(TAG, "failed to transmit; no consumer ir service.");
                return;
            }
    
            try {
                mService.transmit(mPackageName, carrierFrequency, pattern);
            } catch (RemoteException e) {
                throw e.rethrowFromSystemServer();
            }
        }    
    

    字面意思,接收指定的频率和模式来发射红外。继续去服务实现里看

    frameworks/base/services/core/java/com/android/server/ConsumerIrService.java

        @Override
        public void transmit(String packageName, int carrierFrequency, int[] pattern) {
    ...
            // Right now there is no mechanism to ensure fair queing of IR requests
            synchronized (mHalLock) {
                int err = halTransmit(carrierFrequency, pattern);//1
    ...
    

    注释//1这里开始进入c层代码,通过jni走进hal层然后沟通驱动里的实现。

    frameworks/base/services/core/jni/com_android_server_ConsumerIrService.cpp

    #include <android/hardware/ir/1.0/IConsumerIr.h>
    #include <nativehelper/ScopedPrimitiveArray.h>
    
    using ::android::hardware::ir::V1_0::IConsumerIr;
    using ::android::hardware::ir::V1_0::ConsumerIrFreqRange;
    using ::android::hardware::hidl_vec;
    
    namespace android {
    
    static sp<IConsumerIr> mHal;
    
    static jboolean halOpen(JNIEnv* /* env */, jobject /* obj */) {
        // TODO(b/31632518)
        mHal = IConsumerIr::getService();
        return mHal != nullptr;
    }
    
    static jint halTransmit(JNIEnv *env, jobject /* obj */, jint carrierFrequency,
       jintArray pattern) {
        ScopedIntArrayRO cPattern(env, pattern);
        if (cPattern.get() == NULL) {
            return -EINVAL;
        }
        hidl_vec<int32_t> patternVec;
        patternVec.setToExternal(const_cast<int32_t*>(cPattern.get()), cPattern.size());
    
        bool success = mHal->transmit(carrierFrequency, patternVec);//1
        return success ? 0 : -1;
    }
    

    到这里要去找hal的具体实现,需要了解hal层实现规则。可以扩展学习下Android Treble架构解析

    穿越hal层,最终跟到hardware目录下的驱动实现,结合日志发现写节点失败了

    hardware/xxx/consumerir/consumerir.cpp

    static int consumerir_transmit(struct consumerir_device *dev __unused,
        int carrier_freq, const int pattern[], int pattern_len) {
    ...
        writeSys(IR_xxx_SEND, pPatterns, strlen(pPatterns));
    

    日志:

    10-02 07:20:49.706  3273  3273 E ConsumerIrHal: writeSysFs, open /sys/devices/virtual/irblaster/xxx/send fail. No such file or directory
    

    从框架角度开看问题的话,至此可以下结论了。原来失败是因为驱动实现异常,节点不存在,根本没去发射所以耗时很短。转交了bsp去解决。后来bsp的动作是去掉了框架里FEATURE_CONSUMER_IR。但是这也埋下了vts另一个问题的伏笔。就是接下来的4.2小结。

4.2 vts

Suite / Plan VTS / vts
Suite / Build 10_r5 / 6719887

armeabi-v7a VtsTrebleFrameworkVintfTest
Test Result Details
VtsTrebleFrameworkVintfTest#SystemVendorTest.ServedHwbinderHalsAreInManifest_32bit fail test/vts-testcase/hal/treble/vintf/SystemVendorTest.cpp:131 Expected: (manifest_hwbinder_hals.find(name)) != (manifest_hwbinder_hals.end()), actual: 4-byte object <E8-17 F6-FF> vs 4-byte object <E8-17 F6-FF> android.hardware.ir@1.0::IConsumerIr/default is being served, but it is not in a manifest
  • 先说结论。

    4.1节提到的cts失败项,去除feature时,仅将feature配置android.hardware.consumerir.xml

    vendor/etc/vintf/manifest.xml中hidl配置删除。

    <hal format="hidl">
        <name>android.hardware.ir</name>
        <transport>hwbinder</transport>
        <version>1.0</version>
        <interface>
            <name>IConsumerIr</name>
            <instance>default</instance>
        </interface>
        <fqname>@1.0::IConsumerIr/default</fqname>
    </hal>
    

    没有进一步删除init.rc中的服务。如此导致虽没有hal声明,但是服务进程仍会开机启动。

    该项测试即要求hal的声明与启动的服务保持一致:SystemVendorTest.ServedHwbinderHalsAreInManifest_32bit

    $ ps -e |grep ir
    system        3357     1   11020   3912 0                   0 S android.hardware.ir@1.0-service
    
    $ logcat |grep -i 3357
    01-01 00:00:10.747  3357  3357 I ConsumerIrHal: consumerir hal open success
    01-01 00:00:10.753  3357  3357 I ServiceManagement: Registered android.hardware.ir@1.0::IConsumerIr/default (start delay of 207ms)
    01-01 00:00:10.753  3357  3357 I ServiceManagement: Removing namespace from process name android.hardware.ir@1.0-service to ir@1.0-service.
    01-01 00:00:10.754  3357  3357 I android.hardware.ir@1.0-service: Registration complete for android.hardware.ir@1.0::IConsumerIr/default.
    
  • 解决方案

    送佛送到西,把ConsumerIr的HAL层及以下代码全部送走。

    $ git diff
    diff --git a/products/xxx/xxx.mk b/products/xxx/xxx.mk
    index be77805..9cdab74 100644
    --- a/products/xxx/xxx.mk
    +++ b/products/xxx/xxx.mk
    @@ -237,16 +237,19 @@ endif
     #                      ConsumerIr
     #
     #########################################################################
    -PRODUCT_PACKAGES += \
    +#---20201107---
    +#delete ConsumerIr service for cts/vts
    +
    +#PRODUCT_PACKAGES += \
         consumerir.app
     #PRODUCT_COPY_FILES += \
     #    frameworks/native/data/etc/android.hardware.consumerir.xml:$(TARGET_COPY_OUT_VENDOR)/etc/permissions/android.hardware.consumerir.xml
     #consumerir hal
    -PRODUCT_PACKAGES += \
    -    android.hardware.ir@1.0-impl \
    -    android.hardware.ir@1.0-service
    +#PRODUCT_PACKAGES += \
    +#    android.hardware.ir@1.0-impl \
    +#    android.hardware.ir@1.0-service
    
  • 分析详细

    测试代码:test/vts-testcase/hal/treble/vintf/SystemVendorTest.cpp

    // This needs to be tested besides
    // SingleManifestTest.ServedHwbinderHalsAreInManifest because some HALs may
    // refuse to provide its PID, and the partition cannot be inferred.
    TEST_F(SystemVendorTest, ServedHwbinderHalsAreInManifest) {
      auto device_manifest = VintfObject::GetDeviceHalManifest();
      ASSERT_NE(device_manifest, nullptr) << "Failed to get device HAL manifest.";
      auto fwk_manifest = VintfObject::GetFrameworkHalManifest();
      ASSERT_NE(fwk_manifest, nullptr) << "Failed to get framework HAL manifest.";
    
      std::set<std::string> manifest_hwbinder_hals;
    
      insert(&manifest_hwbinder_hals, GetHwbinderHals(fwk_manifest));
      insert(&manifest_hwbinder_hals, GetHwbinderHals(device_manifest));
    
      Return<void> ret = default_manager_->list([&](const auto &list) {
        for (const auto &name : list) {
          // TODO(b/73774955): use standardized parsing code for fqinstancename
          if (std::string(name).find(IBase::descriptor) == 0) continue;
    
          EXPECT_NE(manifest_hwbinder_hals.find(name), manifest_hwbinder_hals.end())
              << name << " is being served, but it is not in a manifest.";//1
        }
      });
      EXPECT_TRUE(ret.isOk());
    }
    

    注释//1处断言抛异常,字面意思,hal实体服务启动后在serviceManaegr里已注册,但是manifest.xml里未作声明。这段代码涉及init进程启动rc里声明的服务、serviceManager的服务注册以及hidl服务的注册。涉及的东西很经典,逻辑很清晰。

  • 还有一点

    有同学看到这可能要问了。代码我看了,为什么这么麻烦,刚开始4.1那条只删除feature不就没这么多事儿了?

    确实这是个很好的思路,第一点想到这里,是很正确的方向。但是尝试后发现,system_server进程会crash,因为启动阶段有代码限制了这样做。所以不可以仅仅删除feature。

    frameworks/base/services/java/com/android/server/SystemServer.java

        private void startOtherServices(@NonNull TimingsTraceAndSlog t) {
    ...
                if (!isWatch) {
                    t.traceBegin("StartConsumerIrService");
                    consumerIr = new ConsumerIrService(context);
                    ServiceManager.addService(Context.CONSUMER_IR_SERVICE, consumerIr);
                    t.traceEnd();
                }
    

    frameworks/base/services/core/java/com/android/server/ConsumerIrService.java

        ConsumerIrService(Context context) {
            mContext = context;
            PowerManager pm = (PowerManager)context.getSystemService(
                    Context.POWER_SERVICE);
            mWakeLock = pm.newWakeLock(PowerManager.PARTIAL_WAKE_LOCK, TAG);
            mWakeLock.setReferenceCounted(true);
    
            mHasNativeHal = halOpen();//1
            if (mContext.getPackageManager().hasSystemFeature(PackageManager.FEATURE_CONSUMER_IR)) {
                if (!mHasNativeHal) {
                    throw new RuntimeException("FEATURE_CONSUMER_IR present, but no IR HAL loaded!");
                }
            } else if (mHasNativeHal) {
                throw new RuntimeException("IR HAL present, but FEATURE_CONSUMER_IR is not set!");
            }
        }
    

    注释//1之后的代码做了双向限制,保证hal与feature同在。否则system_server进程会一直crash,最终触发Rescure party机制。

4.3 gts

Suite / Plan GTS / gts
Suite / Build 7.0_r3 / 6045416

armeabi-v7a GtsSecurityHostTestCases
Test Result Details
com.google.android.security.gts.SELinuxHostTest#testNoExemptionsForSocketsBetweenCoreAndVendorBan fail junit.framework.AssertionFailedError: Policy exempts domains from ban on socket communications between core and vendor: [hal_audio_default]
at junit.framework.Assert.fail(Assert.java:57)
at junit.framework.TestCase.fail(TestCase.java:227)
at com.google.android.security.gts.SELinuxHostTest.testNoExemptionsForSocketsBetweenCoreAndVendorBan(SELinuxHostTest.java:221)
  • 先说结论。

    google的waiver项。

  • 分析详细。

    测试用例的逻辑:

    使用linux可执行程序:sepolicy-analyze,对机顶盒中的/sys/fs/selinux/policy文件进行解析,要求不能有返回值,命令是:
    sepolicy-analyze policy attribute socket_between_core_and_vendor_violators
    即:不允许有type(类型)与该attribute(属性)“socket_between_core_and_vendor_violators”有关联,字面意思:core与vendor的违规socket特权。

    分析测试代码:反编译后定位测试项
    ./com/google/android/security/gts/SELinuxHostTest.java

        public void testNoExemptionsForVendorExecutingCore() throws Exception {
            if (isFullTrebleDevice()) {
                Set<String> types = sepolicyAnalyzeGetTypesAssociatedWithAttribute("vendor_executes_system_violators");//1
                if (!types.isEmpty()) {
                    List<String> sortedTypes = new ArrayList(types);
                    Collections.sort(sortedTypes);
                    fail("Policy exempts vendor domains from ban on executing files in /system: " + sortedTypes);//
                }
            }
        }
    

    注释//1:进去继续确认测试逻辑:sepolicyAnalyzeGetTypesAssociatedWithAttribute()

    注释//2:此处assert,原因是容器types有东西,东西就是‘[hal_audio_default]’

       private Set<String> sepolicyAnalyzeGetTypesAssociatedWithAttribute(String attribute) throws Exception {
            BufferedReader in;
            Throwable th;
            Throwable th2;
            Set<String> types = new HashSet();
            ProcessBuilder pb = new ProcessBuilder(new String[]{this.mSepolicyAnalyze.getAbsolutePath(), this.mDevicePolicyFile.getAbsolutePath(), "attribute", attribute});//1
    ......
                in = new BufferedReader(new InputStreamReader(p.getInputStream()));
                th = null;
                while (true) {
                    try {
                        String type = in.readLine();
                        if (type != null) {
                            types.add(type.trim());//2 
                        }}} 
    ......
            return types;
    ......
        }
    

    注释//1:通过ProcessBuilder开启一个进程,用于执行linux命令:sepolicy-analyze policy attribute socket_between_core_and_vendor_violators

    注释//2:获取有效标准输出,写到结果容器中存储

    现在基本逻辑就清楚了,只要这个命令执行有结果返回就是不被允许的,现在需要分析这个工具‘sepolicy-analyze’是干嘛的?
    在Android工程源码中搜索,我们找到了这个host可执行程序的源码
    system/sepolicy/tools/sepolicy-analyze/
    结合网络资料以及阅读源码和README文档,澄清测试的命令用途:解析policy文件返回与attribute相关联的type值

    工程中搜索确认到底在哪里使得他们关联的,定位到文件

    ./system/sepolicy/vendor/hal_audio_default.te:1
    type hal_audio_default, domain, socket_between_core_and_vendor_violators;
    

    查证git log,我们发现是如下的commit导致的,是google的auto-path,后续澄清为waiver项。

    commit 1234567
    Author: xxxxxx
    Date:   Mon Feb 17 11:33:16 2020 +0800
    
        auto patch added:CecAudio
    
    diff --git a/vendor/hal_audio_default.te b/vendor/hal_audio_default.te
    index 0dc2170..9da0f1b 100644
    --- a/vendor/hal_audio_default.te
    +++ b/vendor/hal_audio_default.te
    @@ -1,4 +1,4 @@
    -type hal_audio_default, domain;
    +type hal_audio_default, domain, socket_between_core_and_vendor_violators; #此处添加的关联,问题找到了根源  
     hal_server_domain(hal_audio_default, hal_audio)
    
  • 相关资料:

    system/sepolicy/tools/sepolicy-analyze/README

    ATTRIBUTE (attribute)
    sepolicy-analyze out/target/product//root/sepolicy attribute
    Displays the types associated with the specified attribute name.

    该权限详细限制在以下代码中有论述,Android TREBLE架构相关
    system/sepolicy/prebuilts/api/26.0/public/domain.te system/sepolicy/prebuilts/api/27.0/public/domain.te system/sepolicy/prebuilts/api/28.0/public/domain.te: system/sepolicy/public/domain.te

    # On full TREBLE devices, socket communications between core components and vendor components are
    # not permitted.
    full_treble_only(`
      # Most general rules first, more specific rules below.
    
      # Core domains are not permitted to initiate communications to vendor domain sockets.
      # We are not restricting the use of already established sockets because it is fine for a process
      # to obtain an already established socket via some public/official/stable API and then exchange
      # data with its peer over that socket. The wire format in this scenario is dicatated by the API
      # and thus does not break the core-vendor separation.
    

4.4 cts verifier

CtsVerifier 9.0-r11

失败项:Bluetooth Test --> Bluetooth LE Secure Client Test --> 01 BlueTooth LE Client Test

verifier-fail

  • 先说结论。

    蓝牙驱动修改引入,转交bsp修复。

    该项测试流程概述:
    Client测试pass后做出先关闭mAdapter.disable()后打开mAdapter.enable()的动作,经梳理蓝牙框流程发现,蓝牙关闭后再次打开超时失败,由此导致该测试Activity无法收到广播,无法将按钮设置为可选。

  • 分析详细。

    测试代码
    1.测试Activity,提供ui无业务代码
    cts/apps/CtsVerifier/src/com/android/cts/verifier/bluetooth/BleSecureClientStartActivity.java
    2.上面(1)的父类,测试结果处理
    cts/apps/CtsVerifier/src/com/android/cts/verifier/bluetooth/BleClientTestBaseActivity.java
    3.测试Service,真正的测试项执行
    cts/apps/CtsVerifier/src/com/android/cts/verifier/bluetooth/BleClientService.java

    流程梳理
    1.启动BleSecureClientStartActivity。该测试页面只是一个activity-ui。主要的方法实现和流程动作都在父类Activity中,重点关注父类。
    2.调用父类BleClientTestBaseActivity的onCreate()完成:
    页面显示、设置底部pass-fail-button按键的监听、设置pass-button为disable不可选
    初始化页面的测试项ListView
    回到子类继续onCreate():显示info提示信息的dialog、启动BleClientService准备测试

    3.调用父类BleClientTestBaseActivity的onResume():
    注册测试服务BleClientService的广播监听BroadcastReceiver mBroadcast

    4.开始测试
    这个测试是俩设备Server-Client配对测试,自动触发,细节略,和Server端之间的蓝牙通信有关
    listview每一条测试结束都会有广播发出,接收广播后将mPassed做或运算,如果一切顺利mPassed的运算结果是PASS_FLAG_ALL
    这代表测试项全部通过

    private static final int PASS_FLAG_ALL = 0x3FFFF;
    

    然后Client端将蓝牙关闭mAdapter.disable()再打开mAdapter.enable(),打开成功情况下Activity才会将Pass-Button设置为可选择
    由于驱动代码出问题,遂enable()失败,无法设置按钮可选

    关键日志

    测试All-PASS

    04-21 16:52:05.901 D/BluetoothGatt( 6338): onClientConnectionState() - status=0 clientIf=6 device=7B:D0:42:AC:47:B6
    04-21 16:52:05.901 D/BleClientService( 6338): onConnectionStateChange: status= 0, newState= 0
    04-21 16:52:05.901 D/BluetoothGatt( 6338): close()
    04-21 16:52:05.901 D/BluetoothGatt( 6338): unregisterApp() - mClientIf=6
    04-21 16:52:05.915 D/BleClientTestBase( 6338): Processing com.android.cts.verifier.bluetooth.BLE_BLUETOOTH_DISCONNECTED
    04-21 16:52:05.921 D/BleClientTestBase( 6338): Passed Flags has changed from 0x0003FDFF to 0x0003FFFF. Delta=0x00000200
    04-21 16:52:05.921 D/BleClientTestBase( 6338): All Tests Passed.
    

    蓝牙正常关闭

    04-21 16:52:06.931 D/AdapterProperties( 3406): Setting state to TURNING_OFF
    04-21 16:52:07.038 I/AdapterState( 3406): BLE_TURNING_OFF : entered 
    04-21 16:52:07.057 I/bt_btif_core( 3406): btif_disable_bluetooth finished
    

    蓝牙开启超时

    04-21 16:52:17.051 D/BluetoothManagerService( 3297): enable(com.android.cts.verifier):  mBluetooth =null mBinding = false mState = OFF
    04-21 16:52:17.288 D/BluetoothAdapterService( 7058): bleOnProcessStart()
    04-21 16:52:17.290 D/BluetoothManagerService( 3297): MESSAGE_BLUETOOTH_STATE_CHANGE: OFF > BLE_TURNING_ON
    04-21 16:52:17.290 D/BluetoothManagerService( 3297): Sending BLE State Change: OFF > BLE_TURNING_ON
    ......
    04-21 16:52:21.292 E/AdapterState( 7058): BLE_TURNING_ON : BLE_START_TIMEOUT
    04-21 16:52:21.293 I/AdapterState( 7058): BLE_TURNING_OFF : entered 
    

五、经验之谈

失败项到手,先确认他的来源。新工具引入?上个版本有吗?最近有什么修改?这些信息对缩小范围很重要。

xts测试是很多且杂的,一个人去掌控全局个人感觉会很吃力。所以有不熟悉的模块问题,要分发给对应模块同学分析。例如媒体、网络、驱动等等。

善用搜索引擎,获取直接或者间接的答案与知识。

对比和二分法是你最后的依靠。如果你没有头绪或者进展缓慢,时间紧迫时就回退版本吧。众所周知二分法的时间复杂度是O(lgn)。

修改验证时,了解编译框架会使你事半功倍。仅替换apk,so还是某个文件。或者刷个img镜像也比编译OTA包快的多。

六、一些资料、工具

posted @ 2020-11-19 01:18  秋城  阅读(5798)  评论(0编辑  收藏  举报