Proguard 源码分析(七)混淆
本章我们讲Proguard非常重要的一个步骤:混淆Obfuscator
混淆的目的很明显,是为了混淆语义。
我们能轻车熟路的找到混淆的源头:
Obfuscator 的execute方法
我们来看一下就行混淆器给我们设置了怎样的访问者:
ClassVisitor memberInfoLinker =
configuration.useUniqueClassMemberNames ?
(ClassVisitor)new AllMemberVisitor(new MethodLinker()) :
(ClassVisitor)new BottomClassFilter(new MethodLinker());
我们直接跟到最终的访问者,也就是MethodLinker
public void visitAnyClass(Clazz clazz)
{
// Collect all non-private members in this class hierarchy.
clazz.hierarchyAccept(true, true, true, false,
new AllMethodVisitor(
new MemberAccessFilter(0, ClassConstants.INTERNAL_ACC_PRIVATE | ClassConstants.INTERNAL_ACC_STATIC,
this)));
// Clean up for the next class hierarchy.
memberMap.clear();
}
它的意思是说将自己作为AllMethodVisitor 所有方法中,MemberAccessFilter可接受的访问权限的访问者,
既然它的目的很明显是为了处理方法,我们直接跟到相关的调用:
public void visitAnyMember(Clazz clazz, Member member)
{
String name = member.getName(clazz);
String descriptor = member.getDescriptor(clazz);
if (name.equals(ClassConstants.INTERNAL_METHOD_NAME_CLINIT) ||
name.equals(ClassConstants.INTERNAL_METHOD_NAME_INIT))
{
return;
}
String key = name + ' ' + descriptor;
Member otherMember = (Member)memberMap.get(key);
if (otherMember == null)
{
Member thisLastMember = lastMember(member);
memberMap.put(key, thisLastMember);
} else {
link(member, otherMember);
}
}
我们看到刚开始它会先取得方法的名称和签名,对于cinit 和init不做处理,注释便是最好的源码解析:
// Special cases: <clinit> and <init> are always kept unchanged.
// We can ignore them here.
之后将通过签名来找到享元池中的方法Member otherMember = (Member)memberMap.get(key);
if (otherMember == null)
{
// Get the last method in the chain.
Member thisLastMember = lastMember(member);
// Store the new class method in the map.
memberMap.put(key, thisLastMember);
}
else
{
// Link both members.
link(member, otherMember);
}
如果没有这个方法,便加入到享元池中,但是之前调用了lastMember
我们来看一下lastMember 方法:
public static Member lastMember(Member member)
{
Member lastMember = member;
while (lastMember.getVisitorInfo() != null &&
lastMember.getVisitorInfo() instanceof Member)
{
lastMember = (Member)lastMember.getVisitorInfo();
}
return lastMember;
}
也就是说如果它的visitor信息里面如果存在,则直接返回上一个的方法,这种写法很类似于职责链模式,但我只是猜测,只是有职责链的感觉,我们继续往下看,如果方法已经存在了呢?~
我们回到clazz.hierarchyAccept 的参数:
public void hierarchyAccept(boolean visitThisClass, //true
boolean visitSuperClass, //true
boolean visitInterfaces, //true
boolean visitSubclasses,//false
ClassVisitor classVisitor);
这里面很可能存在一个方法visitThisClass,visitSuperClass,visitInterfaces都存在的情况
所以link(member, otherMember); 的目的就是为了合并到一个共同的职责链源头
好的我们继续混淆的过程:
NameMarker nameMarker = new NameMarker();
ClassPoolVisitor classPoolvisitor =
ClassSpecificationVisitorFactory.createClassPoolVisitor(configuration.keep,
nameMarker,
nameMarker,
false,
false,
true);
前文好几次提到了ClassSpecificationVisitorFactory.createClassPoolVisitor 这个方法,我们不做深究我们直接看NameMarker
public void visitProgramClass(ProgramClass programClass)
{
keepClassName(programClass);
// Make sure any outer class names are kept as well.
programClass.attributesAccept(this);
}
public void keepClassName(Clazz clazz)
{
ClassObfuscator.setNewClassName(clazz,
clazz.getName());
}
static void setNewClassName(Clazz clazz, String name)
{
clazz.setVisitorInfo(name);
}
可见:keepClassName 的目的就是为了往VisitorInfo 设置name 标签
public void visitAnyAttribute(Clazz clazz, Attribute attribute) {}
说明它对属性并没有任何的访问操作:
接着由于库文件没有必要混淆,所以要将库文件保持签名
libraryClassPool.classesAccept(nameMarker);
libraryClassPool.classesAccept(new AllMemberVisitor(nameMarker));
接着:
// Mark attributes that have to be kept.
AttributeVisitor attributeUsageMarker =
new NonEmptyAttributeFilter(
new AttributeUsageMarker());
AttributeVisitor optionalAttributeUsageMarker =
configuration.keepAttributes == null ? null :
new AttributeNameFilter(new ListParser(new NameParser()).parse(configuration.keepAttributes),
attributeUsageMarker);
programClassPool.classesAccept(
new AllAttributeVisitor(true,
new RequiredAttributeFilter(attributeUsageMarker,
optionalAttributeUsageMarker)));
这是对class的属性做标记,接着我们省略掉无关紧要的代码我们直接看混淆的主要访问者
programClassPool.classesAccept(
new ClassObfuscator(programClassPool,
classNameFactory,
packageNameFactory,
configuration.useMixedCaseClassNames,
configuration.keepPackageNames,
configuration.flattenPackageHierarchy,
configuration.repackageClasses,
configuration.allowAccessModification));
我们可以跟到对于混淆的名是通过:
String name = programClass.getName();
String newName = ClassObfuscator.newClassName(programClass);
来生成的而ClassObfuscator.newClassName
实际上是放回访问者的访问标志,而这个标记值是通过ClassObfuscator的
public void visitProgramClass(ProgramClass programClass)
{
// Does this class still need a new name?
newClassName = newClassName(programClass);
if (newClassName == null)
{
// Make sure the outer class has a name, if it exists. The name will
// be stored as the new class name, as a side effect, so we'll be
// able to use it as a prefix.
programClass.attributesAccept(this);
// Figure out a package prefix. The package prefix may actually be
// the an outer class prefix, if any, or it may be the fixed base
// package, if classes are to be repackaged.
String newPackagePrefix = newClassName != null ?
newClassName + ClassConstants.INTERNAL_INNER_CLASS_SEPARATOR :
newPackagePrefix(ClassUtil.internalPackagePrefix(programClass.getName()));
// Come up with a new class name, numeric or ordinary.
newClassName = newClassName != null && numericClassName ?
generateUniqueNumericClassName(newPackagePrefix) :
generateUniqueClassName(newPackagePrefix);
setNewClassName(programClass, newClassName);
}
}
来设置的。而对于采用keep标记的那些值,在调用 newClassName(programClass);会放回keep访问者标记的值就不走if中的语句块,这样就达到了只针对非keep中的参数混淆的效果。我们来看一下它是如何给定名字的吧:
private String generateUniqueClassName(String newPackagePrefix)
private String generateUniqueClassName(String newPackagePrefix,
NameFactory classNameFactory)
{
// Come up with class names until we get an original one.
String newClassName;
String newMixedCaseClassName;
do
{
// Let the factory produce a class name.
newClassName = newPackagePrefix +
classNameFactory.nextName();
newMixedCaseClassName = mixedCaseClassName(newClassName);
}
while (classNamesToAvoid.contains(newMixedCaseClassName));
// Explicitly make sure the name isn't used again if we have a
// user-specified dictionary and we're not allowed to have mixed case
// class names -- just to protect against problematic dictionaries.
if (this.classNameFactory != null &&
!useMixedCaseClassNames)
{
classNamesToAvoid.add(newMixedCaseClassName);
}
return newClassName;
}
{
// Find the right name factory for this package.
NameFactory classNameFactory =
(NameFactory)packagePrefixClassNameFactoryMap.get(newPackagePrefix);
if (classNameFactory == null)
{
// We haven't seen classes in this package before.
// Create a new name factory for them.
classNameFactory = new SimpleNameFactory(useMixedCaseClassNames);
if (this.classNameFactory != null)
{
classNameFactory =
new DictionaryNameFactory(this.classNameFactory,
classNameFactory);
}
packagePrefixClassNameFactoryMap.put(newPackagePrefix,
classNameFactory);
}
return generateUniqueClassName(newPackagePrefix, classNameFactory);
}
可见,是通过调用工厂来生成下一个匹配的名字,而且每一个包对应一个工厂,Proguard中有很多的命名工厂类的实现,我们直接用默认的命名工厂类:SimpleNameFactory
private String name(int index)
{
// Which cache do we need?
List cachedNames = generateMixedCaseNames ?
cachedMixedCaseNames :
cachedLowerCaseNames;
// Do we have the name in the cache?
if (index < cachedNames.size())
{
return (String)cachedNames.get(index);
}
private String newName(int index)
{
// If we're allowed to generate mixed-case names, we can use twice as
// many characters.
int totalCharacterCount = generateMixedCaseNames ?
2 * CHARACTER_COUNT :
CHARACTER_COUNT;
int baseIndex = index / totalCharacterCount;
int offset = index % totalCharacterCount;
char newChar = charAt(offset);
String newName = baseIndex == 0 ?
new String(new char[] { newChar }) :
(name(baseIndex-1) + newChar);
return newName;
}
// Create a new name and cache it.
String name = newName(index);
cachedNames.add(index, name);
return name;
}
我们看到它的命名规则非常简单~~就是采用26个字母顺序使用,然后依次扩展.
接下来我们来看下如何将这些规则应用到字节码库中:
我们看到访问者:
public void visitProgramClass(ProgramClass programClass)
{
// Rename this class.
programClass.thisClassConstantAccept(this);
// Rename the class members.
programClass.fieldsAccept(this);
programClass.methodsAccept(this);
}
ClassRenamer 它对常量的访问操作是:
public void visitClassConstant(Clazz clazz, ClassConstant classConstant)
{
// Update the Class entry if required.
String newName = ClassObfuscator.newClassName(clazz);
if (newName != null)
{
// Refer to a new Utf8 entry.
classConstant.u2nameIndex =
new ConstantPoolEditor((ProgramClass)clazz).addUtf8Constant(newName);
}
}
很容易看出,实际上它是将常量池对应的索引数据替换成它的新数据,
我们看下它对member的操作吧:
public void visitProgramMember(ProgramClass programClass,
ProgramMember programMember)
{
// Has the class member name changed?
String name = programMember.getName(programClass);
String newName = MemberObfuscator.newMemberName(programMember);
if (newName != null &&
!newName.equals(name))
{
programMember.u2nameIndex =
new ConstantPoolEditor(programClass).addUtf8Constant(newName);
}
}
还是一样的道理,先对member定义个一个新的名字,然后替换掉常量中的数据。