python 语言常见用法积累
字典结构的使用
字典简单理解就是key-value对,下面是字典建立和简单使用
>>> d={1:'hello',2:'world',3:'come',4:'on'} >>> d {1: 'hello', 2: 'world', 3: 'come', 4: 'on'} >>> d[1] 'hello' >>> d['a'] Traceback (most recent call last): File "<pyshell#23>", line 1, in <module> d['a'] KeyError: 'a'
我们发现当取一个我们没有的key时,会报错KeyError,如果要避免这个问题3个方式解决。
1. 先进行key 是否在字典中的判断
>>> for key in d: print key 1 2 3 4 >>> 'a' in d False
上面也展示了如何循环打印出字典中所有key 值。
2. 字典也提供了get方法,如果key不存在会返回None或者自己设定的值
>>> d.get('a') >>> d.get('a','not exist') 'not exist'
3. python设计了一种数据结构defaultdict
>>> from collections import defaultdict >>> dd = defaultdict(lambda:'not exist') >>> dd['key1']='hello' >>> dd['key1'] 'hello' >>> dd['key2'] 'not exist'还有一点字典key内存存放顺序与key插入顺序没有关系,还有key必须是不可改变对象,字典需要通过key的hash值来寻找value值,所以key是不能改变,从底层实现性能角度来说字典是通过空间换时间,字典消耗比较大空间来换取查询速度。
>>> L=[1,2] >>> dict={} >>> dict[L]='a list' Traceback (most recent call last): File "<pyshell#44>", line 1, in <module> dict[L]='a list' TypeError: unhashable type: 'list'字典还有一个常见操作就是获取字典所有key-value对的值
>>> d {1: 'hello', 2: 'world', 3: 'come', 4: 'on'} >>> for key,value in d.items(): print key,value 1 hello 2 world 3 come 4 on
python 排序
python排序主要有sorted 方法和list build in 的一个sort方法,下面主要对这两个排序做介绍。
sorted方法最简单用法
>>> sorted([5,3,6,2,1]) [1, 2, 3, 5, 6]
>>> L = [5, 2, 3, 1, 4] >>> L.sort() >>> L [1, 2, 3, 4, 5]从上面代码可以发现list 的sort方法不会返回内容,只是在list原先位置进行排序,这样list以前是什么状态就找不到了。sorted方法就是会返回一个新的排序好的内容,而原先传入的参数并不会改变。
>>> L=[5, 2, 3, 1, 4] >>> sorted(L) [1, 2, 3, 4, 5] >>> L [5, 2, 3, 1, 4]还有一点就是sorted方法适用于所有iterable结构,比如字典
>>> d {1: 'hello', 2: 'world', 3: 'come', 4: 'on'} >>> sorted(d) [1, 2, 3, 4]除了最基本用法还有可以传入一个key,让iterable结构的item 通过这个key进行比较
>>> sorted("This is a test string from Andrew".split(), key=str.lower) ['a', 'Andrew', 'from', 'is', 'string', 'test', 'This']
同样排序的key可以是这个对象中某个index的元素,下面就是选取的index 为2 的age这个元素,通过reverse 参数可以设定是否为逆序
>>> student_tuples = [ ('john', 'A', 15), ('jane', 'B', 12), ('dave', 'B', 10), ] >>> sorted(student_tuples, key=lambda student: student[2]) # sort by age [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] >>> sorted(student_tuples, key=lambda student: student[2],reverse = True ) # sort by age [('john', 'A', 15), ('jane', 'B', 12), ('dave', 'B', 10)]也依据一个对象中某个属性来进行排序
>>> class Student: def __init__(self, name, grade, age): self.name = name self.grade = grade self.age = age def __repr__(self): return repr((self.name, self.grade, self.age)) >>> student_objects = [ Student('john', 'A', 15), Student('jane', 'B', 12), Student('dave', 'B', 10), ] >>> sorted(student_objects, key=lambda student: student.age) # sort by age [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]然而总是通过取一个属性感觉不是那么简单,python2.5后提供itemgetter, attrgetter使用起来更方便。
>>> sorted(student_tuples, key=itemgetter(2)) [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] >>> sorted(student_objects, key=attrgetter('age')) [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
还有可能需要先根据参数1排序然后再根据参数2排序,这个也实现好了
>>> sorted(student_tuples, key=itemgetter(1,2)) [('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)] >>> sorted(student_objects, key=attrgetter('grade', 'age')) [('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]
从python2.2开始,排序就采用stable 排序,何为stable 很简单就是说两个相同key在排序后原始顺序要保留。
>>> data = [('red', 1), ('blue', 1), ('red', 2), ('blue', 2)] >>> sorted(data, key=itemgetter(0)) [('blue', 1), ('blue', 2), ('red', 1), ('red', 2)]
采用stable排序最大好处就是可以让你把一个很复杂的排序分解成很多步骤如下面:先按age升序,后按分数逆序
>>> s = sorted(student_objects, key=attrgetter('age')) >>> sorted(s, key=attrgetter('grade'), reverse=True) [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
python内部实现采用Timsort:Timsort
python 实现列出目录所有文件
最常见好用的就是调用系统os模块的listdir方法
>>> l=[file for file in os.listdir('E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result')] >>> l ['averageMethMotif', 'Crick', 'MythylationLevel', 'old_1_10_output_Crick.read', 'old_1_10_output_CrickFinal.read', 'old_1_10_output_Waston.read', 'SingleCount', 'Waston']
listdir 方法列出目录文件,但是无法递归列出多层目录下文件
递归调用os.path.walk 递归遍历
>>> import os >>> def processDirectory ( args, dirname, filenames ): print 'Directory',dirname for filename in filenames: print ' File',filename >>> os.path.walk('E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result',processDirectory,None) Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result File averageMethMotif File Crick File MythylationLevel File old_1_10_output_Crick.read File old_1_10_output_CrickFinal.read File old_1_10_output_Waston.read File SingleCount File Waston Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\averageMethMotif File AverageOfmethyOfCG.txt File AverageOfmethyOfCHG.txt File AverageOfmethyOfCHH.txt Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\Crick File BaseRatio.txt File Column1Dist.txt File Column1Dist3set.txt File Column1Dist4set.txt File Column2Dist.txt File Column2Dist3set.txt File Column2Dist4set.txt File Column3Dist.txt File Column3Dist3set.txt File Column3Dist4set.txt File Column4Dist.txt File Column4Dist3set.txt File Column4Dist4set.txt File Column5Dist.txt File Column5Dist3set.txt File Column5Dist4set.txt File Column6Dist.txt File Column6Dist3set.txt File Column6Dist4set.txt File Column7Dist.txt File Column7Dist3set.txt File Column8Dist.txt File delDist.txt File delDist3set.txt File delDist4set.txt File Fre2Set.txt File Fre3Set.txt File Fre4Set.txt Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\MythylationLevel File average_of_methylation File highMethylation.ATCGmap File lowMethylation.ATCGmap File NoMethylation.ATCGmap Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\SingleCount File 2setCount.txt File 3setCount.txt File SingleCount.txt Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\Waston File BaseRatio.txt File Column1Dist.txt File Column1Dist3set.txt File Column1Dist4set.txt File Column2Dist.txt File Column2Dist3set.txt File Column2Dist4set.txt File Column3Dist.txt File Column3Dist3set.txt File Column3Dist4set.txt File Column4Dist.txt File Column4Dist3set.txt File Column4Dist4set.txt File Column5Dist.txt File Column5Dist3set.txt File Column5Dist4set.txt File Column6Dist.txt File Column6Dist3set.txt File Column6Dist4set.txt File Column7Dist.txt File Column7Dist3set.txt File Column8Dist.txt File delDist.txt File delDist3set.txt File delDist4set.txt File Fre2Set.txt File Fre3Set.txt File Fre4Set.txt
非递归的方法
>>> import os >>> for dirpath,dirnames,filenames in os.walk('E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result'): print 'Directory',dirpath for filename in filenames: print 'File',filename Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result File old_1_10_output_Crick.read File old_1_10_output_CrickFinal.read File old_1_10_output_Waston.read Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\averageMethMotif File AverageOfmethyOfCG.txt File AverageOfmethyOfCHG.txt File AverageOfmethyOfCHH.txt Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\Crick File BaseRatio.txt File Column1Dist.txt File Column1Dist3set.txt File Column1Dist4set.txt File Column2Dist.txt File Column2Dist3set.txt File Column2Dist4set.txt File Column3Dist.txt File Column3Dist3set.txt File Column3Dist4set.txt File Column4Dist.txt File Column4Dist3set.txt File Column4Dist4set.txt File Column5Dist.txt File Column5Dist3set.txt File Column5Dist4set.txt File Column6Dist.txt File Column6Dist3set.txt File Column6Dist4set.txt File Column7Dist.txt File Column7Dist3set.txt File Column8Dist.txt File delDist.txt File delDist3set.txt File delDist4set.txt File Fre2Set.txt File Fre3Set.txt File Fre4Set.txt Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\MythylationLevel File average_of_methylation File highMethylation.ATCGmap File lowMethylation.ATCGmap File NoMethylation.ATCGmap Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\SingleCount File 2setCount.txt File 3setCount.txt File SingleCount.txt Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\Waston File BaseRatio.txt File Column1Dist.txt File Column1Dist3set.txt File Column1Dist4set.txt File Column2Dist.txt File Column2Dist3set.txt File Column2Dist4set.txt File Column3Dist.txt File Column3Dist3set.txt File Column3Dist4set.txt File Column4Dist.txt File Column4Dist3set.txt File Column4Dist4set.txt File Column5Dist.txt File Column5Dist3set.txt File Column5Dist4set.txt File Column6Dist.txt File Column6Dist3set.txt File Column6Dist4set.txt File Column7Dist.txt File Column7Dist3set.txt File Column8Dist.txt File delDist.txt File delDist3set.txt File delDist4set.txt File Fre2Set.txt File Fre3Set.txt File Fre4Set.txt
glob模块完成文件过滤
>>> for filename in glob.glob('E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result//Waston//*.txt'): print filename
python两个列表做相关运算
两个list是无法直接做减法的
>>> y1=[1,2,4] >>> y2=[2,3,4] >>> y2-y1 Traceback (most recent call last): File "<pyshell#61>", line 1, in <module> y2-y1 TypeError: unsupported operand type(s) for -: 'list' and 'list'
第一种方法:
把list转换为numpy下面的array类型,得到结果再转型回list
>>> import numpy >>> numpy.array(y2) - numpy.array(y1) array([1, 1, 0]) >>> list(numpy.array(y2) - numpy.array(y1)) [1, 1, 0]
第二种方法:
利用zip函数来操作,该函数接受多个序列为参数,返回序列元素组合的tuple
>>> map(lambda x: x[0]-x[1], zip(y2, y1)) [1, 1, 0]