python 语言常见用法积累

字典结构的使用

字典简单理解就是key-value对,下面是字典建立和简单使用

>>> d={1:'hello',2:'world',3:'come',4:'on'}
>>> d
{1: 'hello', 2: 'world', 3: 'come', 4: 'on'}
>>> d[1]
'hello'
>>> d['a']

Traceback (most recent call last):
  File "<pyshell#23>", line 1, in <module>
    d['a']
KeyError: 'a'

我们发现当取一个我们没有的key时,会报错KeyError,如果要避免这个问题3个方式解决。

1. 先进行key 是否在字典中的判断

>>> for key in d:
	print key

	
1
2
3
4
>>> 'a' in d
False

上面也展示了如何循环打印出字典中所有key 值。

2. 字典也提供了get方法,如果key不存在会返回None或者自己设定的值

>>> d.get('a')
>>> d.get('a','not exist')
'not exist'

3. python设计了一种数据结构defaultdict

>>> from collections import defaultdict
>>> dd = defaultdict(lambda:'not exist')
>>> dd['key1']='hello'
>>> dd['key1']
'hello'
>>> dd['key2']
'not exist'
还有一点字典key内存存放顺序与key插入顺序没有关系,还有key必须是不可改变对象,字典需要通过key的hash值来寻找value值,所以key是不能改变,从底层实现性能角度来说字典是通过空间换时间,字典消耗比较大空间来换取查询速度。

>>> L=[1,2]
>>> dict={}
>>> dict[L]='a list'

Traceback (most recent call last):
  File "<pyshell#44>", line 1, in <module>
    dict[L]='a list'
TypeError: unhashable type: 'list'
字典还有一个常见操作就是获取字典所有key-value对的值

>>> d
{1: 'hello', 2: 'world', 3: 'come', 4: 'on'}
>>> for key,value in d.items():
	print key,value

	
1 hello
2 world
3 come
4 on

python 排序

python排序主要有sorted 方法和list build in 的一个sort方法,下面主要对这两个排序做介绍。

sorted方法最简单用法

>>> sorted([5,3,6,2,1])
[1, 2, 3, 5, 6]
>>> L = [5, 2, 3, 1, 4]
>>> L.sort()
>>> L
[1, 2, 3, 4, 5]
从上面代码可以发现list 的sort方法不会返回内容,只是在list原先位置进行排序,这样list以前是什么状态就找不到了。sorted方法就是会返回一个新的排序好的内容,而原先传入的参数并不会改变。

>>> L=[5, 2, 3, 1, 4]
>>> sorted(L)
[1, 2, 3, 4, 5]
>>> L
[5, 2, 3, 1, 4]
还有一点就是sorted方法适用于所有iterable结构,比如字典

>>> d
{1: 'hello', 2: 'world', 3: 'come', 4: 'on'}
>>> sorted(d)
[1, 2, 3, 4]
除了最基本用法还有可以传入一个key,让iterable结构的item 通过这个key进行比较

>>> sorted("This is a test string from Andrew".split(), key=str.lower)
['a', 'Andrew', 'from', 'is', 'string', 'test', 'This']

同样排序的key可以是这个对象中某个index的元素,下面就是选取的index 为2 的age这个元素,通过reverse 参数可以设定是否为逆序

>>> student_tuples = [
    ('john', 'A', 15),
    ('jane', 'B', 12),
    ('dave', 'B', 10),
]
>>> sorted(student_tuples, key=lambda student: student[2])   # sort by age
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
>>> sorted(student_tuples, key=lambda student: student[2],reverse = True
       )   # sort by age

[('john', 'A', 15), ('jane', 'B', 12), ('dave', 'B', 10)]
也依据一个对象中某个属性来进行排序

>>> class Student:
        def __init__(self, name, grade, age):
            self.name = name
            self.grade = grade
            self.age = age
        def __repr__(self):
            return repr((self.name, self.grade, self.age))

        
>>> student_objects = [
    Student('john', 'A', 15),
    Student('jane', 'B', 12),
    Student('dave', 'B', 10),
]
>>> sorted(student_objects, key=lambda student: student.age)   # sort by age
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
然而总是通过取一个属性感觉不是那么简单,python2.5后提供itemgetter, attrgetter使用起来更方便。

>>> sorted(student_tuples, key=itemgetter(2))
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
>>> sorted(student_objects, key=attrgetter('age'))
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]

还有可能需要先根据参数1排序然后再根据参数2排序,这个也实现好了

>>> sorted(student_tuples, key=itemgetter(1,2))
[('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]
>>> sorted(student_objects, key=attrgetter('grade', 'age'))
[('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]

从python2.2开始,排序就采用stable 排序,何为stable 很简单就是说两个相同key在排序后原始顺序要保留。

>>> data = [('red', 1), ('blue', 1), ('red', 2), ('blue', 2)]
>>> sorted(data, key=itemgetter(0))
[('blue', 1), ('blue', 2), ('red', 1), ('red', 2)]

采用stable排序最大好处就是可以让你把一个很复杂的排序分解成很多步骤如下面:先按age升序,后按分数逆序

>>> s = sorted(student_objects, key=attrgetter('age'))
>>> sorted(s, key=attrgetter('grade'), reverse=True)
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]

python内部实现采用Timsort:Timsort


python 实现列出目录所有文件

最常见好用的就是调用系统os模块的listdir方法

>>> l=[file for file in os.listdir('E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result')]
>>> l
['averageMethMotif', 'Crick', 'MythylationLevel', 'old_1_10_output_Crick.read', 'old_1_10_output_CrickFinal.read', 'old_1_10_output_Waston.read', 'SingleCount', 'Waston']

listdir 方法列出目录文件,但是无法递归列出多层目录下文件

递归调用os.path.walk 递归遍历

>>> import os
>>> def processDirectory ( args, dirname, filenames ):
	    print 'Directory',dirname
	    for filename in filenames:
		print ' File',filename

		
>>> os.path.walk('E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result',processDirectory,None)
Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result
 File averageMethMotif
 File Crick
 File MythylationLevel
 File old_1_10_output_Crick.read
 File old_1_10_output_CrickFinal.read
 File old_1_10_output_Waston.read
 File SingleCount
 File Waston
Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\averageMethMotif
 File AverageOfmethyOfCG.txt
 File AverageOfmethyOfCHG.txt
 File AverageOfmethyOfCHH.txt
Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\Crick
 File BaseRatio.txt
 File Column1Dist.txt
 File Column1Dist3set.txt
 File Column1Dist4set.txt
 File Column2Dist.txt
 File Column2Dist3set.txt
 File Column2Dist4set.txt
 File Column3Dist.txt
 File Column3Dist3set.txt
 File Column3Dist4set.txt
 File Column4Dist.txt
 File Column4Dist3set.txt
 File Column4Dist4set.txt
 File Column5Dist.txt
 File Column5Dist3set.txt
 File Column5Dist4set.txt
 File Column6Dist.txt
 File Column6Dist3set.txt
 File Column6Dist4set.txt
 File Column7Dist.txt
 File Column7Dist3set.txt
 File Column8Dist.txt
 File delDist.txt
 File delDist3set.txt
 File delDist4set.txt
 File Fre2Set.txt
 File Fre3Set.txt
 File Fre4Set.txt
Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\MythylationLevel
 File average_of_methylation
 File highMethylation.ATCGmap
 File lowMethylation.ATCGmap
 File NoMethylation.ATCGmap
Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\SingleCount
 File 2setCount.txt
 File 3setCount.txt
 File SingleCount.txt
Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\Waston
 File BaseRatio.txt
 File Column1Dist.txt
 File Column1Dist3set.txt
 File Column1Dist4set.txt
 File Column2Dist.txt
 File Column2Dist3set.txt
 File Column2Dist4set.txt
 File Column3Dist.txt
 File Column3Dist3set.txt
 File Column3Dist4set.txt
 File Column4Dist.txt
 File Column4Dist3set.txt
 File Column4Dist4set.txt
 File Column5Dist.txt
 File Column5Dist3set.txt
 File Column5Dist4set.txt
 File Column6Dist.txt
 File Column6Dist3set.txt
 File Column6Dist4set.txt
 File Column7Dist.txt
 File Column7Dist3set.txt
 File Column8Dist.txt
 File delDist.txt
 File delDist3set.txt
 File delDist4set.txt
 File Fre2Set.txt
 File Fre3Set.txt
 File Fre4Set.txt

非递归的方法

>>> import os
>>> for dirpath,dirnames,filenames in os.walk('E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result'):
	print 'Directory',dirpath
	for filename in filenames:
		print 'File',filename

		
Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result
File old_1_10_output_Crick.read
File old_1_10_output_CrickFinal.read
File old_1_10_output_Waston.read
Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\averageMethMotif
File AverageOfmethyOfCG.txt
File AverageOfmethyOfCHG.txt
File AverageOfmethyOfCHH.txt
Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\Crick
File BaseRatio.txt
File Column1Dist.txt
File Column1Dist3set.txt
File Column1Dist4set.txt
File Column2Dist.txt
File Column2Dist3set.txt
File Column2Dist4set.txt
File Column3Dist.txt
File Column3Dist3set.txt
File Column3Dist4set.txt
File Column4Dist.txt
File Column4Dist3set.txt
File Column4Dist4set.txt
File Column5Dist.txt
File Column5Dist3set.txt
File Column5Dist4set.txt
File Column6Dist.txt
File Column6Dist3set.txt
File Column6Dist4set.txt
File Column7Dist.txt
File Column7Dist3set.txt
File Column8Dist.txt
File delDist.txt
File delDist3set.txt
File delDist4set.txt
File Fre2Set.txt
File Fre3Set.txt
File Fre4Set.txt
Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\MythylationLevel
File average_of_methylation
File highMethylation.ATCGmap
File lowMethylation.ATCGmap
File NoMethylation.ATCGmap
Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\SingleCount
File 2setCount.txt
File 3setCount.txt
File SingleCount.txt
Directory E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result\Waston
File BaseRatio.txt
File Column1Dist.txt
File Column1Dist3set.txt
File Column1Dist4set.txt
File Column2Dist.txt
File Column2Dist3set.txt
File Column2Dist4set.txt
File Column3Dist.txt
File Column3Dist3set.txt
File Column3Dist4set.txt
File Column4Dist.txt
File Column4Dist3set.txt
File Column4Dist4set.txt
File Column5Dist.txt
File Column5Dist3set.txt
File Column5Dist4set.txt
File Column6Dist.txt
File Column6Dist3set.txt
File Column6Dist4set.txt
File Column7Dist.txt
File Column7Dist3set.txt
File Column8Dist.txt
File delDist.txt
File delDist3set.txt
File delDist4set.txt
File Fre2Set.txt
File Fre3Set.txt
File Fre4Set.txt

glob模块完成文件过滤

>>> for filename in glob.glob('E://workspace2//RefactorMythelation//src//sourceCode//old_1_10_output_result//Waston//*.txt'):
	print filename



python两个列表做相关运算

两个list是无法直接做减法的

>>> y1=[1,2,4]
>>> y2=[2,3,4]
>>> y2-y1

Traceback (most recent call last):
  File "<pyshell#61>", line 1, in <module>
    y2-y1
TypeError: unsupported operand type(s) for -: 'list' and 'list'

第一种方法:

把list转换为numpy下面的array类型,得到结果再转型回list

>>> import numpy
>>> numpy.array(y2) - numpy.array(y1)
array([1, 1, 0])
>>> list(numpy.array(y2) - numpy.array(y1))
[1, 1, 0]

第二种方法:

利用zip函数来操作,该函数接受多个序列为参数,返回序列元素组合的tuple

>>> map(lambda x: x[0]-x[1], zip(y2, y1))
[1, 1, 0]








posted @ 2015-10-25 17:26  ttabbss  阅读(258)  评论(0编辑  收藏  举报