Python3进阶

Python3进阶

Python3进阶

zip

zip(*iterables)

zip函数可以接受一系列的可迭代对象作为参数，将对象中对应的元素打包成一个个tuple(元组)，然后由这些tuple(元组)组成一个list(列表)返回。

E.g.1:

a = [1, 2, 3, 4, 5]
b = ['a', 'b', 'c', 'd', 'e']
c = zip(a, b)
print(type(c))
print(c)  # 返回zip对象而不是内容
print(list(c))

<class 'zip'>
<zip object at 0x00000143FAE1AF80>
[(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd'), (5, 'e')]

如果传入的可迭代对象的长度不一致，则返回可迭代对象中最短的一个对象的长度

E.g.2:

a = [1, 2, 3, 4, 5]
b = ['a', 'b', 'c', 'd']
c = zip(a, b)
print(list(c))

[(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')]

E.g.3:

names = ["zhangsan", "lisi", "wangwu", "zhaoliu", "sunqi"]
weights = [72, 68, 72, 66, 76]

user_weight = dict(zip(names, weights))
print(user_weight)

{'zhangsan': 72, 'lisi': 68, 'wangwu': 72, 'zhaoliu': 66, 'sunqi': 76}

实例：

机器学习模型训练中，经常需要打乱数据集，用 zip() 函数可以实现如下：

import random
X = [1, 2, 3, 4, 5, 6]
y = [0, 1, 0, 0, 1, 1]
zipped_data = list(zip(X, y))
# 将样本和标签一 一对应组合起来,并转换成list类型方便后续打乱操作
random.shuffle(zipped_data)
# 使用random模块中的shuffle函数打乱列表，原地操作，没有返回值
new_zipped_data = list(map(list, zip(*zipped_data)))
# zip(*)反向解压，map()逐项转换类型，list()做最后转换
new_X, new_y = new_zipped_data[0], new_zipped_data[1]
# 返回打乱后的新数据
print('X:', X, '\n', 'y:', y)
print('new_X:', new_X, '\n', 'new_y:', new_y)

输出结果(因未设置随机种子seed，因此每次运行结果可能不一样):

X: [1, 2, 3, 4, 5, 6]
 y: [0, 1, 0, 0, 1, 1]
new_X: [6, 1, 5, 2, 3, 4]
 new_y: [1, 0, 1, 1, 0, 0]

enumerate

enumerate(iterable, start=0)

enumerate函数可以同时返回列表和元组等可迭代对象的下标和内容，但实际上，enumerate函数实际返回的是一个enumerate类型的可迭代对象。

E.g.1:

seq = [1, 2, '3', 'hello world']
for i, element in enumerate(seq):
    print(i, element)

0 1
1 2
2 3
3 hello world

同样，enumerate返回一个可迭代的对象

E.g.2:

seq = [1, 2, '3', 'hello world']
print(type(enumerate(seq)))
print(enumerate(seq))
print(list(enumerate(seq)))

<class 'enumerate'>
<enumerate object at 0x000001FC9787AFC0>
[(0, 1), (1, 2), (2, '3'), (3, 'hello world')]

map

map(function, iterable, ...)

map对list中每一个元素都调用function函数进行处理，返回一个新的列表。

E.g.1

d = [1, 2, 3]


def func(s):
    return s * 100


print(map(func, d))
print(type(map(func, d)))
print(list(map(func, d)))

<map object at 0x0000019A46BF5E50>
<class 'map'>
[100, 200, 300]

map可以处理多个可迭代的对象，如果传入的可迭代对象的长度不一致，则返回可迭代对象中最短的一个对象的长度。

E.g.2:

d = [1, 2, 3]
e = [7, 8, 9, 10]


def func(a, b):
    return a * 100 + b


print(list(map(func, d, e)))

[107, 208, 309]

可以使用lambda定义小函数，使程序更加简洁。

d = [1, 2, 3]
e = [7, 8, 9, 10]
print(list(map(lambda a, b: a * 100 + b, d, e)))

[107, 208, 309]

reduce

functools.reduce(function, iterable, [initial_value])

function -- 函数，有两个参数

iterable -- 可迭代对象

initial_value -- 可选，初始参数

E.g.1:

import functools  # python3


def add(x, y):
    print("x = %d, y = %d" % (x, y))
    return x + y


print(functools.reduce(add, [1, 2, 3, 4, 5]))
print(functools.reduce(add, [1, 2, 3, 4, 5], 8))

x = 1, y = 2
x = 3, y = 3
x = 6, y = 4
x = 10, y = 5
15
x = 8, y = 1
x = 9, y = 2
x = 11, y = 3
x = 14, y = 4
x = 18, y = 5
23

可以使用lambda定义小函数，使程序更加简洁。

E.g.2:

import functools  # python3

print(functools.reduce(lambda x, y: x + y, [1, 2, 3, 4, 5], 8))
print(functools.reduce(lambda x, y: 10 * x + y, [1, 2, 3, 4, 5], 8))

23
812345

filter

filter(function, iterable)

对iterable中的每一个元素都调用function进行判断，返回满足条件的元素列表。

E.g.1:

print(filter(lambda a: a % 2 == 0, range(20)))
print(list(filter(lambda a: a % 2 == 0, range(20))))

<filter object at 0x0000025BE94AD190>
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

collections模块

namedtuple

collections.namedtuple(typename, field_names, *, rename=False, defaults=None, module=None)

namedtuple是一个函数，它用来创建一个自定义的tuple对象，并且规定了tuple元素的个数，并可以用属性而不是索引来引用tuple的某个元素。这样一来，我们用namedtuple可以很方便地规定一种数据类型，它具备tuple的不变性，又可以根据属性来引用，使用十分方便。

E.g.1:

import collections

Point = collections.namedtuple('Point', ['x', 'y'])
p = Point(1, 2)
print(p.x)
print(p.y)

pp = Point(x=3, y=4)
print(pp.x)
print(pp.y)

namedtuple的属性不能更改，否则会报错。
可以验证创建的Point对象是tuple的一种子类。

E.g.2:

import collections

Point = collections.namedtuple('Point', ['x', 'y'])
p = Point(1, 2)

print(isinstance(p, Point))
print(isinstance(p, tuple))

True
True

defaultdict

使用dict时，如果引用的Key不存在，就会抛出KeyError，如果希望key不存在时，返回一个默认值，就可以用defaultdict。除了在Key不存在时返回默认值，defaultdict的其他行为跟dict是完全一样的。

E.g.1:

import collections

dd = collections.defaultdict(lambda: 'N/A')
dd['key1'] = 'abc'

print(dd['key1'])

print(dd['key2'])

abc
N/A

统计词频

E.g.2:

import collections

cnt = collections.defaultdict(int)
for char in ["a", "b", "c", "b", "d", "a", "a"]:
  cnt[char] += 1
print(cnt)

defaultdict(<class 'int'>, {'a': 3, 'b': 2, 'c': 1, 'd': 1})

Counter

Counter是一个简单的计数器，例如，统计字符出现的个数

E.g.1:

import collections

c = collections.Counter('gallahad')
print(c)

Counter({'a': 3, 'l': 2, 'g': 1, 'h': 1, 'd': 1})

itertools模块

chain

itertools.chain(*iterables)

chain可以把一组迭代对象串联起来，形成一个更大的迭代器

E.g.1:

import itertools
for c in itertools.chain('ABC', 'XYZ'):
    print(c)

A
B
C
X
Y
Z

groupby

itertools.groupby(iterable, key=None)

groupby把迭代器中相邻的重复元素(key)挑出来放在一起

E.g.1:

import itertools
for key, group in itertools.groupby('AAABBBCCAAA'):
    print(key, list(group))

A ['A', 'A', 'A']
B ['B', 'B', 'B']
C ['C', 'C']
A ['A', 'A', 'A']

E.g.2:

import itertools
date = [
    ("1班", "刘一", 93),
    ("1班", "陈二", 72),
    ("1班", "张三", 81),
    ("2班", "李四", 98),
    ("2班", "王五", 91),
    ("3班", "赵六", 86),
    ("3班", "孙七", 48),
    ("3班", "周八", 89),
    ("3班", "吴九", 64),
    ("3班", "郑十", 79),
]

for key, group in itertools.groupby(date, key=lambda m: m[0]):
    print(key, list(group))

1班 [('1班', '刘一', 93), ('1班', '陈二', 72), ('1班', '张三', 81)]
2班 [('2班', '李四', 98), ('2班', '王五', 91)]
3班 [('3班', '赵六', 86), ('3班', '孙七', 48), ('3班', '周八', 89), ('3班', '吴九', 64), ('3班', '郑十', 79)]

时间模块

datetime

datetime是Python处理日期和时间的标准库。

E.g.1:

import datetime
now = datetime.datetime.now()
print(now)
print(type(now))
print(now.timestamp())  # 注意Python的timestamp是一个浮点数。如果有小数位，小数位表示毫秒数，

2020-08-18 19:40:09.972192
<class 'datetime.datetime'>
1597750809.972192

很多时候，用户输入的日期和时间是字符串，要处理日期和时间，首先必须把str转换为datetime。转换方法是通过datetime.datetime.strptime实现，需要一个日期和时间的格式化字符串

E.g.2:

import datetime
cday = datetime.datetime.strptime('2019-6-20 18:19:59', '%Y-%m-%d %H:%M:%S')
print(cday)

2019-06-20 18:19:59

如果已经有了datetime对象，要把它格式化为字符串显示给用户，就需要转换为str，转换方法是通过strftime实现的，同样需要一个日期和时间的格式化字符串

E.g.3:

import datetime

now = datetime.datetime.now()
print(now.strftime('%Y-%m-%d %H:%M:%S'))

2020-08-18 19:53:20

对日期和时间进行加减实际上就是把datetime往后或往前计算。，得到新的datetime。加减可以直接用+和-运算符，需要使用timedelta这个类。

import datetime
now = datetime.datetime.now()
print(now)

print(now + datetime.timedelta(hours=10))
print(now - datetime.timedelta(days=1))
print(now + datetime.timedelta(days=2, hours=12))

2020-08-18 20:35:57.941536
2020-08-19 06:35:57.941536
2020-08-17 20:35:57.941536
2020-08-21 08:35:57.941536

posted @ 2020-08-18 23:47 熠丶阅读(108) 评论(0) 收藏举报

刷新页面返回顶部

Burning Bright

don't let me down

Python3进阶

Python3进阶

zip

enumerate

map

reduce

filter

collections模块

namedtuple

defaultdict

Counter

itertools模块

chain

groupby

时间模块

datetime

公告