[Python] 基本数据类型

列表（list）

动态，长度大小不固定，可随意增、删、改元素（链表）
可放置任意数据类型
支持负数索引，切片操作

1 t = timeit.timeit(stmt="x=[1,2,3,4,5,6]", number=100000)
2 print(t)

View Code

元组（tuple）

静态，长度大小固定，无法增、删、改元素（数组）
可放置任意数据类型
支持负数索引，切片操作

1 t = timeit.timeit(stmt="x=(1,2,3,4,5,6)", number=100000)
2 print(t)

View Code

字典（dict）

由键--值对组成的有序元素的集合（哈希表）
键不可变，不重复
查找、添加、删除操作复杂度O(1)
底层哈希表中存储哈希值、键、值三个元素

1 d = {'b':1,'a':2,'c':10}
2 d_sorted_by_key = sorted(d.items(),key=lambda x:x[0])
3 d_sorted_by_value = sorted(d.items(),key=lambda x:x[1])
4 d_sorted_by_key
5 d_sorted_by_value

View Code

集合（set）

无序元素的集合（哈希表）
元素不可变，不重复
查找、添加、删除操作复杂度O(1)
底层哈希表中存储哈希值、键、值三个元素

 1 import time
 2 
 3 def find_unique_price_using_set(products):
 4     unique_price_set = set()
 5     for _, price in products:
 6         unique_price_set.add(price)
 7     return len(unique_price_set)
 8 
 9 def find_unique_price_using_list(products):
10     unique_price_list = []
11     for _, price in products:
12         if price not in unique_price_list:
13             unique_price_list.append(price)
14     return len(unique_price_list)
15 
16 id = [x for x in range(0,10000)]
17 price = [x for x in range(20000, 30000)]
18 products = list(zip(id,price))
19 
20 start_using_list = time.perf_counter()
21 find_unique_price_using_list(products)
22 end_using_list = time.perf_counter()
23 print("time elapse using list:{}".format(end_using_list - start_using_list))
24 
25 start_using_list = time.perf_counter()
26 find_unique_price_using_set(products)
27 end_using_list = time.perf_counter()
28 print("time elapse using set:{}".format(end_using_list - start_using_list))

View Code

字符串（string）

不可变，改变字符串就要创建新的字符串
分割

1 def query_data(namespace, table):
2     print("data in " + table + " at " + namespace)
3     
4 path = 'hive://ads/traning_table'
5 namespace = path.split('//')[1].split('/')[0]
6 table = path.split('//')[1].split('/')[1]
7 data = query_data(namespace,table)

View Code

格式化输出

 1 name = input('your name:')
 2 gender = input('you are a boy?(y/n)')
 3 
 4 welcome_str = 'Welcome to the matrix {prefix}{name}.'
 5 welcome_dic = {
 6     'prefix' : 'Mr. ' if gender == 'y' else 'Mrs. ',
 7     'name' : name
 8 }
 9 
10 print('authorizing...')
11 print(welcome_str.format(**welcome_dic))

View Code

数据清洗

 1 import re
 2 
 3 def parse(text):
 4     # 去除标点符号和换行符
 5     text = re.sub(r'[^\w]',' ',text)
 6     # 转为小写
 7     text = text.lower()
 8     # 生成所有单词的列表
 9     word_list = text.split(' ')
10     # 去除空白单词
11     word_list = filter(None, word_list)
12     # 生成单词和词频的字典
13     word_cnt = {}
14     for word in word_list:
15         if(word not in word_cnt):
16             word_cnt[word] = 0;
17         word_cnt[word] += 1
18     #按词频倒序排序
19     sorted_word_cnt = sorted(word_cnt.items(),key=lambda kv:kv[1],reverse=True)
20     return sorted_word_cnt
21 
22 with open('in.txt','r') as fin:
23     text = fin.read()
24     
25 word_and_freq = parse(text)
26 
27 with open('out.txt','w') as fout:
28     for word, freq in word_and_freq:
29         fout.write('{} {}\n'.format(word,freq))

View Code

...（输入为马丁路德金的《I have a dream》）

json
- dumps()：打包（序列化），字典->字符串
- loads()：解包（反序列化），字符串->字典

 1 import json
 2  
 3 params = {
 4     'symbol':'123456',
 5     'type':'limit',
 6     'price':123.4,
 7     'amount':23
 8 }
 9 
10 params_str = json.dumps(params)
11 print('after json serialization')
12 print('type of params_str = {}, params_str = {}'.format(type(params_str),params_str))
13 
14 original_params = json.loads(params_str)
15 print('after json deserialization')
16 print('type of original_params = {}, original_params={}'.format(type(original_params),original_params))

View Code

posted @ 2020-04-13 22:55 cxc1357 阅读(156) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

cxc1357

[Python] 基本数据类型

公告