python模块学习 -openpyxl
python模块学习 -openpyxl
-
openpyxl模块介绍
- openpyxl模块是一个读写Excel 2010文档的Python库,如果要处理更早格式的Excel文档,需要用到额外的库,openpyxl是一个比较综合的工具,能够同时读取和修改Excel文档。其他很多的与Excel相关的项目基本只支持读或者写Excel一种功能。
-
安装openpyxl模块
- openpyxl是一个开源项目,这里使用如下命令安装openpyxl模块
- pip3 install openpyxl
-
openpyxl基本用法
- 想要操作Excel首先要了解Excel 基本概念,Excel中列以字幕命名,行以数字命名,比如左上角第一个单元格的坐标为A1,下面的为A2,右边的B1。
- openpyxl中有三个不同层次的类,Workbook是对工作簿的抽象,Worksheet是对表格的抽象,Cell是对单元格的抽象,每一个类都包含了许多属性和方法。
-
操作Excel的一般场景
- 打开或者创建一个Excel需要创建一个Workbook对象
- 获取一个表则需要先创建一个Workbook对象,然后使用该对象的方法来得到一个Worksheet对象
- 如果要获取表中的数据,那么得到Worksheet对象以后再从中获取代表单元格的Cell对象
-
Workbook对象
-
一个Workbook对象代表一个Excel文档,因此在操作Excel之前,都应该先创建一个Workbook对象。对于创建一个新的Excel文档,直接进行Workbook类的调用即可,对于一个已经存在的Excel文档,可以使用openpyxl模块的load_workbook函数进行读取,该函数包涵多个参数,但只有filename参数为必传参数。filename 是一个文件名,也可以是一个打开的文件对象。
>>> import openpyxl >>> excel = openpyxl.Workbook(‘hello.xlxs‘) >>> excel1 = openpyxl.load_workbook(‘abc.xlsx‘) >>>
-
-
PS:Workbook和load_workbook相同,返回的都是一个Workbook对象。
Workbook对象提供了很多属性和方法,其中,大部分方法都与sheet有关,部分属性如下:
- active:获取当前活跃的Worksheet
- worksheets:以列表的形式返回所有的Worksheet(表格)
- read_only:判断是否以read_only模式打开Excel文档
- encoding:获取文档的字符集编码
- properties:获取文档的元数据,如标题,创建者,创建日期等
- sheetnames:获取工作簿中的表(列表)
>>> import openpyxl >>> excel2 = openpyxl.load_workbook(‘abc.xlsx‘) >>> excel2.active <Worksheet "abc"> >>> excel.read_only False >>> excel2.worksheets [<Worksheet "abc">, <Worksheet "def">] >>> excel2.properties <openpyxl.packaging.core.DocumentProperties object> Parameters: creator=‘openpyxl‘, title=None, description=None, subject=None, identifier=None, language=None, created=datetime.datetime(2006, 9, 16, 0, 0), modified=datetime.datetime(2018, 2, 5, 7, 25, 18), lastModifiedBy=‘Are you SuperMan‘, category=None, contentStatus=None, version=None, revision=None, keywords=None, lastPrinted=None >>> excel2.encoding ‘utf-8‘ >>>
-
Workbook提供的方法如下:
- get_sheet_names:获取所有表格的名称(新版已经不建议使用,通过Workbook的sheetnames属性即可获取)
- get_sheet_by_name:通过表格名称获取Worksheet对象(新版也不建议使用,通过Worksheet[‘表名‘]获取)
- get_active_sheet:获取活跃的表格(新版建议通过active属性获取)
- remove_sheet:删除一个表格
- create_sheet:创建一个空的表格
- copy_worksheet:在Workbook内拷贝表格
>>> excel2.get_sheet_names() Warning (from warnings module): File "__main__", line 1 DeprecationWarning: Call to deprecated function get_sheet_names (Use wb.sheetnames). [‘abc‘, ‘def‘] >>> excel2.sheetnames [‘abc‘, ‘def‘] >>> excel2.get_sheet_by_name(‘def‘) Warning (from warnings module): File "__main__", line 1 DeprecationWarning: Call to deprecated function get_sheet_by_name (Use wb[sheetname]). <Worksheet "def"> >>> excel2[‘def‘] <Worksheet "def"> >>> excel2.get_active_sheet() Warning (from warnings module): File "__main__", line 1 DeprecationWarning: Call to deprecated function get_active_sheet (Use the .active property). <Worksheet "abc"> >>> excel2.create_sheet(‘ghk‘) <Worksheet "ghk">
-
Worksheet对象
- 有了Worksheet对象以后,我们可以通过这个Worksheet对象获取表格的属性,得到单元格中的数据,修改表格中的内容。openpyxl提供了非常灵活的方式来访问表格中的单元格和数据,常用的Worksheet属性如下:
- title:表格的标题
- dimensions:表格的大小,这里的大小是指含有数据的表格的大小,即:左上角的坐标:右下角的坐标
- max_row:表格的最大行
- min_row:表格的最小行
- max_column:表格的最大列
- min_column:表格的最小列
- rows:按行获取单元格(Cell对象) - 生成器
- columns:按列获取单元格(Cell对象) - 生成器
- freeze_panes:冻结窗格
- values:按行获取表格的内容(数据) - 生成器
PS:freeze_panes,参数比较特别,主要用于在表格较大时冻结顶部的行或左边的行。对于冻结的行,在用户滚动时,是始终可见的,可以设置为一个Cell对象或一个端元个坐标的字符串,单元格上面的行和左边的列将会冻结(单元格所在的行和列不会被冻结)。例如我们要冻结第一行那么设置A2为freeze_panes,如果要冻结第一列,freeze_panes取值为B1,如果要同时冻结第一行和第一列,那么需要设置B2为freeze_panes,freeze_panes值为none时 表示 不冻结任何列。
-
常用的Worksheet方法如下:
- iter_rows:按行获取所有单元格,内置属性有(min_row,max_row,min_col,max_col)
- iter_columns:按列获取所有的单元格
- append:在表格末尾添加数据
- merged_cells:合并多个单元格
- unmerged_cells:移除合并的单元格
>>> for row in excel2[‘金融‘].iter_rows(min_row=2,max_row=4,min_col=2,max_col=4): print(row) (<Cell ‘abc‘.B2>, <Cell ‘abc‘.C2>, <Cell ‘abc‘.D2>) (<Cell ‘abc‘.B3>, <Cell ‘abc‘.C3>, <Cell ‘abc‘.D3>) (<Cell ‘abc‘.B4>, <Cell ‘abc‘.C4>, <Cell ‘abc‘.D4>)
-
PS:从Worksheet对象的属性和方法可以看到,大部分都是返回的是一个Cell对象,一个Cell对象代表一个单元格,我们可以使用Excel坐标的方式来获取Cell对象,也可以使用Worksheet的cell方法获取Cell对象。
>>> excel2[‘abc‘][‘A1‘] <Cell ‘abc‘.A1> >>> excel2[‘abc‘].cell(row=1,column=2) <Cell ‘abc‘.B1> >>>
-
Cell对象
- Cell对象比较简单,常用的属性如下:
- row:单元格所在的行
- column:单元格坐在的列
- value:单元格的值
- coordinate:单元格的坐标
>>> excel2[‘abc‘].cell(row=1,column=2).coordinate ‘B1‘ >>> excel2[‘abc‘].cell(row=1,column=2).value ‘test‘ >>> excel2[‘abc‘].cell(row=1,column=2).row 1 >>> excel2[‘abc‘].cell(row=1,column=2).column ‘B‘
- Cell对象比较简单,常用的属性如下:
-
打印表中数据的几种方式
# ---------- 方式1 ---------- >>> for row in excel2[‘abc‘].rows: print( *[ cell.value for cell in row ]) # ---------- 方式2 ---------- >>> for row in excel2[‘abc‘].values: print(*row)
操作实例:
-
1、 安装
pip install openpyxl
想要在文件中插入图片文件,需要安装pillow,安装文件:PIL-fork-1.1.7.win-amd64-py2.7.exe
·font(字体类):字号、字体颜色、下划线等
· fill(填充类):颜色等
· border(边框类):设置单元格边框
· alignment(位置类):对齐方式
· number_format(格式类):数据格式
· protection(保护类):写保护
-
2、 创建一个excel 文件,并写入不同类的内容
# -*- coding: utf-8 -*- from openpyxl import Workbook wb = Workbook() #创建文件对象 # grab the active worksheet ws = wb.active #获取第一个sheet # Data can be assigned directly to cells ws['A1'] = 42 #写入数字 ws['B1'] = "你好"+"automation test" #写入中文(unicode中文也可) # Rows can also be appended ws.append([1, 2, 3]) #写入多个单元格 # Python types will automatically be converted import datetime import time ws['A2'] = datetime.datetime.now() #写入一个当前时间 #写入一个自定义的时间格式 ws['A3'] =time.strftime("%Y年%m月%d日 %H时%M分%S秒",time.localtime()) # Save the file wb.save("e:\\sample.xlsx")
-
3、 创建sheet
# -*- coding: utf-8 -*- from openpyxl import Workbook wb = Workbook() ws1 = wb.create_sheet("Mysheet") #创建一个sheet ws1.title = "New Title" #设定一个sheet的名字 ws2 = wb.create_sheet("Mysheet", 0) #设定sheet的插入位置 默认插在后面 ws2.title = u"你好" #设定一个sheet的名字 必须是Unicode ws1.sheet_properties.tabColor = "1072BA" #设定sheet的标签的背景颜色 #获取某个sheet对象 print wb.get_sheet_by_name(u"你好" ) print wb["New Title" ] #获取全部sheet 的名字,遍历sheet名字 print wb.sheetnames for sheet_name in wb.sheetnames: print sheet_name print "*"*50 for sheet in wb: print sheet.title #复制一个sheet wb["New Title" ]["A1"]="zeke" source = wb["New Title" ] target = wb.copy_worksheet(source) # w3 = wb.copy_worksheet(wb['new title']) # ws3.title = 'new2' # wb.copy_worksheet(wb['new title']).title = 'hello' # Save the file wb.save("e:\\sample.xlsx")
-
4、 操作单元格
# -*- coding: utf-8 -*- from openpyxl import Workbook wb = Workbook() ws1 = wb.create_sheet("Mysheet") #创建一个sheet ws1["A1"]=123.11 ws1["B2"]="你好" d = ws1.cell(row=4, column=2, value=10) print ws1["A1"].value print ws1["B2"].value print d.value # Save the file wb.save("e:\\sample.xlsx")
-
5、 操作批量的单元格
- 无论ws.rows还是ws.iter_rows都是一个对象
- 除上述两个对象外 单行,单列都是一个元祖,多行多列是二维元祖
# -*- coding: utf-8 -*- from openpyxl import Workbook wb = Workbook() ws1 = wb.create_sheet("Mysheet") #创建一个sheet ws1["A1"]=1 ws1["A2"]=2 ws1["A3"]=3 ws1["B1"]=4 ws1["B2"]=5 ws1["B3"]=6 ws1["C1"]=7 ws1["C2"]=8 ws1["C3"]=9 #操作单列 print ws1["A"] for cell in ws1["A"]: print cell.value #操作多列,获取每一个值 print ws1["A:C"] for column in ws1["A:C"]: for cell in column: print cell.value #操作多行 row_range = ws1[1:3] print row_range for row in row_range: for cell in row: print cell.value print "*"*50 for row in ws1.iter_rows(min_row=1, min_col=1, max_col=3, max_row=3): for cell in row: print cell.value #获取所有行 print ws1.rows for row in ws1.rows: print row print "*"*50 #获取所有列 print ws1.columns for col in ws1.columns: print col wb.save("e:\\sample.xlsx")
- 使用百分数
# -*- coding: utf-8 -*- from openpyxl import Workbook from openpyxl import load_workbook wb = load_workbook('e:\\sample.xlsx') wb.guess_types = True ws=wb.active ws["D1"]="12%" print ws["D1"].value # Save the file wb.save("e:\\sample.xlsx") #结果会打印小数 # -*- coding: utf-8 -*- from openpyxl import Workbook from openpyxl import load_workbook wb = load_workbook('e:\\sample.xlsx') wb.guess_types = False ws=wb.active ws["D1"]="12%" print ws["D1"].value wb.save("e:\\sample.xlsx") #结果会打印百分数
-
获取所有的行对象:
#coding=utf-8 from openpyxl import Workbook from openpyxl import load_workbook wb = load_workbook('e:\\sample.xlsx') ws=wb.active cols=[] cols = [] for col in ws.iter_cols(): cols.append(col) print cols #所有列 print cols[0] #获取第一列 print cols[0][0] #获取第一列的第一行的单元格对象 print cols[0][0].value #获取第一列的第一行的值 print "*"*30 print cols[len(cols)-1] #获取最后一列 print cols[len(cols)-1][len(cols[0])-1] #获取最后一列的最后一行的单元格对象 print cols[len(cols)-1][len(cols[0])-1].value #获取最后一列的最后一行的单元格对象的值
-
6、 操作已经存在的文件
# -*- coding: utf-8 -*- from openpyxl import Workbook from openpyxl import load_workbook wb = load_workbook('e:\\sample.xlsx') wb.guess_types = True #猜测格式类型 ws=wb.active ws["D1"]="12%" print ws["D1"].value # Save the file wb.save("e:\\sample.xlsx") #注意如果原文件有一些图片或者图标,则保存的时候可能会导致图片丢失
-
7、 单元格类型
# -*- coding: utf-8 -*- from openpyxl import Workbook from openpyxl import load_workbook import datetime wb = load_workbook('e:\\sample.xlsx') ws=wb.active wb.guess_types = True ws["A1"]=datetime.datetime(2010, 7, 21) print ws["A1"].number_format ws["A2"]="12%" print ws["A2"].number_format ws["A3"]= 1.1 print ws["A4"].number_format ws["A4"]= "中国" print ws["A5"].number_format # Save the file wb.save("e:\\sample.xlsx") 执行结果: yyyy-mm-dd h:mm:ss 0% General General #如果是常规,显示general,如果是数字,显示'0.00_ ',如果是百分数显示0% 数字需要在Excel中设置数字类型,直接写入的数字是常规类型
-
8、 使用公式
# -*- coding: utf-8 -*- from openpyxl import Workbook from openpyxl import load_workbook wb = load_workbook('e:\\sample.xlsx') ws1=wb.active ws1["A1"]=1 ws1["A2"]=2 ws1["A3"]=3 ws1["A4"] = "=SUM(1, 1)" ws1["A5"] = "=SUM(A1:A3)" print ws1["A4"].value #打印的是公式内容,不是公式计算后的值,程序无法取到计算后的值 print ws1["A5"].value #打印的是公式内容,不是公式计算后的值,程序无法取到计算后的值 # Save the file wb.save("e:\\sample.xlsx")
-
9、 合并单元格
# -*- coding: utf-8 -*- from openpyxl import Workbook from openpyxl import load_workbook wb = load_workbook('e:\\sample.xlsx') ws1=wb.active ws.merge_cells('A2:D2') ws.unmerge_cells('A2:D2') #合并后的单元格,脚本单独执行拆分操作会报错,需要重新执行合并操作再拆分 # or equivalently ws.merge_cells(start_row=2,start_column=1,end_row=2,end_column=4) ws.unmerge_cells(start_row=2,start_column=1,end_row=2,end_column=4) # Save the file wb.save("e:\\sample.xlsx")
-
10、插入一个图片
- 需要先安装Pilow,安全文件是:PIL-fork-1.1.7.win-amd64-py2.7.exe
# -*- coding: utf-8 -*- from openpyxl import load_workbook from openpyxl.drawing.image import Image wb = load_workbook('e:\\sample.xlsx') ws1=wb.active img = Image('e:\\1.png') ws1.add_image(img, 'A1') # Save the file wb.save("e:\\sample.xlsx")
-
11、 隐藏单元格
# -*- coding: utf-8 -*- from openpyxl import load_workbook from openpyxl.drawing.image import Image wb = load_workbook('e:\\sample.xlsx') ws1=wb.active ws1.column_dimensions.group('A', 'D', hidden=True) #隐藏a到d列范围内的列 #ws1.row_dimensions 无group方法 # Save the file wb.save("e:\\sample.xlsx")
-
12、 画一个柱状图
# -*- coding: utf-8 -*- from openpyxl import load_workbook from openpyxl import Workbook from openpyxl.chart import BarChart, Reference, Series wb = load_workbook('e:\\sample.xlsx') ws1=wb.active wb = Workbook() ws = wb.active for i in range(10): ws.append([i]) values = Reference(ws, min_col=1, min_row=1, max_col=1, max_row=10) chart = BarChart() chart.add_data(values) ws.add_chart(chart, "E15") # Save the file wb.save("e:\\sample.xlsx")
-
13、 画一个饼图
# -*- coding: utf-8 -*- from openpyxl import load_workbook from openpyxl import Workbook from openpyxl.chart import (PieChart , ProjectedPieChart, Reference) from openpyxl.chart.series import DataPoint data = [ ['Pie', 'Sold'], ['Apple', 50], ['Cherry', 30], ['Pumpkin', 10], ['Chocolate', 40], ] wb = Workbook() ws = wb.active for row in data: ws.append(row) pie = PieChart() labels = Reference(ws, min_col=1, min_row=2, max_row=5) data = Reference(ws, min_col=2, min_row=1, max_row=5) pie.add_data(data, titles_from_data=True) pie.set_categories(labels) pie.title = "Pies sold by category" # Cut the first slice out of the pie slice = DataPoint(idx=0, explosion=20) pie.series[0].data_points = [slice] ws.add_chart(pie, "D1") ws = wb.create_sheet(title="Projection") data = [ ['Page', 'Views'], ['Search', 95], ['Products', 4], ['Offers', 0.5], ['Sales', 0.5], ] for row in data: ws.append(row) projected_pie = ProjectedPieChart() projected_pie.type = "pie" projected_pie.splitType = "val" # split by value labels = Reference(ws, min_col=1, min_row=2, max_row=5) data = Reference(ws, min_col=2, min_row=1, max_row=5) projected_pie.add_data(data, titles_from_data=True) projected_pie.set_categories(labels) ws.add_chart(projected_pie, "A10") from copy import deepcopy projected_bar = deepcopy(projected_pie) projected_bar.type = "bar" projected_bar.splitType = 'pos' # split by position ws.add_chart(projected_bar, "A27") # Save the file wb.save("e:\\sample.xlsx")
-
14、 设定一个表格区域,并设定表格的格式
# -*- coding: utf-8 -*- from openpyxl import load_workbook from openpyxl import Workbook from openpyxl.worksheet.table import Table, TableStyleInfo wb = Workbook() ws = wb.active data = [ ['Apples', 10000, 5000, 8000, 6000], ['Pears', 2000, 3000, 4000, 5000], ['Bananas', 6000, 6000, 6500, 6000], ['Oranges', 500, 300, 200, 700], ] # add column headings. NB. these must be strings ws.append(["Fruit", "2011", "2012", "2013", "2014"]) for row in data: ws.append(row) tab = Table(displayName="Table1", ref="A1:E5") # Add a default style with striped rows and banded columns style = TableStyleInfo(name="TableStyleMedium9", showFirstColumn=True, showLastColumn=True, showRowStripes=True, showColumnStripes=True) #第一列是否和样式第一行颜色一行,第二列是否··· #是否隔行换色,是否隔列换色 tab.tableStyleInfo = style ws.add_table(tab) # Save the file wb.save("e:\\sample.xlsx")
-
15、给单元格设定字体颜色
# -*- coding: utf-8 -*- from openpyxl import Workbook from openpyxl.styles import colors from openpyxl.styles import Font wb = Workbook() ws = wb.active a1 = ws['A1'] d4 = ws['D4'] ft = Font(color=colors.RED) # color="FFBB00",颜色编码也可以设定颜色 a1.font = ft d4.font = ft # If you want to change the color of a Font, you need to reassign it:: #italic 倾斜字体 a1.font = Font(color=colors.RED, italic=True) # the change only affects A1 a1.value = "abc" # Save the file wb.save("e:\\sample.xlsx")
-
16、设定字体和大小
# -*- coding: utf-8 -*- from openpyxl import Workbook from openpyxl.styles import colors from openpyxl.styles import Font wb = Workbook() ws = wb.active a1 = ws['A1'] d4 = ws['D4'] a1.value = "abc" from openpyxl.styles import Font from copy import copy ft1 = Font(name=u'宋体', size=14) ft2 = copy(ft1) #复制字体对象 ft2.name = "Tahoma" print ft1.name print ft2.name print ft2.size # copied from the a1.font = ft1 # Save the file wb.save("e:\\sample.xlsx")
-
17、设定行和列的字体
# -*- coding: utf-8 -*- from openpyxl import Workbook from openpyxl.styles import Font wb = Workbook() ws = wb.active col = ws.column_dimensions['A'] col.font = Font(bold=True) #将A列设定为粗体 row = ws.row_dimensions[1] row.font = Font(underline="single") #将第一行设定为下划线格式 # Save the file wb.save("e:\\sample.xlsx")
-
18、设定单元格的边框、字体、颜色、大小和边框背景色
# -*- coding: utf-8 -*- from openpyxl import Workbook from openpyxl.styles import Font from openpyxl.styles import NamedStyle, Font, Border, Side,PatternFill wb = Workbook() ws = wb.active highlight = NamedStyle(name="highlight") highlight.font = Font(bold=True, size=20,color= "ff0100") highlight.fill = PatternFill("solid", fgColor="DDDDDD")#背景填充 bd = Side(style='thick', color="000000") highlight.border = Border(left=bd, top=bd, right=bd, bottom=bd) print dir(ws["A1"]) ws["A1"].style =highlight # Save the file wb.save("e:\\sample.xlsx")
-
19、常用的样式和属性设置
# -*- coding: utf-8 -*- from openpyxl import Workbook from openpyxl.styles import Font from openpyxl.styles import NamedStyle, Font, Border, Side,PatternFill from openpyxl.styles import PatternFill, Border, Side, Alignment, Protection, Font wb = Workbook() ws = wb.active ft = Font(name=u'微软雅黑', size=11, bold=False, italic=False, vertAlign=None, underline='none', strike=False, color='FF000000') fill = PatternFill(fill_type="solid", start_color='FFEEFFFF', end_color='FF001100') #边框可以选择的值为:'hair', 'medium', 'dashDot', 'dotted', 'mediumDashDot', 'dashed', 'mediumDashed', 'mediumDashDotDot', 'dashDotDot', 'slantDashDot', 'double', 'thick', 'thin'] #diagonal 表示对角线 bd = Border(left=Side(border_style="thin", color='FF001000'), right=Side(border_style="thin", color='FF110000'), top=Side(border_style="thin", color='FF110000'), bottom=Side(border_style="thin", color='FF110000'), diagonal=Side(border_style=None, color='FF000000'), diagonal_direction=0, outline=Side(border_style=None, color='FF000000'), vertical=Side(border_style=None, color='FF000000'), horizontal=Side(border_style=None, color='FF110000') ) alignment=Alignment(horizontal='general', vertical='bottom', text_rotation=0, wrap_text=False, shrink_to_fit=False, indent=0) number_format = 'General' protection = Protection(locked=True, hidden=False) ws["B5"].font = ft ws["B5"].fill =fill ws["B5"].border = bd ws["B5"].alignment = alignment ws["B5"].number_format = number_format ws["B5"].value ="zeke" # Save the file wb.save("e:\\sample.xlsx")