XML整形
估计如下一样使用XDocument的人比较多,毕竟也是微软推荐使用的。
string FormatXml(string Xml) { try { XDocument doc = XDocument.Parse(Xml); return doc.ToString(); } catch (Exception) { return Xml; } }
当出现如下文档(默认命名空间,前缀命名空间都定义)时,以上方法返回的值格式变了:
<?xml version="1.0"?> <?mso-application progid="Excel.Sheet"?> <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40"> <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office"> <Created>2015-06-05T18:19:34Z</Created> <LastSaved>2015-06-05T18:19:39Z</LastSaved> <Version>16.00</Version> </DocumentProperties> <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office"> <AllowPNG/> <RemovePersonalInformation/> </OfficeDocumentSettings> <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel"> <WindowHeight>12645</WindowHeight> <WindowWidth>22260</WindowWidth> <WindowTopX>1170</WindowTopX> <WindowTopY>0</WindowTopY> <ProtectStructure>False</ProtectStructure> <ProtectWindows>False</ProtectWindows> </ExcelWorkbook> <Styles> <Style ss:ID="Default" ss:Name="Normal"> <Alignment ss:Vertical="Bottom"/> <Borders/> <Font ss:FontName="游ゴシック" x:CharSet="128" x:Family="Modern" ss:Size="11" ss:Color="#000000"/> <Interior/> <NumberFormat/> <Protection/> </Style> </Styles> <Worksheet ss:Name="Sheet1"> <Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="54" ss:DefaultRowHeight="18.75"> </Table> <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel"> <PageSetup> <Header x:Margin="0.3"/> <Footer x:Margin="0.3"/> <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/> </PageSetup> <Print> <ValidPrinterInfo/> <PaperSizeIndex>9</PaperSizeIndex> <HorizontalResolution>600</HorizontalResolution> <VerticalResolution>600</VerticalResolution> </Print> <Selected/> <ProtectObjects>False</ProtectObjects> <ProtectScenarios>False</ProtectScenarios> </WorksheetOptions> </Worksheet> </Workbook>
XDocument整形之后,前缀显示自动加上了:
<?xml version="1.0" encoding="utf-8"?> <?mso-application progid="Excel.Sheet"?> <ss:Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40"> <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office"> <Created>2015-06-05T18:19:34Z</Created> <LastSaved>2015-06-05T18:19:39Z</LastSaved> <Version>16.00</Version> </DocumentProperties> <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office"> <AllowPNG /> <RemovePersonalInformation /> </OfficeDocumentSettings> <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel"> <WindowHeight>12645</WindowHeight> <WindowWidth>22260</WindowWidth> <WindowTopX>1170</WindowTopX> <WindowTopY>0</WindowTopY> <ProtectStructure>False</ProtectStructure> <ProtectWindows>False</ProtectWindows> </ExcelWorkbook> <ss:Styles> <ss:Style ss:ID="Default" ss:Name="Normal"> <ss:Alignment ss:Vertical="Bottom" /> <ss:Borders /> <ss:Font ss:FontName="游ゴシック" x:CharSet="128" x:Family="Modern" ss:Size="11" ss:Color="#000000" /> <ss:Interior /> <ss:NumberFormat /> <ss:Protection /> </ss:Style> </ss:Styles> <ss:Worksheet ss:Name="Sheet1"> <ss:Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="54" ss:DefaultRowHeight="18.75"></ss:Table> <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel"> <PageSetup> <Header x:Margin="0.3" /> <Footer x:Margin="0.3" /> <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75" /> </PageSetup> <Print> <ValidPrinterInfo /> <PaperSizeIndex>9</PaperSizeIndex> <HorizontalResolution>600</HorizontalResolution> <VerticalResolution>600</VerticalResolution> </Print> <Selected /> <ProtectObjects>False</ProtectObjects> <ProtectScenarios>False</ProtectScenarios> </WorksheetOptions> </ss:Worksheet> </ss:Workbook>
解决办法:使用XmlDocument
static string FormatXml(string xml) { try { var doc = new XmlDocument(); doc.LoadXml(xml); StringBuilder output = new StringBuilder(); XmlWriterSettings settings = new XmlWriterSettings { Indent = true, Async = true }; using (XmlWriter writer = XmlWriter.Create(output, settings)) { doc.Save(writer); } return output.ToString(); } catch (Exception) { return xml; } }
整形之后保持原样:
<?xml version="1.0" encoding="utf-16"?> <?mso-application progid="Excel.Sheet"?> <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40"> <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office"> <Created>2015-06-05T18:19:34Z</Created> <LastSaved>2015-06-05T18:19:39Z</LastSaved> <Version>16.00</Version> </DocumentProperties> <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office"> <AllowPNG /> <RemovePersonalInformation /> </OfficeDocumentSettings> <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel"> <WindowHeight>12645</WindowHeight> <WindowWidth>22260</WindowWidth> <WindowTopX>1170</WindowTopX> <WindowTopY>0</WindowTopY> <ProtectStructure>False</ProtectStructure> <ProtectWindows>False</ProtectWindows> </ExcelWorkbook> <Styles> <Style ss:ID="Default" ss:Name="Normal"> <Alignment ss:Vertical="Bottom" /> <Borders /> <Font ss:FontName="游ゴシック" x:CharSet="128" x:Family="Modern" ss:Size="11" ss:Color="#000000" /> <Interior /> <NumberFormat /> <Protection /> </Style> </Styles> <Worksheet ss:Name="Sheet1"> <Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="54" ss:DefaultRowHeight="18.75"></Table> <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel"> <PageSetup> <Header x:Margin="0.3" /> <Footer x:Margin="0.3" /> <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75" /> </PageSetup> <Print> <ValidPrinterInfo /> <PaperSizeIndex>9</PaperSizeIndex> <HorizontalResolution>600</HorizontalResolution> <VerticalResolution>600</VerticalResolution> </Print> <Selected /> <ProtectObjects>False</ProtectObjects> <ProtectScenarios>False</ProtectScenarios> </WorksheetOptions> </Worksheet> </Workbook>
改行(CR+LF)问题
标签的值中如果包含\r\n换行符,XDocument以及XmlDocument读取之后会默认被转换为\n,如果就这样保存会少了\r。
而且Excel另存为的xml格式文件中的改行符为【 】,很难直接复原。
如果不在意xml格式的话可以通过如下方法解决:
static string FormatXml(string xml) { try { var doc = new XmlDocument(); doc.LoadXml(xml); StringBuilder output = new StringBuilder(); XmlWriterSettings settings = new XmlWriterSettings { NewLineChars = "\r\n", NewLineOnAttributes = true, NewLineHandling = NewLineHandling.Replace, CheckCharacters = false, Indent = false, Async = true }; using (XmlWriter writer = XmlWriter.Create(output, settings)) { doc.Save(writer); } return output.ToString(); } catch (Exception) { return xml; } }
每天成就一小步,积累下来就是一大步。
转发本文请注明出处,谢谢您的阅读与分享!