云中客

梦想有多大,就能走多远

  博客园 :: 首页 :: 博问 :: 闪存 :: 新随笔 :: 联系 :: 订阅 订阅 :: 管理 ::

XML整形

估计如下一样使用XDocument的人比较多,毕竟也是微软推荐使用的。

        string FormatXml(string Xml)
        {
            try
            {
                XDocument doc = XDocument.Parse(Xml);
                return doc.ToString();
            }
            catch (Exception)
            {
                return Xml;
            }
        }

当出现如下文档(默认命名空间,前缀命名空间都定义)时,以上方法返回的值格式变了:

<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">
 <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
  <Created>2015-06-05T18:19:34Z</Created>
  <LastSaved>2015-06-05T18:19:39Z</LastSaved>
  <Version>16.00</Version>
 </DocumentProperties>
 <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
  <AllowPNG/>
  <RemovePersonalInformation/>
 </OfficeDocumentSettings>
 <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
  <WindowHeight>12645</WindowHeight>
  <WindowWidth>22260</WindowWidth>
  <WindowTopX>1170</WindowTopX>
  <WindowTopY>0</WindowTopY>
  <ProtectStructure>False</ProtectStructure>
  <ProtectWindows>False</ProtectWindows>
 </ExcelWorkbook>
 <Styles>
  <Style ss:ID="Default" ss:Name="Normal">
   <Alignment ss:Vertical="Bottom"/>
   <Borders/>
   <Font ss:FontName="游ゴシック" x:CharSet="128" x:Family="Modern" ss:Size="11"
    ss:Color="#000000"/>
   <Interior/>
   <NumberFormat/>
   <Protection/>
  </Style>
 </Styles>
 <Worksheet ss:Name="Sheet1">
  <Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1"
   x:FullRows="1" ss:DefaultColumnWidth="54" ss:DefaultRowHeight="18.75">
  </Table>
  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
   <PageSetup>
    <Header x:Margin="0.3"/>
    <Footer x:Margin="0.3"/>
    <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/>
   </PageSetup>
   <Print>
    <ValidPrinterInfo/>
    <PaperSizeIndex>9</PaperSizeIndex>
    <HorizontalResolution>600</HorizontalResolution>
    <VerticalResolution>600</VerticalResolution>
   </Print>
   <Selected/>
   <ProtectObjects>False</ProtectObjects>
   <ProtectScenarios>False</ProtectScenarios>
  </WorksheetOptions>
 </Worksheet>
</Workbook>

XDocument整形之后,前缀显示自动加上了:

<?xml version="1.0" encoding="utf-8"?>
<?mso-application progid="Excel.Sheet"?>
<ss:Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
  <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
    <Created>2015-06-05T18:19:34Z</Created>
    <LastSaved>2015-06-05T18:19:39Z</LastSaved>
    <Version>16.00</Version>
  </DocumentProperties>
  <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
    <AllowPNG />
    <RemovePersonalInformation />
  </OfficeDocumentSettings>
  <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
    <WindowHeight>12645</WindowHeight>
    <WindowWidth>22260</WindowWidth>
    <WindowTopX>1170</WindowTopX>
    <WindowTopY>0</WindowTopY>
    <ProtectStructure>False</ProtectStructure>
    <ProtectWindows>False</ProtectWindows>
  </ExcelWorkbook>
  <ss:Styles>
    <ss:Style ss:ID="Default" ss:Name="Normal">
      <ss:Alignment ss:Vertical="Bottom" />
      <ss:Borders />
      <ss:Font ss:FontName="游ゴシック" x:CharSet="128" x:Family="Modern" ss:Size="11" ss:Color="#000000" />
      <ss:Interior />
      <ss:NumberFormat />
      <ss:Protection />
    </ss:Style>
  </ss:Styles>
  <ss:Worksheet ss:Name="Sheet1">
    <ss:Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="54" ss:DefaultRowHeight="18.75"></ss:Table>
    <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
      <PageSetup>
        <Header x:Margin="0.3" />
        <Footer x:Margin="0.3" />
        <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75" />
      </PageSetup>
      <Print>
        <ValidPrinterInfo />
        <PaperSizeIndex>9</PaperSizeIndex>
        <HorizontalResolution>600</HorizontalResolution>
        <VerticalResolution>600</VerticalResolution>
      </Print>
      <Selected />
      <ProtectObjects>False</ProtectObjects>
      <ProtectScenarios>False</ProtectScenarios>
    </WorksheetOptions>
  </ss:Worksheet>
</ss:Workbook>

解决办法:使用XmlDocument

        static string FormatXml(string xml)
        {
            try
            {
                var doc = new XmlDocument();
                doc.LoadXml(xml);

                StringBuilder output = new StringBuilder();
                XmlWriterSettings settings = new XmlWriterSettings
                {
                    Indent = true,
                    Async = true
                };
                using (XmlWriter writer = XmlWriter.Create(output, settings))
                {
                    doc.Save(writer);
                }
                return output.ToString();
            }
            catch (Exception)
            {
                return xml;
            }
        }

整形之后保持原样:

<?xml version="1.0" encoding="utf-16"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
  <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
    <Created>2015-06-05T18:19:34Z</Created>
    <LastSaved>2015-06-05T18:19:39Z</LastSaved>
    <Version>16.00</Version>
  </DocumentProperties>
  <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
    <AllowPNG />
    <RemovePersonalInformation />
  </OfficeDocumentSettings>
  <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
    <WindowHeight>12645</WindowHeight>
    <WindowWidth>22260</WindowWidth>
    <WindowTopX>1170</WindowTopX>
    <WindowTopY>0</WindowTopY>
    <ProtectStructure>False</ProtectStructure>
    <ProtectWindows>False</ProtectWindows>
  </ExcelWorkbook>
  <Styles>
    <Style ss:ID="Default" ss:Name="Normal">
      <Alignment ss:Vertical="Bottom" />
      <Borders />
      <Font ss:FontName="游ゴシック" x:CharSet="128" x:Family="Modern" ss:Size="11" ss:Color="#000000" />
      <Interior />
      <NumberFormat />
      <Protection />
    </Style>
  </Styles>
  <Worksheet ss:Name="Sheet1">
    <Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="54" ss:DefaultRowHeight="18.75"></Table>
    <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
      <PageSetup>
        <Header x:Margin="0.3" />
        <Footer x:Margin="0.3" />
        <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75" />
      </PageSetup>
      <Print>
        <ValidPrinterInfo />
        <PaperSizeIndex>9</PaperSizeIndex>
        <HorizontalResolution>600</HorizontalResolution>
        <VerticalResolution>600</VerticalResolution>
      </Print>
      <Selected />
      <ProtectObjects>False</ProtectObjects>
      <ProtectScenarios>False</ProtectScenarios>
    </WorksheetOptions>
  </Worksheet>
</Workbook>

 

改行(CR+LF)问题

标签的值中如果包含\r\n换行符,XDocument以及XmlDocument读取之后会默认被转换为\n,如果就这样保存会少了\r。

而且Excel另存为的xml格式文件中的改行符为【&#10;】,很难直接复原。

如果不在意xml格式的话可以通过如下方法解决:

        static string FormatXml(string xml)
        {
            try
            {
                var doc = new XmlDocument();
                doc.LoadXml(xml);

                StringBuilder output = new StringBuilder();
                XmlWriterSettings settings = new XmlWriterSettings
                {
                    NewLineChars = "\r\n",
                    NewLineOnAttributes = true,
                    NewLineHandling = NewLineHandling.Replace,
                    CheckCharacters = false,
                    Indent = false,
                    Async = true
                };
                using (XmlWriter writer = XmlWriter.Create(output, settings))
                {
                    doc.Save(writer);
                }
                return output.ToString();
            }
            catch (Exception)
            {
                return xml;
            }
        }

 

posted on 2019-05-29 13:12  走遍江湖  阅读(710)  评论(0编辑  收藏  举报