当前位置:网站首页>How to parse XML data and what are the methods?

How to parse XML data and what are the methods?

2020-12-08 10:58:12 A paper umbrella

problem :XML How to parse the data , What are the ways ?

We talked about this last time JSON Four ways of parsing , So this time, let's see XML Four analytical methods of .

Four ways of parsing

  • DOM analysis
  • SAX analysis
  • JDOM analysis
  • DOM4J analysis

Case practice

DOM analysis

DOM(Document Object Model, Document object model ), In the application , be based on DOM Of XML The analyzer will have a XML The document is transformed into a collection of object models ( Often referred to as DOM Trees ), It is through the operation of this object model that the application program , To achieve the right XML Operation of document data .XML It's in the form of a tree , therefore DOM During operation , It will also be transformed in the form of chapter trees . Throughout DOM In the tree , The biggest point is Document, Represents a document , There is only one root node in this document .

Be careful : In the use of DOM During operation , Every text area is also a node , Called a text node .

Core operation interface

stay DOM There are four core operation interfaces in parsing :

Document: This interface represents the whole XML file , It means the whole tree DOM The root of the tree , Provides access to and manipulation of data in documents , adopt Document Nodes can access XML All elements in the file .

Node: This interface is in the whole DOM Trees have a pivotal position ,DOM A large part of the core interface of operation is from Node The interface inherits . for example :Document、Element Such as the interface , stay DOM In the tree , every last Node The interface represents DOM A node in the tree .

NodeList: This interface represents a collection of nodes , It is generally used to represent a group of nodes with sequential relationship , for example : Children of a node , When the document changes, it will directly affect NodeList aggregate .

NamedNodeMap: This interface represents a one-to-one correspondence between a group of nodes and their unique names , This interface is mainly used for the representation of attribute nodes .

DOM Analytic process

If a program needs to be done DOM Parse the read operation , You also need to follow the steps below :

①  establish  DocumentBuilderFactory : DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
②  establish  DocumentBuilder: DocumentBuilder builder = factory.newDocumentBuilder();
③  establish  Document : Document doc = builder.parse(“ File path to parse ”);
④  establish  NodeList : NodeList nl = doc.getElementsByTagName(“ Read nodes ”);
⑤  Conduct  XML  Information reading  

SAX analysis

SAX(Simple API for XML) The resolution is based on xml The order of the files is analyzed step by step .SAX There is no official standards body , It doesn't belong to any standards organization or group , It doesn't belong to any company or individual , It's a computer technology that anyone can use .

SAX(Simple API for XML, operation XML The simple interface ), And DOM The difference in operation is ,SAX It uses a sequential mode to access , It's a fast read XML The way of data . When using SAX When the parser operates, it triggers a series of things , When you scan the document (document) Start and end 、 Elements (element) At the beginning and at the end, related processing methods are called , And make corresponding operation by these operation methods , Until the end of the entire document scan .

If you want to achieve this SAX analysis , You must first establish a SAX The parser .

// 1、 Create a parser factory 
SAXParserFactory factory = SAXParserFactory.newInstance();
// 2、 Get the parser 
SAXParser parser = factory.newSAXParser();
// SAX  Parser  , Inherit  DefaultHandler
String path = new File("resource/demo01.xml").getAbsolutePath();
//  analysis  
parser.parse(path, new MySaxHandler()); 

JDOM analysis

stay W3C Provided by itself XML Operating standards ,DOM and SAX, But from a development perspective ,DOM and SAX Each has its own characteristics ,DOM You can modify , But not suitable for reading large files , and SAX Can read large files , But it can't be modified . So-called JDOM = DOM Can be modified + SAX Read large files ,JDOM It's a free open source component , Directly from www.jdom.org Upload and download .

JDOM operation xml Common classes :

Document: Represents the whole xml file , It's a tree structure

Eelment: It means a xml The elements of , Provides methods to manipulate its child elements , Text , Properties and namespace, etc

Attribute: Represents the attributes that an element contains

Text: Express xml Text information

XMLOutputter:xml Output stream , The bottom is through JDK Mid stream implementation

Format: Provide xml The encoding of the file output 、 Style and typesetting, etc

We found that JDOM The output operation of is better than the traditional DOM Much more convenient , And it's more intuitive , It's easy to include in the output . What is observed at this time is JDOM about DOM Parsing support , But also say ,JDOM It also supports SAX Characteristics ; therefore , have access to SAX Perform parsing operation .

//  obtain  SAX  Parser 
SAXBuilder builder = new SAXBuilder();
File file = new File("resource/demo01.xml");
//  Get document 
Document doc = builder.build(new File(file.getAbsolutePath()));  
//  Get root node  
Element root = doc.getRootElement();  
System.out.println(root.getName());
//  Get all the child nodes under the root node ,  You can also get the specified direct point according to the label name 
List<Element> list = root.getChildren();
System.out.println(list.size());
for(int x = 0; x<list.size(); x++){
    Element e = list.get(x);  
    //  Get the name of the element and the text inside 
    String name = e.getName();
    System.out.println(name + "=" + e.getText());
    System.out.println("==================");
} 

DOM4J analysis

dom4j Is a simple open source library , Used for processing XML、 XPath and XSLT, It's based on Java platform , Use Java The collection framework of , Fully integrated DOM,SAX and JAXP. Download path :

http://www.dom4j.org/dom4j-1....

http://sourceforge.net/projec...

DOM4J And JDOM All belong to a free XML Open source components , However, due to the current development framework using this technology more , such as Hibernate、Spring Use... Etc DOM4J This function , So as an introduction , You can have an understanding of this component . No one is good or bad , General frame use DOM4J More , And we usually use it JDOM More common . You can find DOM4J A lot of new features , For example, the output format can be very good .

File file = new File("resource/outputdom4j.xml");
SAXReader reader = new SAXReader();
//  Read the file as a document 
Document doc = reader.read(file);
//  Get the root element of the document 
Element root = doc.getRootElement();
//  Find all the child nodes according to the following elements 
Iterator<Element> iter = root.elementIterator();
while(iter.hasNext()){
    Element name = iter.next();
    System.out.println("value = " + name.getText());
} 

Expand ~XML The creation of

DOM establish

If you want to generate XML file , When creating documents , You should use newDocument() Method

If you want to DOM Document output of , It's troublesome in itself . Write many times at a time copy

public static void createXml() throws Exception{  
    // Get the parser factory  
    DocumentBuilderFactory factory=DocumentBuilderFactory.newInstance();  
    // Get the parser  
    DocumentBuilder builder=factory.newDocumentBuilder();  
    // create documents  
    Document doc=builder.newDocument();  
    // Create elements 、 set relationship  
    Element root=doc.createElement("people");  
    Element person=doc.createElement("person");  
    Element name=doc.createElement("name");  
    Element age=doc.createElement("age");  
    name.appendChild(doc.createTextNode("lebyte"));  
    age.appendChild(doc.createTextNode("10"));  
    doc.appendChild(root);  
    root.appendChild(person);  
    person.appendChild(name);  
    person.appendChild(age);  
    // Write out  
    //  Get transformer factory  
    TransformerFactory tsf=TransformerFactory.newInstance();  
    Transformer ts=tsf.newTransformer();  
    // Set encoding  
    ts.setOutputProperty(OutputKeys.ENCODING, "UTF-8");  
    // Create with  DOM  New input source for node , Act as a transformation  Source  The holder of the tree  
    DOMSource source=new DOMSource(doc);  
    // Act as the holder of the conversion result  
    File file=new File("src/output.xml");  
    StreamResult result=new StreamResult(file);  
    ts.transform(source, result);  
} 

SAX establish

// Create a SAXtransformerfactory object 
SAXTransformerFactory stf = (SAXTransformerFactory) SAXTransformerFactory.newInstance();
try {
    // adopt SAXTransformerFactory Object to create a TransfomerHandler object 
    TransformerHandler handler = stf.newTransformerHandler();
    // adopt transformerHandler Object to create a transformer object 
    Transformer tf = handler.getTransformer();
    // Set up Transfomer Object properties 
    tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    tf.setOutputProperty(OutputKeys.INDENT, "yes");
    // Create a Result The object of , With the handler relation 
    File file = new File("src/output.xml");
    if(!file.exists()){
        file.createNewFile();
    }
    Result result = new StreamResult(new FileOutputStream(file));
    handler.setResult(result);
    // adopt Handler To write XML The content of  
    // open Document 
    handler.startDocument();
    AttributesImpl attr = new AttributesImpl();
    // Create a root node bookstore
    handler.startElement("", "", "bookstore", attr);
    attr.clear();
    attr.addAttribute("", "", "id", "", "1");
    handler.startElement("", "", "book", attr);
    attr.clear();
    handler.startElement("", "", "name", attr);
    handler.characters(" Rehabilitation guidelines for cervical spondylosis ".toCharArray(), 0, " Rehabilitation guidelines for cervical spondylosis ".length());
    handler.endElement("","","name");
    // Turn off the nodes 
    handler.endElement("", "", "book");
    handler.endElement("", "", "bookstore");
    handler.endDocument();
} catch (SAXException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
} catch (FileNotFoundException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
} catch (IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
} catch (TransformerConfigurationException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
} 

JDOM establish

//  Create nodes  
Element person = new Element("person");  
Element name = new Element("name");  
Element age = new Element("age");  
//  Create properties  
Attribute id = new Attribute("id","1");  
//  Set text  
name.setText("lebyte");  
age.setText("10");  
//  set relationship  
Document doc = new Document(person);  
person.addContent(name);  
name.setAttribute(id);  
person.addContent(age);  
XMLOutputter out = new XMLOutputter();  
File file = new File("resource/outputjdom.xml");  
out.output(doc, new FileOutputStream(file.getAbsoluteFile())); 

DOM4J establish

//  Use  DocumentHelper  To create  Document  object  
Document document = DocumentHelper.createDocument();  
//  Create elements and set relationships  
Element person = document.addElement("person");  
Element name = person.addElement("name");   
Element age = person.addElement("age");  
//  Set text   name.setText("lebyte"); 
age.setText("10"); 
//  Create format output  
OutputFormat of = OutputFormat.createPrettyPrint();  
of.setEncoding("utf-8");  
//  output to a file  
File file = new File("resource/outputdom4j.xml");  
XMLWriter writer = new XMLWriter(new FileOutputStream(new  File(file.getAbsolutePath())),of);  
//  Write  
writer.write(document);  
writer.flush();  
writer.close(); 

版权声明
本文为[A paper umbrella]所创,转载请带上原文链接,感谢
https://chowdera.com/2020/12/20201208105756491i.html