背景

Digester最初是经典Web MVC框架Apache Struts中的一部分,用来解析struts-config.xml以配置其控制器Controller的。后来由于其设计思想非常通用,被移到了Apache Commons项目。

简介

Digester是对SAX的封装,用来解决SAX解析XML的一些不足之处。Digester保存了元素之间的关联关系,同时对解析XML节点进行了进一步的封装。我们在使用时只需要预定义一下解析规则。

使用示例

首先引入maven依赖:

1
2
3
4
5
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-digester3</artifactId>
<version>3.2</version>
</dependency>

以下面的XML为例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<?xml version="1.0" encoding="UTF-8"?>
<Server port="8005" shutdown="SHUTDOWN">
this is Server start
<Service name="Catalina">
this is Service start
<Connector port="8080"
protocol="HTTP/1.1"
connectionTimeout="20000"
redirectPort="8443" />
<Engine name="Catalina" defaultHost="localHost">
this is Engine start
<Host name="localhost"
appBase="webapps"
unpackWARs="true"
autoDeploy="true">
this is Host
</Host>
this is Engine end
</Engine>
this is Service end
</Service>
this is Server end
</Server>

我们需要定义ServerServiceConnectorEngineHost节点的Java类与其对应。

Server类:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
package com.sunchaser.sparrow.javase.base.xml.digester;

import lombok.Data;

import java.util.ArrayList;
import java.util.List;

/**
* @author sunchaser admin@lilu.org.cn
* @since JDK8 2021/8/18
*/
@Data
public class MyServer {
private Integer port;
private String shutdown;

private List<MyService> myServiceList = new ArrayList<>();

public void addMyService(MyService myService) {
myServiceList.add(myService);
}
}

Service类:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
package com.sunchaser.sparrow.javase.base.xml.digester;

import lombok.Data;

import java.util.ArrayList;
import java.util.List;

/**
* @author sunchaser admin@lilu.org.cn
* @since JDK8 2021/8/18
*/
@Data
public class MyService {
private String name;

private List<MyConnector> myConnectorList = new ArrayList<>();

private MyEngine myEngine;

public void addMyConnector(MyConnector myConnector) {
myConnectorList.add(myConnector);
}
}

Connector类:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
package com.sunchaser.sparrow.javase.base.xml.digester;

import lombok.Data;

/**
* @author sunchaser admin@lilu.org.cn
* @since JDK8 2021/8/18
*/
@Data
public class MyConnector {
private Integer port;
private String protocol;
private Integer connectionTimeout;
private Integer redirectPort;
}

Engine类:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
package com.sunchaser.sparrow.javase.base.xml.digester;

import lombok.Data;

import java.util.ArrayList;
import java.util.List;

/**
* @author sunchaser admin@lilu.org.cn
* @since JDK8 2021/8/18
*/
@Data
public class MyEngine {
private String name;
private String defaultHost;

private List<MyHost> myHostList = new ArrayList<>();

public void addMyHost(MyHost myHost) {
myHostList.add(myHost);
}
}

Host类:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
package com.sunchaser.sparrow.javase.base.xml.digester;

import lombok.Data;

/**
* @author sunchaser admin@lilu.org.cn
* @since JDK8 2021/8/23
*/
@Data
public class MyHost {
private String name;
private String appBase;
private String unpackWARs;
private String autoDeploy;
}

下面我们来看下Digester的使用方式:

1
2
3
4
5
6
7
8
9
10
private void parse() throws IOException, SAXException {
// 创建Digester对象
Digester digester = createDigester();
// 获取xml文件的输入流
InputStream is = DigesterParseXmlTest.class.getResourceAsStream("/xml/server.xml");
// 将当前类压入Digester的对象栈栈顶
digester.push(this);
// 执行解析
digester.parse(is);
}

其中createDigester()方法创建了Digester对象,同时定义了解析的一系列规则:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
private Digester createDigester() {
Digester digester = new Digester();
// 设置false:不需要进行XML的DTD规则校验
digester.setValidating(false);
// 解析到Server节点时创建一个MyServer对象,然后压入栈顶
digester.addObjectCreate("Server",
"com.sunchaser.sparrow.javase.base.xml.digester.MyServer");
// 根据Server节点的Attr属性调用MyServer类中对应的setter方法
digester.addSetProperties("Server");
// 将栈顶元素(即上面将创建的MyServer对象)作为入参传递给栈顶的下一个元素的setMyServer方法并调用该方法
// 这里的栈顶的下一个元素是后面调用digester.push(this)方法推入的当前类对象DigesterParseXmlTest
digester.addSetNext("Server",
"setMyServer",
"com.sunchaser.sparrow.javase.base.xml.digester.MyServer");

// 解析到Server节点下的Service节点时,创建一个MyService对象,然后压入栈顶
digester.addObjectCreate("Server/Service",
"com.sunchaser.sparrow.javase.base.xml.digester.MyService");
// 根据Server/Service节点的Attr属性调用MyService类中对应的setter方法
digester.addSetProperties("Server/Service");
// 将栈顶元素(即上面将创建的MyService对象)作为入参传递给栈顶的下一个元素(MyServer)的addMyService方法并调用该方法
digester.addSetNext("Server/Service",
"addMyService",
"com.sunchaser.sparrow.javase.base.xml.digester.MyService");

// 解析到Server节点下的Service节点下的Connector节点时,创建一个MyConnector对象,然后压入栈顶
digester.addObjectCreate("Server/Service/Connector",
"com.sunchaser.sparrow.javase.base.xml.digester.MyConnector");
// 根据Server/Service/Connector节点的Attr属性调用MyConnector类中对应的setter方法
digester.addSetProperties("Server/Service/Connector");
// 将栈顶元素(即上面将创建的MyConnector对象)作为入参传递给栈顶的下一个元素(MyService)的addMyConnector方法并调用该方法
digester.addSetNext("Server/Service/Connector",
"addMyConnector",
"com.sunchaser.sparrow.javase.base.xml.digester.MyConnector");

// 这里由于会解析到Connector元素的结束标签,MyConnector元素会出栈。栈顶元素变为MyService

// 解析到Server节点下的Service节点下的Engine节点时,创建一个MyEngine对象,然后压入栈顶
digester.addObjectCreate("Server/Service/Engine",
"com.sunchaser.sparrow.javase.base.xml.digester.MyEngine");
// 根据Server/Service/Engine节点的Attr属性调用MyEngine类中对应的setter方法
digester.addSetProperties("Server/Service/Engine");
// 将栈顶元素(即上面将创建的MyEngine对象)作为入参传递给栈顶的下一个元素(MyService)的setMyEngine方法并调用该方法
digester.addSetNext("Server/Service/Engine",
"setMyEngine",
"com.sunchaser.sparrow.javase.base.xml.digester.MyEngine");

// 解析到Server节点下的Service节点下的Engine节点下的Host节点时,创建一个MyHost对象,然后压入栈顶
digester.addObjectCreate("Server/Service/Engine/Host",
"com.sunchaser.sparrow.javase.base.xml.digester.MyHost");
// 根据Server/Service/Engine/Host节点的Attr属性调用MyHost类中对应的setter方法
digester.addSetProperties("Server/Service/Engine/Host");
// 将栈顶元素(即上面将创建的MyHost对象)作为入参传递给栈顶的下一个元素(MyEngine)的addMyHost方法并调用该方法
digester.addSetNext("Server/Service/Engine/Host",
"addMyHost",
"com.sunchaser.sparrow.javase.base.xml.digester.MyHost");
// 这里会依次解析到元素的结束标签:</Host>、</Engine>、</Service>、</Server>,栈中元素会依次进行出栈
return digester;
}

下面我们来进行测试:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public class DigesterParseXmlTest {
private static final Logger LOGGER = LoggerFactory.getLogger(DigesterParseXmlTest.class);
private MyServer myServer;

public MyServer getMyServer() {
return myServer;
}

public void setMyServer(MyServer myServer) {
this.myServer = myServer;
}

public static void main(String[] args) throws IOException, SAXException {
DigesterParseXmlTest dpxt = new DigesterParseXmlTest();
dpxt.parse();
LOGGER.info("{}", JSON.toJSONString(dpxt.getMyServer()));
}
......
}

由于我们定义了解析规则,在解析到Server元素时会创建MyServer对象,然后调用栈顶元素DigesterParseXmlTestsetMyServer方法。所以我们可以在parse解析完成后获取DigesterParseXmlTest类持有的MyServer引用。

运行main方法,可以通过日志看到我们的解析结果:

1
17:54:26.062 [main] INFO com.sunchaser.sparrow.javase.base.xml.digester.DigesterParseXmlTest - {"myServiceList":[{"myConnectorList":[{"connectionTimeout":20000,"port":8080,"protocol":"HTTP/1.1","redirectPort":8443}],"myEngine":{"defaultHost":"localHost","myHostList":[{"appBase":"webapps","autoDeploy":"true","name":"localhost","unpackWARs":"true"}],"name":"Catalina"},"name":"Catalina"}],"port":8005,"shutdown":"SHUTDOWN"}

以上就是使用Digester解析XML的方法。

它定义了一个抽象类Rule,抽象了一些动作,或者说是描述了解析一个XML元素的生命周期,包括begin(开始解析元素)、body(解析元素的body)、end(结束解析元素)和finish(解析完所有元素)。同时提供了一系列Rule的实现类供我们使用。