Spark 配置log4j2 写日志到kafka

Background

Spark 配置log4j2,并且解决Kafka Kerberos 认证问题。

Using the new Spark log collect architecture: Spark Log ==> Kafka ==> NIFI ==> Splunk. Through this architecture, business logs are collected in near real time and displayed on Splunk. This architecture is also easy to integrate with existing Spark programs and can be used in an easy configuration.

  • log4j2: collect Spark application log.
  • Kafka: store all application logs based on each Country.
  • NiFI: Consumer all logs from Kafka and config some properties.
  • Splunk: display log.

Spark Application config log4j2

Including dependency in pom.xml

		<dependency>
			<groupId>org.apache.logging.log4j</groupId>
			<artifactId>log4j-api</artifactId>
			<version>2.4</version>
		</dependency>
		<dependency>
			<groupId>org.apache.logging.log4j</groupId>
			<artifactId>log4j-core</artifactId>
			<version>2.4</version>
		</dependency>
		<dependency>
			<groupId>org.apache.kafka</groupId>
			<artifactId>kafka-log4j-appender</artifactId>
			<version>2.6.0</version>
		</dependency>

Using Logger in Code

Use log4j2 replace original Logger Tool, such as java.util.logging.Logger.

// a global logger
static final Logger logger = LogManager.getLogger(YourApplication.class);

logger.info("info level");
logger.warn("warn level");
logger.error("error level");

log4j2 configuration

config log4j2.xml in resources package

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="info" name="Log-Appender-Config">
    <Appenders>
        <Kafka name="Kafka" topic="YourProjectLogTopic">
            <PatternLayout pattern="%d{HH:mm:ss.SSS} %-5p [%-t] %F:%L - %m"/>
            <Property name="bootstrap.servers">xxx:9092</Property>
            <Property name="security.protocol">SASL_PLAINTEXT</Property>
            <Property name="sasl.mechanism">GSSAPI</Property>
            <Property name="sasl.kerberos.service.name">kafka</Property>
        </Kafka>
        <Async name="Async">
            <AppenderRef ref="Kafka"/>
        </Async>

        <Console name="stdout" target="SYSTEM_OUT">
            <PatternLayout pattern="%d{HH:mm:ss.SSS} %-5p [%-7t] %F:%L - %m%n"/>
        </Console>

    </Appenders>
    <Loggers>
        <Root level="info">
            <AppenderRef ref="Kafka"/>
            <AppenderRef ref="stdout"/>
        </Root>
    </Loggers>
</Configuration>

config jass conf

configuration jass.conf in java system property when use spark-submit is recommended.

--files ./PROJECT-jaas.conf,./PROJECT.keytab \
--conf spark.executor.extraJavaOptions="-Djava.security.auth.login.config=./PROJECT-jaas.conf" \
--conf spark.driver.extraJavaOptions="-Djava.security.auth.login.config=./PROJECT-jaas.con" \

Kafka Topic

Log creation guidelines: It's recommended create application name based on project name.

posted @ 2022-05-30 14:10  stone_la  阅读(521)  评论(0编辑  收藏  举报