FreezeJ' Blog

filebeat 7.14部署

2022-04-02

使用系统centos7.8

操作命令

wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.14.2-linux-x86_64.tar.gz
tar xf filebeat-7.14.2-linux-x86_64.tar.gz 
mv filebeat-7.14.2-linux-x86_64 /data/filebeat-system  # 加个system后缀,一个机器可能有多个filebeat实例
mkdir -p /data/filebeat-system/logs  # 日志目录

编译go-daemon

因为go语言不方便做后台dameon,需要依赖一个go-daemon程序,默认的tar包没有这个文件,需要手动编译:

若使用systemd的形式,不需要这个工具,但是为了适配旧点的系统(centos6.x),使用更传统的SysV init形式。

cd /data/
git clone https://github.com/elastic/go-daemon.git
cd /data/go-daemon && make
cp ./god /data/filebeat-system/filebeat-god
mv /data/go-daemon /tmp/  # 编译后文件不需要了
/data/filebeat-system/filebeat-god -v
# 输出:
# Go daemon v1.2
# http://github.com/fiorix/go-daemon

启动脚本

/etc/init.d/filebeat-system
启动脚本的目录和内容都按照filebeat-system修改了,如果目录名称不同可以替换一下。

#!/bin/bash
#
# filebeat          filebeat shipper
#
# chkconfig: 2345 98 02
# description: Starts and stops a single filebeat instance on this system
#

### BEGIN INIT INFO
# Provides:          filebeat
# Required-Start:    $local_fs $network $syslog
# Required-Stop:     $local_fs $network $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Filebeat sends log files to Logstash or directly to Elasticsearch.
# Description:       Filebeat is a shipper part of the Elastic Beats
#                    family. Please see: https://www.elastic.co/beats
### END INIT INFO



PATH=/usr/bin:/sbin:/bin:/usr/sbin
export PATH

[ -f /etc/sysconfig/filebeat-system ] && . /etc/sysconfig/filebeat-system
pidfile=${PIDFILE-/var/run/filebeat-system.pid}
agent=${BEATS_AGENT-/data/filebeat-system/filebeat}
args="-c /data/filebeat-system/filebeat.yml --path.home /data/filebeat-system --path.config /data/filebeat-system --path.data /data/filebeat-system --path.logs /data/filebeat-system/logs"
test_args="-e test config"
beat_user="${BEAT_USER:-root}"
wrapper="/data/filebeat-system/filebeat-god"
wrapperopts="-r / -n -p $pidfile"
user_wrapper="su"
user_wrapperopts="$beat_user -c"
RETVAL=0
DEFAULT_GODEBUG="madvdontneed=1"
export GODEBUG=${GODEBUG-$DEFAULT_GODEBUG}

# 自定义环境变量(根据实际情况修改)
# 获取机器公网IP,作为环境变量,在配置文件中调用
PUBLIC_IP=`curl -s --connect-timeout 5 ip.sb` || PUBLIC_IP=`curl -s --connect-timeout 5 http://members.3322.org/dyn
dns/getip`
[ $? -eq 0 ] && export PUBLIC_IP || export PUBLIC_IP="0.0.0.0"

# Source function library.
. /etc/rc.d/init.d/functions

# Determine if we can use the -p option to daemon, killproc, and status.
# RHEL < 5 can't.
if status | grep -q -- '-p' 2>/dev/null; then
    daemonopts="--pidfile $pidfile"
    pidopts="-p $pidfile"
fi

if command -v runuser >/dev/null 2>&1; then
    user_wrapper="runuser"
fi

[ "$beat_user" != "root" ] && wrapperopts="$wrapperopts -u $beat_user"

test() {
        $user_wrapper $user_wrapperopts "$agent $args $test_args"
}

start() {
    echo -n $"Starting filebeat-system: "
        test
        if [ $? -ne 0 ]; then
                echo
                exit 1
        fi
    daemon $daemonopts $wrapper $wrapperopts -- $agent $args
    RETVAL=$?
    echo
    return $RETVAL
}

stop() {
    echo -n $"Stopping filebeat-system: "
    killproc $pidopts $wrapper
    RETVAL=$?
    echo
    [ $RETVAL = 0 ] && rm -f ${pidfile}
}

restart() {
        test
        if [ $? -ne 0 ]; then
                return 1
        fi
    stop
    start
}

rh_status() {
    status $pidopts $wrapper
    RETVAL=$?
    return $RETVAL
}

rh_status_q() {
    rh_status >/dev/null 2>&1
}

case "$1" in
    start)
        start
    ;;
    stop)
        stop
    ;;
    restart)
        restart
    ;;
    condrestart|try-restart)
        rh_status_q || exit 0
        restart
    ;;
    status)
        rh_status
    ;;
    *)
        echo $"Usage: $0 {start|stop|status|restart|condrestart}"
        exit 1
esac

exit $RETVAL

启动服务

chkconfig --add filebeat-system
chkconfig filebeat-system on  # 开机启动
systemctl daemon-reload
systemctl start filebeat-system  # 开启服务

后记

修改采集目录

若使用二进制形式部署,日志没有输出到默认的/var/log路径,如果需要开启elasticsearch、logstash、kibana模块,需要修改一下路径

# elasticsearch
sed -i 's@/var/log/elasticsearch@/data/elasticsearch/logs@g'  /data/filebeat-system/module/elasticsearch/*/manifest.yml
# kibana
sed -i 's@/var/log/kibana@/data/kibana/logs@g'  /data/filebeat-system/module/kibana/*/manifest.yml
sed -i 's@kibana.stdout@kibana.log@g'  /data/filebeat-system/module/kibana/*/manifest.yml
# logstash
sed -i 's@/var/log/logstash@/data/logstash/logs@g'  /data/filebeat-system/module/logstash/*/manifest.yml

systemctl restart filebeat-system

定制化额外配置(需要重启filebeat)

加入云机元数据

修改filebeat.yml配置,参考:https://www.elastic.co/guide/en/beats/filebeat/7.14/add-cloud-metadata.html

默认的add_cloud_metadata配置都不能获取到阿里云和腾讯云的云机元数据,需要手动设置打开:

processors:
  - add_cloud_metadata:
      providers: ["alibaba", "tencent"]

实测腾讯云获取到的数据:

"cloud": {
    "service": {
        "name": "CVM"
    },
    "provider": "qcloud",
    "instance": {
        "id": "ins-xxxxxxxx"
    },
    "region": "china-south-gz",
    "availability_zone": "gz-azone3"
}

加入公网IP字段

修改filebeat.yml配置,参考:https://www.elastic.co/guide/en/beats/filebeat/7.14/add-fields.html

由于机器各异,没有很好的规范定义好hostname,使用云机默认add_host_metadata只能获取到无意义的hostname和内网的ip,对于日志分析和机器的定位不方便,一条日志从哪里产生的都不清楚(或者需要额外查询),需要额外加上一个公网IP的字段。

这里依赖了启动脚本定义的环境变量PUBLIC_IP,在本文上面那个init脚本里面定义的。

processors:
  - add_fields:
      target: "host"
      fields:
        public_ip: ${PUBLIC_IP}

结果:

"host": {
    "id": "xxxxxxxxxxxxxxxxxxxxxxxxxxx",
    "ip": ["172.16.0.x", "fe80::xxxx:xx:xxxx:xxxx"],
    "mac": ["xx:xx:xx:xx:xx:xx"],
    "hostname": "VM-x-x-centos",
    "public_ip": "xxx.xxx.xxx.xxxx",
        // 后续内容省略
}

自定义processors

可以参考:https://github.com/ytpay/filebeat-processors
这个暂时没用到,不过默认的processors没法做到非常定制化的操作,感觉还是会有这个需求,先mark一下。

Tags: ELK