公司一个项目,一个服务有多台服务器,开发每次查看php程序或go程序日志都要去ftp拉取,一台一台找,找个问题搞半天,最近优化了服务器成本,释放了多台云主机,留了一台有docker的机器安装loki和granfana;收集日志采用promtail,并使用supervisor管理进程
docker安装
一、docker-compose 安装loki 、grafana
1.1 安装 docker-compose 命令
curl -L https://get.daocloud.io/docker/compose/releases/download/1.21.1/docker-compose-`uname -s`-`uname -m` -o /usr/bin/docker-compose chmod +x /usr/bin/docker-compose
1.2 loki 目录结构
[root@loki ~]# mkdir loki [root@loki ~]# tree loki loki ├── config │ └── loki │ ├── config.yaml │ └── config.yamlbak └── docker-compose.yaml [root@loki ~]# cd docker-compose/loki
1.3 编写loki和granfana的docker-compose
[root@loki loki]# vim docker-compose.yaml
cat docker-compose.yaml version: "3" networks: loki: services: loki: image: grafana/loki:latest ports: - "3100:3100" - "9095:9095" command: -config.file=/etc/loki/config.yaml volumes: - ./config/loki:/etc/loki - /data/loki:/loki networks: - loki grafana: image: grafana/grafana:latest ports: - "3000:3000" volumes: - /data/grafana:/var/lib/grafana environment: GF_SECURITY_ADMIN_PASSWORD: 123456 GF_SERVER_HTTP_PORT: 3000 networks: - loki
1.4 创建 loki配置 文件
在当前目录下,创建config/loki目录
cd docker-compose/loki mkdir -p config/loki/ vim config/loki/config.yaml
auth_enabled: false server: http_listen_port: 3100 grpc_listen_port: 9095 grpc_server_max_recv_msg_size: 1572864000 #grpc最大接收消息值,默认4m grpc_server_max_send_msg_size: 1572864000 #grpc最大发送消息值,默认4m ingester: lifecycler: address: 172.19.72.235 ring: kvstore: store: inmemory replication_factor: 1 final_sleep: 0s chunk_idle_period: 5m chunk_retain_period: 30s wal: dir: /loki/wal compactor: working_directory: /loki/persistent # 压缩目录,一般也作为持久化目录 compaction_interval: 10m # 压缩间隔 retention_enabled: true # 持久化开启 retention_delete_delay: 5m # 过期后多久删除 retention_delete_worker_count: 150 # 过期删除协程数目 schema_config: configs: - from: "2023-10-23" index: period: 24h prefix: loki_index_ object_store: filesystem # 持久化方式:本地文件 schema: v11 store: boltdb-shipper storage_config: boltdb_shipper: active_index_directory: /loki/boltdb-index # index 目录 cache_location: /loki/boltdb-cache # cache 目录 filesystem: directory: /loki/chunks # chunks 目录 limits_config: retention_period: 240h # 多久过期
创建数据目录并给777权限
如果不给777权限,启动会报错mkdir /loki/chunks: permission denied或其他目录无法创建
mkdir /data/loki mkdir /data/grafana chmod 777 /data/loki chmod 777 /data/grafana
安装kilo 、grafana
docker-compose up -d docker-compose logs #查看日志 docker-compose ps #查看进程
二、安装promtail
需要收集日志的服务器没安装docker,就直接下载安装包,在命令行启动
wget https://github.com/grafana/loki/releases/download/v2.9.2/promtail-linux-amd64.zip #当前最新版,和loki版本一样 unzip promtail-linux-amd64.zip vim promtail-local-config.yaml
server: http_listen_port: 9080 grpc_listen_port: 0 grpc_server_max_recv_msg_size: 1572864000 grpc_server_max_send_msg_size: 1572864000 positions: filename: /tmp/positions.yaml clients: - url: http://172.19.72.235:3100/loki/api/v1/push scrape_configs: - job_name: api static_configs: - targets: - 172.19.72.235 labels: job: api __path__: /data/runtime/logs/*/*.log - job_name: websoket static_configs: - targets: - 172.19.72.235 labels: job: websoket __path__: /data/logs/*.log
启动 promtail
./promtail-linux-amd64 --config.file=promtail-local-config.yaml
启动没问题之后添加到supervisor
[program:promtail-log] directory=/usr/local/data/promtail command=/usr/local/data/promtail/promtail-linux-amd64 -config.file=promtail-local-config.yaml autostart=true autorestart=true startsecs=5 priority=1 stopsignal=INT stopwaitsecs=11 stopasgroup=true killasgroup=true
[root@api1 promtail]# supervisorctl update [root@api1 promtail]# supervisorctl status promtail-log RUNNING pid 11360, uptime 1 days, 0:38:48
三、遇到的问题汇总:
问题1 :权限问题
如果loki_loki_1启动失败,基本上都是/loki/persistent、/loki/wal、/loki/chunks、/loki/boltdb-index、/loki/boltdb-cache无法创建,而这些目录是挂在到本地磁盘/data/loki下面,只需要给/data/loki 777权限,重启服务即可
问题2:promtail和loki,发送接收报错
当日志量过大时候,promtail就报以下错误,loki接收也会报错
status: 500. message: rpc error: code = resourceexhausted desc = trying to send message larger than max (5066121 vs. 4194304)
在loki和promtail配置文件server中,加入下面参数,重启服务就好啦
grpc_server_max_recv_msg_size: 1572864000 grpc_server_max_send_msg_size: 1572864000
上面的配置文件中,已经包含这两个参数
四、配置grafana
4.1 浏览器中打开granfana
4.2 添加数据源
至此就将grafana交给研发使用即可