Il était nécessaire de mettre un autre serveur à la maison et j'ai décidé de surveiller ses performances dans une maison intelligente à domicile, utilisée par Home Assistant. Une recherche rapide et réfléchie sur Google ne m'a pas donné de solutions universelles, j'ai donc construit mon propre vélo.
Introduction: nous surveillerons séparément la charge et la température du processeur, la RAM et la charge d'échange, l'espace disque libre, la durée de disponibilité, la charge totale du système, la température et l'état des disques intelligents séparément, ainsi que l'état du raid (sur un serveur avec serveur ubuntu 20, un simple raid logiciel1 a été soulevé) ... Disques WD Green, carte mère GA-525 avec atom525 intégré.
Le courtier mosquitto a déjà été configuré sur le serveur de la maison intelligente, donc mqtt a été choisi comme méthode de transfert de données.
Dans les premières sections de ce travail, les principes des méthodes de collecte de données appliquées sont donnés, et à la fin - les scripts de transfert de données et les paramètres HA.
Toutes les commandes des exemples sont exécutées en tant que root.
Table des matières
Collecte des capteurs système
Collecte des
données de charge système Collecte des données d'intégrité du disque dur
Collecte des
données d' état RAID Envoi des données collectées
Configuration de Home Assistant
Lectures des capteurs du système
Pour obtenir les capteurs intégrés, nous utiliserons l'utilitaire de capteurs
S'il n'est pas installé, mettez-le: apt-get install lm-sensors
Tout d'abord, vous devez trouver tous les capteurs disponibles. Nous exécutons la commande sensors-detect
et répondons à toutes les questions y . Après cela, vous pouvez voir ce qui s'est passé:sensors
Il est à noter que personnellement, mes capteurs ont commencé à afficher tous les capteurs trouvés uniquement après un redémarrage. Peut-être une sorte de bug, je ne sais pas.
. sensors json, . sensors -A -u -j
json. , .
, . . json - jp. - ubuntu :
apt-get install jq
xpath . , -.
. , , , temp3, :
sensors -A -u -j | jq '.["coretemp-isa-0000"]["Core 0"].temp2_input'
sensors -A -u -j | jq '.["it8720-isa-0290"].fan1.fan1_input'
sensors -A -u -j | jq '.["it8720-isa-0290"].temp3.temp3_input'
, , , , .
. - free. , -m, .
, . - , .
free -m | grep "Mem" | awk '{print $2}'
grep , awk - , . , . .
, df. , , , . - , . : df
df | grep "/dev/md127p1" | awk '{print $5}' | sed 's/%$//'
df | grep "/dev/md126p1" | awk '{print $5}' | sed 's/%$//'
/proc/loadavg. , - , . , , / 1, 5 15 . . , ( ) , '? 15 :
cat /proc/loadavg | awk '{print $3}'
uptime:
uptime | awk '{print $3}' | sed 's/,$//'
mpstat. , , . , , . , , , . mpstat , apt install sysstat. ,
mpstat | grep all | awk '{print $13}'
, .
, , . bash . bc
cpuidle=$(mpstat | grep all | awk '{print $13}')
cpuload=$(echo "100-$cpuidle" | bc -l)
echo " : $cpuload"
hddtemp. , :
apt-get install hddtemp
: , -n :
SMART smartmontools
apt-get install smartmontools
, -a, .
smartctl -a /dev/sda
, . , . . :
Raw_Read_Error_Rate — . , . , . . , ;
Reallocated_Sector_Ct — . ;
Seek_Error_Rate — . ;
Spin_Retry_Count — . ;
Reallocated_Event_Count — ;
Offline_Uncorrectable — . .
, - json. -j, :
smartctl -a -j /dev/sda
xpath, jq, ( ):
smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[0].raw.value' #Raw_Read_Error_Rate
smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[3].raw.value' #Reallocated_Sector_Ct
smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[4].raw.value' #Seek_Error_Rate
smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[6].raw.value' #Spin_Retry_Count
smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[12].raw.value' #Reallocated_Event_Count
smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[14].raw.value' #Offline_Uncorrectable
, " - " - -H, . -j, json.
json:
smartctl -a /dev/sda -j | jq '.smart_status.passed' #smart_status
, ()
, , , cron . .
smartctl -t short /dev/sda
, 2
smartctl -t long /dev/sda
, 1 .
, , smartd, , . , . smartd .
RAID
, cat /proc/mdstat
- :
echo 'check' >/sys/block/md126/md/sync_action
echo 'check' >/sys/block/md127/md/sync_action
cat /sys/block/md126/md/mismatch_cnt
cat /sys/block/md127/md/mismatch_cnt
0, .
, .
mosquitto, :
apt-get install mosquitto-clients
- , . - ( ), ( raid ), ( smart):
touch system.sh && touch drives.sh && touch smart.sh
chmod u+x system.sh && chmod u+x drives.sh && chmod u+x smart.sh
:
system.sh
#!/bin/bash
#
ip=xx.xx.xx.xx
usr="xx"
pass="xx"
tempdrive1=$(hddtemp "/dev/sda" -n)
echo " 1: $tempdrive1"
tempdrive2=$(hddtemp "/dev/sdb" -n)
echo " 2: $tempdrive2"
tempcpu=$(sensors -A -u -j | jq '.["coretemp-isa-0000"]["Core 0"].temp2_input')
echo " : $tempcpu"
fan=$(sensors -A -u -j | jq '.["it8720-isa-0290"].fan1.fan1_input')
echo " : $fan"
temp3=$(sensors -A -u -j | jq '.["it8720-isa-0290"].temp3.temp3_input')
echo " : $temp3"
totalram=$(free -m | grep "Mem" | awk '{print $2}')
echo " : $totalram"
usedram=$(free -m | grep "Mem" | awk '{print $3}')
echo " : $usedram"
usedrampercent=$(($usedram * 100 / $totalram))
echo " : $usedrampercent"
totalswap=$(free -m | grep "Swap" | awk '{print $2}')
echo " : $totalswap"
usedswap=$(free -m | grep "Swap" | awk '{print $3}')
echo " : $usedswap"
usedswappercent=$(($usedswap * 100 / $totalswap))
echo " : $usedswappercent"
averageload=$(cat /proc/loadavg | awk '{print $3}')
echo " : $averageload"
uptimedata=$(uptime | awk '{print $3}' | sed 's/,$//')
echo ": $uptimedata"
cpuidle=$(mpstat | grep all | awk '{print $13}')
cpuload=$(echo "100-$cpuidle" | bc -l) # , bash
echo " : $cpuload"
echo " "
echo " "
mosquitto_pub -h $ip -t "srv/tempdrive1" -m $tempdrive1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/tempdrive2" -m $tempdrive2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/tempcpu" -m $tempcpu -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/fan" -m $fan -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/temp3" -m $temp3 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/usedrampercent" -m $usedrampercent -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/usedswappercent" -m $usedswappercent -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/averageload" -m $averageload -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/uptimedata" -m $uptimedata -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/cpuload" -m $cpuload -u $usr -P $pass
drives.sh
#!/bin/bash
#
ip=xx.xx.xx.xx
usr="xx"
pass="xx"
raid_system_status=$(cat /sys/block/md126/md/mismatch_cnt)
echo " RAID : $raid_system_status"
raid_var_status=$(cat /sys/block/md127/md/mismatch_cnt)
echo " RAID : $raid_var_status"
freesystemdisk=$(df | grep "/dev/md127p1" | awk '{print $5}' | sed 's/%$//')
echo " : $freesystemdisk"
freedatadisk=$(df | grep "/dev/md126p1" | awk '{print $5}' | sed 's/%$//')
echo " : $freedatadisk"
echo " "
echo " "
mosquitto_pub -h $ip -t "srv/raid_system_status" -m $raid_system_status -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/raid_var_status" -m $raid_var_status -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/freesystemdisk" -m $freesystemdisk -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/freedatadisk" -m $freedatadisk -u $usr -P $pass
smart.sh
#!/bin/bash
#
ip=xx.xx.xx.xx
usr="xx"
pass="xx"
Raw_Read_Error_Rate1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[0].raw.value')
echo "SMART Raw_Read_Error_Rate 1: $Raw_Read_Error_Rate1"
Reallocated_Sector_Ct1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[3].raw.value')
echo "SMART Reallocated_Sector_Ct 1: $Reallocated_Sector_Ct1"
Seek_Error_Rate1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[4].raw.value')
echo "SMART Seek_Error_Rate 1: $Seek_Error_Rate1"
Spin_Retry_Count1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[6].raw.value')
echo "SMART Spin_Retry_Count 1: $Spin_Retry_Count1"
Reallocated_Event_Count1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[12].raw.value')
echo "SMART Reallocated_Event_Count 1: $Reallocated_Event_Count1"
Offline_Uncorrectable1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[14].raw.value')
echo "SMART Offline_Uncorrectable 1: $Offline_Uncorrectable1"
smart_status1=$(smartctl -a /dev/sda -j | jq '.smart_status.passed')
echo " 1: $smart_status1"
Raw_Read_Error_Rate2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[0].raw.value')
echo "SMART Raw_Read_Error_Rate 2: $Raw_Read_Error_Rate2"
Reallocated_Sector_Ct2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[3].raw.value')
echo "SMART Reallocated_Sector_Ct 2: $Reallocated_Sector_Ct2"
Seek_Error_Rate2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[4].raw.value')
echo "SMART Seek_Error_Rate 2: $Seek_Error_Rate2"
Spin_Retry_Count2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[6].raw.value')
echo "SMART Spin_Retry_Count 2: $Spin_Retry_Count2"
Reallocated_Event_Count2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[12].raw.value')
echo "SMART Reallocated_Event_Count 2: $Reallocated_Event_Count2"
Offline_Uncorrectable2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[14].raw.value')
echo "SMART Offline_Uncorrectable 2: $Offline_Uncorrectable2"
smart_status2=$(smartctl -a /dev/sdb -j | jq '.smart_status.passed')
echo " 2: $smart_status2"
echo " "
echo " "
mosquitto_pub -h $ip -t "srv/Raw_Read_Error_Rate1" -m $Raw_Read_Error_Rate1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Reallocated_Sector_Ct1" -m $Reallocated_Sector_Ct1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Seek_Error_Rate1" -m $Seek_Error_Rate1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Spin_Retry_Count1" -m $Spin_Retry_Count1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Reallocated_Event_Count1" -m $Reallocated_Event_Count1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Offline_Uncorrectable1" -m $Offline_Uncorrectable1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Raw_Read_Error_Rate2" -m $Raw_Read_Error_Rate2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Reallocated_Sector_Ct2" -m $Reallocated_Sector_Ct2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Seek_Error_Rate2" -m $Seek_Error_Rate2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Spin_Retry_Count2" -m $Spin_Retry_Count2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Reallocated_Event_Count2" -m $Reallocated_Event_Count2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Offline_Uncorrectable2" -m $Offline_Uncorrectable2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/smart_status1" -m $smart_status1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/smart_status2" -m $smart_status2 -u $usr -P $pass
, Mosquitto broker Home Assistant
, , , .
Home Assistant
, . Home Assistant .
sensor:
- platform: mqtt
state_topic: "srv/tempdrive1"
name: " nextcloud 1"
unit_of_measurement: °C
- platform: mqtt
state_topic: "srv/tempdrive2"
name: " nextcloud 2"
unit_of_measurement: °C
- platform: mqtt
state_topic: "srv/tempcpu"
name: " nextcloud "
unit_of_measurement: °C
- platform: mqtt
state_topic: "srv/fan"
name: " nextcloud "
unit_of_measurement: ppm
- platform: mqtt
state_topic: "srv/temp3"
name: " nextcloud "
unit_of_measurement: °C
- platform: mqtt
state_topic: "srv/usedrampercent"
name: " nextcloud RAM"
unit_of_measurement: "%"
- platform: mqtt
state_topic: "srv/usedswappercent"
name: " nextcloud SWAP"
unit_of_measurement: "%"
- platform: mqtt
state_topic: "srv/freesystemdisk"
name: " nextcloud "
unit_of_measurement: "%"
- platform: mqtt
state_topic: "srv/freedatadisk"
name: " nextcloud "
unit_of_measurement: "%"
- platform: mqtt
state_topic: "srv/averageload"
name: " nextcloud "
- platform: mqtt
state_topic: "srv/uptimedata"
name: " nextcloud "
- platform: mqtt
state_topic: "srv/cpuload"
name: " nextcloud "
unit_of_measurement: "%"
- platform: mqtt
state_topic: "srv/Raw_Read_Error_Rate1"
name: " nextcloud 1 SMART Raw_Read_Error_Rate"
- platform: mqtt
state_topic: "srv/Reallocated_Sector_Ct1"
name: " nextcloud 1 SMART Reallocated_Sector_Ct"
- platform: mqtt
state_topic: "srv/Seek_Error_Rate1"
name: " nextcloud 1 SMART Seek_Error_Rate"
- platform: mqtt
state_topic: "srv/Spin_Retry_Count1"
name: " nextcloud 1 SMART Spin_Retry_Count"
- platform: mqtt
state_topic: "srv/Reallocated_Event_Count1"
name: " nextcloud 1 SMART Reallocated_Event_Count"
- platform: mqtt
state_topic: "srv/Offline_Uncorrectable1"
name: " nextcloud 1 SMART Offline_Uncorrectable"
- platform: mqtt
state_topic: "srv/smart_status1"
name: " nextcloud 1 SMART "
- platform: mqtt
state_topic: "srv/Raw_Read_Error_Rate2"
name: " nextcloud 2 SMART Raw_Read_Error_Rate"
- platform: mqtt
state_topic: "srv/Reallocated_Sector_Ct2"
name: " nextcloud 2 SMART Reallocated_Sector_Ct"
- platform: mqtt
state_topic: "srv/Seek_Error_Rate2"
name: " nextcloud 2 SMART Seek_Error_Rate"
- platform: mqtt
state_topic: "srv/Spin_Retry_Count2"
name: " nextcloud 2 SMART Spin_Retry_Count"
- platform: mqtt
state_topic: "srv/Reallocated_Event_Count2"
name: " nextcloud 2 SMART Reallocated_Event_Count"
- platform: mqtt
state_topic: "srv/Offline_Uncorrectable2"
name: " nextcloud 2 SMART Offline_Uncorrectable"
- platform: mqtt
state_topic: "srv/smart_status2"
name: " nextcloud 2 SMART "
- platform: mqtt
state_topic: "srv/raid_system_status"
name: " nextcloud RAID "
- platform: mqtt
state_topic: "srv/raid_var_status"
name: " nextcloud RAID "
, , , ! . , , . :
, . , , smart .
- , . , . → → mqtt.
- linux , , , .
- . , . , .
La capture d'écran montre que le serveur discuté est prévu pour nextcloud. Ses indicateurs internes peuvent également être parfaitement ajoutés à HA, pour cela, il existe une merveilleuse api. Et HA a une intégration intégrée.